Book Image

Distributed Computing with Go

By : V.N. Nikhil Anurag
Book Image

Distributed Computing with Go

By: V.N. Nikhil Anurag

Overview of this book

Distributed Computing with Go gives developers with a good idea how basic Go development works the tools to fulfill the true potential of Golang development in a world of concurrent web and cloud applications. Nikhil starts out by setting up a professional Go development environment. Then you’ll learn the basic concepts and practices of Golang concurrent and parallel development. You’ll find out in the new few chapters how to balance resources and data with REST and standard web approaches while keeping concurrency in mind. Most Go applications these days will run in a data center or on the cloud, which is a condition upon which the next chapter depends. There, you’ll expand your skills considerably by writing a distributed document indexing system during the next two chapters. This system has to balance a large corpus of documents with considerable analytical demands. Another use case is the way in which a web application written in Go can be consciously redesigned to take distributed features into account. The chapter is rather interesting for Go developers who have to migrate existing Go applications to computationally and memory-intensive environments. The final chapter relates to the rather onerous task of testing parallel and distributed applications, something that is not usually taught in standard computer science curricula.
Table of Contents (11 chapters)

Document feeder – the REST API endpoint

The main aim of /api/feeder is to receive documents to be indexed, process them, and forward the processed data to Librarian to be added to the index. This means we need to accurately process the document. But what do we mean by "processing a document?"

It can be defined as the following set of consecutive tasks:

  1. We rely on the payload to provide us with a title and link to the document. We download the linked document and use it in our index.
  1. The document can be thought of as one big blob of text, and it is possible that we might have multiple documents with the same title. We need to be able to identify each document uniquely and also be able to easily retrieve them.
  1. The result of a search query expects the provided words to be present in the document. This means we need to extract all words from a document and also...