Book Image

Programming MapReduce with Scalding

By : Antonios Chalkiopoulos
Book Image

Programming MapReduce with Scalding

By: Antonios Chalkiopoulos

Overview of this book

Table of Contents (16 chapters)
Programming MapReduce with Scalding
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The late bound dependency pattern


Serialization of objects is a problem in all distributed systems since we have to send objects between different machines. Sometimes, a class does not extend Serializable, and we cannot simply extend java.io.Serializable to solve the problem (that is, because of a dependency to a library like the HttpClient). An example of such a case could be the following code:

case class UserInfo(email: String, address: String)
// Note this is a non- serializable class
class ExternalServiceImpl extends ExternalService { ... }

In such cases, we need to postpone the object instantiation. We want this object to be created in every node of the Hadoop cluster, instead of being instantiated and then transferred among cluster nodes.

To achieve this, we can define all the non-serialiazable objects as abstract members in the operations trait. The binding can then be done by either using a lazy val member or by using a constructor function.

In Scala, defining a variable as lazy will...