Book Image

Apache Flume: Distributed Log Collection for Hadoop

By : Steve Hoffman, Steven Hoffman
Book Image

Apache Flume: Distributed Log Collection for Hadoop

By: Steve Hoffman, Steven Hoffman

Overview of this book

Table of Contents (16 chapters)
Apache Flume: Distributed Log Collection for Hadoop Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The memory channel


A memory channel, as expected, is a channel where in-flight events are stored in memory. As memory is (usually) orders of magnitude faster than the disk, events can be ingested much more quickly, resulting in reduced hardware needs. The downside of using this channel is that an agent failure (hardware problem, power outage, JVM crash, Flume restart, and so on) results in the loss of data. Depending on your use case, this might be perfectly fine. System metrics usually fall into this category, as a few lost data points isn't the end of the world. However, if your events represent purchases on your website, then a memory channel would be a poor choice.

To use the memory channel, set the type parameter on your named channel to memory.

agent.channels.c1.type=memory

This defines a memory channel named c1 for the agent named agent.

Here is a table of configuration parameters you can adjust from the default values:

Key

Required

Type

Default

type

Yes

String

memory

capacity...