Apache Flume: Distributed Log Collection for Hadoop

A memory channel, as expected, is a channel where in-flight events are stored in memory. As memory is (usually) orders of magnitude faster than the disk, events can be ingested much more quickly, resulting in reduced hardware needs. The downside of using this channel is that an agent failure (hardware problem, power outage, JVM crash, Flume restart, and so on) results in the loss of data. Depending on your use case, this might be perfectly fine. System metrics usually fall into this category, as a few lost data points isn't the end of the world. However, if your events represent purchases on your website, then a memory channel would be a poor choice.

To use the memory channel, set the type parameter on your named channel to memory.

agent.channels.c1.type=memory

This defines a memory channel named c1 for the agent named agent.

Here is a table of configuration parameters you can adjust from the default values:

Key	Required	Type	Default
`type`	Yes	`String`	`memory`
`capacity...`

Apache Flume: Distributed Log Collection for Hadoop

By : Steven Hoffman

Apache Flume: Distributed Log Collection for Hadoop

By: Steven Hoffman

Overview of this book

Related Content you might be interested in

Current Title:

Apache Flume: Distributed Log Collection for Hadoop

The memory channel