Book Image

Apache Flume: Distributed Log Collection for Hadoop

By : Steven Hoffman
Book Image

Apache Flume: Distributed Log Collection for Hadoop

By: Steven Hoffman

Overview of this book

Table of Contents (16 chapters)
Apache Flume: Distributed Log Collection for Hadoop Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Summary


In this chapter, we covered several real-world considerations you need to think about when planning your Flume implementation, including:

  • Transport time does not always match event time

  • The mayhem introduced with Daylight Savings Time to certain time-based logic

  • Capacity planning considerations

  • Items to consider when you have more than one data center

  • Data compliance

  • Data retention and expiration

I hope you enjoyed this book. Hopefully, you will be able to apply much of this information directly in your application/Hadoop integration efforts.

Thanks, this was fun!