Book Image

Implementing Splunk: Big Data Reporting and Development for Operational Intelligence

By : VINCENT BUMGARNER
Book Image

Implementing Splunk: Big Data Reporting and Development for Operational Intelligence

By: VINCENT BUMGARNER

Overview of this book

Splunk is a data collection, indexing, and visualization engine for operational intelligence. It's a powerful and versatile search and analysis engine that lets you investigate, troubleshoot, monitor, alert, and report on everything that's happening in your entire IT infrastructure from one location in real time. Splunk collects, indexes, and harnesses all the fast moving machine data generated by our applications, servers, and devices - physical, virtual, and in the cloud.Given a mountain of machine data, this book shows you exactly how to learn to use Splunk to make something useful from it. Depending on your needs, you can learn to search, transform, and display data, or learn to administer your Splunk installation, large or small. "Implementing Splunk: Big Data Reporting and Development for Operational Intelligence" will help you get your job done faster, whether you read from the beginning or jump to what you need to know today. New and experienced users alike will find nuggets of wisdom throughout.This book provides you with valuable examples and step-by-step instructions, showing you how to take advantage of everything Splunk has to offer you, to make the most out of your machine data."Implementing Splunk: Big Data Reporting and Development for Operational Intelligence" takes you on a journey right from inception to a fully functioning implementation of Splunk. Using a real-world data walkthrough, you'll be shown how to search effectively, create fields, build dashboards, reports, and package apps, manage your indexes, integrate into the enterprise, and extend Splunk. This practical implementation guide equips you with high-level knowledge for configuring, deploying, extending, and integrating Splunk. Depending on the goal and skills of the reader, enough topics are covered to get you on your way to dashboard guru, app developer, or enterprise administrator. This book uses examples curates reference, and sage advice to help you make the most of this incredibly powerful tool.
Table of Contents (19 chapters)
Implementing Splunk: Big Data Reporting and Development for Operational Intelligence
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Calculating top for a large time frame


One common problem is to find the top contributors out of some huge set of unique values. For instance, if you want to know what IP addresses are using the most bandwidth in a given day or week, you may have to keep track of the total of request sizes across millions of unique hosts to definitively answer this question. When using summary indexes, this means storing millions of events in the summary index, quickly defeating the point of summary indexes.

Just to illustrate, let's look at a simple set of data:

Time

1.1.1.1

2.2.2.2

3.3.3.3

4.4.4.4

5.5.5.5

6.6.6.6

12:00

99

100

100

100

  

13:00

99

 

100

100

100

 

14:00

99

100

 

101

100

 

15:00

99

 

99

100

100

 

16:00

99

100

  

100

100

total

495

300

299

401

400

100

If we only stored the top three IPs per hour, our data set would look like the following:

Time

1.1.1.1

2.2.2.2

3.3.3.3

4.4.4.4

5.5.5.5

6.6.6.6

12:00

 

100

100

100

  

13:00

  

100

100

100

 

14:00

 

100

 

101...