Book Image

Implementing Splunk 7, Third Edition - Third Edition

Book Image

Implementing Splunk 7, Third Edition - Third Edition

Overview of this book

Splunk is the leading platform that fosters an efficient methodology and delivers ways to search, monitor, and analyze growing amounts of big data. This book will allow you to implement new services and utilize them to quickly and efficiently process machine-generated big data. We introduce you to all the new features, improvements, and offerings of Splunk 7. We cover the new modules of Splunk: Splunk Cloud and the Machine Learning Toolkit to ease data usage. Furthermore, you will learn to use search terms effectively with Boolean and grouping operators. You will learn not only how to modify your search to make your searches fast but also how to use wildcards efficiently. Later you will learn how to use stats to aggregate values, a chart to turn data, and a time chart to show values over time; you'll also work with fields and chart enhancements and learn how to create a data model with faster data model acceleration. Once this is done, you will learn about XML Dashboards, working with apps, building advanced dashboards, configuring and extending Splunk, advanced deployments, and more. Finally, we teach you how to use the Machine Learning Toolkit and best practices and tips to help you implement Splunk services effectively and efficiently. By the end of this book, you will have learned about the Splunk software as a whole and implemented Splunk services in your tasks at projects
Table of Contents (19 chapters)
Title Page
Packt Upsell
Contributors
Preface
Index

Reducing summary index size


If the saved search populating a summary index produces too many results, the summary index is less effective at speeding up searches. This usually occurs because one or more of the fields used for grouping has more unique values than expected.

One common example of a field that can have many unique values is the URL in a web access log. The number of URL values might increase in instances where:

  • The URL contains a session ID
  • The URL contains search terms
  • Hackers are throwing URLs at your site trying to break in
  • Your security team runs tools looking for vulnerabilities

On top of this, multiple URLs can represent exactly the same resource, as follows:

  • /home/index.html
  • /home/
  • /home/index.html?a=b
  • /home/?a=b

We will cover a few approaches to flatten these values. These are just examples and ideas, and your particular case may require a different approach.

Using eval and rex to define grouping fields

One way to tackle this problem is to make up a new field from the URL using rex...