Implementing Splunk 7, Third Edition

Implementing Splunk 7, Third Edition - Third Edition

Overview of this book

Splunk is the leading platform that fosters an efficient methodology and delivers ways to search, monitor, and analyze growing amounts of big data. This book will allow you to implement new services and utilize them to quickly and efficiently process machine-generated big data. We introduce you to all the new features, improvements, and offerings of Splunk 7. We cover the new modules of Splunk: Splunk Cloud and the Machine Learning Toolkit to ease data usage. Furthermore, you will learn to use search terms effectively with Boolean and grouping operators. You will learn not only how to modify your search to make your searches fast but also how to use wildcards efficiently. Later you will learn how to use stats to aggregate values, a chart to turn data, and a time chart to show values over time; you'll also work with fields and chart enhancements and learn how to create a data model with faster data model acceleration. Once this is done, you will learn about XML Dashboards, working with apps, building advanced dashboards, configuring and extending Splunk, advanced deployments, and more. Finally, we teach you how to use the Machine Learning Toolkit and best practices and tips to help you implement Splunk services effectively and efficiently. By the end of this book, you will have learned about the Splunk software as a whole and implemented Splunk services in your tasks at projects

Title Page

Packt Upsell

Contributors

Preface

Free Chapter

The Splunk Interface

Logging in to Splunk

The home app

The top bar

The Search & Reporting app

Using the time picker

Using the field picker

The top bar in Splunk Cloud

Splunk reference app – PAS

Universal forwarder

eventgen

Next steps

Summary

Understanding Search

Using search terms effectively

Boolean and grouping operators

Clicking to modify your search

Using fields to search

Using wildcards efficiently

All about time

Making searches faster

Sharing results with others

Searching job settings

Saving searches for reuse

Creating alerts from searches

Event annotations

Summary

Tables, Charts, and Fields

About the pipe symbol

Using top to show common field values

Using stats to aggregate values

Using chart to turn data

Using timechart to show values over time

Working with fields

Chart enhancements in version 7.0

Summary

Data Models and Pivots

What is a data model?

What does a data model search?

Acceleration in version 7.0

Creating a data model

Summary

Simple XML Dashboards

The purpose of dashboards

Using wizards to build dashboards

Converting the panel to a report

Back to the dashboard

Scheduling the generation of dashboards

Summary

Advanced Search Examples

Using subsearches to find loosely related events

Using transaction

Determining concurrency

Calculating events per slice of time

Rebuilding top

Acceleration

Version 7.0 advancements in metrics

Summary

Extending Search

Using tags to simplify search

Using event types to categorize results

Using lookups to enrich data

Using macros to reuse logic

Creating workflow actions

Using external commands

Summary

Working with Apps

Defining an app

Included apps

Installing apps

Building your first app

Editing navigation

Customizing the appearance of your app

Object permissions

App directory structure

Self-service app management

Summary

Building Advanced Dashboards

Reasons for working with advanced XML

Reasons for not working with advanced XML

Development process

Advanced XML structure

Converting simple XML to advanced XML

Module logic flow

Understanding layoutPanel

Reusing a query

Using intentions

Creating a custom drilldown

Third-party add-ons

Summary

Summary Indexes and CSV Files

Understanding summary indexes

When to use a summary index

When to not use a summary index

Populating summary indexes with saved searches

Using summary index events in a query

Using sistats, sitop, and sitimechart

How latency affects summary queries

How and when to backfill summary data

Reducing summary index size

Calculating top for a large time frame

Using CSV files to store transient data

Summary

Configuring Splunk

Locating Splunk configuration files

The structure of a Splunk configuration file

The configuration merging logic

An overview of Splunk.conf files

User interface resources

Summary

Advanced Deployments

Planning your installation

Splunk instance types

Common data sources

Sizing indexers

Planning redundancy

Working with multiple indexes

Deploying the Splunk binary

Using apps to organize configuration

Configuration distribution

Using LDAP for authentication

Using single sign-on

Load balancers and Splunk

Multiple search heads

Summary

Extending Splunk

Writing a scripted input to gather data

Using Splunk from the command line

Querying Splunk via REST

Writing commands

Writing a scripted lookup to enrich data

Writing an event renderer

Writing a scripted alert action to process results

Hunk

Summary

Machine Learning Toolkit

What is machine learning?

Defining the toolkit

The toolkit workbench

Assistants

Extended SPL (search processing language)

Building a model

Validation

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Calculating top for a large time frame

One common problem is to find the top contributors out of a huge set of unique values. For instance, if you want to know what IP addresses are using the most bandwidth in a given day or week, you may have to keep a track of the total of request sizes across millions of unique hosts to definitively answer this question. When using summary indexes, this means storing millions of events in the summary index, quickly defeating the purpose of summary indexes.

Just to illustrate, let's look at a simple set of data:

Time 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 5.5.5.5 6.6.6.6 
12:00 99 100 100 100 
13:00 99 100 100 100 
14:00 99 100 101 100 
15:00 99 99 100 100 
16:00 99 100 100 100 
total 495 300 299 401 400 100

If we only stored the top three IPs per hour, our dataset would look like the following:

Time 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 5.5.5.5 6.6.6.6 
12:00 100 100 100 
13:00 100 100 100 
14:00 100 101 100 
15:00 99 100 100 
16:00 100 100 100 
total 300 299 401 400...

Implementing Splunk 7, Third Edition - Third Edition

Implementing Splunk 7, Third Edition - Third Edition

Overview of this book

Related Content you might be interested in

Current Title:

Implementing Splunk 7, Third Edition - Third Edition

Splunk 7.x Quick Start Guide

Mastering Splunk 8

Splunk 7 Essentials

Calculating top for a large time frame