Book Image

Splunk Best Practices

Book Image

Splunk Best Practices

Overview of this book

This book will give you an edge over others through insights that will help you in day-to-day instances. When you're working with data from various sources in Splunk and performing analysis on this data, it can be a bit tricky. With this book, you will learn the best practices of working with Splunk. You'll learn about tools and techniques that will ease your life with Splunk, and will ultimately save you time. In some cases, it will adjust your thinking of what Splunk is, and what it can and cannot do. To start with, you'll get to know the best practices to get data into Splunk, analyze data, and package apps for distribution. Next, you'll discover the best practices in logging, operations, knowledge management, searching, and reporting. To finish off, we will teach you how to troubleshoot Splunk searches, as well as deployment, testing, and development with Splunk.
Table of Contents (16 chapters)

Preface

Within the working world of technology, there are hundreds of thousands of different applications, all (usually) logging in different formats. As a Splunk expert, our job is make all those logs speak human, which is often the impossible task. With third-party applications that provide support, sometimes log formatting is out of our control. Take, for instance, Cisco or Juniper, or any other leading leading manufacturer.

These devices submit structured data,specific to the manufacturer.  There are also applications that we have more influence on, which are usually custom applications built for a specific purpose by the development staff of your organization. These are usually referred to as 'Proprietary applications' or 'in-house' or 'home grown' all of which mean the same thing. 

The logs I am referencing belong to proprietary in-house (a.k.a. home grown) applications that are often part of the middleware, and usually control some of the most mission critical services an organization can provide.

Proprietary applications can be written in anything, but logging is usually left up to the developers for troubleshooting, and up until now the process of manually scraping log files to troubleshoot quality assurance issues and system outages has been very specific. I mean that usually, the developer(s) are the only people that truly understand what those log messages mean.

That being said, developers often write their logs in a way that they can understand them, because ultimately it will be them doing the troubleshooting / code fixing when something severe breaks.

As an IT community, we haven't really started taking a look at the way we log things, but instead we have tried to limit the confusion to developers, and then have them help other SMEs that provide operational support, understand what is actually happening.

This method has been successful, but time consuming, and the true value of any SME is reducing any systems MTTR, and increasing uptime. With any system, the more transactions processed means the larger the scale of a system, which after about 20 machines, troubleshooting begins to get more complex, and time consuming with a manual process.

The goal of this book is to give you some techniques to build a bridge in your organization. We will assume you have a base understanding of what Splunk does, so that we can provide a few tools to make your day to day life easier with Splunk and not get bogged down in the vast array of SDK's and matching languages, and API's. These tools range from intermediate to expert levels. My hope is that at least one person can take at least one concept from this book, to make their lives easier.

What this book covers

Chapter 1 , Application Logging, discusses where the application data comes from, and how that data gets into Splunk, and how it reacts to the data. You will develop applications, or scripts, and also learn how to adjust Splunk to handle some non-standardized logging. Splunk is as turnkey, as the data you put it into it. This means, if you have a 20-year-old application that logs unstructured data in debug mode only, your Splunk instance will not be a turnkey. With a system such a Splunk, we can quote some data science experts in saying "garbage in, garbage out".

Chapter 2 , Data Inputs, discusses how to move on to understanding what kinds of data input Splunk uses in order to get data inputs. We see how to enable Splunk to use the methods which they have developed in data inputs. Finally, you will get a brief introduction to the data inputs for Splunk.

Chapter 3 , Data Scrubbing, discusses how to format all incoming data to a Splunk, friendly format, pre-indexing in order to ease search querying, and knowledge management going forward.

Chapter 4 , Knowledge management, explains some techniques of managing the incoming data to your Splunk indexers, some basics of how to leverage those knowledge objects to enhance performance when searching, as well as the pros and cons of pre and post field extraction.

Chapter 5, Alerting, discusses the growing importance of Splunk alerting, and the different levels of doing so. In the current corporate environment, intelligent alerting, and alert 'noise' reduction are becoming more important due to machine sprawl, both horizontally and vertically. Later, we will discuss how to create intelligent alerts, manage them effectively, and also some methods of 'self-healing' that I've used in the past and the successes and consequences of such methods in order to assist in setting expectations.

Chapter 6, Searching and Reporting, will talk about the anatomy of a search, and then some key techniques that help in real-world scenarios. Many people understand search syntax, however to use it effectively, (a.k.a to become a search ninja) is something much more evasive and continuous. We will also see real world use-cases in order to get the point across such as, merging two datasets at search time, and making the result set of a two searches match each other in time.

Chapter 7, Form-Based Dashboards, discusses how to create form based dashboards leveraging $foo$ variables as selectors to appropriately pass information to another search, or another dashboard and also, we see how to create an effective drill-down effect.

Chapter 8, Search optimization, shows how to optimize the dashboards to increase performance. This ultimately effects how quickly dashboards load results. We do that by adjusting search queries, leverage summary indexes, the KV Store, accelerated searches, and data models to name a few.

Chapter 9, App Creation and Consolidation, discusses how to take a series of apps from Splunkbase, as well as any dashboard that is user created, and put them into a Splunk app for ease of use. We also talk about how to adjust the navigation XML to ease user navigation of such an app.

Chapter 10, Advanced Data Routing, discusses something that is becoming more common place in an enterprise. As many people are using big data platforms like Splunk to move data around their network things such as firewalls and data stream loss, sourcetype renaming by environment can become administratively expensive.

What you need for this book

You will need at least a distributed deployment of an on prem installation of Splunk for this book, collecting both Linux and Windows information, and a heavy forwarder as well.  We will use all of these pieces to show you techniques to add value. 

Who this book is for

This book is for administrators, developers, and search ninjas who have been using Splunk for some time. A comprehensive coverage makes this book great for Splunk veterans and newbies alike.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "For instance, in Cisco log files there is a src_ip field."

A block of code is set as follows:

[mySourcetype] 
REPORT-fields = myLinuxScript_fields

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

[myUnstructured] 
DATETIME_CONFIG = 
NO_BINARY_CHECK = true 
category = Custom 
pulldown_type = true

Any command-line input or output is written as follows:

ssh -v -p 8089 mydeploymentserver.com

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "The most common messages we see are things like unauthorized login attempt <user> or Connection Timed out to <ip address>."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

  1. Log in or register to our website using your e-mail address and password.

  2. Hover the mouse pointer on the SUPPORT tab at the top.

  3. Click on Code Downloads & Errata.

  4. Enter the name of the book in the Search box.

  5. Select the book for which you're looking to download the code files.

  6. Choose from the drop-down menu where you purchased this book from.

  7. Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR / 7-Zip for Windows

  • Zipeg / iZip / UnRarX for Mac

  • 7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Splunk-Best-Practices. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book 

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/SplunkBestPractices_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.