Book Image

Splunk Best Practices

Book Image

Splunk Best Practices

Overview of this book

This book will give you an edge over others through insights that will help you in day-to-day instances. When you're working with data from various sources in Splunk and performing analysis on this data, it can be a bit tricky. With this book, you will learn the best practices of working with Splunk. You'll learn about tools and techniques that will ease your life with Splunk, and will ultimately save you time. In some cases, it will adjust your thinking of what Splunk is, and what it can and cannot do. To start with, you'll get to know the best practices to get data into Splunk, analyze data, and package apps for distribution. Next, you'll discover the best practices in logging, operations, knowledge management, searching, and reporting. To finish off, we will teach you how to troubleshoot Splunk searches, as well as deployment, testing, and development with Splunk.
Table of Contents (16 chapters)

Unstructured data


The following screenshot is an example of what unstructured data looks like:

These kinds of logs are much more complicated to bring value to, as all of the knowledge must be manually extracted by a Splunk engineer or admin. Splunk will look at your data and attempt to extract things that it believes is fields. However, this often ends up being nothing of what you or your users are wanting to use to add value to their dashboards.

That being the case, this is where one would need to speak to the developer/vendor of that specific software, and start asking some pointed questions.

In these kinds of logs, before we can start adding the proper value, there are some foundational elements that need to be correct. I'm only going to focus on the first, as we will get to the other 2 later in this book.

  • Time stamping

  • Event breaking

  • Event definitions

  • Field definitions (field mapping)

Event breaking - best practice

With structured data, Splunk will usually see the events and not automatically break them as they are nice and orderly.

With unstructured data, in order to make sure we are getting the data in appropriately, the events need to be in some sort of organized chaos, and this usually begins with breaking an event at the appropriate line/character in the log. There's lots of ways to break an event in Splunk (see http://docs.splunk.com/Documentation/Splunk/6.4.1/Admin/Propsconf and search for break), but using the preceding data, we are going to be looking at the timestamp to reference where we should break these events, as using the first field, which is most often the timestamp, is the most effective way to break an event.

There are a few questions to ask yourself when breaking events, though one of the more important questions is; are these events all in one line, or are there multiple lines in each event? If you don't know the answer to this question, ask the SME (dev/vendor). Things can get messy once data is in, so save yourself a bit of time by asking this question before inputting data.

In the following example, we can see the timestamp is the event delimiter and that there can be multiple lines in an event. This means that we need to break events pre-indexing:

In order to do that, we need to adjust our props.conf on our indexer. Doing so will appropriately delineate log events as noted in the following image:

Note

Adding line breaking to the indexing tier in this way is a method for pre-index event breaking and data cannot be removed without cleaning an index.

In this example, we have five indexers in a cluster pool, so using the UI on each of those indexers is not recommended. "Why?" you ask. In short, once you cluster your indexers, most of the files that would end up in $SPLUNK_HOME/etc/ having become shared, and they must be pushed as a bundle by the cluster master. It is also not recommended by Splunk support. Try it if you like, but be prepared for some long nights.

Currently Splunk is set up to do this quite easily from an individual file via the UI, though when dealing with a larger infrastructure and multiple indexers, the UI feature often isn't the best way to admin. As a tip, if you're an admin and you don't have a personal instance of Splunk installed on your workstation for just this purpose, install one. Testing the features you will implement is often the best practice of any system.

Best practices

Why should you install an instance of Splunk on your personal workstation you ask? Because if you bump into an issue where you need to index a dataset that you can't use the UI for, you can get a subset of the data in a file and attempt to ingest it into your personal instance while leveraging the UI and all its neat features. Then just copy all of the relevant settings to your indexers/cluster master. This is how you can do that:

  1. Get a subset of the data, the SME can copy and paste it in an e-mail, or send it attached or by any other way, just get the subset so you can try to input it. Save it to the machine that is running your personal Splunk instance.

  2. Login to your personal Splunk instance and attempt to input the data. In Splunk, go to Settings | Data Inputs | Files & Directories | New and select your file which should bring you to a screen that looks like this:

  3. Attempt to break your events using the UI.

Now we are going to let Splunk do most of the configuring here. We have three ways to do this:

  1. Auto: Let Splunk do the figuring.

  2. Every Line: this is self-explanatory.

  3. Regex...: use a REGEX to tell Splunk where each line starts.

For this example, I'm going to say we spoke to the developer and they actually did say that the timestamp was the event divider. It looks like Auto will do just fine, as Splunk naturally breaks events at timestamps:

Going down the rest of the option, we can leave the timestamp extraction to Auto as well, because it's easily readable in the log.

The Advanced tab is for adding settings manually, but for this example and the information we have, we won't need to worry about it.

When we click the Next button we can set our source type, and we want to pay attention to the App portion of this, for the future. That is where the configuration we are building will be saved:

Click Save and set all of the other values on the next couple of windows as well if you like. As this is your personal Splunk instance, it's not terribly important because you, the Splunk admin, are the only person who will see it.

When you're finished make sure your data looks like you expect it to in a search:

And if you're happy with it (and let's say we are) we can then look at moving this configuration to our cluster.

Remember when I mentioned we should pay attention to the App? That's where the configuration that we want was written. At this point, it's pretty much just copying and pasting.

Configuration transfer - best practice

All of that was only to get Splunk to auto-generate the configuration that you need to break your data, so the next step is just transferring that configuration to a cluster.

You'll need two files for this. The props.conf that we just edited on your personal instance, and the props.conf on your cluster master. (For those of you unfamiliar, $SPLUNK_HOME/etc/master_apps/ on your cluster master)

This was the config that Splunk just wrote in my personal instance of Splunk:

Follow these steps to transfer the configuration:

  1. Go the destination app's props.conf, copy the configuration and paste it to your cluster masters props.conf, then distribute the configuration to its peers ($SPLUNK_HOME/etc/master_apps/props.conf). In the case of our example:

          Copy source file = 
          $SPLUNK_HOME/etc/apps/search/local/props.conf 
          Copy dest file = $SPLUNK_HOME/etc/master_apps/props.conf 
    
  2. Change the stanza to your source type in the cluster:

    • When we pasted our configuration into our cluster master, it looked like this:

                        [myUnstructured] 
                        DATETIME_CONFIG = 
                        NO_BINARY_CHECK = true 
                        category = Custom 
                        pulldown_type = true
    • Yet there is no myUnstructured source type in the production cluster. In order to make these changes take effect on your production source type, just adjust the name of the stanza. In our example we will say that the log snippet we received was from a web frontend, which is the name of our source type.

    • The change would look like this: 

                            [web_frontend] 
                            DATETIME_CONFIG = 
                            NO_BINARY_CHECK = true 
                            category = Custom 
                            pulldown_type = true 
      
  3. Push the cluster bundle via the UI on the cluster master:

  4. Make sure your data looks the way you want it to:

Once we have it coming into our index in some sort of reasonable chaos, we can begin extracting knowledge from events.