Book Image

Splunk 7.x Quick Start Guide

By : James H. Baxter
Book Image

Splunk 7.x Quick Start Guide

By: James H. Baxter

Overview of this book

Splunk is a leading platform and solution for collecting, searching, and extracting value from ever increasing amounts of big data - and big data is eating the world! This book covers all the crucial Splunk topics and gives you the information and examples to get the immediate job done. You will find enough insights to support further research and use Splunk to suit any business environment or situation. Splunk 7.x Quick Start Guide gives you a thorough understanding of how Splunk works. You will learn about all the critical tasks for architecting, implementing, administering, and utilizing Splunk Enterprise to collect, store, retrieve, format, analyze, and visualize machine data. You will find step-by-step examples based on real-world experience and practical use cases that are applicable to all Splunk environments. There is a careful balance between adequate coverage of all the critical topics with short but relevant deep-dives into the configuration options and steps to carry out the day-to-day tasks that matter. By the end of the book, you will be a confident and proficient Splunk architect and administrator.
Table of Contents (12 chapters)

Splunk events

As we discovered in the previous section, Splunk creates events from each entry in a log file or data stream. You can search for specific types of events, within specified time frames, using SPL in Splunk Web. For example, let's say you create a search on the instance of Splunk on your laptop using the SPL command:

index=_internal sourcetype=splunk_web_access

if you press Enter, Splunk will return a number of events for Today or any other time frame you've selected in the Time Range drop-down. These events come from Splunk's internal web server, and reflect the format and fields that are typical of a web log:

Splunk events from a simple search

We'll cover all of the features and details of using Splunk Web in Chapter 6, Searching with Splunk, so for now let's focus on some of the most important and useful fields in the events themselves.

The following is a screenshot of a typical Splunk event:

A typical Splunk event

Regardless of the data source or type, Splunk always tags each event with a number of default fields; some of these come from the metadata mentioned in the discussion about the data pipeline in the previous section, and others are added at index time. There are four of these fields that you will want to become familiar with right away, as they are used extensively for filtering arguments in your SPL commands to return the events of interest. In the preceding screenshot, these key fields have been circled in red – they are as follows:

  • _time (timestamp)
  • host
  • source
  • sourcetype

The date and time reflected in the Time column is the timestamp assigned to the event, which Splunk stores in a _time field. When an event contains a timestamp, as this one does, that is, [11/Jun/2018:21:12:35.441 -0400], Splunk will parse that timestamp and save it in the _time field as an epoch value (number of seconds since 00:00:00 coordinated universal time (UTC), Thursday, 1 January 1970). If an event does not contain a timestamp, Splunk will assign the time the event was indexed to the _time field. Splunk displays this _time value in the date-time format, as seen previously, corrected for the time zone specified in the Splunk Web account settings—we'll cover this in more detail in the chapter on Splunk search.

The host field is the name or IP address of the physical device from which an event originates. You can use this field to create filters to return events from a specific host. In the preceding example, the host is a Splunk server called robotdev.

The source field identifies where the event originated. In the case of data obtained from log files, the source consists of the full pathname of the file; in the case of a network-based source, this field contains the protocol and port, such as UDP: 514. In this example, the event came from a Splunk log file in the /opt/splunk/var/log/splunk directory called web_access.log.

The sourcetype field identifies the data structure of an event (what fields the event contains, where they are, and how they're formatted), and determines how Splunk parses the data into specified fields during the indexing process. Splunk Enterprise comes with a large set of predefined source types for known data source types, and will assign the correct sourcetype to your data if it recognizes the format. You can use the sourcetype field in searches to find all the data of a certain type, regardless of the source. In the preceding example, the sourcetype is a Splunk-specific type called splunk_web_access.

The other important field that is not displayed in search results, but is essential for writing SPL commands to perform searches is the index, which as you can see was specified in the SPL command used to return the example events previously. Splunk has four internal indexes: _audit, _internal, _introspection, and _telemetry; you can view the data in these to get familiar with events in the short term. You will create and use custom indexes to store data from your company's host and device logs, and specify those indexes in your search strings.