Book Image

Splunk 7.x Quick Start Guide

By : James H. Baxter
Book Image

Splunk 7.x Quick Start Guide

By: James H. Baxter

Overview of this book

Splunk is a leading platform and solution for collecting, searching, and extracting value from ever increasing amounts of big data - and big data is eating the world! This book covers all the crucial Splunk topics and gives you the information and examples to get the immediate job done. You will find enough insights to support further research and use Splunk to suit any business environment or situation. Splunk 7.x Quick Start Guide gives you a thorough understanding of how Splunk works. You will learn about all the critical tasks for architecting, implementing, administering, and utilizing Splunk Enterprise to collect, store, retrieve, format, analyze, and visualize machine data. You will find step-by-step examples based on real-world experience and practical use cases that are applicable to all Splunk environments. There is a careful balance between adequate coverage of all the critical topics with short but relevant deep-dives into the configuration options and steps to carry out the day-to-day tasks that matter. By the end of the book, you will be a confident and proficient Splunk architect and administrator.
Table of Contents (12 chapters)

What is Splunk?

Okay—so what is Splunk, anyway? How do you explain this product to your peers, friends, and family in a way that is easy to comprehend without watering down its awesome capabilities? Here's how I explain it, with an introductory setting and then increasing levels of powerful uses—until I notice their eyes starting to glaze over, at which point I stop and summarize again: "It's like Google for all kinds of machine data!"

Every company will have tens, hundreds, or maybe thousands of application and web servers, databases, and network devices such as switches, routers, and firewalls; all kinds of sensors, and so on—and all of these create log files or data streams that record their activities and statuses over time. Now, imagine needing to troubleshoot a problem that might be caused by any one of several parts of a system, and having to log into each of these machines one at a time, manually dig through its log file looking for clues, then log into the next, and so on—you can see how tedious and time-consuming this can become. Or maybe you want to monitor critical processes to make sure things are running well—how do you do that for a lot of machines?

Splunk is a software platform that collects and stores all this machine data in one place. It makes it as easy to search through and investigate that data as using Google. Basically, it's Google for log files! Beyond troubleshooting, you can use this search capability to build reports and dashboards to monitor performance, reliability, or other metrics across a whole collection of related servers and devices, and even create alerts to warn you by text or email when something is going wrong. It's also used to detect security threats, and since you have all this data in one place, you can do event correlation across devices and apply machine learning to it for the purposes of anomaly detection, user behavior analytics, and even predictive analytics to identify potential problems before they happen.

Splunk has a media kit brochure that covers the spectrum of ways Splunk helps companies extract value from their machine data, which can be found at: https://www.splunk.com/en_us/newsroom/media-kit.html.

The following diagram illustrates the spectrum of data you can collect with Splunk, and captures the essence of what Splunk does:

Splunk data sources and use cases

Splunk products

There are several types and licensing models of Splunk available to suit the needs of its customer base.

Splunk Enterprise is designed for on-premise deployments; it can be scaled to support an unlimited number of users and ingestion volumes by adding the necessary number and types of Splunk functional software components (indexers and search heads) on customer-supplied servers. The cost of a Splunk Enterprise license is based on daily ingestion volume. A wide range of applications for Splunk Enterprise, written by both Splunk and the user community, that add value to the product are available for free from the Splunkbase website. Splunk also sells several sophisticated premium solution applications such as IT Service Intelligence, Enterprise Security, and User Behavior Analytics on an individual license basis, and a Machine Learning Toolkit is available for free from Splunkbase if you want to roll out your own ML solutions.

Splunk Cloud is a cloud-based software as a service (SaaS) version of Splunk Enterprise; it is also licensed based on daily ingestion volume, and offers all the functionality of the on-premise Splunk Enterprise product. The core Splunk infrastructure is provided and managed by Splunk, while data inputs, approved apps, reports, dashboards, alerts, and so on are managed by the customer.

Splunk Light is designed to be a small-scale solution. It allows up to 20 GB/day of ingestion volume, five users, and reports/dashboards/alerts, but does not provide support for distributed deployments, Splunkbase or premium solution apps (except for an app for Amazon Web Services (AWS)), and several other limitations.

Splunk Free is a free version of the core Splunk Enterprise product that has limits on users (one user), ingestion volume (500 MB/day), and other features; also, it cannot be used for a deployed/clustered configuration.

You can compare the available features of each product here: https://www.splunk.com/en_us/software/features-comparison-chart.html.

The history of Splunk

Splunk was designed from the beginning to allow people who were troubleshooting IT problems to search through their log files as easily as if using a search engine like Google. As a product, Splunk was conceived by founders Rob Das and Erik Swan between 2002 and 2004 after asking a number of people at 60-70 companies, how do you find problems in your infrastructure today? The answer was consistently that they would go through log files, and when asked about what tools they used, they tended to answer that they would write their own scripts and that they do everything manually. They also commented that they used Google to search for posts from other people who had solved the same problem. Rob and Erik thought, why not create a tool that allows people to search through log files as easily as they search the web using something like Google? So they built a prototype to demo at Linux World with the tagline of Google for log files, and since they offered a free download, people tried it, liked it, and told their friends—from there, Splunk spread virally from person to person and company to company.

The name Splunk was derived from asking people, what is it like to troubleshoot in your environment? One answer was that it was like digging through caves with headlamps and helmets on, and crawling through the muck trying to find the problem, and it took forever. A common term for going through caves is spelunking, so they decided to call the product Splunk. From that beginning, Splunk has evolved into the powerful big data and business analytics tool that the founders envisioned from the start.

You can watch Rob and Erik tell their story here: https://www.splunk.com/view/SP-CAAAGBY.