Fundamentals-1:
Module-2
What is Splunk?
01.
Index data
02.
Search & Investigate
03.
Add Knowledge
04.
Monitor & Alert
05.
Report & Analyze
Components?
01.
Indexer – Indexers process incoming machine
data storing the results in index as events. As the indexer indexes data, it
creates the number of files organized the sets of directories by age.
02.
Search Head – The searches allows users to use
the Splunk search language to search the index data. Search Head handles search
request from users and distributes the request to the indexers, which perform
the actual search on the data. Search heads then consolidate and in rich the
results from indexers before return them to user. Search head also provides
users with various tools, such as dashboards, reports and visualizations to a
sys the search experience.
03.
Forwarder – Forwarders are Splunk Enterprise
Instances, they consume data and forward to indexers for processing. In most
Splunk deployments, forwarders serve as the primary way data is supplied for
indexing.
Splunk Deployments and Scaling?
In Single-Instance Deployments,
01.
Input
02.
Parsing
03.
Indexing
04.
Searching
Perfect environment for,
·
Proof of concept
·
Personal use
·
Learning
·
Might serve the needs of small, department-sized
environments
Module-3
Splunk installation on Linux, Windows, OSX & Splunk
Cloud
Splunk Apps and Roles –
01.
Apps - The Apps you see are defined by a user
with an administrator role.
02.
Roles - Determine what a user can see, do and
interact with.
Three main roles in Splunk Enterprise,
01.
Administrator Role – Can install apps, and
create knowledge objects for all users.
02.
Power Role – Can create and share knowledge
objects for users of an app and do real time searches.
03.
User Role – Will only see their own knowledge
objects and those shared with them.
Module-4
Types of Data Input:
Monitor option – Allow us to monitor files,
directories, HTTP events, network ports or data gathering scripts located on
Enterprise Splunk instances.
If we are in windows environment, we would also see options
to monitor windows specific data. This includes,
01.
Event logs
02.
File system changes
03.
Active directory
04.
Network information (Both local & remote
machine)
Forward option – We can receive data from
external forwarders. Forwarders are installed in remote machines to gather data
and forwarded to an indexer over a receiving port. In most production
environments, forwarders will be used as your main source of data input.
[Indexes or Directories where data is stored]
[Havin separate indexes can make your searches more
effective]
[Limits data amount Splunk searches & returns events
only from that index]
[Multiple indexes also allow to limiting access by user
role. Admin can control who sees what data]
[Retains of data – Separate indexes allows custom retention
policies by index]
Module-5
Search & Reporting App – The search and reporting app
provides the default interface for searching and analyzing data. It enables you
to create a knowledge object, reports, dashboards and more.
Seven main components of search and report app main
interface,
[Limiting a search by time is key to faster result and is a
best practice]
[Commands that create statistics and visualizations are called
transforming commands] – These are the commands that transforms event data into
a data table.
[By default, a search job will remain active for 10 minutes
after its run. After 10 mins Splunk will run the job again to return the result]
[Shared search jobs remain active for 7 days and will be
readable to everyone, meaning that anyone you shared this job with will see the
same exactly results you did when you first made it]
Modes:
01.
Fast Mode – Only returning information on
default fields and fields required to fulfill your search. [Field discovery is
disabled in Fast Mode]
02.
Smart Mode – [default] It will toggle behavior
based on the type of search you are running.
03.
Verbose Mode – It returns as much field and
event data as possible, discovering all fields it can.
[Selecting or zooming into events uses your original search
job]
[When you zoom out, Splunk runs a new search job to return
newly selected events]
[By default, events are shown in a List view but there are
options to display as Raw events or in a Table]
Exploring Search Term Options in Splunk:
Upper case Booleans are, AND, OR, NOT. It can be used for
multiple terms.
·
Failed NOT password
·
Failed OR password
·
Failed password – [If no Boolean is used, AND is
implied]
[Boolean operations have an order of evaluation, that are –
NOT, OR, AND] [Parentheses can be used to control the evaluation]
·
Failed NOT (success OR accepted)
·
“failed password” – exact phrases can be
searched by placing the quotes
Escaping characters in a search:
01.
Info=”user “chrisV4” not in database”
02.
Info=”user \”chrisV4\” not in database”
Module-6
Fields info – The fields sidebar shows all the fields that
extracted in the search time. We see fields are broken down into selected
fields and interesting fields list.
Selected fields are fields of the at most importance to you.
Interesting fields are having values in at least 20% of the
events.
In the interesting fields,
-
a denotes the string value
-
# denotes the numerical value
[when you add a field to a selected field list that field
will show in the events occurs and persistence for sub-sequent searches]
Searching Fields – You can refrain and run more efficient
searches by using fields in them.
·
Sourcetype=linux_secure
[field names are case sensitive while values are not case
sensitive/fieldname=sourcetype and values=linux_secure]
Field Operators:
Operators {= or !=} can be used with numerical or string
values. Operators {>, >=, <, <=} can be used with numerical values.
Module-7
Best practices in Splunk:
01.
Time
02.
Index, source, host and sourcetype
[Above fields are extracted in the index time so they will
not be extracted in each search]
[The more you tell the search engine, the more likely it is
that you will get good results]
[inclusion is generally better then exclusion] – eg.,
instead of “access denied” can be search for NOT “access granted”
[Time abbreviations are used to tell Splunk what time range
to search]
Index – One way we can filter events early in our search is
by using index. Indexes stores data for searching. Splunk administrator will
often use multiple indexes to segregate data.
Module-8
The Splunk Search Language: The search
language is built by five components,
01.
Search terms –
02.
Commands – It tells Splunk what we want to do with search results. This includes
charts, computing statistics and formatting
03.
Functions – It explain how we want to chart, compute and evaluate the results
04.
Arguments – Arguments are the variables that we want to apply to the function
05.
Clauses – Clauses which explain how we want results grouped or defined
Eg., Sourcetype=acc*
status=200 | stats list(product_name) as “Game Sold” |
Sourcetype=acc*
status=200 à Search Terms
Stats à Command (Blue)
List à Function (Purple)
Product_name à Argument (Green)
As à Clause (Orange)
Visual Syntax
tools for SPL,
CTRL or HOME
+ \ for each pipe to the new line
Fields Command:
The Fields
command that include or exclude fields from search results. It is useful to
limit fields displayed and can make search faster. Field extraction is one of
the most costly parts of searching in Splunk. Field exclusion happens after
field extraction, only affecting displayed results.
Table Command: The table command retains searched data in a tabulated format.
Rename Command: It is used to rename fields. Once renamed, original name is
not available to subsequent search commands. New field names will need to be
used further down the pipeline.
Dedup Command: It is used to remove duplicate events from the results that
share common values. You can use with single or multiple fields to dedup
commands.
Sort Command: The sort command will let you display the results in ascending
or descending order. The sort command can use limit the results by limit
arguments.
[String data
is sorted alphanumerically]
[Numeric data
is sorted numerically]
Module-9
Transforming Commands: These commands order the search results into a
data table that Splunk can use for statistical purposes.
Top Command:
It finds the most common values of a given field.
·
Index=sales
sourcetype=vendor_sales | top Vendor limit=20
·
Index=sales
sourcetype=vendor_sales | top Vendor product_name limit=0
Top command clauses are,
Limit = int
Countfield =
string
Percentfield
= string
Showcount =
True/False
Showperc =
True/False
Showother =
True/False
Otherstr =
string
·
Index=sales
sourcetype=vendor_sales | top Vendor limit=5 showperc=False
·
Index=sales
sourcetype=vendor_sales | top Vendor
limit=5 showperc=False countfield=”Number of Sales” useother=True
Using the “by”
clause,
·
Index=sales
sourcetpe=vendor_sales | top product_name by Vendor limit=3 countfield=”Number
of sales” showperc=False
Rare Command: It shows the least common values of a field set.
·
Index=sales
sourcetype=vendor_sales | rare Vendor limit=5 showperc=False countfield=”Number
of Sales” useother=True
Using the “by” clause,
·
Index=sales
sourcetype=vendor_sales | rare product_name by Vendor limit=3 showperc=False
countfield=”Number of Sales” useother=True
Stats Command: To produce the statistics of our search result, we use the
stats command. Some of the common stats functions are,
01.
Count
– It returns number of events matching search criteria
02.
Distinct
count – It returns the counts of unique values for the given field in the
search results
03.
Sum
– It returns the sum of numerical values
04.
Average
– It returns the average of numerical values
05.
Min
– It returns the minimum numerical values
06.
Max
– It returns the maximum numerical values
07.
List
– It list all values of given field
08.
Values
– It return the Unique values of a given field. It works like a list function
·
Index=sales
sourcetype=vendor_sales | stats count as “Total sales by Vendors” by
product_name, categoryid, sales_price
·
[|
stats count(field)]
·
Index=web
sourcetype=access_combined | stats count(action) as ActionEvents, count as
“Total Events”
·
Index=sales sourcetype=vendor_sales | stats
distinct_count(product_name) as “Number of games for sale by vendors” by
sale_price
·
Index=sales
sourcetype=vendor_sales | stats sum(price) as “Gross Sales” by product_name
·
Index=sales
sourcetype=vendor_sales | stats count as “Units Sold” sum(price) as “Gross
Sales” by product_name à [when using stats, Count and Sum should be in the same
pipe]
·
Index=sales
sourcetype=vendor_sales | stats avg(sale_price) as “Average Price” à [Missing or misformatted values are
not added to the calculation]
·
Index=sales
sourcetype=vendor_sales | stats avg(sale_price) as “Average Price”,
min(sale_price) as “Min Price”, max(sale_price) as “Max Price” by categoryId
·
Index=bcgassets
sourcetype=asset_list | stats list(Asset) as “company assets” by Employee
·
Index=network
sourcetpe=cisco_wsa_squid | stats values(s_hostname) by cs_username
Module-11
Pivot - It allows to design report in a simple
to use interface without ever having to craft search string.
Data Models or Knowledge objects that provide the data
structure that drives Pivots.
These are created by Admin, Power roles & knonwledge of
the search language and solid understanding of data.
Data Model is the framework and Pivot is the
interface to the data. Each data model is made up of Dataset. Datasets are
smaller collection of your data, defined for specific purpose. They are
represented as table with field names for columns and field values for cells
Need to create a report without/but Data model is currently
does not exists. Instant Pivot tool can get them working with data without
having first creating a datamodel.
By entering non-transforming command into the search bar, we
will see a button in the Statistics and Visualization search results tab.
The Datasets that makeup the data models can also be helpful
in another ways. Allowing our users access to small slicing of data can help
them gain operational intelligence from the data without having them use splunk
search language.
Datasets help users to find data & get answers faster.
Splunk also has a Datasets Add on that you can download from Splunk base. The
Add-on allows you to rapidly build dataset tables without using the splunk
search language.
Module-12
Lookups - Lookups allow you to add other
fields and values to your events not included in the index data. We can combine
fields from sources external to the index with searched events based on paired
fields present in the events. This might include csv file, sctipts or
geospatial data.
[A lookup is categorized as a Dataset]
There are two steps to set up a lookup file,
01. Define a lookup table
02. Define the lookup
(Optionally you can configure your lookup to run
automatically. Once defined, Lookup field values are case-sensitive by default)
Create a Lookup Table -
Settings -> Lookups -> Lookup table files ->
Destination App:
Upload a lookup file: (http_status.csv)
Destination filename:
search - | inputlookup http_status.csv
Define a Lookup -
[Now that we have a table without lookup date, We need to
define the lookup]
Settings -> Lookups -> Lookup definitions ->
Destination App:
Name:
Type: (File-based/External/KV Store/Geospatial)
Lookup file:
The Lookup Command -
[index=web sourcetype=access_combined NOT status=200 |
lookup http_status code as status, OUTPUT code as "HTTP Code",
description as "HTTP Description" | table host, "HTTP
Code", "HTTP Description"]
[Input fields are not automatically generated with the
lookup command]
By default all fields in the lookup table are returned its
output fields except the input field. We can choose what fields are lookup
returns by adding an OUTPUT clause.
[If there are existing output fields with same name it will
be overwritten. We can use OUTPUTNEW clause that don't overwritten an existing
fields]
Creating an Automatic Lookup -
Settings -> Lookups -> Automatic lookups ->
Destination app:
Name:
Lookup table:
Apply to:
Lookup input fields:
Lookup output fields:
index=web sourcetype=access_combined NOT status=200 | table
host, "Code", "Description"
Additional Lookup Options -
In addition to file based lookups, you can also populated
lookup table with search results.
Define lookup based on external script or command.
Use Splunk DB Connect application to create lookups based on
external databases.
Use geospatial lookups to create queries that can be used to
generate choropleth map visualizations.
Populate events with KV Store fields.
Module-13
Scheduled Reports - Is a report that runs on a
scheduled interval and can trigger an action each time it runs.
[Running concurrent reports, and the searches behing them,
can put a big demand on your system hardware even if everything is configured
to the recommended specs]
[Include a Schedule Window only if the report doesn't have
to start at a specific time...and you're ok with the delay.]
Embedded Report -
[Embedding report - Anyone with access to the web page will
be able to see the report]
[An embedded report will not show data until the scheduled
search is run. Once embedded is enable, we no longer is able to edit]
Alerts - It is based on searches that run on
scheduled intervals or in real-time. You can Splunk alert you, when the results
of a search meet defined conditions. Alerts are triggered when search is
completed.
01. List in interface
02. Log events
03. Output to lookup
04. Send to a telemetry endpoint
05. Trigger scripts
06. Send emails
07. Use a webhook
08. Run a custom alert
There are two types in the alerts are Scheduled or Realtime.
Scheduled alert type allows you to set a schedule and time
range for the search to be run.
Real-time alert type will run the search continuously in the
background. [Since real-time alerts run continuously and can place more
overhead on system performance]
No comments:
Post a Comment