Fundamental-2
Module-1
Introduction
Fundamentals One Refresher -
Splunk Search Terms:
01. Keywords []
02. Booleans [Boolean operator must be uppercase]
03. Phrases [Exact phrases can be searched by placing the keyword in
quotes]
04. Fields [We can also search on an extracted field by typing a field value
pair into the search. Field names are case sensitive. Field values are
not]
05. Wildcards
[Wildcards can be used at any point in keyword text and fields]
[Using a wildcard at the begining of the keyword or field is very
inefficient]
06. Comparisons [Comparison operator can be used to define events.]
[Supported operators are, =Equal, !=Not Equal, <Less than, <=Less than or
equal to, >Greater than, >=Greater than or equal to]
Commonly used commands are,
fields
command allows you to include or exclude specific fields from search
results.
[sourcetype=access_combined | fields clientip, action]
table command returns specified
fields in a table format.
[sourcetype=access_combined | table clientip, action]
rename command, can be used to rename
fields.
[sourcetype=access_combined | rename clientip as "userip"]
dedup command, removes duplicate
events from results that share common values.
[sourcetype=access_combined | dedup clients]
sort command allows you to display
your results in ascending or descending order.
[sourcetype=access_combined | sort - price]
lookup command adds field values from
external sources.
[sourcetype=access_combined | lookup dnslookup clientip]
transforming commands, are used to order
search results into a data table that Splunk can use for statistical purposes.
They are required to transform search results into a visualization.
top
& rare, with
top and rare allowing you to quickly find most common and rare values in a
result set.
Stats, for producing statistical information from our search results.
Module-2
Beyond basic search fundamentals
[If a command references a specific value, that value will be case
sensitive]
[eg., replace command]
{sourcetype=access_combined purchase | replace www1 with server1 in host}
[Field values from a Lookup are case sensitive by default. A User within
Admin roles can choose for values to be case insensitive when creating lookup
table but best to assume that this is not the case when searching.]
[Boolean operator are case sensitive. If boolean operator is not supplied with
Upper case it is seen as literal keyword]
[when searching using tag, tag values is case sensitive]
[When using regex with commands, the regex terms needs to follow define
character clause case sensitivity]
[Buckets]
01. When Splunk inguest data that will be stored in Bucket. 02. Bucket
are directories containing set of, Raw data and Indexing data.
03. Buckets have configurable with Maximum size and Maximum time
span.
04. There are three kinds of searchable buckets in Splunk, Hot, Warm and
Cold.
Hot - As events are indexed, they
are placed in Hot buckets. Hot buckets are the only writeable buckets.
Hot bucket rolls to warm bucket when,
- Maximum size reached
- Time span reached
- Indexer is restarted
Warm - Upon rolling, bucket is
closed, renamed and changed to "read only" status.
Warm buckets are renamed displaying time stamps with youngest and oldest events
in the bucket.
Warm bucket rolls to cold bucket when,
- Maximum size reached
- Time span reached
Cold - The bucket is typically
stored in different location then Hot and Warm buckets. This allows them to be
stored on a slower, cost-effectivie infrastructure.
Using
Wildcards -
01. Wildcards are tested after all other search terms.
02. Only trailing wildcards make efficient use of index.
[sourcetype=access*]
03. Wildcards at the beginning of a string cause Splunk to search all
events.
04. Wildcards in the middle of a string produce inconsistent results.
05. Avoid using wildcards to match punctuation.
06. Be as specific as possible in search terms.
Search
Modes -
Knowing when to use appropriate search mode can help your search more
efficient or allow better access to your data for discovery.
Fast
Mode - It
emphasis performance, only returns essential data. when running
non-transforming search in this mode only fields required for the search are
extracted and displayed in events. As with all non-transforming commands,
statistics and visualization are not available but patterns are. If we run a
transforming commands events and patterns are no longer return but we have
access to statistics and visualizations.
Verbose
Mode - It
emphasis completeness by returning all field and event data. If we run a
non-transforming search in this mode, we get events and patterns (same like,
Fast Mode) but all field for the events are extracted and displayed in the side
bar. If we run a transforming search in this mode, we can access to see
statistics and visualizations, but we can also see patterns and events.
Smart
Mode - It is
designed to return the best results for the search being run using a
combination of both fast and verbose modes. If we use a non-transforming
search, it acts like verbose mode, returning all fields for events and access
to patterns. If we use transforming commands, it will act like fast mode.
General Best Practices,
01. The less data you have to search, the faster Splunk will be.
02. Fields extracted at index time do not need to be extracted for each search.
(time, index, source, host and sourcetype)
03. Inclusion is generally better than exclusion. (searching for "access
denied" is better than Not "access-granted")
Use the appropriate search mode,
Fast mode for performance
Verbose mode for completeness
Smart mode for the combination of both
Search Job Inspector,
you might have times to tune a search to get more efficient search. The search
job inspector is a tools that can be used to troubleshoot performance of
searches and determines which phase of a search takes the most time. It
dissects behavior of searches to help understand costs of knowledge objects,
search commands and other components with in the search. Any search job that
has not expired can be inspected.
Module-3
Splunk will allow you to visualize your data in many ways. Any
search that returns statistical values can be viewed as a chart. Most
visualization requires results structured its tables with at least two
columns.
The chart command can take two clause
statements (over & by).
Over - It tells Splunk whcih field you want to be on the X axis.
[Any stats function can be applied to the chart command.]
[index=web sourcetype=access_combined status>299 | chart count over
status]
status is the x-axis and count is the y-axis. The y-axis is always to be
numeric, so that it can be charted.
By - The "by" clause comes into play when we want to split our data
by an additional field.
[index=web sourcetype=access_combined status>299 | chart count over status
by host]
unlike the stats command, only one value can be specified after the
"by" modifier when using the "over" clause. If two
"by" clause is used without the "over" clause, the first
field is used as the "over" clause.
[index=web sourcetype=access_combined status>299 | chart count by
status, host]
[index=web sourcetype=access_combined status>299 | chart count over host by
product_name]
[index=web sourcetype=access_combined status>299 | chart count over host by
product_name usenull=false] --> to remove null from our data
[index=web sourcetype=access_combined status>299 product_name=* | chart
count over host by product_name] --> it does remove null values in the
intial search which is most efficient
{The chart command by default is limited to 10 columns, others can be included
with the limit argument, by default showup as other in your events}
[index=web sourcetype=access_combined status>299 product_name=* | chart
count over host by product_name useother=false] --> to remove field
"other" in the column
[index=web sourcetype=access_combined status>299 product_name=* | chart count
over host by product_name limit=5] --> to display the number of product to
be showened
[index=web sourcetype=access_combined status>299 product_name=* | chart
count over host by product_name limit=0] --> using "limit=0" to
display all of the product
Timechart command - Perfoms stats
aggregations against time. Time is always the X axis.
[index=sales sourcetype=vendor_sale | timechart count]
[index=sales sourcetype=vendor_sale | timechart count by product_name]
As with chart, any stats function can be applied with timechart command
and only one value can be specified after the "by" modifier. The
limit, useother and usenull also available to timechart. The timechart command
intellegently clusters data in time intervals dependent on the time range
selector.
To change the span of time of the cluster, you can use the argument of span
with the time to group by.
[index=sales sourcetype=vendor_sales | timechart span=12hr sum(price) by
product_name limit=0]
We may want to compare data over specific time periods and Splunk
provide the command "timewrap".
[index=sale sourcetype=vendor_sales product_name="Dream Crusher" |
timechart span=1d sum(price) by product_name | timewrap 7d | rename _time as
Day | eval Day = strftime(Day, "%A")]
Line
graph -
Chart
overlay - will
allow you to lay a line chart of one series over another visualization.
[index=main (sourcetype=access_combined action=purchase status=200) OR
sourcetype=vendor_sales | timechart sum(price) by sourcetype | rename
access_combined as "web_sales"]
Area
chart -
Differences of Line graph and Area formatting is the ability to show the stack.
Column
chart - It
also allows you to stack data.
Bar
graph - uses
horizontal bars to show comparision and can be stacked.
Pie
chart - It
takes the data and visualizes the percentage for each slice.
Scatter
chart - It
show the relationship between two discret data values, plotted on a X & Y
axis.
Bubble
chart - We can
add more versility by adding a bubble chart. This provides a visual way to view
a third dimention of data. Each bubble plots against two dimentions of X &
Y axis. The size of the bubble represents the value for the third dimention.
Trellis
layout - It
allow us to split our visualization by a selected field or aggregation. While
we get multiple visualization, the originating search is only run once.
[Additional visualizations can be downloaded from Splunk base]
Module-4
There are several options for representing a data that includes
Geographical information.
IPlocation - It is used to lookup and add
a location information to events. Data searches city, country, region, latitude
and longitude can be added to events that include external ipaddress.
[index=security sourcetype=linux_secure action=success src_ip!=10.* |
iplocation src_ip]
Depending on the IP, not all location ip information might be available
for it. This is a nature of geolocation and should be taken to consideration
when searching your data.
If you are collecting Geographical data, you can use the Geostats
command to aggregate the data for use on a map visualization. The Geostats
command uses the same functions as the stats command.
[index=sales sourcetype=vendor_sales | geostats latfield=VendorLatitude
longfield=VendorLongitude count]
[index=sales sourcetype=vendor_sales | geostats latfield=VendorLatitude
longfield=VendorLongitude count by product_name]
Unlike the stats command, the Geostats command only accepts one "by"
argument. To control the column count, the globallimit argument can be
used.
[index=sales sourcetype=vendor_sales | geostats latfield=VendorLatitude
longfield=VendorLongitude count by product_name globallimit=4]
[You can lookup Geographical data to use with Geostats using the Iplocation
command]
[index=sales sourcetype=linux_secure action=success src_ip!=10.* | iplocation
src_ip | geostats latfield=lat longfield=lon count]
Choropleth
map - It is another
way to see your data as a Geographical visualization. They allow us to use
shawding to show relative matrix over predefined locations of a map.
[In order to use Choropleth, you need a .kmz or compressed Keyhole Markup
Language File that defines region boundries]
To prepare our events with Choropleth, we use the Geom command. It adds the field
with geographical data structures matching polygons on map.
[index=sales sourcetype=vendor_sales VendorID>=5000 AND VendorID<=5055 |
stats count as Sales by VendorCountry | geom geo_countries
featureIdField=VendorCountry]
::geo_countries - name of the kmz file/also known as the
featureCollection::
::featureIdField is also required::
Single
value visualization -
When the result contains single value, there are two different types of
visualizations, you can use to display them.
You can pipe the events into the gauge command,
[index=web sourcetpe=access_combined action=purchase | stats sum(price)
as total | gauge total 0 30000 60000 70000]
[Once the color range format is set, it stays persistent over the radio, filler
or marker gauges]
The Trendline command computes moving
averages of field values. Giving you clear understanding of how your data is
trending.
[index=web sourcetype=access_combined action=purchase status=200 | timechart
sum(price) as sales | trendline wma2(sales) as trend]
trendline command requires three arguments,
Trendtype:
- simple moving average / sma
- exponential moving average / ema
- weighted moving average / wma
"sma/ema/wma", computes the sum of data points over a period of time.
The wma and ema assign a heavier weighting to more current data points.
number "2", will average the data points on every two days.
field "sales", we need to define a field to calculate the trend
from
Addtotals command - It computes the sum
of all numeric fields for each event and create a total column.
[index=web sourcetype=access_combined file=* | chart sum(bytes) over
host by file | addtotals col=true label="Total"
labelfield="host" fieldname="Total by host"
row=false]
col - We can create a column summary by setting a “col” variable to true
label - Row is created and it is not labeled. we add a label by setting the
"label" variable with the name to use
labelfield - The "labelfield" variable with the field to show the
label-in
fieldname - We can change the label for our Total using the
"fieldname" variable
row - It is used to remove the field by setting the "row" variable to
false
Module-5
Eval command - It is used to calculate and manipulate
field values, Arithmetic, Concatenation & Boolean operators are supported
by the command. Results can be written to new field or replace existing field.
Field values are created by "eval" command is case sensitive.
[index=network sourcetype=cisco_wsa_squid | stats sum(sc_bytes) as Bytes by
usage | eval bandwidth = Bytes/1024/1024]
[index=network sourcetype=cisco_wsa_squid | stats sum(sc_bytes) as Bytes by
usage | eval bandwidth = rount(Bytes/1024/1024,2)]
[index=network sourcetype=cisco_wsa_squid | stats sum(sc_bytes) as Bytes by
usage | eval bandwidth = rount(Bytes/1024/1024,2) | sort -bandwidth | rename
bandwidth as "Bandwidth (MB)" | fields - Bytes]
Along with conversion values, the Eval command allows to perform Mathematical
functions against fields with numerical values.
[index=web sourcetype=access_c* product_name=* action=purchase | stats
sum(price) as total_list_price, sum(sale_price) as total_sale_price by product_name
| eval discount = round(((total_sal_price - total_list_price) /
total_list_price)*100) | sort - discount | eval discount =
discount."%"]
Convert values with Eval command -
Tostring function - It converts
numerical values to strings. The tostring function also allows formating of
strings, this allows to format for time, hexadecimal numbers and commas.
[index=web sourcetype=access* product_name=* action=purchase | stats
sum(price) as total_list_price, sum(sale_price) as total_sale_price by product_name
| eval total_list_price = "$" +
tostring(total_list_price,"commas")]
[After use tostring function, the field values might not sort numerically
because are now ascii values]
The fieldformat command - It can be used to
format values but not changed the characteristics of underlying values. It uses
the same functions as the eval command.
[index=web sourcetype=access* product_name=* action=purchase | stats
sum(price) as total_list_price, sum(sale_price) as total_sale_price by
product_name | eval total_list_price = "$" +
tostring(total_list_price,"commas") | fieldformat total_sale_price =
"$" + tostring(total_list_price,"commas")]
[fieldformat can be sorted numerically thats because fieldformat is
happenning at display level without changing the underlying data]
[While eval creates new field values, the underlaying data in the index does
not change]
Multiple Eval commands can be used in the search, since eval creates a new
field, its subsequent command can reference the results of the eval commands that
come before it.
[index=web sourcetype=access_combined price=* | stats values(price) as
list_price, values(sale_price) as sale_price by product_name | eval
current_discount=rount(((list_price - sale_price)/list_price) * 100) | eval
new_discount = (current_discount -5) | eval new_sale_price = list_price -
(list_price * (new_discount/100)) | eval price_change_revenue = (new_sale_price
- sale_price)]
[The eval command has "if" function allows you to evaluate an
argument and create defined values for fields depending on evaluate is true or
false]
if(x,y,z)
x--boolean expression
y--it executes if x is true
z--it executes if x is false
[y & z must be in double quotes if not numerical]
[index=sales sourcetype=vendor_sales | eval SalesTerritory =
if(VendorID < 4000,"North America","Rest of the World")
| stats sum(price) as TotalRevenue by SalesTerritory]
[The eval "case" function behavios much like "if" function
but can take multiple boolean expressions and return the corresponding argument
that is true]
[index=web sourcetype=access_combined | eval httpCategory=case(status>=200
AND status<300,"Success")]
[index=web sourcetype=access_combined | eval httpCategory=case(status>=200
AND status<300,"Success", status>=300 AND
status<400,"Redirect", status>=400 AND
status<500,"Client Error", status>=500,"Server
Error")]
[if an event doesn't fit any of the cases no value will be used. If you want to
make sure the value is always return from the case function, we add a final
condition that evaluates to true]
[index=web sourcetype=access_combined | eval httpCategory=case(status>=200
AND status<300,"Success", status>=300 AND
status<400,"Redirect", status>=400 AND
status<500,"Client Error", status>=500,"Server Error",
true(),"Something Weird Happened")]
Eval commands can be wrapped in transforming commands.
[index=web sourcetype=access_combined | stats count(eval(status<300)) as
"Success", count(eval(status>=400 AND status<500)) as
"Client Error", count(eval(status>500)) as "Server
Error"]
Few things to note about using eval inside transforming commands.
["as" clause is required for transforming commands]
['"' double quotes are required for field values]
[resulting field values are case sensitive]
The search command - can be used to
filter results at any time in the search. The command behavious exactly like
the search terms before the first pipe but allows to filter your results
further down the search pipeline.
[index=network sourcetype=cisco_wsa_squid usage=Violation | stats count(usage)
as Visits by cs_username| search Visits > 1]
[Remember: If you can filter events before the first pipe, do it there for
better searches]
The Where command - uses the same
expression syntax as eval and many of the same functions but filters events to
only keeps the results that evaluates to true.
[index=network sourcetype=cisco_wsa_squid } stats
count(eval(usage="Personal")) as Personal,
count(eval(usage="Business")) as Business by username | where
Personal > Business | sort -Personal | where username!="lsagers" |
sort -Personal]
[In the real world, Never use a "Where" command when you can filter
by search terms]
[inside a "eval" or "where" command asteris(*) cann't be
used as wildcard, instead you want to use the like operator with either the
"%"(percentage) or "_"(underscore) character]
%(percentage) - character will match multiple characters
_(underscore) - character will match the one
If you want to eval a field to check is null or not use the "isnull"
function
[index=sales sourcetype=vendor_sales | timechart sum(price) as sales | where
isnull(sales)]
[index=sales sourcetype=vendor_sales | timechart sum(price) as sales | where
isnotnull(sales)]
while using "where" clause - when evaluating the value, values are
case sensitive
[index=sales sourcetype=vendor_sales | where product_name="final
sequel"] --> does not get result
[index=sales sourcetype=vendor_sales | where product_name="Final
Sequel"] --> this gets the result as value of product name is case
sensiive while using "where" clause
If you use single quote, Splunk will treat the string as a field.
The Fillnull command - It replaces any null
values in your events.
If you run a report that includes nulls for some data, your report is displayed
with empty field.
[index=sales sourcetype=vendor_sales | chart sum(price) over product_name by
VendorCountry | fillnull]
By default, the null values are replaced with 0 "Zero". But by using
a "Value" argument any string can be used.
[index=sales sourcetype=vendor_sales | chart sum(price) over product_name by
VendorCountry | fillnull value="nothing to see here"]
Module-6
Transaction command – Transaction any group of related events
that span time. These events can come from multiple applications or hosts.
Events related to purchase from Online store, can span
across from application server, database and e-commerce engine.
One email message can create multiple events as it travels
to various queues. Each events in network traffic logs represents a single user
generating a single http request.
Visiting a web-site, normally generates multiple http
request for html, javascript, flash, css files and images, etc.
[index=web sourcetype=access_combined | transaction clientip]
à We get a list of
events that shared the same client IP
[index=web sourcetype=access_combined | transaction clientip
| table clientip, action, product_name]
The Transaction command can create two fields in raw events,
duration and eventcount.
Duration – The duration is the time difference between first
and last event in the transaction.
Eventcount – The eventcount is the number of events in the
transaction.
These fields can be used with statistics and reporting
commands,
[index=web sourcetype=access_combined | transaction clientip
| timechart avg(duration)]
The transaction command includes some definition options,
the most common begin maxspan, maxpause, startswith & ensdwith.
Maxspan – It allows to set of maximum total time between earliest and
latest events.
Maxpause – It allowed maximum total time between events.
Startswith – It allows forming transactions starting with specified: terms,
field values & evaluations
Endswith – It allows forming transactions ending with specified: terms,
field values & evaluations
[index=web sourcetype=access_combined | transaction clientip
startswith=”addtocart” endswith=”purchase” | table clientip, action,
product_name]
Transaction is very incredibly handy when you need to
investigate an item. If you want to see what email are rejected by your email
security device.
[index=network sourcetype=cisco_esa REJECT]
[index=network sourcetype=cisco_esa | transaction mid dcid
icid | search REJECT]
Since “Transaction” are incredibly powerful you might want
them instead of “stats” but there are specific reasons to use one or the other.
Transactions –
01.
Use transaction to see events correlated
together
02.
Use when events need to be grouped on start and
end values
[By default, there is a limit of 1000 events per
transaction]
Stats –
03.
Use stats to see results of a calculation
04.
Use when events need to be grouped on a field
value
05.
Stats command is faster and more efficient, so
when you have choice use “stats”
[Stats does not have any limitation]
Module-7
What is Knowledge Object? – Simple put their tools that
helps you and your users discover and analyze your data. They include,
·
Data interpretation
·
Classification
·
Enrichment
·
Normalization and
·
Search Time Mapping of knowledge called Data
Models
Knowledge object are useful in Splunk for several reasons, it can
be created by one user and it can be shared with other user based on
permissions. They can be saved or reused to multiple peoples or multiple apps
and they can be used in search.
[Knowledge objects are powerful tools for your deployments]
Your Role –
·
Oversee knowledge object creation and usage
·
Implement best practices for naming conventions
·
Normalize data
·
Create data models
[Keeping the tools box (Knowledge Object) clean and
efficient]
Naming conventions –
·
Developing a naming convention will help us and
our users know exactly what each knowledge object does and will help Splunk
tool box uncluttered.
·
Create a Knowledge object with six segmented
keys, Group, Type, Platform, Category, Time and Description.
[OPS_WFA_Network_Security_na_IPshoisAction]
Permissions –
·
Permission playing a major role by creating and
sharing “knowledge objects” in Splunk
·
There are 3 pre-defined ways knowledge objects
can be displayed to users, Private, Specific App & All Apps
·
When user creates an “Object”, by default set to
private and only available to user
·
Power and Admin user are allowed to create
knowledge object that can be shared for all users of an App. They may allow
other roles to edit the object by granting their role with write permissions
·
Admin is the only user role that allowed to make
knowledge objects available to all apps
·
As with shared and app objects, these are
automatically made readable to all users but admin can choose to grant read and
write access per role
·
Admins also can read and edit private objects
created by any role
Manage Knowledge Objects –
·
Knowledge objects can be centrally managed under
the knowledge header in the settings menu
·
User with Admin role will see a “Reassign
knowledge objects” button
CIM Intro –
·
As we mentioned normalizing index data is the
major part of your role is a knowledge manager.
·
In most Splunk deployments, Data comes from
multiple sourcetypes as a result the same values of data can occur under many
different field names
Eg.,
sourcetype=access_combined – field: “clientip”
sourcetype=cisco_wsa_squid – field: “userIP”
·
At search time, we may want to normalize these
different occurrences to a common structure and naming convention. Allowing us
to correlate events from both source types
·
Splunk supports the use of a “Common Information
Model” or CIM to provide methodology for normalizing values to a common field
name
·
CIM uses schema to define standard fields
between sources, We can use knowledge object to help make these connections
Module-8
The Field Extractor – It is a utility allows you to use a graphical
user interface to extract Fields that persist as Knowledge Objects making them
reusable in searches.
There are 2 different methods that field extractor can use
to extract data.
·
Regular expression
·
Delimiters
Regular expression will work well when you have unstructured
data and events that you want to extract fields from. The field extractor will automatically
build Regular Expressions using provided samples.
Delimiters will used when events contain fields separated by
a character.
There are 3 ways to access Field extractor utility,
01.
From the fields menu in the settings
02.
The fields side bar
03.
From the events actions menu
·
The workflow changes depending on how you access
the Field Extractor and which method you choose. The easiest way to extract the
field is using the events actions menu.
Extracting Fields : RegEx – If you do edit the regular
expression, you will not be reture to the field extractor utility after doing
so.
Extract with Delimiter -
Extracting Multiple
Fields – The field extractor is also making it easy extract multiple field from
overlapping values.
Module-9
Field Alias – It will give you a way to normalize the data
over multiple sources. You can assign one or more aliases to any extracted
fields and can apply them to lookups.
Normalizing below sourcetype and the correlating of fields
is “Employee”
Sourcetype=cisco_firewall field=”Username”
Sourcetype=winauthentication_security field=”User”
Calculated Fields – If you find yourself repetitive, long or
complex eval commands calculated field can save your lot of time and headaches.
[Calculated Fields must be based on an extracted or
discovered fields]
[Output Fields from a Lookup Table or fields generated from
within a Search string are not supported]
Module-10
Tag –
Tags in Splunk or Knowledge object that allows you to
designate descriptive names for key-value pairs. They enable you to search for
events that contain particular field values.
[index=web host=www*]
www1 & www2 is in San Francisco
www3 is in London
Will use “tags” to give this host function and location labels
Creating Tags –
We can create tags by clicking on events information link
and clicking the action link for the field value pair we want to tag.
[index=security tag=SF]
[tag values are case sensitive in a search]
Event Types – It allow you to categorize events based on
search terms.
Creating Event Type from search -
Event type builder – An Event type can also be build using
the Event type builder.
When to use Event Types vs Saved Reports, each option has
its own advantages depending on what you need to do with your data.
Event types –
·
Allow you to categorize events based on search
string
·
Use tags to organize your data
·
“eventype” field within a search string
·
Eventtypes don’t include the time range
Saved reports –
·
It used when search criteria is not changed
(Fixed search criteria)
·
When you need to include a time range and
formatting the results
·
When you want to share with other Splunk users
·
When you want to add a report to dashboards
Module-11
Macros – are search strings or portions of search string that can be
reused in multiple places within Splunk. They are useful when you frequent run
searches requiring similar or complicated search syntax.
There are couple of things that make macros like no other
knowledge objects.
·
Macros allow you to store entire search strings
including pipes and eval statements
·
They are time range independent, allowing the
time range to be selected at search time
·
pass arguments to the search
Create Macro –
[index=sale sourcetype=vendor_sales | stats sum(sale_price)
as total_sales by Vendor | eval total_sales = “$” +
tostring(round(total_sales,2),”commas”)]
Settings à
Advanced search à
Add new in Search macros
Destination App: (search)
Macro Name: convertUSD
Definition: {This is the search string that will expand when
referenced – [eval total_sales = “$” +
tostring(round(total_sales,2),”commas”)]}
[index=sales sourcetype=vendor_sales | stats sum(sale_price)
as total_sales by Vendor | `convertUSD`]
{Backticks tells Splunk that this is the macro and to
replace it with the search in the macro definition}
Macro Argument – while this macro has saved as some
keystrokes. The goal should always beat make our macros as reusable as
possible.
List of macros can be seen under, Settings à Advanced search à Search macros
Destination App: (search)
Name: convertUSD(1)
Definition: eval $value$ = “$” +
tostring(round($value$,2),”commas”)
Arguments: value
[index=sales sourcetype=vendor_sales | stats sum(sale_price)
as Total_Sales by Vendor | `convertUSD(Total_Sales)]
-
Macros can be passed with any number
[index=sale sourcetype=vendor_sales | stats sum(sale_price)
as Average_price by product_name | `convertUSD(“Average_price”)]
Multiple Arguments –
Since we are using two string function with the eval command
if we try to sort our result alphanumerically which might not be our desire
result. Lets add another argument that allows users to choose if they want to
convert the currency with the eval or field format command.
Expanding search –
Splunk has a builtin search expansion tool that allows you
to preview your search without running it.
(Ctrl/windows)+shift+E to open a search expansion window
Module-12
Workflow Actions - Let us create links with an
events that interact with external resources or narrow down search.
They use the HTTP GET or POST method to pass information to external sources or
pass information back to Splunk to perform a secondary search.
Workflow Action - GET Method
To create a workflow actions - Settings --> Fields --> Workflow actions
(Add new)
Destination app
Name
Label - "Get WhoIs for $src_ip$ (this label will display in UI when you launch
the action)
Apply only to the following fields - src_ip
URI - http//whois.domaintools.com/$src_ip$
Workflow Action - Search
A workflow action can also be used to launch a search.
Settings --> Fields --> Workflow actions
Destination app -
Name -
Label - Find other events for $src_ip$
Apply only to the following fields - src_ip
Apply only to the following event types -
Show action in - Event menu
Action type - search (search will bring-up the search configuration)
Search string - $src_ip$
Run in app - search
Open in view -
Run search in - New window
Module-13
Data Models Intro -
In the fundamental 1 course, you learnt how to use the pivot interface
to create reports and dashboards.
Pivot - It allows users to work with
Splunk without ever having to understand the Splunk search language.
Data
Models - are
hierarchically structured datasets. They consist of, Events, Searches &
Transactions.
You can think of Data Model is the framework, Pivot is the interface to the
data.
Data Model Scenario - Some thought need to go intercreating our data models
before the build them.
For Data Models, could you Pivot to search report and segment the data anyway
we wanted.
[Any field can be made available to the data model]
We build data set hierarchi's by adding childern data set to the root data
set.
Creating Root Datasets - Settings --> Data Models --> New Data
Models
Title -
ID -
App - "searching and reporting"
Description -
Add Dataset --> Root Event / Root Search
Root Event - It enables you to create hierarchies
based on a set of events, and are the most commonly used type of root data
model object.
Root
Search - It
builds these hierarchies from a transforming search. Root search don't benefit
from data model acceleration.
[Splunk suggest to avoid using Root search whenever possible]
Root
Transaction -
It objects allow you to create datasets from groups of related events that span
time. They use an existing object from our data hierarchy to group on.
Child
Objects - It
allow us to constrain or narrow down the events in the objects above it in the
hierarchical tree.
If you try to create a pivot with the current module, we can only use
inhereted fields to split our data which is not very helpful. So we will need
some additional fields.
Add fields -
01. Auto-Extracted - attributes are the fields Splunk extracts from our
data
02. Eval Expression - is an attribute created by running an eval expression on
a field
03. Lookup - atribute is created using lookup tables
04. Regular Expression - allows us to create an attribute using a regular
expression on the data
05. Geo IP - attribute is created from Geo IP data in our events
[We select the fields we wanted to display and rename them for the end
user]
Transactions with Datasets - Do not benefit from data model acceleration.
Data Models in search -
[It is recommended to use the Pivot UI over the pivot command]
Manage Data Models - Settings --> Data Models
We can edit our data model or explore them in Pivot. We can choose to upload
and restore Data model from backup file.
[Accelerating data models can make searches faster and more efficient]
Module-14
CIM - Common information model
01. Demystify CIM
02. Why to make data CIM-compliant
03. How to validate compliance
[Sametype of data can occur as different field names]
sourcetype=access_combined field "clientip"
sourcetype=cisco_wsa_squid field "userIP"
Using a CIM, we could normalize the different occurances (clientip /
userIP) to a shared structure "SRC" allowing us to correlate the
clientip data with userip data under a shared field name.
Splunk provides the methodology for normalizing values to common field name by
supporting the use of CIM.
Using the CIM schema, We can make sure all our data maps to defined method.
(Maps all data to defined method)
Sharing a common language for field values. (Normalizes to common
language)
You can normalize the data at index time or a search time using knowledge
object. (Data can be normalized at index time or search time)
CIM schema should be used for,
* Field extractions
* Aliases
* Event types
* Tags
Knowledge objects can be shared globally across all apps. Allowing us to take
advantages of mappings, no matter which apps is using at a time.
Splunk Premium solutions like Splunk Enterprise security rely heavily on data
that is CIM compliant when searching data, running reports and creating
dashboard.
Splunk provides CIM Add-on its Splunk base that include JSON data model
files that help you
* validate indexed data compliance.
* use normalize data in pivots
* and can help improve preformance through data model exceleration
* Add-on is free and no additional indexing. so will not affect license in
any-way.
* Add-on is only be installed on search head or a single instance of deployment
of Splunk.
* User with the admin role is required to install Add-on
Using CIM with your data
01. Getting Data in
02. Examine Data
03. Tag Events
04. Verify Tag
05. Normalize Fields
06. Validate Against Model
07. Package as Add-on
Settings --> Data Models
Normalizing Data to CIM
Field extractions and lookups can also be used to make fields CIM
compliant.
We can search our datamodel using our "datamodel" command.