REGEX Flashcards
Why does Regex matter?
-Clarifies how splunk will bring data into indexes
-Shows you how to look at logs to see individual events
-Gives insight as to why and how regex can be applied
-Will help in the onboarding of data
Logs vs Events
A. LOG = contains specific types of events, documenting all the records that happened in a particular time
B. EVENT = located within a log and it’s action that happens that makes it unique from the other “things” that happen
Process of Logs Rolling into Linux
-When logs roll in they create new files; as it ages it rolls off into another syslog like syslog.1
-Most current logs are in the none number ones; “.z” is for zipped files and is used to save space
-Some logs will rollover by date; sometimes logs rollover by volume
-Most of the rollover stuff Splunk has already ingested and has been indexed as events; are essentially backup files that can be cleaned up/deleted if you need more space in Linux
How does Splunk know what logs to watch and ingest into its indexes?
Inputs monitoring stanza
Reasons for lags in Splunk ingesting logs into indexes
-There are large geographical time differences
-System might not be running efficiently
-Servers might have outdated hardware
-Splunk system might not be scaled to handle logs
-Forwarder might not be working properly
Ingestion rate of log files
Frequency of ingestion is once per day is one way to ingest log files but usually live streamed is the preferred method
True or False-Each log has its own IP address= different users
True
What is Get Command?
Get command= you want to view different components of the cart/website
What is the purpose of JessionID?
given to visitors who log in and out/identify users of that site only for the duration they use it
What does Regex do?
- FIltering-eliminate unwanted data from searches
- Matching-advance pattern matching to find results you need
- Field Extractions-labeling bits of data that can be used for further calculations and analysis
Greedy Regex vs Lazy Regex
Greedy Regex= means that your regex statement is using an operator that keeps “gobbling” up matches until it is interrupted
Lazy Regex= means your Regex will only match first occurrence of the pattern you are looking for before it stops
Index vs Search Time Extractions
At Index Time: the indexers are told to also extract fields/value pairs from the data AS they are committing the data to disk.
***Burdens the indexers and can affect performance
At Search Time: the search head perform extractions as it is bringing back your data from the indexers.
***Better option, these field extractions act as knowledge objects that reside on SH.
True or False? Metadata is attached to every single event so that you know where the event came from.
True
Describe PROD, DEV, UAT, & SANDBOX
PROD-Production
Where all changes are finalized and go live; end users or customers use this environment
DEV-Development
Most commonly used for testing and development. Sometimes isolated from prod, sometimes connected, this environment is unpublished and an exact copy of prod
UAT-User Acceptance Test
Once the testing phase is over, the user which will be using the application must okay your work on their end
SANDBOX-testing
This is an isolated environment where you can safely write code and run tests without any communication with production environment
Standard flow of data through environments
1.First data goes into dev ;onboard your data
2.Work done in dev will be copied to UAT(UAT is middle environment for the team that needs your work to review what you’ve done)
3.then info from UAT will be sent to Prod(Do not make ANY changes without permission)
List the different ways to onboard data.
-UF
-Syslog
-HEC
-API Collection
-Scripted Inputs
-One Shot Upload
List primary .conf files for onboarding
-Inputs.conf
-Outputs.conf
-Authentication.conf: configuration file that specifies how Splunk users are authenticated.
-Authorize.conf: specifies the permissions that Splunk users have
-Serverclass.conf: specifies the properties of Splunk server classes.
-Props.conf
-Transforms.conf: specifies how Splunk transforms data
Outline onboarding data in Splunk
1-Gather requirements from data owner
2-Decide between a custom or pre-made TA
3-Decide which method of onboarding you will use to bring data in: API/UF/syslog/Dbconnect/Scripted inputs
4-Determine where the data will reside in splunk-what index will you use and what sourcetype is linked to that data
5-Obtain a sample of data or log you will bring into splunk-sample from data owner-use props.conf config to test data then bring data in thru all appropriate config files
6-You will decide who needs access to this data
Process for Building Custom TA/APP
Step 1: Create a new directory for your TA.
The directory should be located in the $SPLUNK_HOME/etc/apps directory.
Step 2: Create an app.conf file in the new directory.
The app.conf file contains the configuration settings for your TA.
Step 3: Add the following configuration settings to the app.conf file:
name: The name of your TA.
version: The version of your TA.
author: The author of your TA.
description: A description of your TA.
Step 4: Add any additional configuration settings that your TA needs.
For example, if your TA includes a dashboard, you will need to add a dashboard.conf file to the app directory.
Step 5: Create any scripts that your TA needs.
For example, if your TA includes a search, you will need to create a search.sh file in the app directory.
Step 6: Package your TA.
This will create a single file that contains all of the configuration files and scripts for your TA.
Step 7: Test your TA.
Make sure that your TA is working properly by deploying it to a test environment and testing all of its features.
Step 8: Deploy your TA to Splunk.
You can do this using the Splunk Web UI or the Splunk CLI.
Global Stanza vs Local Stanza
Global modifications = default settings that set the standard for and is applied to all configurations beneath it
Local modifications override global