ExtraPreEx3(pt2) Flashcards
How to test new logs with Splunk GUI?
To test new logs with Splunk GUI, you can follow these steps:
Go to the Splunk GUI.
In the search bar, type the following:
index=<your_index> sourcetype=<your_sourcetype>
where <your_index> is the name of the index where your new logs are stored and <your_sourcetype> is the sourcetype of your new logs.</your_sourcetype></your_index></your_sourcetype></your_index>
Click the Search button.
Review the search results to make sure that Splunk is parsing your new logs correctly.
You can also use the Splunk GUI to test new logs by using the Log Explorer. The Log Explorer allows you to view and interact with your logs in real time.
To use the Log Explorer, follow these steps:
Go to the Splunk GUI.
Click the Log Explorer tab.
In the Inputs section, select the index and sourcetype of your new logs.
Click the Start button.
Review the logs in the Log Explorer window.
If you see any errors in the logs, or if Splunk is not parsing the logs correctly, you can adjust your Splunk configuration and test the logs again.
What is epoch/system time?
is a system for measuring time in computing. It represents the number of seconds that have elapsed since a defined starting point called the “epoch”.
The epoch is commonly defined as January 1, 1970, at 00:00:00 Coordinated Universal Time (UTC).
How to onboard data through UF?
Install and configure the Splunk UF on the data source.
Create an inputs.conf file to tell the Splunk UF where to find the data and how to parse it.
Create an outputs.conf file to tell the Splunk UF where to send the data.
Restart the Splunk UF.
Verify that the data is being sent to Splunk.
Here’s a more detailed breakdown of each step:
- Install and configure the Splunk UF on the data source.
The Splunk UF can be installed on a variety of operating systems, including Windows, Linux, and macOS. You can download the Splunk UF from the Splunk website.
Once you have installed the Splunk UF, you need to configure it to send data to Splunk. To do this, you need to create a config file and edit it to specify the Splunk server’s IP address and port number.
You can also configure the Splunk UF to use a proxy server and to use authentication.
- Create an inputs.conf file to tell the Splunk UF where to find the data and how to parse it.
The inputs.conf file is a text file that tells the Splunk UF where to find the data and how to parse it.
The inputs.conf file is divided into sections, one section for each type of data source.
For example, if you are collecting data from a web server, you would create a section in the inputs.conf file for the web server. In the section, you would specify the location of the web server log files and how to parse the log files.
- Create an outputs.conf file to tell the Splunk UF where to send the data.
The outputs.conf file is a text file that tells the Splunk UF where to send the data.
The outputs.conf file is divided into sections, one section for each type of destination.
For example, if you are sending the data to a Splunk server, you would create a section in the outputs.conf file for the Splunk server. In the section, you would specify the IP address and port number of the Splunk server.
- Restart the Splunk UF.
Once you have created the inputs.conf and outputs.conf files, you need to restart the Splunk UF.
The Splunk UF will read the inputs.conf and outputs.conf files and start collecting and sending data to Splunk.
- Verify that the data is being sent to Splunk.
To verify that the data is being sent to Splunk, you can use the Splunk Search & Reporting interface.
In the Splunk Search & Reporting interface, you can search for the data that you are collecting.
If you see the data in the search results, then you know that the data is being sent to Splunk successfully.
How to configure serverclass.conf?
Identify the server classes that you need.
Create a stanza for each server class in the serverclass.conf file.
Specify the following parameters in each server class stanza:
serverClass: The name of the server class.
members: A list of forwarder hosts that belong to the server class.
apps: A list of Splunk apps that are deployed to the server class.
inputs: A list of inputs that are configured for the server class.
outputs: A list of outputs that are configured for the server class.
Here is an example of a serverclass.conf file:
[serverClass:linux_servers]
members = host1, host2, host3
apps = linux_secure, windows_secure
inputs = linux_secure.conf, windows_secure.conf
outputs = output.conf
[serverClass:windows_servers]
members = host4, host5, host6
apps = windows_secure
inputs = windows_secure.conf
outputs = output.conf
Process for getting premade TA from Splunkbase
Step 1- Go to Splunkbase and find add-on and check for compatibility
Step 2- Read details of premade TA
Step 3- Install TA- Download TAs onto local computer and go to winSCP or GUI of server you want to install TA
*go to manage apps then install app from file
*if installed on GUI it should be found in apps directory
Step 4-Next step-move TA to deployment-apps
*Make sure sourcetypes in inputs matches props
Step5-Use DS or CM to push out to the servers I want
Splunk on Prem vs Splunk cloud
cloud - splunk themselves manage all the backend and you just use it.
for on prem you have to build and manage all your servers in the splunk cluster yourself.
Tell me about yourself
1.Talk about you experience with Splunk. ex: positions you’ve held, any certs you’ve obtain that apply yo the field.
-3 years as a Splunk engineer; hold comptia network+ and Security+ certs
-Experience working within the command line of lInux through-with Red Hat being the flavor of distro
-In the command line i am experienced with moving files and directories;copying files using absolute path; renaming files;appropriately deleting files and directories; editing files via Vi —furthermore on the CL i am also familiar with testing connections, reviewing disk usage and volume, as well as checking the health/status of ports
-I have experience editing stanza to activate or deactivate configuration files such as forwarder level config files=inputs.conf,outputs.conf & Indexer level configs=indexes,props.conf. Also, have experience writing configuration files and then placing them in depoyments apps in which they are pulled from the DS by the forwarders in their serverclass-making sure such configurations are correctly done is important because they tell the forwarders what data to collect and to send it to the indexers. Along with working with config files I am also well versed in the rules properties involving file precedence-which comes in handy when there are two of the same stanza in different locations, settings or attributes.
-Use Preconfigured indexes to accomplish varying tasks. Some of these preconfigured indexes are:
a. main-this the default index. All processed data will be stored here unless otherwise specified.
b. _internal-Stores all Splunk components’ internal logs and processing metrics. It is often used for troubleshooting. Search for logs that say ERROR or WARN.
-There are many instances where I have to figure out varying issues as a result I have picked up valuable troubleshooting knowledge involving the use of Btools and Splunk commands such as :
ps -aux = check processes
kill a process = kill -9 <pid></pid>
top =check which process is using the most resources on the server
cat /proc/meminfo = check how much memory is available
cat /proc/cpuinfo = check CPU utilization
df -h = check how much space is left on a volume
fdisk -1 = list available drives
rpm -qa =check installed rpm packages
netstat= find which ports are open and listening for inbound data
c. _audit-Stores events related to the activities conducted in the component-including file system changes, and user auditing such as search history and user-activity error logs.
d. _summary-summary indexing allows you to run fast searches over a large data set by scheduling Splunk to summarize data and then import data into the summary index over time.
e. _fishbucket-This index tracks how far into a file indexing has occurred to prevent duplicate data from being stored. This is especially useful in the event of a server shutdown or connection errors.
-Also, I am familiar with the some of the rules involving storage of data i.e. the bucket lifecycle which includes understanding how data flows into the index and then travels from hot-warm-cold and then frozen buckets. I am also familiar with configuring Splunk indexer policies for bucket storage using indexes.conf. Some of the attributes involved in this process that I use are:
-maxhotbuckets=max hot buckets that can exist in index
-maxdatasize= max size of the hot bucket before it rolls to warm
-Frozentimeperiodinsecs = specify the time in seconds before data is moved to frozen = 86400 sec x(hot days + cold days)=retention time
-coldto frozendir=the file path to where the frozen data will be stored .
I am flexible when it comes to working with different splunk components such as UF and HF. And working with standard or clustered indexer or SH environments. Currently, my environment prefers to use indexer clustering since it provides us with:
-High availability=when we are replicating data within our indexers
-Multiple copies available for searching
-Data gets into our indexers in round robin fashion
As a result of working in such an environment, I ahve been exposed to multisite clustering which is found in 80% of environments(ASK ABOUT THIS AT END).
When I practice the method of Multi site clustering=I help to ensure that one of our multiple copies are stored at a different site. This is useful for:
-data availability
-data fidelity
-data resiliency
-disaster recovery (tell short story fast)
-search affinity
Moreover, I am also familiar with the process of creating a custom TA for the use of configuration deployment. As well as the guidelines involved in app creation. I can also create/modify “base apps{TA) “ for a new or existing architecture deployment.The base apps act as the architecture settings that prepares my architecture to process data and where I can manage and update architecture settings. Some examples of configurations that I work with in the base apps are:
-inputs.conf=deployed to all indexers
-outputs.conf=deployed to all servers that send data to the indexers
-server.conf=deploys license,indexer, and SH sluctering confgurations, etc.
-web.conf-disable or enable web
-indexes.conf-deployed to indexers to maintain consistency in bucket settings
Furthermore, I am experienced with applying concept of Regex when reviewing and onboarding data, index time and search time extractions,extracting fields using SPL(Splunk Processing Language)-which can be very useful I need to execute searched in the front end. I can writing out regex control characters,character classes, and operators which I have practiced with the help of the regex101 website. With Regex I can:
-FIlter-eliminate unwanted data from my searches
-Matching-advance pattern matching to find results I need
-Field extractions-labeling bits of data can be used for further calculations and analysis
I also utilize props.conf and some of its attributes(which include but are not limited to time_prefix,max_timestamp_lookahed, Time_Format, and TZ when I need to refine and parse samples of data that are used in our onboarding of data processes.
And lastly, I feel that it is important to mention that I have experience using transforms.conf as well. Especially for scenarios in which I have to:
-send events to NULL queue
-Separate a single sourcetype into multiple sourcetypes
-host and source overrides based on regex
-delimiter-based field extractions
-anonymizing data -which I find to be very enjoyable because it relates to security.
2.What your current position is and task you handle.
3.Mention the various tools you have used in conjuntion with splunk. You could speak on the different sources you’ve used to onboard data.
4.Speak about your current environment.
can you describe transform and what you know about it/how do you use it in your environment
- I can use transforms to specify transformations and lookups that can then be applied to any event.
These transforms and lookups are referenced by name in props. - I understand that props can be used in conjunction with transforms-I cannot use transforms without first referencing it in props.conf, however they have their own unique properties:
-props is responsible for helping me with data breaks and parses
-whereas transforms is responsible for how my data will end up looking once I am done setting up and executing the appropriate attributes and stanzas
- In my environment I am also able to use transforms to: -Manipulate data before it gets indexed
-Transform and alter raw data
-change metadata - Now I am going to go deeper into the important uses of transforms within my current place of work which include:
A. Sending events to Null queue
-use this as a means to get rid of unwanted data
-unwanted can cause me to have extra work, increase the work hour spent processing, and is inefficient for my environment since its a waste of storage
-In some instances, while onboarding data I have come across debug events which our in-house developers use however for my department it is not needed so I use props and transforms to send the debug data to null queue.
-Another way I can get rid of unwanted data is to Use retention policies= such as Frozentimeperiodinsecs = specify the time in seconds before data is moved to frozen = 86400 sec x(hot days + cold days)=retention time
-truncating data in Splunk also gets rid of unwanted data by shortening the data to a specific length or by removing a specific number of characters from the beginning or end of the data
I’ve used it to:
-Get rid of unwanted data in web server logs. For example, you might want to truncate the user-agent header in web server logs to remove sensitive information such as the user’s operating system and browser version.
-Improve the performance of searches on large datasets. For example, you might want to truncate the raw_data field in large datasets to reduce the amount of data that needs to be searched.
-Remove unwanted characters from data. For example, you might want to remove carriage returns and line feeds from data before indexing it.
-I can also delete data from an index using the delete command. This will permanently delete the data from Splunk.
To delete data from an index, I use the following search:
search index=my_index sourcetype=my_sourcetype | delete
This will delete all events from the my_index index and sourcetype my_sourcetype.
B. Separating a single sourcetype into multiple sourcetypes
Answer=a log might contain events that are kind of different from each other
Sometimes logs are just written in a messed up and funny way-using this method helps to organize data, making searches easier, and allows me to apply different processing rules to different parts of the data.
-Splitting a single sourcetype for web server logs into multiple sourcetypes based on the HTTP status code. This would allow you to easily search for all events with a particular HTTP status code, such as 404 errors or 500 errors.
-Splitting a single sourcetype for firewall logs into multiple sourcetypes based on the source IP address. This would allow you to easily identify all traffic from a particular IP address.
-Splitting a single sourcetype for system logs into multiple sourcetypes based on the log file name. This would allow you to easily search for all events from a particular log file, such as the syslog file or the kernel log file.
C. Host and source overrides based on Regex
-To correct or normalize host and source values.
-To categorize data.
-To filter data.
-To route data to specific indexes.
For Example:
There have been instances where I have had a load balancer that distributes traffic to multiple servers, with these load balancers I thought it was best to override the host value for all events from the load balancer to the name of the actual server that handled the request. This helped me to track which server is handling which requests more efficiently.
D. Delimiter-based field extractions is another use of transforms which is important for organization and improving performance of search results
Delimiter-based field extractions in Splunk are used by me to extract fields from data that is separated by delimiters. Delimiters are characters that are used to separate fields in data, such as commas, spaces, or tabs.
For example: I’ve used this method for Extracting the date and time from a system log. This would allow me to easily identify when events occurred.
E. Anonymizing data-this is important for my environment because we have clients that entrust us to onboard and manage sensitive data such as:
Title 13 Data = Class of protected data containing private business or personal information
PII = Personal Identifying Information
PHI = Protected Health Information
Title 26 Data = Class protected data containing Federal Tax Information
-As a result I have to set up anonymization:
1-First make a reference in props.conf
[pretendpii]
TRANSFORMS-anon=hidelastname
2-Then, define in transforms.conf
[hidelastname]
REGEX= etc.
DEST_KEY=_raw
FORMAT=etc.
-I have also used hashing data by using props.conf + SEDCMD
[sourcetype]
SEDCMD-hidesession = etc
SEDCMD-hideticket = etc
-protection characters(i.e. \, @ , “ , []) = can help me to to protect sensitive data in indexes and search results.
Example:
enclose it in the @ character:
@password@
Splunk will then ignore all characters inside the protection character.
To restart or not to restart
Configuration changes that require a restart
-changes to indexes.conf,inputs.conf
-changes to a home path in indexes.conf
-deleting an existing app
Configuration changes that do not need a restart
-adding a new index or new app with reloadable configs
-changes or additions to transforms.conf or props.conf
***Props.conf=does not need a restart