Splunk 102 Flashcards

Question

What does OOTB mean?

Answer 1

Out Of The Box means that some software's feature comes with the "base" of the software, and it doesn't need to be installed seperatly etc. to be accessed and used.

Answer 2

main: This is the default index. All processed data will be stored here unless otherwise specified _internal: Stores all splunk component's internal logs and processing metrics. It is often used for troubleshooting. Search for logs that say ERROR or WARN. _audit: Stores events related to the activities conducted in the component - including files system changes, and user auditing such as search history and user-activity error logs. _summary: Summamy indexing allows you to run fast searches over a large data set by scheduling Splunk to summarize data then "import" data into the summary index from another larger index over time _fishbucket: This index tracks how far into a file indexing has occurred to prevent duplicate data from being stored. This is especially useful in the event of a server shutdown or connection error.

Answer 3

By configuring indexes.conf properly.

Answer 4

This is the default index. All processed data will be stored here unless otherwise specified

Answer 5

Stores all splunk component's internal logs and processing metrics. It is often used for troubleshooting. Search for logs that say ERROR or WARN. IT HOUSES INFORMATION FROM SPLUNKD.LOG WHICH IS A VERY IMPORTANT LOG FILE THAT TELLS YOU ABOUT THE HEALTH OF THE SPLUNK COMPONENT THAT YOU ARE ON.

Answer 6

Stores events related to the activities conducted in the component - including files system changes, and user auditing such as search history and user-activity error logs.

Answer 7

Summamy indexing allows you to run fast searches over a large data set by scheduling Splunk to summarize data then "import" data into the summary index from another larger index over time

Answer 8

This index tracks how far into a file indexing has occurred to prevent duplicate data from being stored. This is especially useful in the event of a server shutdown or connection error.

Answer 9

It will go to the main index

Answer 10

It is one of the most important Splunk internal logs. It is stored in internal index.

Answer 11

Check splunkd.log

Answer 12

Through back end or through search head

Answer 13

$SPLUNK_HOME/var/lib/splunk/defaultdb

Answer 14

hot: <7days 20x/day warm: 7 days - 1 month 5x/week cold: 1-3 months 5x/month frozen 3months + /2x year

Answer 15

$SPLUNK_HOME/var/lib/splunk/defaultdb/db/*

Answer 16

$SPLUNK_HOME/var/lib/splunk/defaultdb/db/*

Answer 17

$SPLUNK_HOME/var/lib/splunk/defaultdb/coldb/*

Answer 18

Deletion is the default. You must configure if you desire to archive the data instead

Answer 19

$SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/*

Answer 20

By configuring different retention policies in the indexes.conf file

Answer 21

Those are attributes that control when data rolls from warm to cold, cold to frozen etc...

Answer 22

maxHotBuckets = states maximum hot buckets that can exist for index

Answer 23

maxDataSize = maximum size of the hot bucket before it rolls to warm. When setting the maximum size, you should use auto_high_volume for high volume indexes (such as a network index); othwerise, use auto.

Answer 24

maxTotalDataSizeMB = determines when buckets roll from cold to frozen by size

Answer 25

frozenTimePeriodInSecs = determines when buckets roll from cold to frozen by time.

Answer 26

86400 seconds x (hot days+cold days)

Answer 27

coldToFrozenDir = is the file path to where the frozen data will be stored. This directory will usually have volume or disk attached to it.

Answer 28

coldPath.maxDataSizeMB = tell splunk how long deas data stay in cold bucket by size in MB

Answer 29

maxWarmDBCount = tells how many warm buckets we want to have (300 is default)

Answer 30

https://splunk-sizing.appspot.com/

Answer 31

A deployment topology that portions search management and search fulfillment/indexing activities across multiple Splunk Enterprise instances. In distributed search, a Splunk Enterprise instance, referred to as the search head, distributes search requests to other instances, called search peers, which perform the actual searching, as well as the data indexing. The search head merges the results back to the user. Distributed search provides horizontal scaling, so that a single Splunk Enterprise deployment can search and index arbitrarily large amounts of data. Distributed search is also useful for correlating data across data silos.

Answer 32

Round robin is a mechanism of distributing data from forwarders to available indexers. So when forwarder starts forwarding process it searches for available indexers, it sends data to the available one for (default) 30 seconds, then to other avaible one for next 30 seconds etc...

Answer 33

It is the path to indexes which can be stored outside of Splunk

Answer 34

Logs - it is a file that records events that occure in OS, or other software. So for example on Linux we have /var/log/secure file, which stores logs for authentication process of the system (so successful, failed attempts etc.), or /var/log/cron file, that stores cronjobs related events.

Answer 35

Configurations - in simple definitions to configurate something means to set the way that thing behaves. In splunk enviroments configurations of different components are stored in .conf files. Eg. inputs.conf (data source, index name, type of data), indexes.conf (indexes name, how and for how long to store the data)

Answer 36

Metrics - IT metrics are quantifable measurements used to help to specify of the quality of given product/system/software etc, and demonstrate the value of it. Eg.: - Uptime: This is the amount of time that systems are available and functional. - Application crash rate—how many times an application fails divided by how many times it was used.

Answer 37

) Alerts - tasks that continually look for and report on specific events or conditions. When the conditions of the alert are met, an alert notification is triggered. eg: - Some software will achieve threshold for response time, and it generates an alert - There have been too many unsuccessful login tries to given server/system/software, which could generate an alert

Answer 38

those are individual occurences of recorded activity/split data A single piece of data in Splunk software, similar to a record in a log file or other data input.

Answer 39

The Splunk licensing structure is based off of the indexing volume that is processed by Splunk on a per day basis. The "upper limit" of the data index volume is called "license quota", and it can result in intense penalties, if some company's indexing volume will exceed that quota. Companies have to increase their license quota on as needed basis as they ingest more data.

Answer 40

By gathering the data from the source with universal forwarders, sending them (also with UF) to Indexers which will parse and retain that data, what in turn would allow to access that data, or in more practical terms - gain insights that or coming from it through a search head. Of course there is much, much more in the whole process (configurating components, installing Splunk, calculating which type of license do we need etc...) but those are the basics of the process.

Answer 41

In first stage, from raw stream of data the data is split into smaller pieces called events, and it stores them in indexes. Then metadata is being attached to single one of them (metadata = data about data, so in this case it includes host, so from what server the logs are coming from, then source, so path to the data file, and sourcetype which states the type of data). In the second stage the events with metadata are being to placed to buckets (so segments of data). That data can be searched upon (excluding data stored in "frozen" buckets"). Then it writes (stores) INDEXED data and RAW data to disk, and it compresses it.

Answer 42

. - absolute path to the source of data - if we want monitoring to be enabled at all - type of data - index name to which the data is going to be attributed to

Answer 43

Heavy forwarder both collects data, and parses it before forwarding it. It is slower than universal forwarder, so depending on the needs of the environment we might want to use it, but most likely we would stick to universal forwarders.

Answer 44

Through Deployment Server we can control the way that other Splunk components behave. So, for example it could send config files to Universal Forwarder (technically UF would pull those files from DS), and through them tell it, from which source to gather data from, and to which servers (so indexers) to send the data to. So UF gathers data from the source, sends it to Indexer whose job is to parse, refine, filter that data in indexes, which in turn makes that data possible to access and gain insights out of it through the Searchhead.

Answer 45

Virtual Machine - it is a type of software, that allows us to run operating systems on other operating systems. For example we can emulate Linux on our Windows system with the software called Oracle VM Virtual Box, or emulate Windows on our Linux system with VMware Player.

Answer 46

Network devices - are hardware, that makes up the network infrastructure, eg. hub, switch, router, modems etc.

Answer 47

Databases - are systematized collection of structured data, eg. SQL Server, mySql, Oracle, Redis

Answer 48

$SPLUNK_HOME/var/lib/splunk/defaultdb/ (excluding frozen buckets as the data gets deleted or achived in the directory we specify)

Answer 49

Those attributes determine when buckets roll from cold and frozen. maxTotalDataSizeMB specifies at what limit data size limit that happens, and frozenTimePeriodInSecs specifies after how many seconds that happens.

Answer 50

by setting the maxHotSpanSecs attribute (in seconds!)

Answer 51

Not quite sure yet.

Answer 52

_fishbucket index

Answer 53

_internal index

Answer 54

At the parsing stage in indexers after it has split the data into events.

Answer 55

The bucket's life has 5 phases in it's cycle. 1. Hot: this is the directory where all the data enters into the index and is written to the disk 2. Warm: data comes here when hot bucket is full or Splunk is restarted. This bucket sahres the same directory as the hot ones, and with hot bucket it is the fastest but also most expensive storage. It stores frequently searched data. 3. Cold: The data gets here after the time or space limit has been reached. It has rarely searched data that has aged. It is slower and cheaper storage and it is considered the archive tier. 4. Frozen: archived data, or we can compare it to windows "recycle bin" - to access the data we would have to recover it through thawing process. This data is of low priority, but some companies can store it for many years, depending on it's policies. (this data is not searchable!!!) 5. Thawed: Buckets restored from an archive. If you archive frozen buckets, you can later return them to the index by thawing them.

Answer 56

With frozenTimePeriodInSecs attribute in indexes.conf. So if we want to look at it from colt to frozen perspective, we would have to use this calculation: 86400 seconds x (hot days + cold days) So seconds in a day x how many days the data will be stored in hot buckets and in cold ones before it moves to frozen.

Answer 57

Hot and warm bucket stores the most recent, and the most accessed data. It is fastest to search in hot/warm bucket, but also it is most expensive storage. Cold bucket stores not so often searched data that has moved forward from warm bucket. but it a It is slower, but is also a cheaper storage. Frozen bucket stores old data that probably won't be accessed, but it is kept for different reasons (eg. company's inner policies). The data is not searchable (you have to thaw it first) but has the cheapest storage.

Answer 58

$SPLUNK_HOME/var/lib/splunk/defaultdb/db/*

Answer 59

The maxTotalDataSizeMB parameter controls the combined size for all these buckets together. When size limit has been reached, the oldest cold bucket is rolled to frozen (and is no longer counted - regardless of whether this means deletion or archival).

Answer 60

coldToFrozenDir

Answer 61

maxWarmDBCount

Answer 62

homePath.maxDatasizeMB

Answer 63

coldPath.maxDataSizeMB

Splunk 102 Flashcards

(90 cards)