Enterprise only content Flashcards
Splunk licensing is based …
on the amount of data indexed.
The daily license quote includes the full size of data flowing through the ___, but not the disk storage.
parsing pipeline
Replicated data, summary indexes, internal logs, and metadata ___ count towards license quota.
does not
What are the Splunk license options?
Enterprise Free Trial Splunk for industrial IoT license Forward Dev/test license
Enterprise
o Can be bought for any indexing volume
o Enables all Splunk features including clustering and distributed search
o No enforcement. Users can still search after license violation
o Licenses can be stacked
Free
o Includes 500mb/day indexing for life
o Disabled features include clustering, authentication, distributed search, alerting and deployment management
Trial
o Full Splunk features for 60 days
o After 60 says it automatically becomes free license
o Max 500mb a day
o Sales Trial license can be provided for customised license
Splunk for industrial IoT
o Not stackable
o Access to Splunk enterprise and a select premium Splunk apps
Forward
o Allows forwarding of unlimited data
o Cannot be used for indexing
o No need to purchase separately
o Universal forwards automatically apply forwarder license
o Heavy forwarder must be converted to Forwarder License group
Dev/Test
o For running Splunk in Non Prod environments
o Cannot be used in distributed environment
o Not stackable
o Can be used for Splunk App development
What are the license warnings and violations?
- Exceeding daily volume quota results in a warning
- 5 or more warnings in a 30 day rolling period is a violation
- Searching is not disabled in violation period
- Alert logged in Messages on any Splunk Web pages
How do you monitor for license warnings?
- Monitoring console
- Licensing page in Splunk web
- Usage report in Splunk Web
How do you handle license violations?
- Review heavy hitters (usage report) and adjust intake
- The daily limit resets at midnight
- Buy more licenses
How is a search performed?
- During indexing, Splunk indexers convert the machine data stream into searchable events which are stored in indexes
- Indexes contain compressed raw data (journal.gz) and time-series index files (TSIDX)
- Indexes store data in time-oriented buckets (hot, warm, cold and frozen)
- Indexers perform the search and return the results
- Search results and meta data are stored as search artifacts until the search job expires
How does Splunk retrieve data?
- Timeframe – identify data buckets based on time range
* Bloom filter – calculate bloom filter on base search and compare against buckets bloom filter
What is a bloom filter?
- Bloom filter is a bit array created by running search terms through set of hashing algorithms
- Splunk creates a bloom filter for each bucket
- When a search is run, Splunk calculates the bloom filter for the base search, and compares with the bucket bloom filter
- Only the matching buckets are opened
- Having as many filtering terms as possible in the base search improves search performance
What is a search artifact?
- Contains results and metadata
- Stored on $SPLUNK_HOME/var/run/dispatch
- Deleted when the search job expires
- Each job has its own directory
- Too many search artifacts can cause performance degradation
What is a distributed search?
Distributed search separates search management and presentation layer from indexing and search retrieval layer
How does a distributed search work?
- Search head receives users search request
- Search head dispatches searches to the search peers (indexers)
- Search peers run the search on behalf of search heads and return the results to the search head
- Search head merges the results from all the search peers
- Search head runs additional filtering and transformation commands (if applicable) and returns the results to the user
What are search peers?
- The indexers that participate in distributed search are called search peers
- Search peers must be added in search heads
- If the search head participates in indexer cluster, search peers are automatically added
- When a peer goes down, search head removes it from the peers list (default timeout 10 seconds)
What is a knowledge bundle?
- Archive of knowledge objects that search head sends to all search peers
- Includes knowledge objects such as event types, saved searches
- Peers need these knowledge objects to execute searches on behalf of search heads
- Contains a subset of $SPLUNK_HOME/etc/system|apps|users
Where is the location of a knowledge bundle?
- Search head – $SPLUNK_HOME/var/run (.bundle or .delta extension)
- Search peers – $SPLUNK_HOME/var/run/searchpeers
How does does a knowledge bundle get replicated?
- Entire knowledge bundle
* Delta – changes since last full bundle push
What are the four replication policies?
- Classic – search head directly replicates to all search peers
- Cascading – replicates to a subset of search peers which replicates to other search peers and so on
- Mounted – search head places knowledge bundle in shared storage (NOT recommended)
- Remote file storage – search head uploads knowledge bundle to a remote file system
How can you manage knowledge bundles?
- You can customise what gets replicated
* Use distsearch.conf to blacklist large files you don’t need replicated
How can you monitor knowledge bundle replication?
- Splunk web – settings > distributed search
- Monitoring console – search > distributed search
- Command line – $SPLUNK_HOME/bin/Splunk show bundle-replication-status
- REST API - /services/search/distributed/bundle/config
How do you set up a distributed search?
- Install the same version of Splunk Enterprise in search head and search peers
- Search head and search peers must use a license master
- Setup the same indexes in all search peers
- Created a user with edit_user capability on all search peers
- Add search peers in search head via Splunk Web
How are the indexers prepared in a distributed environment?
- Access – create a user with edit_user capability
- Index – ensure indexes have data coming in from forwarders
- Connectivity – ensure search head can connect to management port (8089) of the indexer
How can a distributed search be verified?
- Examine the search peer in distributed search page in Splunk Web. Look for replication status
- Run a search to retrieve events from an index
- Check the internal logs on the indexer
What is a distributed search group?
- Search peers configured into specific groups using distsearch.conf
- Enables to run search on targeted indexers
- User splunk_server_group option in SPL to specify the group
- Distributed search groups should be avoided in indexer clusters
How do you use a distributed search group?
- Specify the distributed search group as part of SPL
- Index=infra splunk_search_group=sre
- Verify by examining splunk_server field
What is meant by quarantining a search peer and how is it performed?
- You can quarantine search peer from participating in searching
- Enables to perform maintenance on the search peer without affecting searches
- User Slunk web to quarantine search peer
What are the scaling options?
- Independent search heads – dedicated search heads with no communication between them
- Search head clusters – a group of search heads (min 3) in a cluster communicating with each other
- Indexer cluster – search heads that join indexer cluster can be independent or search head cluster
What are the search head cluster considerations?
- Minimum 3 members required
- Always use new Splunk instances to create the cluster
- Cluster members must have the same hardware capacity
- Synchronise the clocks of all members including search peers
What are the key benefits of search head clustering?
- High availability and load balancing
- Captain managers and distributes the scheduled jobs
- Configuration and search artifacts replication
- Seamless user experience
What is a search head captain?
- Captain centrally coordinates all cluster-wide activities. Captain is also a member of the cluster
- Captaincy can be configured to by dynamic (default) or static
- With dynamic captaincy, the cluster automatically elects a new captain using RAFT consensus algorithm
- Captain consumes more CPU and memory
How are scheduled jobs and artifacts handled in a clustered environment?
- Captain is the only scheduler
- Captain chooses the search head cluster member to run search jobs based on load
- Search artifacts are replicated by captain to other members. Ad-hoc and real-time artifacts are not replicated