1.0 Deploying Splunk Flashcards

Question

Describe the underlying text parsing process:

Answer 1

Splunk breaks events into segments at index and search time. Index time - uses segments to create lexicons - which point to where on disk Search time - uses segments to split terms

Answer 2

Optimized to execute arbitrary boolean keyword searches and return millions of events in revers time order Inverted index - allows fast full text searches - maps keywords to locations in raw data 2 components: lexicon value arrays containing info about events

Answer 3

Each unique term from the raw event has its own row. Each row has a list of events containing the given term. ``` Looks like a table Term. Postings List (event #) bacon. 0 beets. 2, 5, 7 crab 0, 1, 9 ```

Answer 4

Each unique term from the raw event has its own row. Each row has a list of events containing the given term. ``` Looks like a table Term. Postings List (event #) bacon. 0 beets. 2, 5, 7 crab 0, 1, 9 ```

Answer 5

Each raw event has its own row Each row contains metadata including the seek address ``` Looks like a table #. Seek address. _time. host. source. sourcetype. ```

Answer 6

Management of the storage of indexed data. Allows Splunk to expire old data and make room for new data. *Most restrictive rule wins - prompts a change in state

Answer 7

maxDataSize - max size for hot bucket maxWarmDBCount - max num of warm buckets maxTotalDataSizeMB - max size of an index -> cold to frozen frozenTimePeriodInSecs - max age of bucket -> cold to frozen homePath.maxDataSizeMB - max size for hot/warm storage coldPath.maxDataSizeMB - max size for cold storage maxVolumeDataSizeMB - max size for volume maxHotBuckets - max number of hot buckets timePeriodInSecBeforeTsidxReduction - how long indexers retain tsidx files

Answer 8

- Allows you to manage dis usage across multiple indexes - Allows you to create max data size for them - Typically separates hot/warm from cold storage - Take precedence over other bucket controls

Answer 9

"Most restrictive rule wins" - Oldest bucket will be frozen first - Age determined by most recent event - Hot buckets are measured by size but are exempt from age controls

Answer 10

.35 SF and .15 RF Review formulas

Answer 11

User Query -> SH -> Check search quota -> Check disk -> dispatch directory -> indexers

Answer 12

SH sends search request: 1) request is received by indexer 2) Indexer checks disk 3) Creates dispatch directory in var/run/spl/dispatch 4) configures subsystem - initializes configs (props, transforms etc) using bundle identified by SH 5) Implement time range - finds buckets in range 6) uses bloom filters to minimize resource usage 7) checks the lexicon- find events matching keywords within the lexicon (tsidx files) 8) Use results returned to find the event offsets within raw data from the values array 9) uncompresses raw data - uncompresses appropriate raw data to get the _raw event 10) Process field extractions 11) Send results to the search head

Answer 13

Time spent in search Time spent searching the index Time spent fetching data Workload undertaken by search peers

Answer 14

REST /services/search/jobs

Answer 15

Header Execution costs - categories listed as command.* and command.search.* reflect various phases of the search process Search job properties - Bundle - Can summarize - Create time - Cursor time - diskUsage - dropCount

Answer 16

``` Generating Streaming Transforming Centralized (stateful) streaming Non-streaming ```

Answer 17

Invoked at beginning of a search. Does not expect or require an input | search is implied

Answer 18

Generates a report data structure Operate on the entire event set ex. chart, timechart, stats

Answer 19

Operates on each event individually Distributable streaming - run on indexers ex. eval, fields, rename, regex

Answer 20

Runs on SH | ex. head, streamstats

Answer 21

Fore the entire set of events to the SH (sort, dedup, top

Answer 22

``` search datamodel inputcsv metadata rest stats ```

Answer 23

``` fields lookup rex spath where ```

Answer 24

``` head eventstats streamstats tail transaction lookup local=t ```

Answer 25

``` append chart join stats table timechart top ```

Answer 26

Parsed map-reduce Job inspection -> Search job properties (2 parts): remoteSearch - done on indexer reportSearch - done on SH

Answer 27

1) request received from SH on indexer 2) checks disk 3) creates remote dispatch folder 4) configs subsystem 5) checks time range 6) bloom filter 7) splunk lexicon 8) results to SH No raw data, no field extractions, doesn't use results to offset in array

Answer 28

- Small result sets. Max of 10000 events. Max runtime of 60 sec - Certain commands require (join, set) - Used to produce search terms fr outer search aka find subset of hosts, determine time, craft main search string dynamically *subsearches always run first before main

Answer 29

For subsearches that return many results - better to use stats or eval - typically subsearches take longer than main - GUI provides no feedback when subsearch runs

Answer 30

- filter early - specify index - utilize indexed extractions where avail - use the TERM directive if applicable - place streaming/remote commands before non-streaming - avoid using table, except very end. Causes data to be pushed to SH - Remove unnecessary data using | fields

Answer 31

An arbitrary unit of content deployed by the DS to a group of deployment clients - Fully developed apps (Splunkbase) - Simple groups of configs - Usually focused on a specific type of data or business need

Answer 32

DS: /etc/deployment-apps ---> Client: etc/apps

Answer 33

A centralized config manager that delivers updated content to deployment clients. - units of contents know as deployment apps - operates on a "pull" model - clients phone home - DS does not have SHC or IDXC clients

Answer 34

2000 polls/minute (Windows) 10000 polls/minute (Linux) - Utilize the phoneHomeIntervalInSecs attribute in deploymentclient.conf For more thank 50 clients the DS should be on its own server.

Answer 35

1) Client polls at certain interval: Client X, architecture 2) Determine apps for client using serverclasses 3) List of apps and checksums sent to client 4) Compares remote and local lists to determine updates 5) Client downloads now or updated apps from DS 6) Client restarts if necessary

Answer 36

The client records its class membership, apps, checksums. Client caches the bundle (tar archive) with the app content

Answer 37

App removed from DS, app removed from client If app update is found on DS, client will delete and download a new copy - Apps that store user settings locally will have those settings "erased" - crossServerChecksum uses checksum rather than modtime

Answer 38

Use base configs to provide consistency

Answer 39

Fast initial deployment time Reusable, predictable, supportable Faster troubleshooting Common naming scheme

Answer 40

Serverclass: - allows you to group Splunk instances by common characteristics and distribute content based on those characteristics blacklist (takes precedence), whitelist, filter by instance type

Answer 41

[global] - global level [serverClass:serverClassName] - Individual serverclass, can be multiple erverClass stanzas -one for each serverClass [serverClass: serverClassName:app:appname] - app within the server class. Used to specify apps the serverclass applies to - one for each app in the serverClass

Answer 42

``` repositoryLocation stateOnClient - only thing enabled by default restartSplunkWeb restartSplunkd issueReload ```

Answer 43

Horizontal - all DS are peers - all DS are on same level - all DS respond to clients Tiered - primary DS that servers other DS (parent/child) - any peers can respond to client

Answer 44

Multiple regions More than 2000/10000 clients Network Segregation HA is required Load balancer if too many clients are phoning home

Answer 45

child: deploymentclient.conf - repostitoryLocation= $SPL_HOME/etc/deployment-apps serverRepositoryLocationPolicy = rejectAlways child and parent: serverclass.conf - crossServerChecksum = true

Answer 46

Master node - a single node to manage the cluster Peer nodes - to index, maintain, and search the data Search heads - one or more to coordinate searches across peers

Answer 47

- Validates config settings before sending to the indexers - Monitors indexer peers and attempts to migrate node failures - Acts as a single point of contact for the SHs/MC

Answer 48

1) Indexers stream copies of their data to other indexes 2) Master node coordinates activities involving search peers and search head 3) Forwarders send load-balanced data to peer nodes 4) Indexers send search results to SH

Answer 49

``` 1 - identify requirement 2 - install Splunk Enterprise on instances 3 - enable clustering 4 - complete peer node configuration 5 - forward master node data to peers ```

Answer 50

``` DR and failover needs single site v multi site RF SF Quantity of data indexed and search load ```

Answer 51

Do not set = # of indexers bc cluster will not be able to handle failures

Answer 52

./splunk edit cluster-config -mode master -replication_factor -search_factor -secret your_key -cluster_label cluster1 ``` server.conf [clustering] mode=master replication_factor = search_factor = pass4SymmKey = pwd cluster_label = label ```

Answer 53

./splunk edit cluster-config -mode slave -master_uri https://:8089 -replication_port 9887 -secret your_key ``` server.conf [clustering] mode=slave master_uri = https://:8089 pass4SymmKey = pwd ``` [replication_port://9887] disabled = false

Answer 54

./splunk edit cluster-config -mode searchead -master_uri https://:8089 -replication_port 9887 ``` server.conf [clustering] mode=searchhead master_uri = https://servername:8089 pass4SymmKey = pwd ```

Answer 55

./splunk add cluster-master -master_uri https://:8089 -secret your_key server.conf [clustering] mode=searchhead master_uri = clustermaster:one, clustermaster:two [clustermaster:one] master_uri = https://:8089 pass4SymmKey = pwd [clustermaster:two] master_uri = https://:8089 pass4SymmKey = pwd

Answer 56

Indexer acknowledgement - which retains a copy of the raw data until events are acknowledged

Answer 57

forwarders query the master node to get a list of all indexers in cluster

Answer 58

etc/master-apps --> etc/slave-apps

Answer 59

outputs.conf [tcpout] defaultGroup = peer_nodes forwardedindex.filter.disable = true [tcpout:peer_nodes] server=server1:9997, server2:9997, etc

Answer 60

1 Cluster master 2 SH tier 3 Indexers Options: Tier by tier Site by site Rolling peer-by-peer (7.1.x+) Forwarders only should be upgraded opportunistically

Answer 61

- Listens for cluster peers - adds to cluster when peer registers - Waits for RF number of peers to be satisfied before starting its functions - Listens for heartbeat of peers - if doesn't hear back for x amount of time peer it is marked down - Checks the manifest of buckets provided by all peers to determine if policy is met. If not fix up is triggered

Answer 62

SPL/var/run/splunk/cluster/remote-bundle bundle contains the contents from master-apps

Answer 63

index name local ID orig indexer GUID

Answer 64

1 idx notifies CM of new hot bucket 2 CM replies with list of streaming targets for replication 3 orig indexer begins replicating to new indexer

Answer 65

1 IDX notifies CM when bucket rolls to warm | 2 Rep target notified that bucket is complete and rolls to warm

Answer 66

CM notified when freezes | CM stops doing fix up tasks

Answer 67

stops processing events | CM is notified and idx enters detention

Answer 68

Auto: Stops indexing internal and external data. Stops replication. Doesn't participate in searches. Manual: Stops indexing external data (can be switched) Stops replication

Answer 69

Cluster runs as normal as long as there are no other failures If peer creates hot bucket - it will try to contact master and fail. It will continue sending to previous peers SH will continue to function but will eventually begin to access incomplete data Forwarders continue to send to their list When it comes back up: Master starts fix up tasks Peers continue to send heartbeats and reconnect

Answer 70

No failover Must have standby Copy over server.conf and master/apps Ensure peer nodes can reach new master

Answer 71

Stops sending heartbeat. Master detects after 60s and starts fix up tasks Searches will continue but only provide partial results

Answer 72

Starts sending heartbeat. Master detects and adds back to cluster Master rebalances cluster IDX downloads latest conf bundle from master (if necessary)

Answer 73

A multi-site cluster requires additional configuration in the [clustering] stanza  - The Cluster Master requires at least one host per site - Origin site is the site originating the data or the site where data first entered the cluster - origin : is minimum number of copies held on the origin site - site#: defines the minimum copies for that site - total: defines the total copies across all sites ``` server.conf [general]  site = site2 [clustering]  mode = master multisite = true available_sites = site2, site8, site44  site_replication_factor = origin:2, site8:4, total : 8 site search factor = origin:1, site8:l, total : 4 constrain_singlesite_buckets=false ```

Answer 74

After migration from single to multi: - cluster holds both sing and multi buckets - buckets created with marker in journal.gz indicating site origin - All SHs and CM are required to declare site membership - Indexers only return data if their "primary" status matches the requested site

Answer 75

1 install new indexers 2 add new indexers to cluster and ensure they receive common configs 3 Decommission and remove old indexers from master list by removing one at a time to allow bucket fix up to complete (selectively fail node) ./splunk offline --enforce-counts 1) Install new Indexers 2a) Bootstrap indexers to join the cluster as peers 2b) Ensure new indexers receive common configuration - Distribute files and apps with the configuration bundle - Common Files: indexes . conf, props . conf, transforms . conf 3a) Prepare to decommission old indexers - Point forwarders to new indexers - Put old indexers into detention 3b) Decommission old indexers (one at a time) - Run command splunk offline --enforce-counts - Wait for indexer status to show as GracefulShutdown in CM Ul - Repeat for remaining indexers - CM will fix / migrate buckets to new hardware 3c) Remove the old peer from the masters list

Answer 76

stop master upgrade using normal Splunk Enterprise procedure start the master

Answer 77

stop all SHs Upgrade using normal procedures * if integrated with IDXC - upgrade one member and make it the captain - upgrade additional members one by one - upgrade deployer start SH

Answer 78

- on CM enable maintenance mode to prevent unnecessary fix-ups - stop indexers - upgrade normally - start indexers - on CM disable maintenance-mode Or searchable rolling restart on indexers

Answer 79

Deployer - pushes out apps | SHs replicate knowledge object

Answer 80

1) identity reqs 2) set up deployer 3) Install Splunk instance 4) initialize cluster members 5) bring up cluster captain 6) perform post-deployment set up

Answer 81

server.conf [shclustering] pass4SymmKey = pwd shcluster_label = label

Answer 82

splunk init shcluster-config - auth usr:pwd -mgmet_uri servername:port -replication_port -replication_factor -conf_deploy_fetch_url :<8089> -secret key -cluster_label label

Answer 83

- Horizontal scaling for increased capacity - HA for scheduled search activity - Centralized management of baseline configs - Replication of user generated content for consistent user experience

Answer 84

Coordinates replication of artifacts, maintains registry. - artifacts stored in /var/run/splunk/dispatch Pushes knowledge bundles to peers Replicates runtime config updates Assigns jobs to members based on relative current loads

Answer 85

Uses a dynamic election - election occurs: - current captain fails or restarts - network errors cause 1+ members to disconnect - current captain steps down after detecting that majority of members have stopped participating in the cluster - New captain is elected with majority vote

Answer 86

Cluster consists of at least 3 members Captain election requires 51% If deploying across 2 sites - primary site must contain majority nodes bc network distruption will still allow election

Answer 87

server.conf preferred_captain=true - to have one member always run as captain - you don't want captain performing ad hoc jobs - repair the cluster

1.0 Deploying Splunk Flashcards

(112 cards)