Section 5.1 Flashcards

(71 cards)

1
Q

What does the Indexing Layer do?

A

Allows you to clean up data.

Allows you to refine data.

Allows you to store data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Index clustering?

A

When multiple indexers are connected in order to replicate copies of the indexers buckets (data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where is data stored?

A

In indexes on the indexer that have buckets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is automatic failover?

A

Basically backing up data. If one indexer fails, the others will pickup the slack and maintain continuity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

High availability means…

A

Data is highly available for searching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Index Clustering in summary means

A

Data is protected from sudden loss

More copies are available for users who are actively searching

Indexer activities will continue in the event an indexer goes down

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Replication Factor determines

A

How many copies are maintained within an indexer cluster.

Deafult RF is 3

Maximum RF is determined by the number of indexers you have or nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Search Factor determines

A

How many of these copies are immediately searchable.

Default SF is 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a clustering environment you need a minimum of ____ Indexers

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Most important fact about a Search Factor (SF)

A

The Search Factor can never be more than the Replication Factor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain RF & SF

A

RF factor tells us how many times we want the data to be copied over. Two of those copies are highly available and just incase something happens to the first copy. If both copies go down, the third copy is usually stored at an offsite location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When does the Cluster Master come in?

A

The Cluster Master comes into play when we start copying our data (when the environment becomes clustered).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cluster Master Manages what layer?

A

It manages the indexing layer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Cluster Master?

A

A centralized configuration Manager who’s job is to manage the indexer cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Once the environment becomes clustered, the Deployment Server….

A

Only manages the forwarders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a Cluster Master do?

A

Manages cluster activities (adding peers, distributing configurations, determines the number of copies to maintain).

Maintains memory of peers, their buckets, and configs

Tells search head where to request data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are Peers (Cluster Peer)?

A

Peers are Indexers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do Peer Nodes do?

A

Peers receive and index incoming data typically from forwarders)

Replicate data to other peers

Respond to incoming searches by supplying search results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A clustered architecture is called ..

A

A distributed search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Clustering is Smart because it provides….

A

Data Availability
Data Fidelity
Data Resiliency
Disaster Recovery
Search Affinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Multi-site clustering =

A

Storing copies of your data at a different site

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Data fidelity =

A

The act of not losing data; reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Benefits of Clustering =

A

1.Data Availability & fast recovery
2.Easier overall administration
3.Scalability of indexing
4.No additional cost for data replication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Cons of clustering =

A

1.Increased storage requirements
2.Increased processing load
3.Requires additional Splunk instances
4.Indexers require the same OS and versions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
When you enable a search head in cluster environment you must specify what?
Cluster settings (i.e. Master Node) and the port on which it receives data.
26
Transforms.conf=
specify transformations and lookups that can then be applied to any event
27
What is the filepath of the CM that sends apps to its peers ?
splunkhome/etc/master-apps
28
Where do bundles reside for cluster peer?
splunkhome/etc/slave-apps
29
Splunkhome etc slave apps =
where you will always find pushed configuration files (sent from CM to indexer)
30
Config changes that require restart?
A.Changes to indexes.conf,inputs.conf B.Home path changes to Indexes.conf C.Deleting an existing app
31
Configuration changes that do not need a restart ?
Adding a new index or new app with reloadable configs Changes or additions to transforms.conf or props.conf
32
Tell me about your environment
In my environment we have a current quota of about 50TB, and we are currently ingesting about 49TB per day with 600 users. We have about 290 indexers, with close to 32,000 forwarders and about 12 search heads.
33
Environment with too many forwarders for you to manage one at a time-what Splunk instance would you install and how would you configure it to manage all the forwarders?
Use Deployment Server and put the forwarder in serverclass and create deployment apps to configure all of them.
34
In your deployment app you are Configuring inputs.conf to bring in new data-you then search with search head and cannot find the data. What happened?
-didn’t send deployment apps to correct serverclass -mistake in monitoring stanza -did not put right index -severclass has not phoned home -turn monitoring on(BEST ANSWER) -Splunk does not have permissions to read source file
35
what directory must you place your inputs.conf file in the deployment app
local directory
36
indexer uses what port
9997
37
fishbucket index importance
allows you to see how far into a file indexing has occurred-helps to avoid duplicates and comes in handy after server shutdown or connection errors.
38
advantages of indexer clustering
1.Data Availability & fast recovery 2.Easier overall administration 3.Scalability of indexing 4.No additional cost for data replication A. Data Availability = how often your data is available to be utilized. B. Data Fidelity = the act of not losing data. C. Data Reliability = refers to the accuracy, consistency, and dependability of the data being ingested, indexed, and queried within the platform. D. Data Resiliency = platform's ability to maintain data availability, integrity, and accessibility even in the face of unexpected failures. E. Disaster Recovery = set of processes and strategies put in place to ensure availability and continuity of Splunk services and data. F. Search Affinity = search local sites; mechanism for intelligently routing and distributing search jobs across a distributed Splunk environment.
39
explain data availability
how often your data is available to be utilized
40
who manages all indexes in cluster environment? Explain
Cluster Master/Master Node
41
how would you configure hot bucket to roll over by time
Maxhotspansecs
42
default port used for replication
8080 is replication, 8089 is the management port(goes between config manager and clients-ds vs clients and then CM vs indexers-to ANY client it is managing), and 9997 is the data (receiving port)
43
what is metadata and what does it contain?
Meta data=bar code=tells you where a product is coming from (ip address, log path, and format of data)
44
What is source
name of the event or other input from which the event originates
45
give examples of sourcetypes you worked with
json and syslog or CSV
46
what is the largest sourcetype you have worked with?
syslog is network data and large
47
high availability
-High availability=when we are replicating data within our indexers -Multiple copies available for searching -Data gets into our indexers in round robin fashion
48
distributed search?
key feature that allows you to search and analyze data across multiple Splunk instances or indexers in a distributed Splunk deployment. This is especially useful in large-scale environments where the volume of data to be searched and analyzed exceeds the capacity of a single Splunk instance.
49
how replicated buckets are stored in indexers
1.once the data comes to the indexers the method of distributing data will be round-robin 2.once the data is written on the indexers 3.then the process of replicating data will move from indexer to indexer trying to find a healthy one to store that specific data.
50
how does forwarder distribute data among indexers without replication (regular data)
round robin fashion
51
reloading vs restarting DS
When updating clients of the DS-reload deployment server when you make updates for DS itself you restart DS.
52
when increasing ingestion in cluster environment
add more indexers to the cluster
53
some considerations to consider when going into clustered environment
cost of more splunk instances ingestion of data storage requirements processing requirements
54
You notice that your newly monitored data is not in the index that you have configured it to be in. Where is data possibly being stored and how would you troubleshoot it?
Go to the inputs.conf and validate that the 'index' is correct. If index is wrong it will be in the main index
55
Recently got fresh new data in the splunk
Hot bucket
56
Under what circumstances would the data in the hotbucket stop writing?
If the hot bucket is too full or if their is restart.
57
In order to have have splunk search head what would you need to download?
Splunk Enterprise
58
Maximum number of concurrent users per search head
12
59
What is Maxhotbucket?
Maximum hot bucket that can be in an index
60
Which default port is for replication?
8080 port
61
What is the thawing process
Frozen data has to be unthawed and sent back to cold Move that file into thaw directory and rename it to a name that splunk recognizes
62
What must happen before indexer can be part of a cluster?
Indexer must become cluster member
63
Cluster Master/Master Node
You only need ONE
64
Internal Index?
Used for troubleshooting; stores all Splunk components' internal logs and processing metrics. Searches for logs that say ERROR or WARN
65
Monitoring stanza in Windows vs Linux
Windows = [monitor://C:\app\log\data\catalina.out] Linux = [monitor:///another/random/path]
66
Two types of files indexes consist of
raw data (full log files) and indexed files (tsidx)
67
To disable the monitor to stop sending logs
Go to monitoring and change disable to true or 1
68
Explain summary index
Summary indexing allows you to run fast searches over a large data set by scheduling Splunk to summarize data then import data into the summary index over time
69
When increasing ingestion of data by 2 TB what will you have to do? In clustered environment-how would you accommodate it.
Adding indexers to the cluster to accommodate growth.
70
What directory are apps deployed to in a clustered environment
slave-apps filepath
71
When will you use management port?-8089
when CM is communicating with with its clients or slaves