Data Collection Flashcards
What are the forwarder types?
A universal forwarder:
- A streamlined binary package that contains only the components needed to forward data
- Data is forwarded unparsed
Heavy Forwarder:
- Uses the Splunk Enterprise binary with all capabilities
- Parses data before forwarding
- Can consume more resources than parsing on the indexer
Lightweight Forwarder:
- A full install of splunk, but it does not parse inputs
- Data is sent uncooked/raw
- Deprecated as of Splunk 6.0
What is cooked data?
Parsed and unparsed are considered “cooked” data, otherwise is sent in raw form like syslog
What does each event’s header contain?
host, source, sourcetype, and target index
When unparsed data is forwarded, what size in kb is it delivered?
64kb blocks
Describe Parsed Data
Parsed data is loadbalanced to all indexers using automatic load balancing.
Automatic switchover is done on a 30 second timer (default)
This can be configured based on volume in outputs.conf
By time: autoLBFrequency - 30 second default
By volume: autoLBVolume setting. Default is 0 bytes. If set with anything other than 0, then the forwarder will change indexers based on the amount of data.
If both time nd volume are set than the first one to hit wins
Describe Unparsed data
Unparsed data stream is sent to the indexer tagged with minimal metadat to identify host, source, sourcetype, and target index
Divide stream in to 64kb blocks
stamps stream with time zone of the originating forwarder
How is unparsed data loadbalanced?
Adheres to the same load balancing parameters are parsed data.
There are some issues with unparsed data and load balancing on a timer or volume setting. Events can be truncated or trashed.
To not have trashed or truncated events, use EVENT_BREAKER(regex) and EVENT_BREAKER_ENABLE = true in props.conf on the forwarder.
Only the forwarder is required to be at least at 6.5, the indexers need not be upgraded.
What configuration in props.conf be copied to EVENT_BREAKER to work?
LINE_BREAKER
Describe the Monitor input.
Most common input
Continuously watches a file or directory, ingesting new events as they arrive.
Files should be local to the host doing the monitoring
- use a forwarder on the host where the logs reside
Configure using the [monitor::/] stanza in inputs.conf
Describe the batch input.
The batch input reads a file, indexes the data, and deletes the file
Best used with large archives of historic data, not files that continue to be written to.
Use a forwarder on the host where the files reside.
[batch://] stanza has several unique settings, but also uses the same settings as the monitor input.
Describe the Script input
Splunk runs the script in the [script://] stanza and ingests the output (STDOUT/STDERR)
Splunk supports a number of script types including PowerShell, Python, Windows batch files, or any other utility that can format and stream the data that you want to index
Place scripts in bin/ directory of your app:
$SPLUNK_HOME/etc/apps//bin/
Describe the FIFO input
A First In, First Out (FIFO) input reads data from the FIFO queue reference in the path specified in the stanza
- not currently supported by Splunk web
- If using Splunk Cloud, use an HF for FIFO inputs
It is important to note that data sent over FIFO queues does not remain in memory and can be an unreliable method for data sources.
Describe an fschange input
File System change (fsmonitor) monitors the directory and subdirectories referenced in the path for the updates, additions, and deletions
- note the stanza does not preface with “//”
Events arrive at Splunk to indicate a change from the prior state
The stanza uses different setting than other inputs
A directory cannot be simultaneously monitored by [fschange:] and [monitor://…].
Describe Perfmo, WinEventLog, WMI, and admon Inputs
Windows inputs that monitor perfmon counters, windows event logs, and active directory(admon).
WMI can be used for remote servers, but is highly discouraged. A UF should be placed on the remote server and WinEventLog used
Describe the http input
This configures for HTTP Event Collector (HEC) which is a token-based HTTP input that is secure and scalable
Notethat the inputs.conf must live within the application scope of $SPLUNK_HOME/splunk_httpinput/local/inputs.conf to be active.