Getting Started Flashcards
What does the host map allow you to do? (4)
Quickly visualize your environment
Identify outliers
Detect usage patterns
Optimize resources
Why is tagging important for infra monitoring
All machines show up in the infrastructure list.
You can see the tags applied to each machine.
Tagging allows you to indicate which machines have a particular purpose.
Datadog attempts to automatically categorize your servers. If a new machine is tagged, you can immediately see the stats for that machine based on what was previously set up for that tag. Read more on tagging.
APM
Provides you with deep insight into your application’s performance:
- from automatically generated dashboards for monitoring key metrics, like request volume and latency,
- to detailed traces of individual requests
side by side with your logs and infrastructure monitoring.
When a request is made to an application, Datadog can see the traces across a distributed system, and show you systematic data about precisely what is happening to this request.
Events Explorer
The Event Explorer displays the most recent events generated by your infrastructure and services.
You can also submit your own custom events using the Datadog API, custom Agent checks, DogStatsD, or the Events email API.
In the Event Explorer, filter your events by facets or search queries. Group or filter events by attribute and graphically represent them with event analytics.
Events from Events Explorer can include (4)
- Code deployments
- Service health changes
- Configuration changes
- Monitoring alerts
The Event Explorer automatically gathers events collected by the Agent and installed integrations.
NPM
- gives you visibility into your network traffic across any tagged object in Datadog: from containers to hosts, services, and availability zones.
- Group by anything—from datacenters to teams to individual containers.
- Use tags to filter traffic by source and destination. The filters then aggregate into flows, each showing traffic between one source and one destination, through a customizable network page and network map.
What does each NPM flow contain, and what doe they report?
network metrics such as throughput, bandwidth, retransmit count, and source/destination information down to the IP, port, and PID levels.
It then reports key metrics such as traffic volume and TCP retransmits.
RUM
allows you to visualize and analyze real-time user activities and experiences.
Session Replay
Allows you to capture and view the web browsing sessions of your users to better understand their behavior.
RUM explorer allows you to
not only visualize load times, frontend errors, and page dependencies, but also correlate business and application metrics to troubleshoot issues with application, infrastructure, and business metrics in one dashboard.
Serverless
- Lets you write event-driven code and upload it to a cloud provider, which manages all of the underlying compute resources.
- brings together metrics, traces, and logs from your AWS Lambda functions running serverless applications into one view, so that you can optimize performance by filtering to functions that are generating errors, high latency, or cold starts.
Cloud SIEM
- automatically detects threats to your application or infrastructure.
- ex: For example, a targeted attack, an IP communicating with your systems matching a threat intel list, or an insecure configuration.
- These threats are surfaced in Datadog as Security Signals and can be correlated and triaged in the Security Explorer
Synthetics
- allow you to create and run API and browser tests that proactively simulate user transactions on your applications and monitor all internal and external network endpoints across your system’s layers.
- You can detect errors, identify regressions, and automate rollbacks to prevent issues from surfacing in production.
About the agent
- it’s software that runs on your hosts.
- It collects events and metrics from hosts and sends them to Datadog, where you can analyze your monitoring and performance data.
- It can run on your local hosts (Windows, MacOS), containerized environments (Docker, Kubernetes), and in on-premises data centers.
- You can install and configure it using configuration management tools (Chef, Puppet, Ansible).
How much & how often does the agent collect system level metrics
The Agent is able to collect 75 to 100 system level metrics every 15 to 20 seconds.
What can the agent collect?
System level metrics
+
With additional configuration, the Agent can send live data, logs, and traces from running processes to the Datadog Platform.
Where can I find info on the agent
The Datadog Agent is open source and its source code is available on GitHub at DataDog/datadog-agent.
Agent overhead
The amount of space and resources the Agent takes up depends on the configuration and what data the Agent is configured to send.
- At the onset, you can expect around 0.08% CPU used on average with a disk space of roughly 830MB to 880MB.
What agent metrics are sent by the agent about himself?
The following Agent metrics are information the Agent sends to Datadog about itself
- datadog.agent.python.version (Shows 1 if the Agent is reporting to Datadog. The metric is tagged with the python_version.)
- datadog.agent.running (Shows 1 if the Agent is reporting to Datadog)
datadog.agent.started (reports 1 when the Agent starts (available in v6.12+).)
Why are agent metrics useful?
Help you determine things like:
- what hosts or containers have running Agents
- when an Agent starts
- what version of Python it’s running.
Differences between installing the agent on a host or in a containerized environment
- On a host:
1. Agent is configured using a YAML file
2. integrations are identified through the Agent configuration file - Container:
1. Agent configuration options for a container’s Agent are passed in with environment variables, for example:
» DD_API_KEY for the Datadog API key
» DD_SITE for the Datadog site
2. integrations are automatically identified through Datadog’s Autodiscovery feature
Why it it recommended I install the Agent? (2)
- Agent needs to be installed to send data from any one of the many Agent based Integrations.
- the Agent is the recommended method to forward your data to the Datadog Platform.
How many enabled checks does the Agent have?
Agent has several checks enabled which collect over 50 default metrics to provide greater insight on system level data.
What are the prerequisites required to install the agent?
- Create a Datadog account.
- Have your Datadog API key on hand.
- Have the Datadog UI open.
How do you install the agent
To install the Datadog Agent on a host, use the one-line install command from that page, updated with your Datadog API key.
What message displays in your Events Explorer if your Agent successfully installs?
Datadog agent (v. 7.XX.X) started on <Hostname></Hostname>
What service checks is the agent set up to deliver?
datadog.agent.up: Returns OK if the Agent connects to Datadog.
datadog.agent.check_status: Returns CRITICAL if an Agent check is unable to send metrics to Datadog, otherwise returns OK.
What are the agent service checks useful for?
These checks (datadog.agent.up: & datadog.agent.check_status: ) can be used in the Datadog Platform to visualize the Agent status through monitors and dashboards at a quick glance