Infra questions Flashcards
potential infrastructure questions
How long do we store metrics for?
We retain all of our metrics for 15 months
Can I automate the build of dashboards and monitors?
Yes - you can use our API alongside tools like terraform to automate builds
Do you deploy the agent as a sidecar for my Kubernetes Cluster?
No - it’s deployed as a daemonset, therefore one agent per host/node
How often do we receive metrics from API crawlers?
For cloud - around 10 minutes, which is the same interval you’d experience in cloud native tools.
Do we need the Agent?Or can we just configure through API
Since the APIs collect metrics on a 10-15 min interval, to get more granular metrics AND quicker metrics, you should get the Agent, because the agent collects every 15 seconds…Why would you NOT want the agent?
Where is my data stored?
We are fully hosted in the cloud, so in multiple cloud instances (AWS, Azure, GCP, also in the EU. Mostly in USA.
Do you offer on-prem hostings?
No.
How could I get more advanced metrics from RDS?
Deploy the agent to an ec2 instance on the same security group as the database - you can hook into the rds like a regular DB
Can I automate setting up multiple AWS accounts rather than manually in the UI?
Yes, using terraform, API, or cloudformation
Any community dashboards?
No. But we have all types of dashboards and it is very easy to customize your own
Can I convert grafana dashboards into DD so I dont have to recreate?
No
Can I trigger changes on my infrastructure when an alert goes off?
Not directly, but we can trigger a webhook to allow a client to trigger a script based on the web hook. We also have AWS Automation and Kubernetes to autoscale
Will I see Watchdog alerts right away?
No, it can take anywhere from 2-6 weeks. Watchdog needs to collect data in order to start making assumptions. The more data it has, the better
Can I silence alerts periodically?
Yes, we can manage downtime in monitors easily
How long does it take to set up Communcation integrations?
With any supported tech stack, very quickly, but can depend
What If I have to completely seperate my data based on teams?
We could spin up multi-org environments, or seperate DD instances
Is RBAC supported?
Yes, for logs and dashboards
Do you use monitor netflow? (e.g. monitoring the flow between devices such as routers/firewalls
Yes, in public beta
Can you support hybrid envrionments?
Yes
Support serverless?
Yes - theres a Serverless page under infrastructure to show AWS lambdas and AWS serverless
How often do live containers get updated?
2 second buffer period
Whats the agent overhead?
Depends… .08%-3% CPU image
If a client is running AWS ECS on Fargate, do they deploy the agent as a sidecar using Docker within the task definition?
Yes
Do I have to use Helm to deploy the agent in a Kubernetes environment?
Nope. We also provide YAML manifests for required resources
Can I see container health information beyond the 36 hours available in Live view?
Yes. Docker, Kubernetes, Chhecks/Integrations retain container info for the standard 15 months
My K8s is epehemeral
Datadog can provide insights into the entire lifecycle of pods, including observing when they terminate. We can also capture real time data on CPU and memory util. of pods
My servers are ephemeral
You can use a metric monitor on the system uptime metric to address this. This metric is an ever increasing timer which resets to 0 when the host starts. You can use diff() function with the metric to distinguise between a new server and a rebooted server