Delta Live Tables Flashcards
What is Delta Live Tables?
A managed data pipeline tool exclusive to DB
1/ETL Tool for data warehousing
2/Managed structured streaming for Spark
What problem does Delta Live Tables solve?
1/Write pipelines in simple, declarative code
2/Easily implement and monitor data quality constraints
3/Easily track and view data lineage
Why do customers care about Delta live tables?
Helps accelerate data pipeline development, reduces burden of performance tuning and maintenance
When should you position Delta live tables with a customer?
If you see any data pipeline/ETL work in databricks,batch, or stream
How do Delta Live Tables work?
1/Customers write DLT-specific SQL/Python code in a notebook
2/Once written, switch to workflows, click DLT and set up cluster to run the notebook
3/Once run DLT populates the UI with a data lineage diagram and writes out logs that are used for monitoring
What are key features of Delta Live Tables?
1/Automatic table optimization
2/Enhanced autoscaling for spiky workloads
3/Easily create test pipelines
True or false : DLT is only for streaming
False : Its continuous or triggered mode
True or False : DLT is low-code data engineering and not for advanced Spark users
False : DLT improves efficiency, making pipeline code simple, can handle optimization, testing, error-handling, monitoring, and documentation
What to look for when pitching DLT
1/Pipelines with unpredictable workloads
2/Complex pipelines with many downstream tables
3/Mention of data lineage and/or quality
Competitors?
DBT, and any structured streaming
Cost
It costs more than others but better performant. Starts at $0.20 DBU for Core