UDA Flashcards

1
Q

UDA

A

Unified Data Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A UDA approach simply means that

A

an organization is able to collect and process big data, store that data over a long period of time, and have that data at their disposal whenever they need it for a variety of business purposes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is UDA?

A

Organizations that adopt a UDA approach put infrastructure in place that combines their
data engineering, data science, and business intelligence workflows under one umbrella.
In other words, they implement one system that allows everyone on their data science
teams to work together on data through its entire lifecycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the conceptual idea behind a Unified Data Analytics approach?

A

Combine an organization’s data engineering, data science, and business intelligence workflows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

DataBricks

A

offers a
Unified Data Analytics Platform (UDAP) that enables organizations to bring together their
data science, data engineering, and business analytics workflows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MLflow

A
is an open-source project built by Databricks that helps data scientists manage their
machine learning (ML) workflows, including things like tracking their ML experiments or managing
their ML models.

What MLflow brings to data scientists is a way to centrally manage their ML models, track the
results of ML experiments, monitor and tune models at different stages, and deploy those models
into production. In other words, it makes it much easier to administer the entire machine learning
lifecycle.

With Managed MLflow, data scientists are more productive because they can focus on designing
and improving their models, instead of doing all of this manually (like keeping track of everything in
a spreadsheet).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The components that make up the UDS include

A

the Databricks Runtime, Delta Lake, and Databricks Ingest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The Databricks Runtime is

A

the actual processing engine of the UDAP. It’s built on an
optimized version of Apache Spark that has been programmed to run faster than
open-source Apache Spark.

The Databricks Runtime runs on auto-scaling infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Delta Lake

A

Organizations that use Databricks use a data lake as their long-term data storage solution.
Delta Lake adds intelligence, via a transaction log, to your data lake. This means that in
addition to getting the usual functionality of a data lake for data storage, you get additional
functionality that helps prepare data for machine learning and data science workflows,
plus added performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Databricks Ingest

A

gives organizations the ability to bring data together for use in analytics
- from their own data stores, as well as from third parties by incrementally ingesting real-time data into their Delta
Lakes (data lakes built with Delta Lake) from a variety of data sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

the final component of the Unified Data Analytics Platform

A

the Enterprise Cloud Service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The Enterprise Cloud Service is

A

arguably the most important piece of the Unified Data
Analytics Platform – it’s what allows organizations to set up, secure, manage, and scale
their platform.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does the Enterprise Cloud Service tie back into a Unified Data Analytics approach?

A

It protects all of the work being done with your data - from ingestion to storage to performing analytics and generating real-time dashboards and periodic reports.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DataBricks Runtime

A

Optimized version of Apache Spark

How well did you know this?
1
Not at all
2
3
4
5
Perfectly