Instructor's Method - 6/16/2021 Flashcards

1
Q

Azure Data Factory

A

Ingest - copy data 90+ different sources

Transform - map data flows. Code-free workflows

Orchestration - end-to-end workflows =>

This is a Standalone-Tool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Diff bet Azure Data Factory and Synapse Pipeline

A

Barely - they have same codebase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mapping Data Flows

A

Visually design ETL pipelines

Once data flow is created, run in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Spark

A

Open-source data processing engine built around speed, ease of use, and sophisticated analytics

Compute engine designed for distributed data processing at scale

In-memory engine that is up to 100 times faster than Hadoop

Largest open-source data project

Multi-language support - Scala, Java, SQL, R & Python

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Spark SQL

A

Batch processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Spark Streaming

A

Stream processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Spark Pool

A

Node size

Number of nodes

Apache Spark version

Different library versions that will be installed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Spark Pool

A

Node size

Number of nodes

Apache Spark version

Different library versions that will be installed

Auto-pause (auto-termination time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to set header

A

.option(“header”,”true”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which is Native Spark function to analyze data

A

describe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is display() function used for

A

in-built function of spark to show data in data frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Difference between show() and display()

A

show() is native function of spark

display() is from Synapse and shows data in html

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Show data in text format

A

spark.read.parquet(“filename”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Show data in tabular format

A

using display(Dframe, True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly