Instructor's Method - 6/16/2021 Flashcards by Haute Punjaban

Azure Data Factory

Ingest - copy data 90+ different sources

Transform - map data flows. Code-free workflows

Orchestration - end-to-end workflows =>

This is a Standalone-Tool

How well did you know this?

Not at all

Perfectly

Diff bet Azure Data Factory and Synapse Pipeline

Barely - they have same codebase

How well did you know this?

Not at all

Perfectly

Mapping Data Flows

Visually design ETL pipelines

Once data flow is created, run in

How well did you know this?

Not at all

Perfectly

Spark

Open-source data processing engine built around speed, ease of use, and sophisticated analytics

Compute engine designed for distributed data processing at scale

In-memory engine that is up to 100 times faster than Hadoop

Largest open-source data project

Multi-language support - Scala, Java, SQL, R & Python

How well did you know this?

Not at all

Perfectly

Spark SQL

Batch processing

How well did you know this?

Not at all

Perfectly

Spark Streaming

Stream processing

How well did you know this?

Not at all

Perfectly

Spark Pool

Node size

Number of nodes

Apache Spark version

Different library versions that will be installed

How well did you know this?

Not at all

Perfectly

Spark Pool

Node size

Number of nodes

Apache Spark version

Different library versions that will be installed

Auto-pause (auto-termination time)

How well did you know this?

Not at all

Perfectly

How to set header

.option(“header”,”true”)

How well did you know this?

Not at all

Perfectly

Which is Native Spark function to analyze data

describe()

How well did you know this?

Not at all

Perfectly

What is display() function used for

in-built function of spark to show data in data frame

How well did you know this?

Not at all

Perfectly

Difference between show() and display()

show() is native function of spark

display() is from Synapse and shows data in html

How well did you know this?

Not at all

Perfectly

Show data in text format

spark.read.parquet(“filename”)

How well did you know this?

Not at all

Perfectly

Show data in tabular format

using display(Dframe, True)

How well did you know this?

Not at all

Perfectly