Syntax Flashcards

1
Q

Search for database

A

show databases like ‘*name*’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Search for table

A

show tables in attribution like ‘*tablename*’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Query files under table

A

select distinct input_file_name() from database.tablename

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Query table metadata (2 types)

A

describe table [extended] database.tablename

describe detail database.tablename

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Get table history

A

describe history database.tablename

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Get Point in time for table

A

select * from database.table timestamp as of ‘2023-04-12T13:43:30’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Search file system

A

dbutils.fs.ls(“/”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Return X number of records in sql query

A

Select * from database.tablename Limit X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Use data frame to build temporary view

A

df.createOrReplaceTempView(“database.viewname”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Move file data into dataframe

A

df = (
spark.read
.format(“csv”)
.option(“delimter”,”,”)
.option(“header”,”true”)
.load(“/mnt/folder”)
)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

List secrets in scope

A

dbutils.secrets.list(“scopename”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

View all scopes

A

dbutils.secrets.listscopes()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Filter rows in data frame

A

df = df.filter(
(df.city == “Ki”) & (df.province == “Zu”)
)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Execute notebook with widgets

A

dbutils.notebook.run( path=”/Users/”,
timeout_seconds=12,
{
“mountPoint”:”/mnt/”
}
)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Activate intellisense

A

Tab

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Run cell

A

Ctr + Enter

17
Q

Run cell & create new cell

A

Shift + enter

18
Q

Create parameter

A

dbutils.widgets.text(
name=”mountPoint”,
defaultValue=””
)

19
Q

Use parameter

A

dbutils.widgets.get(
“mountPoint”
)

20
Q

Create Generated Surrogate key on delta table

A

personId as bigint GENERATED ALWAYS AS IDENTITY
(
START WITH 0
AND
INCREMENT BY 1
)

21
Q

3 Ways to create unique key in databricks

A

1) Monotonically_increasing_id()
2) Window function
3) Generated Identity column

22
Q

Set up Database

A

CREATE DATABASE database_name LOCATION (‘/mnt/gold/’)

23
Q

Create Parquet File

A

df.write.format(“parquet”).mode(“append”).save(“/mnt/table_name/)

24
Q

Delete table data from file storage

A

Dbutils.fs.rm(“/mnt/db/tablename”,True)

25
Q

Create Parquet file and hive table

A

Df.write.format(“parquet”).option(“path”,”/mnt/database/table_name”).saveAsTable(“database.filename”)

26
Q

sql function to load every file only once into table

A

Copy into database.table
From “/mnt/database/tablename/”
Fileformat = csv

27
Q

Explain checkpointing

A

Changes 1-9 generate a separate json file in delta log. When Json file 10 is written a checkpoint parquet file is also written

28
Q

Create Data Frame from List with metadata and data

A

df = createDataFrame(<list>).Collect()</list>

29
Q

Loop through data frame

A

for row in df:
print(row[“<column_name>"])</column_name>