Week 2: Data Extraction Flashcards

1
Q

REVERSED

  • Physical level: describes how a record is stored
  • Logical level: describes what data stored in the database and the relationships among the data e.g. name: string, salary: real
  • View level: application programs which hide details of data types. Views can also hide information for security/privacy purposes
A

What are the 3 database levels of abstraction?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

REVERSED

-structures: e.g. rows and columns, nodes and edges, key-value pairs
-constraints: e.g. all rows must have same number of columns, age must be positive, c
* operations: e.g. return the values of record x, find rows where the column “name” is “Jordan”

A

What are the 3 main components of data models?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

REVERSED

  • Schema: specifies name of relation, plus name and type of each column
  • Instance: a table with rows and columns. rows = records = tuples, column = attributes = fields = features (instance is a snapshot of the schema at a point in time)
A

What are the 2 parts that a relation is made up of?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

REVERSED

The number of columns

A

What is the degree of a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

REVERSED

The number of columns

A

What is the arity of a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

REVERSED

The number of rows

A

What is the cardinality of a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

REVERSED

attribute values are normally required to be atomic: indivisible (not containing more than one value)

A

What is the requirement of attribute values in a relation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

REVERSED

K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation

A

What is a superkey of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

REVERSED

Superkey K is a candidate key if K is minimal

A

What is a candidate key of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

REVERSED

A selected candidate key used to represent the relation

A

What is the primary key of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

REVERSED

column or set of columns in table whose values correspond to the primary key in another table

A

What is a foreign key of R?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

REVERSED

  • used for defining the database schema
  • DDL is used to create and modify database objects such as tables, indexes and users
A

What is data definition language?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

REVERSED

Language for accessing and manipulating the data in a model
DML is used for adding, deleting and modifying the records in a database

A

What is data manipulation language?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

REVERSED

a statement requesting the retrieval of information

A

What is a query?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

REVERSED

  • and (^): retrieves records that satisfy both conditions
  • or (v): retrieves records that satisfy one of the conditions
  • not (~): retrieves records that don’t satisfy the condition
A

What are the three boolean operators?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

REVERSED

A^F = F 
AvF = A 
A^T = A 
AvT = T
A

What are the 4 boolean laws including true and false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

REVERSED

A^A = A 
AvA = A
A

What is A^A and AvA?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

REVERSED

Av ~A = T
A^ ~A = F

A

What are the complement laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

REVERSED

~(~A) = A

A

What is the double negation law?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

REVERSED

~(A^B) = ~A v ~B 
~(AvB) = ~A ^ ~B
A

What are de morgans laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

REVERSED

A ^ (BvC) = (A^B) v (A^C) 
A v (B^C) = (AvB) ^ (AvC)
A

What are the distributive laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

REVERSED

A v (A^B) = A 
A ^ (AvB) = A
A

What are the absorptive laws?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

REVERSED

  • the output of a relational algebra (RA) is a relation that can be input for another RA operator
  • don’t need heaps of functions for each line of code
A

What are the advantages of composing relational algebra operators?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

REVERSED

Union (U): total set of records in both relations, only records each record once even if it is in both
intersection: common records
difference (-): records in one table but not the other
selection (sigma): a specific set of rows
projection (pi): a specific set of columns. Duplicate tuples are removed
join: the natural join of two tables

A

What are the 6 main relational algebra operators?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
# REVERSED rho (p)
What is the relational algebra symbol for renaming?
26
# REVERSED G or gamma
What is the relational algebra symbol for grouping and aggregation?
27
# REVERSED Gets all the tuples from relation R and pair them with all the tuples from relation S, end up with huge table
What does a cartesian product do?
28
# REVERSED set of all combinations of tuples in R and S that are equal on their common attribute names. and includes all other attributes from both relations filled in
What does a natural join do?
29
# REVERSED Pi(A1, sum(A3)) (A1, A2 G sum(A3) (sigma P (r1, r2)))
What is the form of relational algebra statement for the general sql query: select A1, sum (A3) from r1, r2 where P group by A1, A2
30
# REVERSED Can do the union, intersection or set difference (-) of full relational algebra statements - but not include them within the statements
How do you use union, intersection and set difference in relational algebra?
31
# REVERSED - char(n) or character(n): character string with fixed length n - varchar(n): variable length character string - int or integer: integer of any length - numeric(p,d): fixed point number with p digits and d of the digits are to the right of the decimal - float(n): floating point number with precision at least n digits
What are the 5 basic types in SQL?
32
# REVERSED SELECT desired attributes FROM one or more tables WHERE condition about tuples of the tables
What is the basic form of a select from where statement?
33
# REVERSED SELECT DISTINCT
How do you force the elimination of duplicates in an SQL query?
34
# REVERSED SELECT \*
How do you select all attributes in an SQL query?
35
# REVERSED results in a relation with one column and one row with value ‘542’
What is the result of SELECT ‘542’
36
# REVERSED returns a relation with one column and N rows(the number of tuples in instructor) where each row will contain the value ‘A’
What is the result of SELECT ‘A’ FROM instructor
37
# REVERSED using "as" e.g. SELECT ID, name, salary/12.0 as monthly\_salary FROM instructor or simply write salary/12.0 monthly\_salary do not need the "as"
How do you rename an attribute in SQL? (2 ways)
38
# REVERSED SELECT ID, name FROM student WHERE dept\_name = ‘Math’ AND tot\_cred \>= 24 can also use OR
What do you use in the where clause to meet more than one criteria?
39
# REVERSED generates every possible instructor - teaches pair, with all attributes from both relations/ common attributes in the resulting table are renamed using the relation name cartesian product
What is the result of SELECT \* FROM instructor, teaches
40
# REVERSED =, \<\> (not equal), , \<=, \>=
What are the comparison signs in SQL?
41
# REVERSED Self join is where a table joins with itself on two columns with different names but storing data from the same domain. distinguish copies by renaming the relations
What is a self join?
42
# REVERSED SELECT T.name AS N1, S.name AS N2 FROM instructor AS T, instructor AS S WHERE T.salary = S.salary AND T.name \< S.name
What is an SQL query for getting the name of instructors who have the same salary, avoids pairs like (miller, miller) and produces pairs in alphabetical order. From the relation instructor
43
# REVERSED All attributes from left table are listed. Matching attributes from the right table are added where they are available, null when not available SELECT \* FROM course LEFT OUTER JOIN prereq ON course.course\_id = prereq.course\_id
What is a left outer join and its code in SQL?
44
# REVERSED All attributes from the right table are listed. Matching attributes from the left table are added where they are available, null when not available SELECT \* FROM course RIGHT OUTER JOIN prereq ON course.course\_id = prereq.course\_id
What is a right outer join and its code in SQL?
45
# REVERSED takes all attributes from left and right tables, fills in all matching attributes and puts null where not available SELECT \* FROM course FULL OUTER JOIN prereq ON course.course\_id = prereq.course\_id
What is a full outer join and its code in SQL?
46
# REVERSED takes only the attributes that are common in both tables Inner join keeps both the columns in the output even though they will be the same, while natural join will only keep one common column
What is a natural join and an inner join?
47
# REVERSED SELECT \* FROM course JOIN prereq ON course.course\_id = prereq.course\_id INNER JOIN is also the same as JOIN
How do you do an inner join in SQL?
48
# REVERSED SELECT \* FROM course NATURAL JOIN prereq ON course.course\_id = prereq.course\_id
How do you do a natural join in SQL?
49
# REVERSED WHERE name LIKE ‘%Van der%’
How do you match a string in the where clause?
50
# REVERSED ‘Intro%’ #matches any string beginning with “Intro” ‘%Comp%’ #matches any string containing “Comp” as a substring
How do you match an exact substring and the beginning of a string?
51
# REVERSED ‘\_\_\_’ #matches any string of exactly three characters ‘\_\_\_%’ #matches any string of at least three characters
How do you match a string of 3 characters and a string of at least 3 characters?
52
# REVERSED WHERE ... NOT LIKE ...
How do you find mismatches of strings?
53
# REVERSED LIKE ‘100\%’ ESCAPE ‘\'
How would you match the string "100%"?
54
# REVERSED WHERE (instructor.ID, dept\_name) = (teaches.ID, ‘Biology’)
What is a tuple comparison?
55
# REVERSED SELECT name FROM instructor WHERE salary BETWEEN 90000 and 100000
How can you find the names of all instructors with salary between $90,000 and $100,000 (this includes the values equal to)
56
# REVERSED (SELECT course\_id FROM section WHERE sem = ‘Fall’ AND year = 2009) EXCEPT (SELECT course\_id FROM section WHERE sem = ‘Spring’ AND year = 2010)
How do you find the difference of two SQL queries?
57
# REVERSED WHERE salary IS NULL
How do you check for null values using SQL?
58
# REVERSED Unknown
What does 5\> unknown return?
59
# REVERSED - unknown OR true = true - unknown OR false = unknown - unknown OR unknown = unknown
What are the unknowns combined with the or operator?
60
# REVERSED - true AND unknown = unknown - false AND unknown = false - unknown AND unknown = unknown
What are the unknowns combined with the and operator?
61
# REVERSED treated as false if it evaluates to unknown
What happens to unknown values in the where clause?
62
# REVERSED WHERE dept\_name IN (‘Comp sci’, ‘Biology’)
What is another way to write Where dept\_name = 'Comp Sci' or dept\_name = 'Biology'
63
# REVERSED SELECT \* FROM instructor WHERE dept\_name = ‘Physics’ ORDER BY name -puts results in ascending order
How do you order the results of an SQL query? What order does it put them in?
64
# REVERSED - avg: average value - min: minimum value - max: maximum value - sum: sum of values - count: number of values
What aggregate functions can be used in SQL? (5)
65
# REVERSED SELECT dept\_name, AVG(salary) AS avg\_salary FROM instructor GROUP BY dept\_name
How do you find the average salary of instructors in each department
66
# REVERSED SELECT dept\_name, AVG(salary) AS avg\_salary FROM instructor GROUP BY dept\_name HAVING avg\_salary \> 80000
How do you find the average salary of instructors in each department which has average salary greater than 80000
67
# REVERSED Having must be used after a GROUP BY clasue
When do you use WHERE vs HAVING?
68
# REVERSED A subquery is a SELECT-FROM-WHERE expression that is nested within another query
What is a subquery in SQL?
69
# REVERSED SELECT A1, A2, …, An FROM r1, r2, …, rn WHERE P as follows: - Ai can be replaced by a sub query that generates a single value - ri can be replaced by any valid subquery - P can be replaced with an expression of the form: B (subquery)
What type of subqueries can be nested in each of the select, from and where clauses?
70
# REVERSED SELECT dept\_name, (SELECT COUNT(\*) FROM instructor WHERE department.dept\_name = instructor.dept\_name) AS num\_instructors FROM department
Code to list all departments along with the number of instructors in each department in SQL? where there is an instructor relation and department relation
71
# REVERSED ``` data = {‘State’: [‘Ohio, ‘Ohio’, ‘Ohio’, ‘Nevada’, ‘Nevada’, ’Nevada'], ‘Year’: [2000, 2001, 2002, 2001, 2002, 2003], ‘Population’: [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]} df = pd.DataFrame(data) ``` ``` data = [[‘Ohio’, 2000, 1.5], [‘Ohio’, 2001, 1.7], [‘Ohio’, 2002, 3.6], [Nevada, 2001, 2.4], [‘Nevada, 2002, 2.9], [‘Nevada’, 2003, 3.2]] cols = [‘State’, ‘Year’, ‘Population’] df = pd.DataFrame(data, columns = cols) ```
What are the 2 ways to create dataframes in python pandas?
72
# REVERSED df. head(n) #get first n rows df. tail(n) #get last n rows
How do you get the first and last rows of a dataframe in python?
73
# REVERSED df.info()
How do you get the index and data types of a dataframe in python?
74
# REVERSED df.describe()
How do you get the summary statistics of dataframe columns in python?
75
# REVERSED df.sum()
How do you get column sums in python?
76
# REVERSED df.count()
How do you get the count for each column in python?
77
# REVERSED df[‘column’].isnull()
How do you get a true/false vector for whether the value is null in a column in python?
78
# REVERSED df. iloc[5] df. iloc[5, :] df. loc[‘rowname’]
How can you extract row 5 in python by index (2) and rowname (1)?
79
# REVERSED df.iloc[0:5,1]
How do you extract the first 5 rows and column 2 in python?
80
# REVERSED df.columns
How do you get a vector containing the column names in python?
81
# REVERSED df[[‘actor’, ‘facebook\_likes’, ‘content\_rating’]]
How do you extract multiple columns by name in python?
82
# REVERSED df.director\_name df[“director\_name”]
What are 2 ways to extract a single column by name in python?
83
# REVERSED df[df.columns[[5,7,9]]][:] #extracts columns 5,7,9
How can you extract multiple columns by index in python?
84
# REVERSED df.loc[df[‘content\_rating’]==‘PG\_13’, [‘facebook\_likes’, ‘budget’]] #or df.iloc[(df[‘content\_rating]==‘PG\_13’).values,[3,5]] #or df[df.content\_rating == ‘PG\_13’][[‘facebook\_likes’, ‘budget’]]
What are 3 ways to extract the rows with the condition that content rating = PG and the columns with names facebook likes and budget or index 3 and 5?
85
# REVERSED df[‘newcolumn’] = …
How do you add a new column to a dataframe in python?
86
# REVERSED del df[‘column1’] df.drop(‘column1’, axis=1)
How do you delete a column in python? (2 ways)
87
# REVERSED df.T
How do you transpose a dataframe in python?
88
# REVERSED df.drop([‘row1’])
How do you delete a row in python?
89
# REVERSED print(len(df.columns))
How do you display the number of columns in python?
90
# REVERSED print(len(df))
How do you display the number of rows in python?
91
# REVERSED df.columns[1].min()
How do you get the minimum of column 1 in python?
92
# REVERSED pd.concat([df1, df2])
How do you concatenate dataframes row-wise in python?
93
# REVERSED pd.concat([df1, df2], axis=1)
How do you concatenate dataframes column-wise in python?
94
# REVERSED pd.merge(df1, df2, on=‘attribute’, how = 'outer')
How can you do a full outer join in python?
95
# REVERSED pd.merge(df1, df2, on=‘attribute’, how = 'left')
How do you do a left outer join in python?
96
# REVERSED list.unique()
How do you get the unique values in a list?
97
# REVERSED list.append(c)
How do you add values to the end of a list?
98
# REVERSED df.groupby([‘column']).count()
How do you group by a particular attribute and count how many in each group in python?
99
# REVERSED Where columns match, it will add the data in that column from each df, e.g. "BIOBIO" because strings will concatenate. Where columns don't match, will include all columns from both dataframes but entries will all be NaN
What is the result in python of adding two dataframes df1+df2