BIM 6 Flashcards

1
Q

What is data?

A

Data is raw facts, figures with no meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is information?

A

information is a collection of data organized to have meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is database technology?

A

a collection of related data organized to make it valuable and useful.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the relational database model?

A

a collection of tables to represent data and relationships of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are columns/attributes/fields?

A

these are the columns of the table. they represent the different data categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are records/entity/instances/items?

A

these are rows of the table and the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a primary key?

A

to keep each row unique, we use a primary key to identify the different rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a file?

A

these are the tables in the databases. the actual things data is in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a database?

A

the database includes the collection of files/tables containing information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a data entity?

A

the things we store information about, this may be the people, places, objects. Basically a record/row of a table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does SQl stand for?

A

sql stands for structured query language?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what does query mean?

A

query means a request for data/information from the database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 languages/categories of SQL commands?

A
  1. Data Definition Language (DDL) - all about the structure of the table/file. It creates, deletes, and alters tables.
  2. Data Manipulation Language - DML is all about the content. It allows for editing, updating, inserting, etc
  3. Data Control Language - DCL is about access to the database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the DDL - creating tables commands?

A

CREATE TABLE tablename (columnname1 TEXT, column2, INTEGER)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the DDL data types?

A
  1. integer - whole numbers
  2. TEXT - text
  3. real - allows decimals
  4. blob - binary-large objects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

renaming columnes

A

ALTER TABLE tablename RENAME COLUMN columnname TO newcolumnname

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

add columns

A

ALTER TABLE tablenname ADD columnname

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

using DML to manipulate table rows - add values into rows

A

INSERT INTO tablename VALUES (C1R1, …), (C1, R2)

INSERT INTO tablename (column1, column 2) VALUES (valueC1, …)

INSERT INTO [Phones] (PersonID, [Phone Number]) SELECT PersonID, Phone
From Person;

INSERT INTO table2
SELECT * FROM table1
WHERE condition;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Update values of rows

A

UPDATE tablename SET columnnameX=newvalue WHERE columnnameY=condition value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

DML - reading rows

A

SELECT * FROM tablename WHERE columnname=conditionvalue

21
Q

how do you order the columsn by order?

A

SELECT * FROM tablename WHERE condition=condition ORDER BY columnname ASC/DESC

22
Q

what does lim and offset do?

A

lim limits the records by X amount and offset skips the first Y rows

23
Q

how to join tables?

A

SELECT column FROM tablename JOIN table2 ON condition=condition

24
Q

What is normalization?

A

the process of streamlining data to minimize redundancy and increase flexibility.
It is a technique to manage the trade-off between reducing data redundancy and ease of use.

25
Q

What is data redundancy?

A

when the same piece of information occurs multiple times.

26
Q

What is first normal form 1NF

A

there are 2 rules of 1NF
1. There are only atomic values and values in cells cannot be divided
2. Cells contain only 1 value from the attribute

27
Q

What is 2NF

A

2NF is about reducing redundancy. The 2 rules are:
1. Follows 1NF
2. Does not have any partial dependencies in the case of composite keys. This partial dependency means that columns depend on both composite (primary) keys and not only one of them. Place depends only on tournament and not year. this is a partial dependency

28
Q

What are composite keys?

A

Composite keys occur when there are multiple primary keys necessary to uniquely identify each row. So they are the multiple primary keys

29
Q

what is 3NF

A

the rules of 3NF are
1. it follows 2Nf
2. There are no transitive dependencies. Transitive dependencies are columns that depend on columns that aren’t primary keys. They only relate to the primary keys as they depend on columns that depend on primary keys.
So rule 2 occurs if all columns depend on primary keys

30
Q

What are entity-relationship diagrams?

A

ERDs describe the relationships between things of interest.

31
Q

What 4 concepts does ERD rely on?

A
  1. Entity and entity sets
    Entities are the rows/records while entity sets are the entire table. Entities describe the things of interest to store data on
  2. Attributes
    the columns and categories
  3. Relationships and relationship sets
    shows how entities are linked
  4. Cardinalities
    Cardinalities or relationship degrees represent the number of entities are associated with another tables entities
32
Q

what are the 3 types of cardinalities?

A
  1. one to one 1:1
    one entity is associated with one entity
  2. one to many 1:M
    one entitiy can be associated with many entities in Set B
  3. many to many M;N
    many entities in set A are associated with possible many entities in set B
33
Q

what are the 3 models in making an ERD?

A
  1. Conceptual model
    It is not detailed and just gives the files/tables that are the focus
    About identifying what we want to store
  2. Logical model
    attributes and keys are attached. however, the problem with M:N is not fixed yet
  3. Physical model
    this is where the ERD is created, the database structure can be implemented and M:M relationships are fixed
34
Q

What are the different ERD lines/conditions?

A

look at doc

35
Q

how are many to many relationships solved?

A

solved through using an associate entity table. This creates a 1:M relationship and the table needs both primary keys from the tables as foreign keys as well.

36
Q

What are bits and bytes?

A

Bits are the smallest unit of data that a computer can handle. 0 or 1.
Bytes are groups of 8 bits.

37
Q

What is a primary and foreign key?

A

Primary key is the unique identifier for each row
Foreign keys are primary keys in other tables and mostly a look-up field to find data or align tables

38
Q

What does select and join do?

A

Select creates a subset consisting of records in the file that meet the criteria
join combines tables

39
Q

What is a blockchain?

A

A distributed database technology that enables firms to create transactions on a network without central authority. It stores transactions as a distributed ledger among computers. It maintains a continuously growing list of records called blocks. Each block contains a timestamp and link

40
Q

What are the 3Vs of big data

A
  1. velocity - velocity of processing
  2. variety - wide variety
  3. volume - extreme volume of data
41
Q

What is a datawarehouse?

A

data warehouses are databases that store data of potential interest. Data is available to anyone but it cannot be changed

42
Q

What are data marts?

A

Data marts are subsets of data warehouses and summarize portions of data in a separate database for a specific population

43
Q

What are datalakes? (hadoop)

A

a repository for raw, unstructured data that hasnt been analyzed yet

44
Q

What is inmemory computing?

A

It relies on computer’s main memory and eliminates bottlenecks

45
Q

What are some analytical tools?

A
  1. online analytical processing (OLAP)
    Enables multidimensional data anlysis
  2. Data mining
    provides insight from data by finding hidden patterns and relationships in large databases
  3. Text
    tools that help businesses analyze big text data
  4. sentiment analysis software
    can mine text comments i emails, blogs, etc to determine opinions
  5. web mining
    the analysis of patterns from the webW
46
Q

what are the types of data mining?

A
  1. Associations
    occurences linked to a single event
  2. Sequences
    Events linked over time
  3. Classifications
    recognizes patterns to describe the group to which an item belongs by examining existing items that have been classified
  4. clustesr
    similar to classifications but no groups have been defined
  5. forecasts
    uses a series of existing values to forecast what other values will be
47
Q

what is data governance?

A

encompasses policies through which data can be managed as organizational resources

48
Q

what is data quality audit?

A

a structured survey on accuracy and level of completness of data

49
Q

what is data cleansing

A

process of deleting and correctig data in database that is wrong