Exam 1 Flashcards
To cover Data Fundamentals
What are the steps to make data useful?
Data Acquisition, Data Modeling, Extraction
Name and Describe Data Provisioning
Data Provisioning is the process of providing users and systems with access to data. This includes the security authorizations to limit access to only those data which the user or system is officially permitted to view
Replication
Data is copied from the source and transferred to the analysis system. This is done to keep the data intact. It is done in time or in batches
Structured Data
Structured data is computer readable and usable. EX(databases, spreadsheets, flat files) Specific data types. Metadata is data about the data (meaning, context, purpose)
Create, Read, Update and Delete (CRUD)
Tables: Columns and Rows
Unstructured Data
Unstructured data does not conform to a data model and or has associated metadata such as pictures, audio, video, tweet, reviews
Relational Databases
Relationships between tables
Each row has a unique id called a primary key.
Connect liked tables with the primary key in another table called a foreign key.
CRUD Anomalies
Read - Does not create an anomaly
Create Anomalies by repeating data already stored, combining data, possibly creating unstructured data.
Update anomalies- storing the same data in many different places
Delete anomalies- delete a row if data which affects another tables data.
Describe Normalization
Normalization is the process of decomposing a database into more tables until the database is not longer susceptible to anomalies
Most common forms of anomalies
-First normal form(1NF)
-Second Normal form (2NF)
-Third Normal form(3NF) (Industry standard)
1NF
Each table cell should contain a singe value
Each record needs to be unique
2NF
Rule 1- Be in 1NF
Rule 2- Single Column primary key
3NF
Rule 1- Be in 2NF
Rule 2- Has no transitive functional dependencies
What is a transitive functional dependencies
A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change
What are some examples of tagged data?
XML and HTML and JSON are examples of tagged data
What is AI?
The Turing test is a test of a machines ability to exhibit intelligent behavior equivalent to, or extinguishable from that of a human
Define Natural Language processing
NLP translates human voice and language into computer readable text using programming languages.
Examples:
- Speech recognition
- Sentiment Analysis