Data relations(1.3.2 1) Flashcards
What are relational databases?
usually created of 2D data structures called tables
What are table identifiers?
each table in the database needs a name
What is a record?
each row in a table
What is a field?
each column a table
What is a data element/data item?
each individual cell in a table
Table and field identifier
case sensitive
should contain no spaces
Tables and keys
the most important aspect of a table (even in a flat file) is that each record must be unique;y identifiable in some way
to do this we define a primary key
What is a primary key?
a field or collection of fields which uniquely identify each record
ID fields
an easy way to create a uniquely identifiable record is to simply have a field which contains an automatically generated ( often serialised) ID number
it increase the size of the database
it introduces an abstrxct piece of data which may have no relevence to real life uses
What is a foreign key?
a field of one table which references the primary key of a different table
Foreign keys
using one produces a relation between the two tables
tables can be related to multiple other tables by use of foreign key
cannot be used as the primary key in a related table and should not be used as part of a multi field primary key
Entities and relations
we can describe the relation between the tables by considering the actual relation between the objects described in the table
One to many relation
One to one relation
Many to many relation
Linking tables
the inclusion of a linking table splits the many to many relation into two different one to one relations
Primary key and indexing
a primary key is automatically indexed within a database
an indexed field is very easily searched and records are quickly retrieved
Secondary key and indexing
can define a field as a secondary key which the database will then index
makes searching and retrieving records via secondary key much quicker at the cost of increasing the storage space required by the database
Data
raw and unprocessed (can contain error)
needs processing before we can get any information out of it
can come from automated or human sources
How to get data automated sources
e.g electronic sensors (weather stations presence detectors web cookies)
produces huge amounts of data cheaply
sensor errors can go unnoticed for a long time
ethical concerns (surveillance anonymity)
How to get data human sources
e.g census survey
error prone (due to human error)
expensive
difficult to get people to to voluntarily provide private data
Turning data to information
data needs to be cleaned (removing any errors)
statistical methods/graphical methods can be applied to the data
visualisations are great all about making it easier for the person looking at it
correlation vs causation
Storing data
easiest way to store data is a table
rows are records
columns are fields
Flat file database
single table of data
requires no specialised software
portable (flat files are often stored in plain text)
easy to use (requires no specialist training)
large datasets start to become unwieldy (errors can become more prevalent)