132-databases Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Database

A

persistent, organised store of data. Persistent: stored on permanent storage (e.g hard drive) & not stored in memory-only data like data structures array, queue/list). Organised: certain standardised catalogued way, stored in records&fields. Store of data: raw facts/figures an application needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Organising data into a database benefits & from paper-based to electronic databases:

A

-easier to efficiently search, manipulate, update and make data copies.
- can be accessed by multiple people at same time from different locations.
- Avoids inconsistencies and duplication of data, storing and managing data in a reliable and efficient way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Database terminology (table, record, field, entity, attributes, field name, tuple)

A

A table is a collection of records representing one set of data in a database. A record is data fields about a single entity. A field is a piece of data about an entity in a record. A entity is an object, place, person, or thing that can be distinguished from others and has data stored about it
Attributes: Properties/characteristics of an entity.
Field name: An identifier for a piece of data.
Tuple: A record in a table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Database concepts

A
  • database is a collection of tables that store data about entities in the real world.
  • Each table consists of records (rows/tuples) and fields (columns/attributes)
  • database is managed by a database application like a DBMS.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Flat file database

A
  • all data stored in only one table (a single file, often CSV and based around a single entity & it’s attributes)
  • Typical uses: when only one type of entity is being stored, storing contact details, small product database, maintaining music/game collection
  • Created using database/spreadsheet software but can be created manually using text editor/other tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Flat file database pros

A
  • very simple, easy & quick set up and design
  • little expertise to maintain
  • suitable for storing small amounts of data
    -easier to import into another system
    -easier to use for an inexperienced user
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Flat file database cons

A

-very inefficient, a lot of repetitive data, takes up unnecessary space
-gets worse as database grows: difficult to maintain, slow to query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Relational database

A
  • database worganise data into multiple tables, each representing an entity in the system.
  • to minimize redundancy, repetitive data is split and stored in separate related tables, which are linked together using relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Relational database pros

A
  • reduces data duplication - saves storage
  • easier to change data or the format of data
  • easier to add data
  • improves data integrity and consistency
  • allows different access levels/security levels
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Relational database cons

A
  • becomes complex as amount of data grows - decrease in performance - slow query responses
  • expensive - set up and maintenance
  • requires lot of physical storage.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Primary key

A

field, has a unique identifier (value) for each record in table. which is never repeated within a table so ensures each record is unique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Secondary key

A
  • field that has some identifying information but typically not uniquely identifies record
  • allows database to be searched quicker for records/speeding up sorting - if key maintains a logical order to the records
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Foreign key

A
  • primary key (attribute) of another table(entity), to link tables together
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Entry relation/entity relationships

A
  • One-to-one: one entity can only be linked to one entity. e.g each customer’s firstname has one surname.
  • One-to-many: one entity can be associated with many other entities .e.g customer can have many orders but each order belongs to only one customer.
  • Many-to-many: many entities can be associated with many other entities. Each customer can buy many items, and each item can have many customers.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data capture

A
  • collecting data by extracting info from paper/electronic documents and converting it into data to place into database.
  • manual- paper forms: human reading form & manually typing info into computer based system, don’t require technology/internet connection
  • automated methods - helps speed up & automate data input, very fast n efficient, less prone to human error): smart card readers, barcode readers, scanners, OCR, OMR, magnetic ink character recognition, sensors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data capture OCR (optical character recognition)

A
  • Automatically reads text by interpreting letters shapes.
  • Works better w/ printed text than handwriting.
  • Used by post office to read postcode & route mail, road cameras to handle congestion charging n identify speeding drivers (form of OCR: ANPR automatic number plate recognition software)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data capture OMR (optical mark recognition)

A
  • detects the presence of marks or shaded areas.
  • Provides more accuracy than OCR.
  • Used by multiple choice tests, lottery tickets, surveys, ballots
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Data capture - Inputting Data

A
  • once data collected, can be transferred to database automatically using DBMS software/typing it in customised form/importing it from a spreadsheet or file/using EDI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Selecting data

A
  • methods of querying the correct data and retrieving it.
  • Using query languages e.g SQL/QBE.
  • Both allow to specify tables/fields/different search criteria/output sorting (e.g ascending/descending order, & use boolean expressions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

QBE (Query By Example)

A
  • makes use of visual tables where user enter commands n conditions
  • doesn’t need to remember syntax finer details, just need to understand how to use graphical interface.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Managing data

A
  • managing, manipulating, adding, editing and deleting data.
  • Can be managed by SQL to sort, restructure and select certain sections - DML to modify (add new n edit existing data n delete data) a database once it has been initially set up(commands update delete insert) or by using built in facilities of database.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Database applications

A
  • software programs designed to retrieve, manage, distribute and present information required by the end users effectively from the raw data of the database.
  • Examples: database management system (DBMS), CRM systems, web applications, social media websites
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Database management systems (DBMS)-

A
  • software application, allows database administrator to maintain one/more relational databases.
  • Hides underlying complex structure of data, provides layer of abstraction for user, so can define the database structures at logical level.
  • Examples: MySQL, SQLite
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Features of a DBMS

A
  • allows different applications to access data at the same time, controls access to data
  • ensures it remains integral (enforce referential integrity) by preventing creation of duplicate primary keys n any operations that could damage relationships between tables of data
  • enforce validation rules, provides secure access (limits access), provides encryption
  • provides program data independence
  • manages multiple users
  • provides backups n ability to restore from a backup if disaster occurs
  • supports a query language n other languages (to extract/add/amend data & alter database structure)
  • controls concurrency (lock data whilst someone is working on it so someone else cannot change it at same time)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Exchanging data -

A
  • Process of transferring the collected data w/ other people/electronic systems
  • Manual methods: memory stick, optical media, removable hard disk, email, paper-based
  • Automatic methods: bypass manual data exchange and have 2 database interface with one another (reading n writing directly to and from eachother’s table) through live connection e.g EDI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

EDI (electronic data interchange)

A
  • Protocol between two systems to facilitate exchange of data.
  • Doesn’t require human interact, enables data transfer from one computer to another. Increases data transmission speed & efficiency of processes for users. Process can be entirely automated without any human involvement
  • though, any error in data will be replicated across multiple systems.
  • Uses e.g automatic order placement system when item stocks low.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Exchanging data considerations

A

making sure can exchange data between different databases n other systems/application self spreadsheet software n accounting programs. Common formats: human readable, open formats for structuring data. Standards for storing n transporting data, systems allow. Examples: CSV, JSON, EDI, SOAP, RSS, APIs, SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

CSV file (commas separated value)

A
  • stored in text file data is in a list with a comma between each one.
  • Used for excel documents.
  • Database: each record is stored on separate line n each field separated by comma.
  • Structure fixed and known, important routines can be written to extract data from file.
  • Many systems allow data to be output in CSV format.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

JSON file (JavaScript object notation)

A
  • json strings can be easily converted into an object
  • Common format for databases
  • can export and output data in JSON format. [{“id”:1, “last name”:”Smith”,”firstname”:”Sara”}, … }
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Indexing a database

A
  • process of creating a database index, data structure that improves the speed of data retrieval
  • method used to store position of each record ordered by certain attribute
  • makes it quicker to query/search through n find required items on those particular attributes, as items not need to be searched sequentially.
  • When field becomes indexed==secondary key. Primary key is automatically indexed so can be located quickly n easily, but almost never queried as not normally remembered so secondary keys used (fields used as search criteria).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Normalisation

A
  • process of removing redundancy within a database
  • way of structuring the data in a relational database according to formal rules
  • process of coming up w/ best possible layout for relational database by splitting tables and arranging data to normalised forms( need to be normalised to ensure system operate properly
  • data integrity maintained n unnecessary duplicates avoided).
  • Tries to accomplish: no redundancy, consistent data throughout linked tables, records added n removed w/o issues, complex queries can be carried out.
  • makes accessing and storing data easier and more efficient
  • makes maintaining the database contents more quicker and easier
  • reduces the chances of data errors or duplication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Database normalisation forms

A

first normal form (1NF), second normal form (2NF), third normal form (3NF)

33
Q

UNF (unnormalised form)

A

data in any format/configuration. Flat file before any normalisation.

34
Q

1NF (first normal form)

A
  1. Each record have no repeating/similar groups of attributes.
  2. Data in each field cannot be sensibly broken down any further (atomic). E.g James Winfield should be split
  3. Each record has a unique field (primary key)
  1. All field names must be unique.
  2. Values in fields should be from the same domain (same type of data)
35
Q

Fixing a many to many relationship

A

M:M not considered to be good database design and is not allowed in 3NF. Create linking table, add primary keys from initial table as composition key for linking table. Becomes 2 separate 1:M relationships joined by link table.

36
Q

Composite key

A

combination of two or more fields in table used to uniquely identify each record.

37
Q

2NF (second normal form)

A
  1. Data already in 1NF.
  2. Any partial dependencies (happens with compositive keys) removed, all non key attributes must depend on every part of the primary key and not just one part of it
    (IF YOUR TABLE HAS NO COMPOSITE KEY, IT IS ALREADY IN 2NF)
38
Q

3NF (third normal normal)

A
  1. Data is already in 2NF.
  2. transitive dependencies removed (no non-key fields dependent on non-key fields)
39
Q

SQL(structured query language)

A
  • declarative language (programs specified in terms of what they must do), used to manipulate/interrogate databases
  • enables fast n efficient retrieval n run reports, manipulation(creating, removing, updating), deletion of records using simple set of commands. Allows query/manipulate/define data n control data access.
40
Q

SQL selecting data

A
  • SELECT (variable) FROM (table) WHERE (criteria).
  • SELECT collects fields from given table
  • paired with FROM to specify which table(s) info will come from,
  • WHERE specifies search criteria. n ORDER BY orders output shown to user, ORDER BY field ASC (ascending order top-bottom A-Z) or DESC
41
Q

SQL wildcards

A
    • wildcard for select, shows all fields from a table. % wildcard for: WHERE field LIKE “%A” gives all name field records that include A, “A%” gives all name field records that start with A
42
Q

SQL command groups (group, provides commands that allows to..)

A
  • DDL (data definition language) define n modify database & its components (e.g tables, relationships) e.g CREATE.
  • DQL (data query language) retrieve data from database e.g SELECT.
  • DML (data manipulation language) - insert, update and delete from database. e.g INSERT INTO, UPDATE, DELETE FROM
  • DCL (data control language) grant n revolve access permissions for database n its contents.
  • DTL (data transaction language) - manage transactions
43
Q

SQL CREATE

A
  • allows to make new databases, tables, or link tables.
  • (CREATE DATABASE databasename)
  • CREATE TABLE tablename(field 1 data type (length), field2 data type (length), …),
  • link table example:
    CREATE TABLE Class (CourseID CHAR(6), InstructorID INT, FOREIGN KEY CourseID REFERENCES Courses(CourseID), FOREIGN KEY InstructorID REFERENCES Instructors(InstructorID), PRIMARY KEY (CourseID, InstructorID))
44
Q

SQL INSERT INTO

A
  • insert record into database table:
  • INSERT INTO table (field1, field2, … ) VALUES (value1,value2, … )

e.g INSERT INTO Course (CourseCode, Fee) VALUES (“CR7531”, 25)

45
Q

SQL UPDATE

A
  • update record in database table:
  • UPDATE table SET field1=value1, field2=value2, field3=value3, … WHERE criteria;
46
Q

SQL DELETE FROM

A
  • delete record from database table: DELETE FROM table WHERE criteria;
47
Q

SQL JOIN (inner join)

A
  • combine data from two or more tables by specifying a common field between them. SELECT table.field, table.field, … FROM table1 INNER JOIN table2 ON table1.common field=table2.common field (example: SELECT Students.Surname, Tutors.Room FROM Students JOIN Tutors ON Students.Group=Tutors.Group)
48
Q

SQL DROP TABLE

A
  • deletes existing table in database, results in complete loss of all info stored in table.
  • DROP TABLE tablename
49
Q

SQL ALTER

A
  • add, delete, modify columns in a table.
  • ALTER TABLE tablename ADD attribute datatype (optional length)/DROP COLUMN attribute/MODIFY COLUMN attribute datatype(optional length)
50
Q

SQL data types

A
  • CHAR(n) string of fixed length n
  • VARCHAR(n) string of variable length with upper limit n.
  • BOOLEAN true/false values
  • INT integer
  • FLOAT number with floating decimal point
  • DATE date in format D/M/Y
  • TIME time in format H/M/S
  • CURRENCY sets number as monetary amount (money)
51
Q

Data integrity

A
  • maintenance and consistency of data in a data store
  • stop inconsistent/incomplete transactions from causing data corruption (data becomes unusable) and lost of accuracy
52
Q

Referential integrity

A
  • table relationships must always be consistent
  • concept where each foreign key must link to an existing/valid primary key.
  • every referenced value must exist
  • changes to data in one table must take into account data in linked tables.
  • ensures consistency + accuracy of data within relationships n database itself. enforced by DBMS
53
Q

Maintaining referential integrity

A
  • admin specify what happens when record updated/deleted
  • enforce a cascade delete restraint to primary key relationship (if record deleted, any records w/ matching foreign key values deleted).
  • prevent adding record if there isn’t matching primary key link to foreign key value
  • Ensuring consistency: info not removed if required elsewhere in linked database, if two database linked, one cannot be deleted as other table requires it.
  • if primary key value changed, matching foreign key values no longer valid, so must be auto updated/cascaded change
54
Q

Data redundancy

A
  • occurs when data duplication/repeated data kept in database. e.g same value repeated in 2 or more fields in table
  • same field replicated in 2 or more tables. if change made in one place=not all instances of data will have same value =data inconsistency->corruption, compromise data accuracy, increase storage requirements, redundant (unnecessary) data/wasted space.
  • Dealt by database normalisation.
55
Q

When redundancy is needed

A
  • Built in redundancy e.g RAID setup/mirroring servers (back up) in computer system (duplicate hardwares located in different geographical areas, mirrors every transaction taken place on main system, if main system fails, backup system takes over/if part of database is lost, recoverable)
56
Q

Data independence

A
  • database designed so reflects natural relationships between entities, is not structured in way that suits specific application.
  • So when changes made to data use structure/definition, not affect existing database applications that use database.
  • DBMS manages modifications so database application programs don’t need to be rewritten everytime a change occurs to database so user’s view n experience of data not compromised. Physical data independence and logical data independence
57
Q

Physical data independence

A

ability to adapt physical aspects of database w/o affecting database applications e.g changing devices used for storing data/way files organised

58
Q

Logical data independence

A

ability to adapt to logical aspects of database w/o affecting database applications e.g removing/adding entities/relationships/attributes

59
Q

Validation and verification techniques (database)

A
  • assuring data integrity when data entered/manipulated in database
  • database applications apply validation and verification checks - only valid/verified data sent to database, user friendly messages displayed to prompt user to correct invalid data.
  • DBMS incorporates data dictionary alongside database that specifies any rules/constraints automatically applied. Data rejected if not meet rules.
60
Q

Validation

A

Checking whether data is allowed and follows criteria. cannot ensure that it’s a wrong value, only ensures data is reasonable
e.g birthday date jan 1 2000

61
Q

Types of validation techniques

A

Check digit- Uses additional digit - check other data correct. Mathematical algorithm - calculate value from other data. value compared to check digit to ensure numbers been entered correctly. e.g Detects errors on identification numbers.
Format check-checks data is in the right format
Length check-checks the data isn’t too short or too long by checking how many characters there are
Lookup table-Looks up acceptable values in a table
Presence check-checks data has been entered into a field
Range check-checks a value falls within a specified range

62
Q

verification

A

checking data was entered correctly. double entry, visual check

63
Q

double-entry

A

data entered twice to check both sets of data match up
compared by computer n human, re-entered if discrepancy found

64
Q

visual check

A

double checking data input by looking at it. proof-reading/data compared to original source/intention by human n re-entered if discrepancy found.

65
Q

Database constraints

A
  • used to limit data that can be entered into database table, specify rules data needs to follow to be accepted in a table. SQL statements used in creation of table.
  • Examples: PRIMARY KEY: every row of column has unique value that isn’t null. UNIQUE - all values in a column are different. NOT NULL- column won’t accept null value. FOREGN KEY- establish referential integrity (record w/ foreign key value can only be inserted into table of theres record w/ same value as primary key in another table). CHECK- enforce specific condition to column of table e.g only positive numbers
66
Q

Transaction processing (database)

A
  • information processing must divided into individual, indivisible operations (transactions - single logical unit of work) executed on data.
  • each transaction must completed or not at all. Ensure integrity
  • conforms to ACID rules
67
Q

ACID rules (database)

A

set of properties that guarantees database transactions processed reliably n integrity of database maintained under all circumstances n no data inconsistency. Atomicity, Consistency, Isolation, Durability

68
Q

Record locking

A
  • Record locked when in use.
  • If one transaction amending record, no other transaction able to simultaneously access record until first transaction is complete/cancelled
  • prevents inconsistencies/loss of updates
  • Problems: deadlock
69
Q

ACID rules (database) Atomicity

A
  • database transaction atomic, must be completed in full/not at all but never partially processed under all situations e.g power cuts, hard disk crashes.
  • Held in temporary storage buffer until complete. e.g not possible for a review to be added without average rating being updated
70
Q

ACID rules (database) Consistency

A
  • no transaction can violate defined validation rules of database, change to database must retain overall valid state of database) e.g must maintain referential integrity rules between linked tables
71
Q

ACID rules (database) Isolation

A
  • transactions occur on their own, cannot be interrupted/influenced by other transactions happening at same time (which should lead to same result as if they were executed one after the other). DBMS enforces isolation w/ record locking
72
Q

ACID rules (database) Durability

A
  • once transaction has been executed (change made to database), it will remain so regardless of circumstances e.g system failure after it. DBMS writes effects of complete transaction immediately back to permanent secondary storage
73
Q

Deadlock

A
  • when one process requires the output of another
  • if two/multiple users locking their record are attempting to access/update eachother’s records
  • situation arises where neither can proceed as two/more transactions are waiting for one another to give up locks.
74
Q

Commitment ordering

A
  • evaluating impact of transactions committed to database system before these are executed
  • If transaction likely to cause deadlock, it is blocked until process it depends on has completed
  • Commit order produced and followed to execute transactions in an optimal way that avoids conflicts, maximises serialisation of transactions n improved overall database system performance
75
Q

Timestamp ordering

A
  • each record has two time stamps: last read, last written/updated.
  • All operations to read/write from database are processed in timestamp order, each requested operation given operation timestamp
  • If write operation requested: if read/write timestamp of record I’d newer than timestamp of request write operation, operation is reflected, otherwise allowed n read/write timestamp of record set to operation timestamp.
  • If read operation requested: if write timestamp of record newer than timestamp fox requested read operation, then read operation rejected, if write timestamp of record is older than/same as timestamp of requested operation then allowed n read timestamp updated. I
  • f operation is not allowed, suitable error message displayed to user to instruct them to try again.
76
Q

Serialisation

A

concept when two/more transactions executed concurrently, effect should be as the same as if been executed serially. Transaction serialisable if it can be carried out regardless of any other transaction. Achieved by placing transactions in a queue

77
Q

How can a company ensure accuracy of data and why is it important? 9 marker - points

A
  • careful transaction processing: be atomic, consistent, isolated, durable.
  • record locking
  • referential integrity
    -built in redundancy
    -data entered must be accurate in first place
    -security measures to prevent malicious tampering of data
    —————-
    -ensuring accuracy down to DBMS and code accessing DBMS
    -validation n verification rules n techniques to catch errors
    -some systems disable copy-paste to prevent input errors
    -RAID setup/mirroring servers
    -referential integrity enfroced by DBMS
    -security procedures: firewall, enforce sensible passwords, user access rights etc
    —————-
    -mistakes w/ data= lose money/legal trouble/lose business/reputation
    -without redundancy data could be lost
    -without careful transaction processing, transaction could overwrite another/half complete leading to inaccurate data
    -data protection act -obligation to keep personal data accurate.
78
Q

SQL sub queries

A

example:
SELECT name
FROM world
WHERE population >
(SELECT population
FROM world
WHERE name = ‘Algeria’)