Data Management Flashcards

Question

uniq

Answer 1

Removes duplicate adjacent lines

Answer 2

Records changes to files over time so they can be undone and viewed

Answer 3

Database on your computer holds all the changes. Does not allow collaboration

Answer 4

Single server stores changes, with users checking out files. Allows for collaboration but single point of failure

Answer 5

Changes (repository) stored in server and locally, with changes copied to each other.

Answer 6

put file in local repository

Answer 7

commit changes to file to local repository

Answer 8

Message describing a commit

Answer 9

Upload local repository content to remote repository

Answer 10

Download file from remote repository

Answer 11

When changes made cannot be merged automatically, resolved by manually applying changes to latest version

Answer 12

Copies new features from a branch to main. keeps new code out of main

Answer 13

Copies latest changes from main to branch, keeping branch up to date

Answer 14

Views current processes

Answer 15

CPU usage of processes

Answer 16

kill PID is process ID options: SIGTERM - requests process to stop, time for graceful shutdown SIGKILL - forces process to stop execution

Answer 17

Moves a process to the background/foregorund

Answer 18

Allows for creation of screens to run processes in the background

Answer 19

Accessible by all processes run in the shell

Answer 20

Ordered List of directories that store executables to be run

Answer 21

Sets environment variables export variableName='value'. Gives all environment variables if no argument

Answer 22

Searches for lines containg the given input. grep [pattern] [input]

Answer 23

\* - zero or more of previous ? - zero or one of previous + - one or more of previous . - wildcard [] - range of characters

Answer 24

Takes in text and modifies it. sed [command] [file]. commands e.g. 's/Hello/Hi/g' (replaces first Hello on each line with Hi, g means it affects every instance on a line), '/Run/d' (deletes all lines that contain Run)

Answer 25

Allows for processing of tables. awk [pattern] {action}. By default actions are run on every line. $number used to give column

Answer 26

action run once at the start

Answer 27

action run once at the end

Answer 28

(condition){action}, action only run on lines that meet condition

Answer 29

1. Easily allows for displaying of complex equations 2. Can compile large documents easily 3. Placement of figures and tables is easy 4. Automatic referencing 5. OS independant

Answer 30

Typed as a .tex file and the LaTeX engine compiles to .pdf

Answer 31

Open and close with $. Allows for mathematical symbols

Answer 32

Allows for operating on multiple files at once \* Any characters ? Any singular character [] One character out of those given

Answer 33

Changes permssions for files/directories. If using number each number represents 3 digit binary

Answer 34

first column shows permission infromation as 10 character string. First character shows directory/file, then split into 3 character chunks for each accessor (owner, group and other). The three characters represent whether the file/directory is readable/writable/executable.

Answer 35

Executable directories can be opened

Answer 36

Tab Seperated values, form of structured data

Answer 37

* Searching * aggregation and summation * Prediction * Linking - links info from different sources

Answer 38

A way to make human data machine readable. Involves creating a model that shows relationships between elements

Answer 39

Entities are connected with each other and attributes in a tree structure

Answer 40

Entities and attributes are connected in a directional graph.

Answer 41

Entities and attributes are connected in a directional graph. Additionally, Entities are classes, with attributes values coming with a pointer to the attribute

Answer 42

Entites are now tables. They are linked by a key property

Answer 43

Addition of metadata to document. Allows for structure and additional meaning to text. Allows for machine to gleam meaning from text

Answer 44

Uses whitespace for structure to allow for easy reading but harder writing. Written with key-value pairs, i.e. variableName: data

Answer 45

* Config Files * Passing data between application * Storing simple application states

Answer 46

The syntax can be ambiguous, so may get different results with different parsers (code that splits up text). Not widely used

Answer 47

Stores objects. Subset of YAML. Can be read by most languages. Contains: * Objects * Values - "object": "value" * Lists - "object": [value1, value2]

Answer 48

Sending data on the web/between programs. Sometimes used for config data

Answer 49

A markup language used for documents with hypertext (links). Tags say how to display data i.e. TEXT HERE <\text\>

Answer 50

Markup language for Shopify

Answer 51

Standard Generalised Markup Language - A standard for defining markup languages. Super set of all markup languages e.g. XML, HTML

Answer 52

* complex * no strict structure * Requires a definition of structure

Answer 53

* Easier to parse * Simplifies SGML * Don't need to define structure

Answer 54

eXtensible Markup Language - Hierarchical with only tags, attributes and content. Made to carry data not display data. No defined tags

Answer 55

Defines how it is written: * closing tags for all tags * case sensitive * must have root element * attributes are quoted

Answer 56

Used as a template to ensure an XML file is written in a certain way

Answer 57

Only contains text

Answer 58

Can contain attributes and children

Answer 59

Gives a prefix to tags with the same name, allowing for distinguishing between tags with the same name. xmlns:\="someurl". Then all tags in the namespace have \:tag

Answer 60

Can apply a stylesheet to add presentation to the XML file

Answer 61

XML path language. The way to query data from an XML file

Answer 62

Gives the node at the path given

Answer 63

Gets all nodes with that match

Answer 64

Matches any node

Answer 65

Gets nodes with the given attribute. i.e. //tag[@attribute = 'value']

Answer 66

Gets the text directly inside of the node

Answer 67

Gets the nth occurence of that node

Answer 68

Gets the last occurence of that node

Answer 69

Gets node1s that contain a node2

Answer 70

Gets node1s with the given attribute being equal to the given value

Answer 71

Gets the parent of the current node

Answer 72

used in square brackets, also use brackets

Answer 73

Gets both the nodes returned from the query on the left and right

Answer 74

Used in square brackets, true for nodes containing the given text in the given location. location examples: @attribute, text()

Answer 75

Gets the length of the given string

Answer 76

True if the location (attribute/text) starts with the text.

Answer 77

True if the string ends with the text.

Answer 78

Database Managment System - Collection of software that manages a database

Answer 79

Idea that other applications and users should be insulated from data structure (logical independace) and storage (physical independance).

Answer 80

Protection from changes to logical structure of data i.e. the schema of it

Answer 81

Protection from changes to physical storage of data i.e. whether it's stored on a hard drive

Answer 82

Defines how data is represented, organised and structured. Relational model is most widely used. Has a data language containing DDL and DML

Answer 83

used to modify and retrieve data. Contains data definition language (DDL) and data manipulation langage (DML)

Answer 84

Data definition language - syntax for describing database templates. Includes creating tables and defining keys e.g. XML schema

Answer 85

Data manipulation language - used for querying data e.g. XPath

Answer 86

A table, formally defined in set theory R ⊆ P(S1 × S2)

Answer 87

ordered sequence of k elements

Answer 88

unordered subset of cartesion product of k sets(attributes). Contains many k-tuples. An instance of a k-ary relation schema

Answer 89

An ordered sequence of k-attributes. A template for a k-ary relation

Answer 90

set of relation schemas

Answer 91

set of relations, each of which being an instance of a relation schema. Called a database

Answer 92

schemas. Changes rarely

Answer 93

instances i.e. relations

Answer 94

A set of attributes which is unique for all tuples. Can be made by combining attributes

Answer 95

A relation r satisfies a functional dependancy A -> B if all tuples in r with the same value for attributes in A have the same value for attributes in B. Allows deduction of value of B for a given value of A

Answer 96

The set of attributes on the left hand side of a functional dependancy

Answer 97

The set of attributes on the right hand side of a functional dependancy

Answer 98

A -> B is equivalent to A -> every element of B

Answer 99

If B ⊆ A then A -> B | Obvious as e.g. if height and weight known, then height known

Answer 100

Some functional dependancies S ⊨ (implies) A -> B if every relation instance that satisfies S also satisfies A -> B. A relation that fits the requirements for all of S also follows A -> B

Answer 101

S is equivalent to T if S ⊨ T and T ⊨ S

Answer 102

If the attribute(s) are the determinant for every attribute. The set of all attributes is always a superkey

Answer 103

Uniquely identifies each attribute, created for this purpose

Answer 104

A superkey that has no other superkey included in it. e.g. if {height, weight} and {height} were superkeys, only {height} would be a candidate key

Answer 105

Determines if a set of attributes is a superkey. Steps: 1. Get a dependant set that can be reached with the current attributes 2. If there are no new attributes in any dependant sets, the original attributes are not a superkey so stop 3. Else, union the current attributes with the dependant set 4. Repeat 1-3 until the current attributes is the set of all attributes

Answer 106

Poor relation if X -> A and X is not a super key as it can lead to redundant data

Answer 107

Issues that can happen in a bad database: * Redundancy - same data in multiple places * Updates - updates can cause data to be inconsistent * Inserts - Forced to fill in extra irrelevant attributes * Deletion - Extra data that wasn't intended to be deleted deleted

Answer 108

Unormalised, all data stored in one table

Answer 109

* Cannot have multiple values for one attribute * Cannot have the same attribute in multiple columns

Answer 110

* Each FD has 1 attribute on the right hand side * Minimal amount of attributes on the left hand side * No redundant FDs, i.e. implied by other FDs

Answer 111

Where an attribute only depends on part of any candidate key

Answer 112

* 1st Normal Form * All attributes not part of any candidate key are dependant on all parts of all candidate keys

Answer 113

Take the illegal data and place it in a new table using the key

Answer 114

If values (not part of any candidate keys) only depend on (need to be determined) part of the candidate keys then split into tables with the key being the depended on attributes

Answer 115

* 2nd Normal Form * All non-key attributes are only determined by the keys, not anything else

Answer 116

If a non-superkey attribute depends on non-super key attribute(s) place the depended on attribute as the key for a new table with the attributes that depend on it

Answer 117

* 3rd Normal Form * Every determinant is a candidate key

Answer 118

If a functional dependancy exists with the determinant being non-key, a new table is made with the key being the non-key determinant

Answer 119

* Less Redundancy, so less storage * More efficient to query * No duplication so no inconsistency

Answer 120

A collection of tuples. Visually represented as a table

Answer 121

A link between relations

Answer 122

Type of modelling that Identifies entity names and relationships, sometimes attributes. Made from requirements directly. No database design.

Answer 123

Type of modelling that identifies attributes and attribute types(e.g. int)

Answer 124

Type of modelling Aiming to represent database structure. Has actual tables and attributes. Implements relationships, i.e. keys, join tables, indexes

Answer 125

* Serverless * No configuration

Answer 126

* Not multi-user * No concurrency * Just a file

Answer 127

CREATE TABLE table ( column TYPE (NOT) NULL, ... PRIMARY KEY (column,...) )

Answer 128

INTEGER REAL TEXT BLOB - any uninterpretable data, e.g. image NULL

Answer 129

ALTER TABLE oldTable RENAME TO newTable

Answer 130

DROP TABLE table

Answer 131

Done with SELECT column,... followed by FROM tables and then any number of optional further constraints.

Answer 132

ORDER BY column ASC/DESC

Answer 133

LIMIT x. Result only shows the frist x results

Answer 134

WHERE condition. Result only shows the rows that match the given condition. At most basic level is column = value. Can use AND, OR and NOT, comparison(>), !=.

Answer 135

Used in a WHERE condition when using % is a string as a wildcard. e.g. WHERE id LIKE '%1%'

Answer 136

Used in a WHERE condition when checking if the column values is in a given list. e.g. WHERE id NOT IN (1,3,5)

Answer 137

Allows for selecting data from multiple tables. Done with: JOIN table2 ON table1.foreignkey=table2.primarykey. If primary key composite must use ON for every column in the primary key

Answer 138

Returns all combined rows and the rows from the first table that do not match a row in the second table. e.g. first table people and second table banks, if a person has a bank not in the second table still display that persons row

Answer 139

must reference columns by their table. e.g. table1.column

Answer 140

SELECT table.column1 AS column1, table2.column2 AS column2. Allows for renaming of columns for a query

Answer 141

INSERT INTO table (column,...) VALUES (value,...) Adds an entry to the table. The list of columns can be omitted if the entry has values for all columns. Can use a SELECT query in place of VALUES

Answer 142

Acts as a virtual table. Is a query allowing for data to be seen from tables in a specific way but does not store any data it self

Answer 143

CREATE VIEW view AS SELECT ... DROP VIEW view

Answer 144

Specifies new values for columns in the table. UPDATE table SET column=value... WHERE conditions WHERE statement determines which rows the changed will be applied to but is optional

Answer 145

DELETE FROM table WHERE conditions WHERE statement determines which rows will be deleted but is optional

Answer 146

Allows for applying many different opertaions to a database. Replaces a column after SELECT.i.e. SELECT function(column) FROM table

Answer 147

Allows for function output for rows with the same given column value to have the function applied to separately. e.g. SELECT student, avg(mark) FROM scores GROUP BY student gives the average mark of each student rather than the total average mark

Answer 148

In CREATE TABLE: FOREIGN KEY (column) REFERENCES table(column) This ensures that any values in the current table in the foreign key column is in the other table

Answer 149

Add actions when doing CREATE TABLE after a FOREIGN KEY ON DELETE action - happens when parent record deleted ON UPDATE action - happens when parent key updated

Answer 150

CASCADE - delete/update happens to current table RESTRICT - prevent delete/update SET DEFAULT - change value to default SET NULL - change value to NULL

Answer 151

Data structure associated with a table to improve query speed. Ordered by value of a key accessed regularly (think of it being stored in a binary tree). Increases table modifaction time

Answer 152

CREATE (UNIQUE) INDEX index ON table(column,...) UNIQUE ensures only 1 entry of each value will be in the index, is optional

Answer 153

* Inflexible - changing requirements are tough and e.g. lists are hard to do * ACID costs - limits performance and scalability * JOINs complexity - creates complex queries * Structure issues - Optional data makes bad tables

Answer 154

Add more capacity to the server, not great as has a limit and is expensive. Relational databases have to mainly use this

Answer 155

* Replication - duplicate data to be stored on many server, expensive and has synchronisation issues * Partitioning - store parts of the database on many server, prevents joins across tabls in different servers * Both not good

Answer 156

Not Only SQL. Can use SQL languages on their databases. Less strict schemas.

Answer 157

Typically JSON/BSON. Schema free (mostly?). Structured as a tree with nodes being documents

Answer 158

Contains keys that are linked to values. Often contains partition keys, splits data into partitions, and sort key, gives a single entry

Answer 159

A row represents an entry but only has columns where there is data for it. Like RDBMS and Key-Value

Answer 160

Entries represented as nodes with edges connexting the data together

Answer 161

Distributing data across multiple nodes, allows for easy scaling

Answer 162

When one databse is overloaded with traffic while others are underused

Answer 163

strict/strong - all reads must wait for writes to be consistent sequential - all writes happen in order causal - opertations that can change the outcome of each other happen in order eventual - data will eventually be stored correctly

Answer 164

consistency - all users see the same data availabilty - all users can always get a response partition tolerance - the database works if communication breaks between nodes

Answer 165

For databases with data split across multiple servers only 2 of the three attributes of CAP can be met

Answer 166

* fast and simple * Flexible structure choices * Easily scale horizontally infinately

Answer 167

* Design has to be done right early * Can only access data in the way it's designed * Functions are very difficult to use * Changes later on get very expensive

Answer 168

Document-based using JSON/BSON documents. The documents are key value pairs, but MongoDB is not a key-value type database. Can have schema

Answer 169

Can add sections to certain documents referencing another document

Answer 170

Needed to allow for searching by certain fields and/or sorting by them

Answer 171

AWS key-value based serverless NoSQL database. Everything stored in 1 table

Answer 172

* Only allows basic lookups * The data has to be modelled specifically for this purpose, so difficult to leave

Answer 173

Replace the partition and/or sort key automatically with a secondary index allowing for entries to be searched by some of it's data.

Answer 174

Document Type Definition - Defines the structure of an XML document. (kinda like a schema)

Answer 175

XML Schema Definition - Using XML defines a structure for an XML document.

Answer 176

Syntax - How XML is written Namespaces - Gives IDs to tags to make them unique Schema - Defines how an XML document is structured XPATH - Finds data in an XML document

Answer 177

* Ensures valid file * Can be used as template * Identifies errors * Eases Parsing

Answer 178

* More Complex * More tables/relationships * Longer Queries

Answer 179

Improves query speed by adding data to uneeded location but increases operation complexity

Answer 180

When a foreign key is/is not part of the primary key of a child table

Answer 181

Maximum number of times an entity can be related to another entity

Answer 182

Minimum number of times an entity can be rated to another

Answer 183

PRAGMA foreign_keys = ON; Ensures that values in a foreign key must be in the table that it is referencing

Answer 184

* Multi-user - Allows for multiple people to use a computer concurrently/at different times * Multi-processing - Utilizing multiple processors * Multi-tasking - Running multiple processes at the same time

Answer 185

Used in a file path to go to the home directory e.g. ~/downloads

Data Management Flashcards

(209 cards)