Computer Science Flashcards

You only need a little bit of computer science to get ahead. Learn about it here.

1
Q

What is…

Data

A

Signals that can be transmitted and can potentially be interpreted by a computer or a human. Data is commonly a stream of binary information that needs further processing to be understood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is…

Information

A

Information may be synonymous with data. It could also be argued that information is data that can be interpreted meaningfully, that is, it is more than just a nonsense stream of bytes. It has tangible meaning to the computer or human being interpreting it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is…

Information Theory

A

The study of the quantification, storage, and communication of information. The application of information theory is crucial to compression techniques used in various hardware and software applications such as the transmission of signals, or JPEG compression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a…

Bit

A

A single binary signal – on or off – represented as a one or a zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a…

Byte

A

An encoding of 8 binary signals – eight ones or zeros – represented as a single integer which then needs further interpreting by a computer, or user, by looking up an appropriate encoding scheme.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a…

Bitstream

A

A contiguous stream of bytes that requires further interpretation by a user or computer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an…

Integer

A

An integer is a whole number from zero to positive, or negative, infinity, e.g. 0, 1, 2, 4, 8, 16, 32, 64, 128.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is…

Compression (General)

A

The use of entropy (redundancy) in a digital object to enable it to be re-encoded in a way such that the resulting bitstream is smaller than the original, but that the original file, or an approximation of the original file can still be presented back to the user. A file that has been compressed must be uncompressed to be rendered or used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is…

Compression (Lossless)

A

A method of encoding data so that the resulting bitstream is smaller than the original, e.g. for transmission or storage, but when the data is uncompressed it is exactly the same as the original byte-for-byte.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is…

Compression (Lossy)

A
  • Lossy compression is a term usually applied to files that have been transformed into something, smaller, through the removal of information, but which can be replayed back to the user in a way that is approximately the same.
  • The MP3 algorithm ‘compresses’ audio streams by removing high-frequency signals that, theoretically, human beings cannot hear, transforming the signal, and then re-encoding it.
  • The loss of high-frequency signals equates to a loss of information, and is therefore lossy.
  • Should a user attempt to then recompress a lossy file, the file will be compressed even further resulting in even more information loss – think photocopy of a photocopy.
  • N.B. it is a myth that simply opening a lossy file, e.g. JPG can make it lose even more information. The user must actively choose to resave the file, and even then, choose lossy options when doing so.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is…

Uncertainty

A

The state of being uncertain, for example, not knowing when a project is expected to be completed by.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is…

Management of Uncertainty

A

The use of data, about an uncertain topic, or event, to simulate a range of potential outcomes that can be used to manage projects; risks; and costs; by giving stakeholders an evidence based projection about what may happen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a…

Heuristic

A

A set of rules or principles that can be used to derive an outcome. For example, if asked to determine which direction is east or west, one might look at the time of day, and the position of the sun, and estimate thusly.

Heuristics are often employed in programming where a formal algorithm does not exist, but which an outcome still needs to be derived, e.g. in reverse engineering a file format from a sample corpus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an…

Algorithm

A
  • A formal, or codified description of a set of rules for determining an outcome from one or more inputs.
  • Any set of rules combined to generate an outcome can be described as an algorithm, for example, the set of rules for baking a certain type of cake.
  • An algorithm could be created to sort a set of numbers in the most optimal way possible.
  • Algorithms are utilized in many of today’s online services, for example, YouTube, to determine the type of content that may be most interesting to its viewers.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are…

Dependencies

A

The components of our computer architectures that make it possible to achieve an outcome or result. For example, to run Microsoft Word, a dependency may be Microsoft Windows. Configured in Microsoft Windows may be a number of software libraries that enable it to interact with your computer’s hardware configuration. As we dissect a file, or piece of software, we begin to understand what other technology it depends on to be able to run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is…

Unit Testing

A

The automated testing of source code by breaking it down into its smallest functional components – units. Testing is done by controlling inputs and testing the output and state of the program at various stages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is…

Semantic Versioning

A

A method of controlling the version numbers of software in a way that both makes it clear to users what changes they can expect, but also, in a way that makes software developers more accountable for the breadth and depth of their changes in any one release.

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality in a backwards-compatible manner, and
  • PATCH version when you make backwards-compatible bug fixes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an…

Executable

A

A synonym for program, a file that can be run, or executed by a user of a computer system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a…

Symbolic Link/Shortcut

A

The most common use of either a symbolic link (symlink) or shortcut, on Windows or Linux is to point to a file, or executable at some other location on the hard disk than where the symlink is positioned, e.g. to make it easier to run a given application from a particular location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a…

Programming Language

A

A set of instructions and rules that can be combined to perform a computational task or set of tasks. A programming language is just a flavour of instructions that all need to be boiled down to something that the processor can understand – usually machine language. Programming languages usually differ in terms of abstraction, meaning that low-level languages work much closer to the hardware (closer to machine code) than high-level languages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a…

Scripting Language

A

A high level language that is compiled at run-time via a program called an interpreter. By storing a large number of more complex, yet common functions and procedures in an interpreter, the user can be free to call those function using fewer commands in a ‘script’. A user can interact with data for example, without having to worry about underlying memory models of the computer. A scripting language cannot be run in absence of an interpreter so a dependency of running such code, for example, Ruby, or Python, is that their interpreters must be pre-installed on the host machine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is…

Data Encoding

A

A method of structuring information in a way that can be processed further by user, or computer, e.g. Extensible Markup Language, JavaScript Object Notation (JSON) and Comma Separated Values (CSV). File formats such as Microsoft Excel, Microsoft Word, or JPEG are also data encodings, albeit quite a bit more complex.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is…

Character Encoding

A

The mapping of binary numbers to a lexical or numerical character. Numerous character encodings exist including ASCII, and EBCDIC. The widest range of characters can be represented using a standards called Unicode.

24
Q

What is…

ASCII

A

American Standard Code for Information Interchange is the mapping of computer control signals, and Latin alphanumeric characters to the 255 numbers that can be represented using a single byte. ASCII is heavily biased toward western writing systems and as such Unicode was created to make it easier to work with other writing systems in a computing environment.

25
Q

What is…

Unicode

A

Unicode (maintained by the Unicode Consortium) is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world’s writing systems. The Unicode Standard contains a repertoire of more than 128,000 characters covering 135 modern and historic scripts. Unicode can be implemented by different character encodings including UTF-8 and UTF-16.

26
Q

What is…

UTF-8

A

UTF-8 is an encoding for Unicode and uses one byte for any ASCII character, all of which have the same code values in both UTF-8 and ASCII encoding, and up to four bytes for other characters.

27
Q

What is…

UTF-16

A

UTF-16 is an encoding for Unicode and uses one 16-bit unit for the characters that were representable in a prior character encoding called UCS-2 and two 16-bit units (4 × 8 bits) to handle each of the additional characters in the Unicode standard.

28
Q

What is…

EBCDIC

A

EBCDIC is a legacy character encoding used in the past on IBM computers in the 1960s. EBCDIC could be used internationally through the use of code pages. Code pages by any other name were simply EBCDIC-like, that is, other character-encodings. One would need to look up Japanese, code page 930, CCSID 930, to understand how to decode an EBCDIC message encoded using this variant.

29
Q

What is…

Unit Testing

A

The automated testing of source code by breaking it down into its smallest functional components – units. Testing is done by controlling inputs and testing the output and state of the program at various stages.

30
Q

What is…

Version Control

A

A mechanism for the storage of text based digital files and all subsequent changes made to them – literally – their versions. Version control systems such as Git, Subversion, and Mercurial, are key to software development workflows. Version control enables users to create ‘branches’ on which to work, and create ‘releases’ to aid in the the maintenance of software released to the public.

31
Q

What is…

Git

A

An example of a version control system, well known because of the cloud based implementation of the tool - GitHub.

32
Q

What is…

GitHub

A

A cloud-based version control and storage mechanism for source code, datasets, and other forms of publishing. The command line tool Git allows users to interact with it. Users can use Git and GitHub to create, clone, branch, and contribute to open source projects.

33
Q

What is…

Linux

A

A free and open source operating system (OS) developed in the 90s by Finland Computer Scientist Linus Torvalds and based on Unix-like principles. Android smartphones are Linux based, as are a number of commodity devices such as digital video recorders. Linux is characterized by its ‘kernel’ which provides the core control of the underlying computer system. Distributions add features to the operating system and are as well known as the OS itself, e.g. Ubuntu, Raspbian, Debian, and Red Hat.

34
Q

What is…

Unix

A

A precursor to Linux. Unix is an early operating system that was developed in the 1970s to provide higher-level control of a computing system, e.g. for programming, or for users to script, and run, various sets of commands.

35
Q

What is an…

Enterprise Solution

A

A buzzword (jargon) for a piece of software, or a system, that has the potential to satisfy the needs of all, or a group of users, across an organization. An enterprise content management system is named as such as it is expected to be interacted with by all of a company’s employees.

36
Q

What is…

Shell

A

A command line or terminal (text-based control mechanism of an operating system) available in Linux, as opposed to, DOS in the Windows Environment. Bash is an example of a shell available in Linux.

37
Q

What is a…

Shell Script

A

A method of chaining commands and variables (e.g. in Bash) into something called a ‘script’ to perform a set of operations that together meet a user’s processing needs.

38
Q

What is…

Linked Open Data

A

Data created and made available using the strengths of the web-technology stack. Four main principles are followed:

  • Use URIs (Universal Resource Identifiers) to name (identify) things.
  • Use HTTP URIs so that these things can be looked up (interpreted, ‘dereferenced’ e.g. via web browser).
  • Provide useful information about what a name identifies when it’s looked up, using open standards such as RDF, SPARQL, etc.
  • Refer to other things using their HTTP URI-based names when publishing data on the Web.
39
Q

What is a…

Write Blocker

A

Hardware or software based protection of the storage system such that content can be read but cannot be written to. Write blockers are central to digital forensics where the material collected from storage devices such as hard drives have important evidentiary value and must not have been tampered with.

40
Q

What are…

Zero-byte Files

A

A zero byte file is a pointer to a location in storage where it is recorded in the filesystem a file will exist, but at said location, the file has not yet been written to, and/or had its content cleared.

41
Q

What is…

Binary (Base 2)

A

A number system of two digits, zero, and one through which all numbers can be represented. In computer systems binary numbers are collected into groups of 8-bits called a byte. In computer electronics binary can be created through signals that are either on, or off.

42
Q

What is…

Hexadecimal (Base 16)

A

A number system of 16 characters, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. Hexadecimal can represent all numbers. Its primary application is the representation of binary numbers in the form of two digit bytes. Hexadecimal makes binary easier to read, for example, the number 255, in binary is, 0b11111111, and in hexadecimal is 0xFF. A hexadecimal number is often prefixed with the number zero and letter ‘x’ to indicate that the following characters are hexadecimal.

43
Q

What is…

Human Readable

A

A delineation in the understanding of a data structure where the core components are easily understandable without calculation or computation. For example, XML is commonly understood, despite complexity in its structure, to be human readable, because readers can look at its elements, attributes, and data values, and understand what the data is and how it is represented.

44
Q

Who is…

Claude Shannon

A

Mathematician responsible for the creation of the field of studies known as Information Theory; the study of the quantification, storage, and communication of information.

45
Q

Who is…

Ada Lovelace

A

Mathematician known for what is widely seen as the first computer program on Babbage’s analytical engine. Ada immediately recognised the potential of computers and their applications beyond calculation.

46
Q

Who is…

Alan Turing

A

Mathematician and computer scientist. Formalised the fundamentals of computer science, and developed the rules by which computers can be assessed as being artificially ‘intelligent’.

47
Q

What is…

Recursion

A

A method of repeating a process where the result of the process gets fed back into itself as an input. Recursion is a feature of some programming languages that don’t have loop constructs which allow data to be processed over and over until a certain exit condition is met.

48
Q

What is…

Iterative Development

A

The development of a project, product, or software in a series of steps, with each step developing a minimal viable product, and each step providing a functional or behavioural improvement on the last.

49
Q

What is a…

Schema

A

Description of a data model, restrictions, and rules by which to validate against for translation into a data encoding, e.g. XML document, JSON, or database.

50
Q

What is an…

Ngram

A

A sequence that describes the occurrence of N ‘terms’ (syllables, words, names, etc.) retrieved from a corpus, or corpora, of information for the purpose of research.

51
Q

What is an…

API

A

Application Programming Interface. A description of a software library or web service and how users and software agents are expected to interact with it and retrieve, or contribute data.

52
Q

What is a…

RESTful API

A

A method of retrieving data from a web service. The request is in the form of a suitable HTTP request. A HTTP response is sent back that contains the data requested by the user agent.

53
Q

What is…

State

A

A previous, current, or future representation of a computer program and its variables in memory.

54
Q

What is…

Atomicity (Databases)

A

A group of operations that occur together in a database and are committed (saved) in one group as a single transaction. If one of the operations fails, the database record is not written and the database is reverted back to its state before the interaction began.

55
Q

What is…

XMP

A

Extensible Metadata Platform (XMP) created originally by Adobe, but now an ISO standard. The standard can encode any set of metadata properties. A common use is for the encoding is to record the activities that have been performed on a file, for example, on an image, one could record post-digitization efforts to crop (remove excess parts of the image) and de-skew (straighten the image). The XMP can then be looked upon as an audit trail for the file.

56
Q

What is…

SQL

A

Structured Query Language (SQL) is a standard mechanism for querying (getting results from) a relational database. Relational databases are made up of many tables SQL needs to be able to look at all of these and filter data at the same time to be able to answe a user’s queries. Relational databases use a schema which is strict and fixed. An atlernative is a graph database where the structure is easily extended - they are extensible.

57
Q

What are…

Graph Databases

A

A NoSQL database option. That is, a database that doesn’t rely on SQL (Structured Query Language). Graphs are connected networks of inforamtion. Vertices are connected by edges. Vertices are often called resources (an identifier for a person, place, record) whose edges then describe it - edges are the properties belonging to a resouce. Edges can be resources in their own right as an edge may have its own meaning and semantic rules.

A simple graph may be:

Subject (Resource) -> Predicate (property name) -> Value (property value)

The USA -\> hasStates -\> 52

A graph database is queried through a language called SPARQL (SPARQL Query Language).

Graph databases are extensible meaning it is easy to add and connect more properties and resources.