04_14 Big Data and NoSQL Flashcards
A data model that organizes data around a central entity based on the way the data will be used.
aggregate aware
A data model that does not organize data around a central entity based on the anticipated usage of the data.
aggregate ignorant
A process or set of operations in a calculation.
algorithm
A data processing method that runs data processing tasks from beginning to end without any user interaction.
batch processing
In the HDFS…
A report sent every 6 hours by the data node to the name node informing the name node which blocks are on that data node.
block report
A computer-readable format for data interchange that expands the JSON format to include additional data types including binary objects.
BSON (Binary JSON)
In a key-value database…
A logical collection of related key-value pairs.
bucket
In document databases…
A logical storage unit that contains similar documents, roughly analogous to a table in a relational database.
collection
In a column family database…
A collection of columns or super columns related to a collection of rows.
column family
A NoSQL database model that organizes data into key-value pairs, in which the value component is composed of a set of columns that vary by row.
column family database
A physical data storage technique in which data is stored in blocks, which hold data from a single column across many rows.
column-centric storage
A declarative query language used in Neo4j for querying a graph database.
Cypher
A NoSQL database model that stores data in key-value pairs in which the value component is composed of a tag-encoded document.
document database
In a graph database…
The representation of a relationship between nodes.
edge
Analyzing stored data to produce actionable results.
feedback loop processing
A MongoDB method to retrieve documents from a collection.
find()
A NoSQL database model based on graph theory that stores data on relationship-rich data as a collection of nodes and edges.
graph database
A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds.
Hadoop Distributed File System (HDFS)
In the HDFS…
A signal sent every 3 seconds from the data node to the name node to notify the name node that the data node is still available.
heartbeat