(2) Data Modeling Flashcards
Entity occurrence
A row in a relational table. Also known as entity instance.
Versioning
A property of an OODBMS that allows the database to keep track of the different transformations performed on an object.
Attribute
A characteristic of an entity or an object. It has a name and a data type.
Internal schema
A representation of an internal model using the database constructs supported by the chosen database.
Relational database management system (RDBMS)
A collection of programs that manages a relational database. This software translates a user’s logical requests (queries) into commands that physically locate and retrieve the requested data.
Name node
One of the three types of nodes in HDFS. this node stores all the metadata about the file system.
Class diagram notation
The set of symbols used in the creation of class diagrams in UML object modeling.
entity relationship diagram (ERD)
A diagram that depicts an entity relationship model’s entities, attributes, and relations.
One-to-one (1:1 or 1..1) relationship
Associations among two or more entities that are used by data model’s. In this relationship, one entity is associated with only one instance of the related entity.
Unified Modeling Language (UML)
A language based on object-oriented concepts that provides tools such as diagrams and symbols to graphically model a system.
Connectivity
The type of relationship between entities. Classifications include 1:1, 1:M, and M:N
Client node
One of three types of nodes used in the Hadoop Distributed File System (HDFS). This node acts as the interface between the user application and the HDFS.
Extensible Markup Language (XML)
A meta-language used to represent and manipulate data elements. Unlike other markup languages, XML permits the manipulation of a document’s data elements. XML facilitates the exchange of structured documents such as orders and invoices over the Internet.
MapReduce
An open-source application programming interface (API) that provides fast data analytics services; one of the main Big Data technologies that allows organization’s to process massive data stores.
Big data
A movement to find new and better ways to manage large amounts of web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost.
Object/relational database management system (O/R DBMS)
A DBMS based on the extended relational model (ERDM). the ERDM, championed by many relational database researchers, constitutes the relational model’s response to the OODM. This model includes many of the object-oriented model’s best features within an inherently simpler relational database structure.
Relational model
Developed by E.F Codd of IBM in 1970, the relational model is based on mathematical set theory and represents data as independent relations. Each relation (table) is conceptually represented as a two dimensional structure of intersecting rows and columns. The relations are related to each other through the sharing of common entity characteristics (values in columns).
Subschema
The portion of the database that interacts with application programs.
Class
A collection of similar objects with shared structure (attributes) and behavior (methods). It encapsulates an object’s data representation and a method’s implementation. Classes are organized in a class hierarchy.
Conceptual model
The output of the conceptual design process. This model provides a global view of an entire database and describes the main data objects, avoiding details.
Constraint
A restriction on data, usually expressed in the form of rules. For example, “A student’s GPA must be between 0.00 and 4.00.” they are important because they help to ensure data integrity.
Method
In the object-oriented data model, a named set of instructions to perform an action. Methods represent real-world actions, and are invoked through messages.
Hadoop
A Java bases, open source, high speed, fault-tolerant distributed storage and computational framework. It uses low-cost hardware to create clusters of thousands of computer nodes to store and process data.
application programming interface (API)
Software through which programmers interact with middleware. This allows the use of generic SQL code, thereby allowing client processes to be database server-independent.
3 Vs
Three basic characteristics of Big Data databases: volume, velocity, and variety.
Many-to-many (M:N or ..) relationship
Association among two or more entities in which one occurrence of an entity is associated with many occurrences of a related entity and one occurrence of the related entity is associated with many occurrences of the first entity.