DATA Flashcards
DATA ARCHITECTURE
Standards that govern which data is collected, and how it is stored, arranged, integrated, and put to use in data
systems and in organizations.
data scientist ( 🧐 to find insight and deals with…)
- perform an exploratory analysis to discover insights from the data. Deals with an enormous mass of structure/unstructured data and use their skills in math, statistics, programming, machine learning, etc.
data engineers 🏛
Develops, constructs, tests & maintains the complete architecture of large-scale processing systems.
data analyst
Takes data and uses it to help companies make better business decision: - Analyze and translate to the “English language.” - This data is used by upper management to make business decisions.
DATA LAKE:
Is a storage that holds a vast amount of raw data in its natural form until it is needed.
DATA PROCESSING: (us des)
Is the conversion of data into a usable and desired form
APACHE Hadoop:
Is an open-source framework that is used to efficiently store and process large datasets
APACHE Spark:
Is a data processing framework that can quickly perform processing tasks on very large data sets
and can also distribute data processing tasks across multiple computers.
APACHE Hive:
Is an open-source data warehouse software for reading, writing and managing large data set files
that are stored directly in either the Apache Hadoop Distributed File System (HDFS) or other data storage systems
such as Apache HBase.
SQL:
Is a domain-specific language used in programming and designed for managing data held in a relational
database management system, or for stream processing in a relational data stream management system
NoSQL
NoSQL databases (aka "not only SQL") store data differently than relational tables . They provide flexible schemas and scale easily with large amounts of data and high user loads.
DATA WAREHOUSE:
A Data Warehousing (DW) is process for collecting and managing data from varied sources to
provide meaningful business insights.
SOCIAL DATA
IS THE INFORMATION ABOUT YOU, SUCH AS YOUR
MOVEMENTS, BEHAVIOR, AND INTEREST, AS WELL
AS INFORMATION ABOUT YOUR RELATIONSHIPS
WITH OTHER PEOPLE, PLACES, PRODUCTS, EVEN
IDEOLOGIES.
BIG DATA is a phrase used that means
massive volume of both structured and
unstructured data that is so large it is difficult to process using
traditional database and software techniques.
BIG DATA has the potential to help
companies improve operations and make
faster, more intelligent decisions.