C2: Tools for Data Science Flashcards
What does DAM stand for?
Data Asset Managment.
What is Data Managment?
Collecting, persisting, and retrieving data securely, efficiently, and cost-effectively.
What is Data Integration and Transformation?
The ETL Process.
What are the seven large toolsets?
- Data Managment tools. E.g., MySQL.
- Data integration tools. E.g., Apache SparkSQL / Airflow
- Data Visualization tools. E.g., Kibana / Tableau
- Model deployment tools E.g., Kubernetes
- Model monitoring tools. E.g., Prometheus
- Code Asset managment tools. E.g., Github
- Data Asset managment tools. E.g., ODPi Egeria.
What is PMML?
Predictive Model Markup Language.
What is a Library in programming?
A collection of prewritten, reusable code.
What does API stand for?
Application Programming Interface.
An API allows communication between two pieces of software.
What does REST API stand for?
REpresenational State Transfer | Application Programming Interface.
What is a Data Set?
A structured collection of data.
What is Supervised Learning in ML?
A human provides input data and correct outputs. The model tries to identify relationships and dependencies between the input data and the correct output.
What is Unsupervised Learning in ML?
Unlabeled data is fed to the model. The model analyzes the data, trying to identify patterns and structure within the data based on its characteristics.
What is Reinforcement Learning?
Reward based learning. Learning loosely based on the way human beings and other organisms learn.
What is a CLI?
Command Line Interface.
What is R?
A statistical programming language.
Frameworks, Libraries, Packages, & Modules are all bundles of?
Reusable code.
Framework > Library > [ ? ] > Module
Package.