C2: Tools for Data Science Flashcards
What does DAM stand for?
Data Asset Managment.
What is Data Managment?
Collecting, persisting, and retrieving data securely, efficiently, and cost-effectively.
What is Data Integration and Transformation?
The ETL Process.
What are the seven large toolsets?
- Data Managment tools. E.g., MySQL.
- Data integration tools. E.g., Apache SparkSQL / Airflow
- Data Visualization tools. E.g., Kibana / Tableau
- Model deployment tools E.g., Kubernetes
- Model monitoring tools. E.g., Prometheus
- Code Asset managment tools. E.g., Github
- Data Asset managment tools. E.g., ODPi Egeria.
What is PMML?
Predictive Model Markup Language.
What is a Library in programming?
A collection of prewritten, reusable code.
What does API stand for?
Application Programming Interface.
An API allows communication between two pieces of software.
What does REST API stand for?
REpresenational State Transfer | Application Programming Interface.
What is a Data Set?
A structured collection of data.
What is Supervised Learning in ML?
A human provides input data and correct outputs. The model tries to identify relationships and dependencies between the input data and the correct output.
What is Unsupervised Learning in ML?
Unlabeled data is fed to the model. The model analyzes the data, trying to identify patterns and structure within the data based on its characteristics.
What is Reinforcement Learning?
Reward based learning. Learning loosely based on the way human beings and other organisms learn.
What is a CLI?
Command Line Interface.
What is R?
A statistical programming language.
Frameworks, Libraries, Packages, & Modules are all bundles of?
Reusable code.
Framework > Library > [ ? ] > Module
Package.
Framework > [ ? ] > Package > Module
Library.
Framework > Library > Package > [ ? ]
Module.
[ ? ] > Library > Package > Module
Framework.
It what way is Framewrok different from Libraries, Packages, and Modules?
It calls your code, not the other way around.
What are Git, Git HUB, GitLab, BitBucket, and Beanstalk?
Version control systems.
What is SSH protocol?
Protocol for secure remote login from one computer to another.
What is a Repository?
Where the project folders live. A data structure for storing documents including application source code.
What is a Fork?
A copy of a Repository.
What is a “pull request”?
Requesting review + approval of your changes before being finalized.
What is the Working Directory (Version control domain)
The files and subdirectories on your computer that are associated with a Git repository.
What does, “git init” do in GitHub?
Create (intialize) a new local repository.
What does, “git add” do in GitHub?
Create and add a file to the repo .
What does, “git status” do in GitHub?
Check the status of files changed.
What does, “git commit” do in GitHub?
Commit changes.
What does, “git revert” do in GitHub?
Undo changes.
What does, “git log” do in GitHub?
Review recent commits.
What does, “git branch” do in GitHub?
Get a list of branches and active branches.
What does, “git checkout” do in GitHub?
Switch to a branch.
What does, “git merge” do in GitHub?
Merge changes in your active branch into another branch.
Slang for Git repositories?
Repos.
What is GitLab?
A complete DevOps platform.
What is CI? [DevOps Domain]
Continuous Integration.
What is CD? [DevOps Domain]
Continuous Delivery.
What does VCS stand for?
Version Control System.