Tools for Data Science Flashcards
IBM Data Science Professional Certificate (Course 2/10)
Do I Visit Beautiful Destinations Most Autumns?
What data science categories does raw data need to pass through before it is deemed useful?
- Data management
- Data integration and transformation
- Data visualisation
- Model building
- Model deployment
- Model monitoring and assessment
Do Cats Dance Exquisitely Everywhere?
What tools are used to support the tasks performed in the Data Science Categories?
- Data Asset Management
- Code Asset Management
- Development Environments
- Execution Environments
What is data management?
The process of collecting, persisting, and retrieving data securely, efficiently, and cost-effectively.
Where is data collected from?
Many sources, including (but not limited to) Twitter, Flipkart, sensors, and the Internet.
Where should you store data so it is available whenever you need it?
Store the data in persistent storage
What is data integration and transformation?
The process of extracting, transforming, and loading data (ETL)
Give examples of repositories where data is commonly distributed
- Databases
- Data cubes
- Flat files
When extracting data through the extraction process, where should you save this extracted data?
It is common practice to save extracted data in a central repository like a data warehouse.
What is a data warehouse primarily used for?
It is primarily used to collect and store massive amounts of data for data analysis.
What is data transformation?
The process of transforming the values, structure, and format of data.
What do you do after extracting data?
Transform the data
What happens to data after it has been transformed?
It is loaded back to the data warehouse
What is data visualisation?
It is the graphical representation of data and information.
What are some ways to visualise data?
You can visualise the data through charts, plots, maps, and animations.
Why is data visualisation a good thing?
It conveys data more effectively for decision-makers
What happens after data visualisation?
Model building
What is model building?
It is a step where you train the data and analyse patterns using suitable machine learning algorithms.
What happens after model building?
Model deployment
What is model deployment?
It is the process of integrating a model into a production environment
In model deployment, how is a machine learning model made available to third-party applications?
They are made available via APIs.
What goal do third-party applications help achieve during model deployment?
They allow business users to access and interact with data
What is the purpose of model monitoring?
To track the performance of deployed models.
Give an example of a tool used during model monitoring
Fiddler
What is the purpose of model assessment
To check for model accuracy, fairness, and monitor its robustness