Intro to Data Analytics Flashcards
Which role is responsible for project initiation and providing the requirements for a project?
Project sponsor
Which job position is primarily responsible for designing and constructing data pipelines within the field of data analytics?
Data engineer
Which role in a data analytics project helps data scientists shape data for analysis?
Data engineer
Which role in a data analytics project provides expertise for analytical techniques?
Data scientist
Which skills are required by data scientists for converting unstructured data to structured data in data analytics projects?
Data wrangling skills
Which skill must a business intelligence analyst possess to collect and organize data?
Data preparation
Which project-related activity typically takes up the majority of a data analyst’s time?
Data preparation
Which software do business intelligence analysts use to perform their responsibilities?
Microsoft Excel
What is a skill required of a data engineer?
Maintaining databases
Which group of stakeholders comprises the professionals, such as line managers?
Business users
Which stakeholder is primarily responsible for ensuring the desired quality of the project?
Project managers
Which stakeholder has access to essential tables or storage systems and guarantees the highest levels of security in the data repository?
Database administrator
Why is it significant to establish failure criteria for a data analytics project in the discovery phase?
It helps the team determine when it is best to accept the conclusions.
Which stakeholder extracts and transforms data during the discovery phase?
Data engineer
A person has been assigned to manage a project to implement a company-wide customer relationship management (CRM) system. The CRM system aims to centralize customer details, automate sales processes, and improve customer service. What skills are crucial for the project team members working on the CRM system implementation?
Data analysis, system integration, and training
Why is formulating an initial hypothesis an integral part of the discovery phase of the data analytics lifecycle?
It guides the subsequent data collection, processing, and analysis activities.
Who should be included as stakeholders in an analytics project?
Anyone who will benefit from the project
Who offers suggestions on ideas to test as the team formulates hypotheses during the discovery phase of a data analytics project?
Data scientists
Which activity occurs during the data preparation phase of the data analytics lifecycle?
Understanding of data
Which data visualization is most suitable for understanding the trend and progression of a variable over time in the data preparation phase?
Line charts
Which task is commonly performed to identify and address data quality issues during the data preparation phase?
Conducting data profiling
Which common data cleaning task is used to address the missing data in a data set?
Imputation
Which task is typically performed to handle outliers during the data preparation phase?
Truncating extreme valueS
A data analyst at a retail company is provided with a large dataset containing sales transactions, customer information, and product details. The analyst is tasked with preparing the data for analysis and modeling. Which activity would the analyst perform during the data preparation phase?
Exploring available data to understand its characteristics and suitability
Which activity is performed during the model planning phase of a data analysis project?
Selecting relevant features for modeling
Which programming language is primarily used for statistical analysis and data manipulation in the model planning phase?
R
Which classification model is based on the concept of probability and assigns class labels to instances based on the possibility of belonging to a particular class?
Naive Bayes
Which tool is used to connect users to relational databases and data warehouse appliances in the model planning phase?
SAS/ACCESS
Which regression model is commonly used for predicting a continuous numerical outcome based on a set of input features?
Linear regression
Which phase of the data analytics life cycle involves running analytical software packages on small datasets to test and refine models?
Model execution phase
Which step is typically performed after executing the model in the model execution phase?
Result analysis
Which testing procedure is used for evaluating the performance of a model in the data analytics life cycle?
Cross-validation
What is the role of the SPSS modeler in the model execution phase of the data analytics life cycle?
It is used for applying the trained model to new data for predictions.
How does the communication of results tie to the operationalize phase of data analytics?
It implements data-driven insights into business functions.
Which data visualization tool in the communicate results phase is used to create web-based visualization?
D3.js
Which measure assesses the validity of a correlation between two variables during the communicate results phase?
P-value