Data & Artificial Intelligence (AI) Flashcards
Data, data storytelling, AI, and data science CDS Essentials high-level questions
“Data Literacy” refers to the ability to do what four things with data?
- Read
- Work with
- Analyze
- Argue with
What are the “Three Us” (U being “Understand”) of data storytelling?
- Understand the data
- Understand the audience
- Understand the business question or need
What data analysis technique “summarizes the main features of a dataset, providing insights into its characteristics”?
Descriptive analysis
What data analysis technique “uses a sample dataset to infer conclusions about the larger population of data”?
Inferential analysis
What data analysis technique “identifies patterns and relationships in historical data to predict future outcomes, using algorithms and machine learning techniques”?
Predictive analysis
What data analysis technique aims to understand the reasons behind past events or outcomes by analyzing historical data to identify causes, often used in troubleshooting and problem-solving.”?
Diagnostic analysis
What data analysis technique “recommends actions in order to achieve desired goals”?
Prescriptive analysis
What sort of artificial intelligence has the functionality of “[focusing] on one narrow task”?
Artificial Narrow Intelligence (ANI)
What sort of artificial intelligence has the functionality of “understanding and learning any intellectual task that a human can”?
Artificial General Intelligence (AGI)
What sort of artificial intelligence would have the ability to “surpass human intelligence, evoking emotions, needs, and / or beliefs of its own”? [hypothetically exists]
Artificial Super Intelligence (ASI)
Artificial intelligence can be categorized into two categories; what are they?
- Technology: the type of technology the AI uses
- Functionality: what sort of functionality AI uses to complete its tasks
What subset of artificial intelligence uses training data and an initial algorithm to predict results and then continuously tests and tweaks the algorithm it uses until acceptable predictions are generated?
Machine learning
What subset of artificial intelligence uses neural networks to train itself?
Deep learning
What are the three main differences between machine learning and deep learning?
- Deep learning has better performance with similar amounts of data
- Deep learning does not require feature extraction to be done manually by a human
- Deep learning requires much more processing power
What branch of artificial intelligence can understand, analyze and interpret text and spoken word in the same way human beings can?
Natural Language Processing (NLP)
Why is understanding data sources important in data storytelling?
Being able to explain the data sources used helps build credibility and allows the audience to assess the reliability of the insights presented.
What is the role of the scientific method in data science?
Data science applies the scientific method (asking questions, forming hypotheses, testing with data) to make evidence-based decisions and predictions, moving beyond intuition and guesswork.
“An interdisciplinary field combining computer science, math and statistics, and domain knowledge to extract insights from data and transform it into actionable knowledge.” is the definition for what?
Data science
What are the key roles on a data science team?
Data scientist, data engineer, and subject matter expert. Other roles may include project manager, software developer, and designer.
List a couple popular tools used in data science.
- Programming Languages: SQL, Python, R.
- Relational Databases: MySQL, Microsoft SQL Server, PostgreSQL.
- Big Data Platforms: Spark, Hive.
- Spreadsheets/BI Tools: Excel, Tableau.
Describe the steps in the data science process.
- Define the question.
- Collect data.
- Prepare data.
- Create a model.
- Evaluate the model.
- Deploy the model.
What are the key ingredients of a data-driven organization?
Strategy, people, data, technology, and culture.
Describe the data-driven hierarchy of needs.
- Collect data.
- Organize data.
- Analyze data.
- Make predictions.
- Automate.
What are the three types of AI based on technology?
- Artificial Narrow Intelligence (ANI)
- Artificial General Intelligence (AGI)
- Artificial Super Intelligence (ASI).
What are the characteristics of reactive machines in AI?
- Operate solely on current information
- Lacking memory or the ability to learn from past experiences
- React to input and perform programmed tasks.
What distinguishes limited memory AI systems?
They can learn from past data and experiences, enabling them to make more informed decisions compared to reactive machines.
How does a machine learning algorithm learn?
- Trained on data
- Receiving feedback on its predictions
- Adjusting its parameters to improve accuracy over time.
What is a key difference between deep learning and machine learning in feature extraction?
Deep learning automatically learns features from data, while traditional machine learning requires manual feature engineering.
What are the two main phases of Natural Language Processing (NLP)?
- Data preprocessing (preparing text data)
- Algorithm development (creating rules or models to process the data)
Give four examples of data preprocessing techniques in Natural Language Processing (NLP).
- Tokenization (breaking text into smaller units).
- Stop word removal (eliminating common words).
- Lemmatization/stemming (reducing words to their root form).
- Part-of-speech tagging.
What are the two main types of Natural Language Processing (NLP) algorithms?
- Rule-based systems (using handcrafted linguistic rules)
- Machine learning-based systems (using statistical models trained on data).