Sir kyle Flashcards

Question 1

Q

What is data analytics?
a) The process of collecting and organizing data
b) The process of analyzing data to make decisions
c) The process of creating data
d) The process of deleting outdated data

Question 2

Q

Why is data analytics important for businesses?
1/1
a) It helps in predicting market trends
b) It provides insights for better decision-making
c) It identifies business performance issues
d) All of the above

Question 3

Q

Which type of analytics predicts future outcomes based on data?
1/1
a) Descriptive analytics
b) Diagnostic analytics
c) Predictive analytics
d) Prescriptive analytics

Question 4

Q

What is descriptive analytics?
1/1
a) Analytics that explains what has happened
b) Analytics that predicts what will happen
c) Analytics that determines why something happened
d) Analytics that recommends actions to take

Question 5

Q

Which of these is an example of structured data?
1/1
a) Social media posts
b) Email contents
c) Customer database with names and phone numbers
d) Images stored in a file

Question 6

Q

What type of data visualization is best suited for showing parts of a whole?
1/1
a) Line chart
b) Pie chart
c) Scatter plot
d) Histogram

Question 7

Q

Big Data refers to datasets that are…
1/1
a) Easy to store and manage
b) Too large and complex for traditional data-processing methods
c) Small but require a lot of computation
d) Structured and easy to analyze

Question 8

Q

What is the purpose of A/B testing in data analytics?
1/1
a) To compare two versions of a product or feature to determine which performs better
b) To clean data
c) To automate the analysis process
d) To visualize complex data

Question 9

Q

Which of the following describes prescriptive analytics?
1/1
a) Provides insights into why things happened
b) Describes what is happening in real-time
c) Recommends actions based on data analysis
d) Predicts future trends

Question 10

Q

What is the significance of data governance in analytics?
1/1
a) To regulate the storage of data
b) To ensure data privacy, security, and compliance
c) To visualize large datasets
d) To improve the speed of data processing

Question 11

Q

Which of the following is a type of data analytics?
1/1
A) Predictive Analytics
B) Descriptive Analytics
C) Prescriptive Analytics
D) All of the above

Question 12

Q

What type of data is “Gender” in a dataset?
1/1
A) Quantitative
B) Qualitative
C) Continuous
D) Interval

Question 13

Q

Which chart is most commonly used to show trends over time?
1/1
A) Pie Chart
B) Bar Chart
C) Line Chart
D) Scatter Plot

Question 14

Q

In data cleaning, which process removes duplicate values in a dataset?
A) Normalization
B) Deduplication
C) Data Merging
D) Standardization

Question 15

Q

What is the significance of data governance in analytics?
1/1
a) To regulate the storage of data
b) To ensure data privacy, security, and compliance
c) To visualize large datasets
d) To improve the speed of data processing

Question 16

Q

Which of the following is NOT a form of data visualization?
1/1
a) Bar Chart
b) Line Graph
c) Base Graph
d) Scatter Plot

Question 17

Q

Which chart is best suited for showing the distribution of data across different categories?
1/1
a) Line chart
b) Pie chart
c) Bar chart
d) Scatter plot

Question 18

Q

When creating a histogram, the X-axis represents:
1/1
a) Data frequency
b) Data values or ranges
c) Percentages
d) None of the above

Question 19

Q

In a histogram, what does the height of each bar represent?
1/1
a) The sum of data values in that range
b) The frequency or count of data in a specific range
c) The total data collected
d) The average of the data points in that bin

Question 20

Q

If the bars in a histogram are skewed to the right, what does this indicate about the distribution of the data?
1/1
a) Symmetric distribution
b) Positively skewed distribution
c) Negatively skewed distribution
d) Uniform distribution

Question 21

Q

Which measure of central tendency is most affected by outliers?
1/1
a) Mean
b) Median
c) Mode
d) All are equally affected

Question 22

Q

The median is defined as:
1/1
a) The average of all values
b) The most frequently occurring value
c) The middle value when data is ordered
d) The range of the dataset

Question 23

Q

When would the median be a better measure of central tendency than the mean?
1/1
a) When data is symmetrically distributed
b) When data has outliers or is skewed
c) When data is categorical
d) When data contains repeated values

Question 24

Q

What does the mean of a dataset represent?
1/1
a) The most frequently occurring value
b) The value that divides the data into two equal parts
c) The average of all data points
d) The value with the highest frequency

Question 25

Q

If the mean and median of a dataset are equal, what type of distribution does the data likely have?
1/1
a) Skewed to the left
b) Skewed to the right
c) Relatively Symmetric
d) Uniform distribution

Question 26

Q

Which measure of central tendency divides the dataset into two equal parts?
1/1
a) Mean
b) Median
c) Mode
d) Interquartile range

Question 27

Q

In a dataset where the mean is greater than the median, what can you infer about the shape of the distribution?
1/1
a) It is symmetric
b) It is positively skewed (right-skewed)
c) It is negatively skewed (left-skewed)
d) It is normally distributed

Question 28

Q

When analyzing income data, which measure of central tendency is typically preferred and why?
1/1
a) Mean, because it includes all data values
b) Median, because it is less influenced by extreme outliers
c) Mode, because it represents the most common income level
d) Mean, because it minimizes the impact of variance

Question 29

Q

In a dataset with outliers, why might the median be a better measure of central tendency than the mean?
1/1
a) The median reflects all values in the dataset
b) The mean is distorted by extreme values, while the median is not
c) The mode is more reliable than the mean
d) The mean and median are always equal

Question 30

Q

Questions
Data Collection
Data Cleaning
Data Analysis
Data Interpretation

Answer

A

Data Analytics Workflow

Question 31

Q

Why are measures of central tendency important for summarizing large datasets?
1/1
a) They reduce the complexity of data by providing a single representative value
b) They eliminate the need to analyze individual data points
c) They measure the spread of the data
d) They provide insight into data variability

Question 32

Q

1965, Intel co-founder ____ predicted that
the number of transistors on a chip would double
roughly every two years, with a minimal rise in cost1

Answer

A

Gordon Moore

Question 33

Q

“I would expect that next year, people will share twice as
much information as they share this year, and next year,
they will be sharing twice as much as they did the year
before”

Answer

A

Mark Zuckerberg

Question 34

Q

characteristic of members of a population

e.g., market share, revenue, season, Bike_Rentals, temperature,
date, weather condition

Answer

A

Variables

Question 35

Q

Observations can be named without particular order or ranking imposed on the data.

Words, letters, and even numbers are used to classify the data

Answer

A

Nominal Value

Question 36

Q

observations of variable

e.g., 11%, $225M, summer, 985, 23.5˚, 1/12/2011, mcdonalds

Question 37

Q

contains variables and observations

Array (rows and columns)

Question 38

Q

Indicates an actual amount (numerical). The order and the difference between the variables

can be known. It limitation is it has no “true zero”.

Answer

A

Interval Level

Question 39

Q

The degree to which all required data is known.

Answer

A

Completeness

Question 40

Q

Describes ranking or order. The difference or ratio between rankings may not always be

the same.

Answer

A

Ordinal Value

Question 41

Q

It has the same properties as the interval level. The order and difference can be described.

Additionally, it has a true zero and the ratio between two points has a meaning

Answer

A

Ratio Level

Question 42

Q

Accuracy. Ensure your data is close to the true values (real-world objects it
represents).

Validity. If it measures what it is supposed to measure

Completeness. The degree to which all required data is known.

Consistency. Ensure your data is consistent within the same dataset and/or
across multiple data sets.

Uniformity. The degree to which the data is specified using the same unit of
measure.

Answer

A

DATA QUALITY DIMENTIONS

Question 43

Q

Ensure your data is close to the true values (real-world objects it
represents).

Question 44

Q

If it measures what it is supposed to measure

Question 45

Q

Right positively skewed:

The right tail is longer

Values of data extend to the right

Answer

A

Skewed to the RighT

Question 46

Q

Ensure your data is consistent within the same dataset and/or
across multiple data sets.

Answer

A

Consistency

Question 47

Q

Gather data from various sources, such as databases, files, APIs, or surveys.

Ensure that the data collected is relevant to your research question or analysis
objectives.

Answer

A

Data Collection and Acquisition

Question 48

Q

The degree to which the data is specified using the same unit of
measure.

Answer

A

Uniformity

Question 49

Q

Examine the raw data to get a sense of its structure and contents.

Check for missing values, outliers, and anomalies that may require attention..

Answer

A

Data Inspection

Question 50

Q

Address missing data by deciding whether to fill in missing values or remove records with missing
values.

Correct any data entry errors, inconsistencies, or outliers, duplicated records.

Standardize data formats (e.g., date formats, data types) to ensure consistency.

Answer

A

Data Cleansing

Question 51

Q

Encode categorical variables into numerical format using techniques like one-hot encoding or label encoding.

Normalize or scale numerical features if necessary to bring them to a common scale.

Answer

A

Data Transformation

Question 52

Q

Combine data from multiple sources if needed, ensuring that there are common identifiers to merge the
data correctly.

Answer

A

Data Integration

Question 53

Q

Create visualizations to explore the data further
and identify patterns, relationships, or outliers.

Visualization helps in understanding the data’s
characteristics and guiding further analysis.

Answer

A

Data Visualization

Question 54

Q

Data Creation/Collection

Data Ingestion (ETL)

Data Storage

Data Presentation and Visualization

Data Sharing and Distribution

Data Archiving and Retention

Data Backup and Disaster Recovery

Data Deletion and Disposal

Answer

A

Data Life Cycle

Question 55

Q

Left negatively skewed:

The left tail is longer

Values of data extend to the left

Answer

A

Skewed to the left