Chapter 1: Data Analysis Flashcards

1
Q

What is Data Analysis?

A

Data analysis is the process of collecting, transforming, cleaning, and interpreting data to make informed decisions or draw meaningful insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is descriptive analysis?

A

Descriptive analysis is the step in data analysis where we summarize data to make it more understandable, often using averages, percentages, and trends.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is inferential analysis?

A

Inferential analysis uses samples of data to make conclusions about a larger population, relying on techniques like parameter estimation and hypothesis testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some key techniques used in inferential analysis?

A

Key techniques include parameter estimation, hypothesis testing, and regression analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is predictive analytics?

A

Predictive analytics combines historical data and machine learning to build models that predict future events or outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the steps in the data analysis process?

A

The steps are:
- Define the objective, identify data requirements, collect data, process and format data, clean data, explore data, analyze data, model data, communicate results, and monitor and update.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the key considerations when collecting data?

A

Key considerations include whether the process is manual or automated, limitations of the data, validation at the source, and the accuracy of converting manual data to electronic form.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is randomization in an experiment?

A

Randomization is the process of assigning participants or samples to different groups or conditions randomly to ensure each group is similar except for the treatment being tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does randomization help reduce bias?

A

Randomization reduces bias by ensuring that each group is comparable at the start, which minimizes systematic errors that could skew results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are confounding variables?

A

Confounding variables are factors other than the one being studied that might influence the results. Randomization helps distribute these variables evenly across groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is simple random sampling?

A

Simple random sampling involves randomly selecting participants from the entire population so that each person has an equal chance of being chosen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is stratified sampling?

A

Stratified sampling involves dividing the population into subgroups (strata) and then randomly sampling from each subgroup to ensure all subgroups are represented in the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the types of data determined by the collection process, and how do they affect analysis?

A
  • Cross-sectional data captures a snapshot at one time, useful for comparisons but not trends.
  • Longitudinal data tracks changes over time, showing trends.
  • Sensor data provides real-time measurements for monitoring.
  • Truncated data occurs when observations are cut off or limited, which can lead to incomplete or biased results.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the key characteristics of big data?

A

Big data is characterized by its size (large amounts of data), speed (fast data generation and processing), variety (different types of data like text, images, and numbers), and reliability (ensuring data quality and accuracy).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is replication?

A

Replication involves an independent third party conducting the same experiment or analysis as the original research and obtaining consistent or identical results to verify reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is reproducibility?

A

Reproducibility means obtaining the same results using the same data and methods, ensuring that the findings can be consistently replicated by others.

17
Q

What elements are required for reproducibility?

A

To ensure reproducibility, provide the original data, fully documented code with good version control, detailed documentation of software use, set the random seed if randomness is involved, and minimize manual work.