stats final 251 Flashcards

1
Q

List three or more categorical variables.

A

Gender (male, female, other), color, race

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

List three or more quantitative variables.

A

Age, number of siblings you have, the mass or weight of an object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain the difference between categorical and quantitative variables.

A

You can do logical math with quantitative variables since they can be calcified as intervals,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give an example of a numerical value that is NOT quantitative.

A

Temperature degrees since they can be used logically for math

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the difference between quantitative variables on the interval scale versus the ratio scale.

A

quanitative Interval data can have logical operations done between them
while the ratio scale like temerature canot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain the difference between categorical variables that are nominal versus ordinal.

A

If there is no ordering that can be done between categories the variable is nominal, but if there is some type of order that can be applied the variable is ordinal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you visually display categorical data?

A

On a horizontal axis with a list of categories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you visually display quantitative data?

A

The horizontal axis contains ranges of numbers that represent a continuum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the difference between discrete and continuous distributions.

A

Discrete distributions use countable unique numbers while continuous distributions have an infinite range of possible values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe how we can characterize a distribution.

A

Describe it in terms of where the modes can be found (for uniform this isn’t as useful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are descriptive statistics and what does each one tell us about our data?

A

They tell us about our data - the mean is average, median is middle, mode is popular, range is spread, and standard deviation is variability. Quartiles and percentiles show data locations and outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the similarities between bar charts, pie charts, histograms, stem-and-leaf, dotplots, and boxplots?

A

-Visualization tools for data visuals.
-Used to put together and understand data.
-Helpful for categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the diffrences between bar charts, pie charts, histograms, stem-and-leaf, dotplots, and boxplots?

A

-Bar Charts: Show differences between categories.
-Pie Charts: Display parts usually as a % of a whole
-Histograms: show data distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is metadata?

A

data that provides info about other data gives details about a dataset or piece of information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some examples of descriptive statistics and what do they tell us?

A

One example of a descriptive statistic would be the mean which gives us the average values of a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In this class, are we conducting experiments and/or doing observational studies?

A

We are analyzing data from said experiments but the experiments must be class related because of the tests we have to do in order to do human trials

16
Q

What are some words you should NEVER use when talking about our studies?

A

Everyone/no one - generalizations
I/they thinks - facts matter

16
Q

What is a relative frequency?

A

the proportion or percentage of times a particular event or outcome occurs relative to the total number of observations

17
Q

How many marginal distributions are there?

A

2 one for each observed frequency

18
Q

How many conditional distributions are there?

A

There are many but in our M&M experiment for ex we had 10: C,L,M,N,B,BR,G,O,R,Y

19
Q

Explain how to read/interpret a contingency table.

A

The contingency will have a range of numerical values on the y axis and on the x axis will have whatever the defect, numerical amount, type amount is.

20
Q

Explain what a distribution is.

A

A distribution in statistics describes how data values are spread or organized.

21
Q

Explain how to distinguish between a sample and a population.

A

A sample is a subset of a larger group, the population, from which data is collected for analysis

22
Q

Describe the meaning of N(𝛍,𝛔) and the standard normal distribution N(0,1).

A

This is a normal Gaussian distribution where the standard normal distribution N(0,1) has a mean (meu) of 0 and a standard deviation (sigma) of 1

23
Q

Explain how to interpret histograms

A

Histograms: show data distribution by side-by-side values and displaying their frequency

24
Q

Explain how to interpret boxplots

A

Boxplots: summarize data with a box indicating quartiles and ticks showing the data range

25
Q

Explain how to interpret comparative boxplots

A

Comparative Boxplots: compare distributions of multiple datasets using multiple boxplots side by side

26
Q

What is the meaning of the z-score?

A

Tells us how many standard deviations above or below the mean our data point is

27
Q

z-score fourmula

A

Z = (X - μ) / σ

28
Q

what does each one mean
Z = (X - μ) / σ

A

Z is the z-score.
X is the data point you want to convert.
μ is the mean of the dataset.
σ is the standard deviation of the dataset.

29
Q

Describe how to use the 68-95-99.7 rule to approximate the area under the normal curve between any two data points or z-scores, or to the left of a given z-score, or to the right of a given z-score.

A

About 68% is within one standard deviation from the mean
About 95% is within two standard deviations
About 99.7% is within three standard deviation
You can use this rule to get a rough estimate between, to the left, or to the right of specific z-scores or data points

30
Q

Explain how to find the area under the normal curve between data points or z-scores.

A

use a standard normal distribution table or a calculator with statistical functions like R

31
Q

What is a sampling distribution?

A

A sampling distribution shows how a specific statistic, like the mean, would vary when calculated from random samples taken from the same group.

32
Q

How do you distinguish between distribution of a sample and sampling distribution?

A

-distribution of a sample refers to the pattern or arrangement of data values within a single sample
-sampling distribution is a theoretical probability distribution that illustrates how a specific statistic varies across multiple random samples drawn from the same population

33
Q

What does “random” mean in statistical terms?

A

refers to a process or event that occurs without any predictable pattern or bias.

34
Q

How can you ensure your sample is both random and representative? (What strategies could you use and how are they different?)

A

Random Sampling: Use random methods to select individuals or elements, ensuring each has an equal chance of being chosen.

Systematic Sampling: Select every “k-th” individual from a random starting point, balancing randomness and structure like if you got every 10th person thwart walked into U-REC.

Cluster Sampling: Randomly select clusters and sample all individuals within them, practical for large populations.

Convenience Sampling: not random, based on convenience, and may introduce bias, best for quick data but not ideal for representativeness.

35
Q

Why is it so important to have a sample that is both random and representative?

A

This is important because it ensures that the findings correctly represent the population this will, minimize bias, and enhance the statistical correctness of the results.

36
Q

Describe the main concept behind the central limit theorem.

A

States that the more samples you get the more “normal/bell-shaped” the sample will be regardless of the original population’s shape.

37
Q

Explain why the normal model shows up so often & why we spend so much time studying it.

A

We study the normal distribution often because it shows us many real-world problems and it helps us with statistical analysis it also serves as a foundation for various statistical methods