Chapter 11 statistical sampling Flashcards by Adam Brier

Mean

Sum of the numbers in a data set divided by the total number of values in the data set. Average. Best used in data set with numbers that are close together.

How well did you know this?

Not at all

Perfectly

Median

Midpoint value of a data set, where the values are arranged in ascending or descending order. Better with a data set with outliers.

How well did you know this?

Not at all

Perfectly

Random Variable

A variable that describes all of the possible outcomes of a random process. For example if you have X for a coin flip, then X=1 when it is heads and X=0 when it is tails.

How well did you know this?

Not at all

Perfectly

Discrete

The total number of possible outcomes is countable. An example is heads or tails.

How well did you know this?

Not at all

Perfectly

Continous

The total number of possible outcomes is uncountable. An example is time measurements.

How well did you know this?

Not at all

Perfectly

Probability Density Function

A continuous probability distribution function. This means that for any measurement x sub 1, there exists a corresponding value for f(x sub1).

How well did you know this?

Not at all

Perfectly

Empirical Probabilities

Are probabilities generated from data.

How well did you know this?

Not at all

Perfectly

Expected VAlue

Also known as the mean or average of the probability distribution. Can be thought of as the outcome we should expect on average.

E(x)=Sum(x*P(x))

How well did you know this?

Not at all

Perfectly

Random Sampling

A method of choosing an equally distributed subset from a larger population. There is simple random samples, stratified random samples, cluster sampling, and systematic random samples

How well did you know this?

Not at all

Perfectly

Sampling

A part of a population used to describe the whole group.

How well did you know this?

Not at all

Perfectly

Population

All members of a specified group.

How well did you know this?

Not at all

Perfectly

Simple Random Sampling

A type of random sampling where the variables have an equal, and unsystematic, chance of selection. Best used when a researcher does not know a lot about the demographics in the population.

How well did you know this?

Not at all

Perfectly

Stratified Random Sampling

Divide members of a population into ‘strata’ or homogeneous subgroups. Different in that you seperate the population into groups first. Stratified random sampling cannot have crossover. Stratified random samples must include all members of a population.
Example is splitting up a high school with freshman, sophomore, junior, and senior students to then decide how many of each group is needed to take a sample. Best used when research is familiar with the demographics. No more then four to six strata is recommended but you can have as many as you want.

How well did you know this?

Not at all

Perfectly

Cluster Random Samples

The sampling method where different groups within a population are used as a sample. Cluster cannot have crossover and must include all members of a population. Unlike stratified, the cluster sampling does not have to have an equal selection from each group but must be as close to the same size as possible. Use this when the entire population is unclear or unknown.

How well did you know this?

Not at all

Perfectly

Systematic Random Sampling

Requires selecting samples based on a system of intervals in a population. For example selecting very 4th customer in a movie theater. Can only do this if the population is homogenous with a randomized list.

How well did you know this?

Not at all

Perfectly

Law of Large Numbers

Study These Flashcards

Theorem that states that the larger sample sizes, the closer the sample mean will be to the mean of the population.

Normal Distribution

Study These Flashcards

Roughly bell shaped distribution that occurs over and over throughout populations and samples.

Central Limit Theorem

Study These Flashcards

If you run a random experiment enough times the results will follow a normal distribution. The data set only maintains integrity if the new students are drawn from a random sampling of students.

Mean

Study These Flashcards

Average value of all individuals in the sample.

Standard Error

Study These Flashcards

How accurate your mean is by comparing it to the mean of a value that exists.
To find the standard error
Standard error=(standard deviation)/(root(samplesize))

Regression LIne

Study These Flashcards

A straight line that attempts to predict the relationship between two points. Also called trend line or line of best fit.

Simple Linear Regression

Study These Flashcards

A prediction when a variable Y is dependent on a second variable X based on the regression equation of a given set of data.

Scatterplot

Study These Flashcards

A graph of ordered pairs showing a relationship between two sets of data.

Correlation

Study These Flashcards

The relationship between two sets of variables used to describe or predict information.

Regression Analysis

Study of two variables in an attempt to find a relationship, or correlation.

Independent Variable

A condition or piece of data in an experiment that can be controlled or changed.

Dependent Variable

A condition or piece of data in an experiment as controlled or influenced by an outside factor, most often the independent variable.

Positive Correlation

The dependent variables and the independent variables in the data set increase or decrease together.

Negative Correlation

The dependent variables and independent variables in a data set either increase or decrease opposite from one another.

Causation

An observed event or action appears to have caused a second event or action.

Chi-square

A statistical test used to compare expected data with what we collected.

Null hypothesis

The prediction that there is no interaction between variables. If there is a big enough difference between the scores, then we can say something significant happened which would be rejecting the Null hypothesis.

Chi squared definition and formula

(O-e)^2/e. Where o is the observed data and e is what you expected. P value needs to be under .05 for it to be considered sucessfull. A statistical test used to compare expected data with what we collected.

Degrees of Freedom

of categories-1=Degrees of freedom

Formula for the ax+b regression line

a=(nsum(x*y)-sum(x)*sum(y))/(nsum(x^2)-sum(x)^2) b=1/n (sum(y)-a(sumxi))

Chapter 11 statistical sampling Flashcards

(35 cards)