Statistics 1 Flashcards by Ellie Ruddle

What is the definition of a population?

Every member with selected characteristics and sharing common property in a specific region

How well did you know this?

Not at all

Perfectly

What is the definition of a sample?

A representative sub-set of a given population, unrelated and chosen at random

How well did you know this?

Not at all

Perfectly

What is the difference between the response (dependent) variable and the explanatory (independent) variable?

The response (dependent variable is of interest in an experiment, it depends on another factor (independent/explanatory) variable to cause change.

How well did you know this?

Not at all

Perfectly

What are the two sub-sets of qualitative data?

Nominal Data - categorical information that lacks inherent order or ranking

Ordinal Data - information with order or ranking, differences between values are not quantifiable e.g. survey responses or educational levels

How well did you know this?

Not at all

Perfectly

What are the two sub-sets of quantitative data?

Discontinuous - obtained by counting integers

Continuous - (Most used) obtained by measurement e.g. height, BMI

How well did you know this?

Not at all

Perfectly

What type of data is:
Number of carbon atoms in a molecule

Discontinuous quantitative

How well did you know this?

Not at all

Perfectly

What type of data is:
Mass of a chemical compound weighed on a balance

Continuous quantitative

How well did you know this?

Not at all

Perfectly

What type of data is:
Absorbance measured using a spectrophotometer

Continuous quantitative

How well did you know this?

Not at all

Perfectly

What type of data is:
Gender of students in a class

Nominal Qualitative

How well did you know this?

Not at all

Perfectly

What type of data is:
Educational levels of students in a class

Ordinal Qualitative

How well did you know this?

Not at all

Perfectly

Define Accuracy

Closeness of measurements to the true value

How well did you know this?

Not at all

Perfectly

Define precision

Closeness of repeated measurements to eachother

How well did you know this?

Not at all

Perfectly

Define Data Set

Collection of information based on an experiment or research question, collected in term of observations and variables, ready to be processed, analyzed, distributed or shared.

How well did you know this?

Not at all

Perfectly

Define descriptive statistics

Summarize a set of data values in terms of center and spread

How well did you know this?

Not at all

Perfectly

What does average show?

The general tendency of the data

How well did you know this?

Not at all

Perfectly

What distribution of data can you find the true mean?

Normally distributed data

How well did you know this?

Not at all

Perfectly

Define variance

Study These Flashcards

Average squared deviation from the mean

Define Standard Deviation

Study These Flashcards

Variability or spread of the data from the mean of the sample

Define Standard Error

Study These Flashcards

Deviation from the mean of the populations, this tends to be estimations used to calculate confidence

What is the Confidence Interval?

Study These Flashcards

What percentage confidence you are that if someone repeated the test with a different sample, you would get the same results

Give the basic principles of coefficient of variance (CoV)

Study These Flashcards

Larger the number the larger the spread
Normally expressed as a percentage of the mean
Useful for comparisons of 2 data sets in different units

Give the formula for Coefficient of variance (CoV)

Study These Flashcards

CoV = (SD/mean)*100

Define H0

Study These Flashcards

The null hypothesis - there is no correlation/ difference/ association

Define H1

Study These Flashcards

Quantitative or alternative hypothesis
there is a correlation
H1 and H0 are mutually exclusive

What is the P value

The probability (chance) that the null hypothesis is true with 95% confidence. 0.05 (5%) is the statistical cut off of rejection of the H0.

What is the true cutoff for the P value?

0.05/number of predictor variables

Why is it best t under go 2-tailed tests rater than on-sided

A hypothesis can either be one sided or 2 sided and you can test for statistical significance in both directions. If you only test in one direction you may miss an effect in the other direction!

What is the odds ratio?

A value indicating the strength of the relationship between 2 variables in data. Compared the relative odds of the occurance of the outcome of interest (cancer vs no cancer), given the exposure to the variable of interest (age)

What does Odds Ratio mean in relationship to 1

- OR = 1 variable does not effect the odds of the outcome - OR > 1 variable associated with higher odds of an outcome (Increase the risk of the response variable) - OR < 1 variable associated with lower odds of an outcome (Decrease the risk of the response variable)

What is the Z score ?

Odds ratio / standard error of the odds ratio

What are statistics tests used for?

To test the probability that the null hypothesis is true

When would you use z-test?

When the sample size is small (n<30) and/or the population variance is known

When would you use t-test?

When the sample size is small (n<30) and/or the population variance is unknown

When would you use Chi-squared?

Goodness of fit - examine whether the observed results are in order with the expected values (categorical data)

When would you use Fisher Exact?

Goodness of fit - gauge if there is a significant difference between proportions of the categories in two group variables

When would you use F-test?

Compare variances of 2 samples or the ratio of variances between multiple groups

When would you use ANOVA?

Uses F-tests to statistically test the equality of means on 3 or multiple groups of quantitaive variables

When would you use Wilcoxon Rang

Test the equality of means on 3 or multiple groups - used when data is not normally distributed

What does the result of a t- statistic mean?

The higher the value, the lower the chance that the two samples means are from the same population The higher the value of t the more likely that the two samples means are to be different.

What is a Type I error?

False positive Occurs if you reject the H0 while you are supposed to accept it due to data bias

What is a Type II error?

False negative Occurs when you accept the null hypothesis when you were supposed to reject it due to a lack of power

Statistics 1 Flashcards

(41 cards)