WEEK 9: STATISTICS Flashcards

1
Q

What is biostatistics

A

is the science of analyzing data and interpreting the results so that they can be applied to solving problems related to biology, health, or related fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is univariate analysis

A

describes one variable in a data set using simple statistics like counts (frequencies), proportions, and averages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is bivariable analysis

A

uses rate ratios, odds ratios, and other comparative statistical tests to examine the associations between two variables (mostly exposure and outcome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is multivariable analysis

A

analysis encompasses statistical tests such as multiple regression models that examine the relationships among three or more variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a variable

A

Any quantity that varies from one entity to another (sometime within an entity over time)
- any attribute, phenomenon or event that can have different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the 2 types of variables

A

quantitative and qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

nominal variables (qualitative)

A
  • no intristic or logical order or value
  • ex. university programs
  • you can assign numbers to a different categories
  • do not have any other numeric properties
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ordinal variables (qualitative)

A

Intrinsic value but with no clear or equal differences between levels (a set of ordered categories)
- ex. mild vs. moderate vs. severe pain
- rating scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

3 ways to display qualitative data (nominal, ordinal)

A

pie chart, bar chart, frequency tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

numeric variable (quantitative)

A
  • any positive real number, depends on the nature of the variable can be expressed in decimals
  • meaningful numeric scales
  • age, blood pressure, # of friends, temperature
  • assigned numbers have total mathematical meaning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

continuous variable

A
  • can take any value within a range
  • ex. a persons height. can be 60 inches
  • blood pressure, temp.
  • plotted as. a line
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

discrete variable

A
  • can take a finite or limited number of values
  • not continious
  • a family can not own 10 1/2 cars
  • age in year, number of drinks
  • can be plotted as dots
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

quantitative variables: interval vs ratio

A

interval:
- difference is meaningful
- no natural zero

ratio:
- ratio is meaningful
- zero means absense of attribute (is natural)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Mean

A

is calculated by adding up all the values for a particular variable and dividing that sum by the total number of individuals with a value for the variable=arithmetic average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

median

A

is the value in the middle when you rank the data in ascending or descending order
- Divides the data into 2 equal parts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Mode

A

the most frequently occurring value for a particular variable in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

histogram

A

a graph that shows the frequency of numerical data using rectangles
- important to manage the intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

range

A

range for a variable is the difference between the minimum (lowest) and the maximum (highest) values in the data set

19
Q

what are quartiles

A

mark the three values that divide a data set into four equal parts

20
Q

what is the interquartile range

A

captures the middle 50% of values for a numeric variable

21
Q

standard error of the mean

A

adjusts for the number of observations in the data set by dividing the variance by the total number of observations and then taking the square root of that number

22
Q

confidence intervals

A

Provide information about the expected value of a measure in a source population based on the measured value in a study population
- a larger sample size will yield a narrower confidence interval

23
Q

what does a 95% confidence interval mean

A

interval is usually reported for statistical estimates, which means that 5% of the time the confidence interval is expected to miss capturing the true value of a measure in the source population

24
Q

inferential statistics

A

Techniques that use statistics from a random sample of a population to make evidence-based assumptions (inference) about the values of parameters in the population as a whole

25
Q

null hypothesis

A

there is no difference between the two or more values being compared

26
Q

alternative hypothesis

A

there is a difference between the two or more populations being compared

27
Q

steps in hypotheis testing

A
  1. Take a random sample from the population of interest
  2. Set up two competing hypotheses (based on research questions)
  3. use sample stats (mean, frequency) to decide whether to support or reject null
  4. determine if the null hypothesis is really true, what the observed sample statistics will be
28
Q

p value

A

Introduced by Fisher to determine whether the observed sample supports the null
- between 0.1 and 0.9: no reason to suspect null hypothesis is false
- 0.05 the convention commonly used in health research

29
Q

how is p value calculated

A

from observed data based on pertinent test statistic

30
Q

if p = 0.01 what does this mean

A

If p=0.01 it means if in the real-world null is true (no difference) there is only 1% chance that the data produce results on a difference

31
Q

what is the significance level

A

is the p value at which the null hypothesis is rejected, usually 0.05 in health research

32
Q

parametric test

A

assumes the variables being examined have particular distributions
- Inferential methods are based on types of distributions (mostly normal)

33
Q

nonparametric test

A

does not make assumptions about the distributions of responses
- Nonparametric tests are used for ranked variables and when the distribution of a ratio or interval variable is non-normal

34
Q

bar chart and pie chart

A

Bar Chart - graph that presents categorical data with rectangle lengths proportional to the values they represent
Pie Chart - circle in which each wedge or slice displays the percentage of participants who provided a particular answer to 1 question

35
Q

kurtosis

A

describes how peaked or flat a bell-shaped distribution is

36
Q

leptokurtic vs playkurtic

A

Leptokurtic - distribution curve is very peaked
Platykurtic - curve is relatively flat

37
Q

unimodal and bimodal

A

Unimodal - has 1 peak
Bimodal - has 2 peaks

38
Q

standard deviation

A

the square root of variance

39
Q

z score

A

is a # that indicates how many standard deviations away from the sample mean the response of an individual from within that population
ex. a person who is the mean age has a z score of 0. A person whose age is 1 standard deviation above the mean in the population will have a z score of 1

40
Q

fabrication, falsification and plagirism

A

Fabrication - creation of fake data (creating fictitious data in a spreadsheet for people who never completed a questionnaire or ever participated in an experiment)
Falsification - misinterpretation of results (modifying extreme values to improve the results of statistical tests, manipulating photographs, or intentional misreporting a study’s methods to make the study look more rigorous)
Plagiarism - the use of other people’s ideas, words, ot images w/out permission and proper attribution

41
Q

outlier

A

value in a numeric data set that is distant from other observations and outside the expected range of values

42
Q

Steps for identifying a statistical test

A
  1. Select variables to compare
  2. specify the goal of the test
  3. Check variable types
  4. Choose appropriate test for variables
  5. Confirm that assumptions of the test are met
  6. Run test and interpret results
43
Q

Fisher’s Exact Test

A

compares the values of of a binomial variable in 2 independent populations