DATA DESCRIPTION Flashcards

1
Q

2 types of statistics

A

DESCRIPTIVE: describe study population

INFERENTIAL: what we know to infer what we don’t know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3 key factors in designing a research

A
  1. type of variables
  2. level of measurements
  3. extraneous + confounding variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

research design model (6)

A
  1. current knowledge
  2. choose hypothesis to test
  3. design experiment
  4. do experiment
  5. statistical analysis
  6. interpret + report
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

5 factors involved in good experimental research design

A

1) sample size and type of sample
2) accurate variables to reduce error
3) valid measuring instrument
4) practical experiment?
5) cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why is it important to use research design

A
  1. smooth operation
  2. efficiency
  3. blueprint for planning
  4. reduce erros
  5. reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what makes good research design? (3)

A

1) reliability
2) replication
3) validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

4 types of validity

A

measurement
internal
external
ecological

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Type of variable (3)

A

CONTINUOUS - temp (figure on a scale)

DISCRETE - no. of symptoms

CATEGORICAL - ethnicity, gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

measurement variables (type of scale) (4)

A

INTERVAL
RATIO
NOMINAL
ORDINAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

interval scale

A

order of magnitude
equal intervals on scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ratio scale

A

order of magnitude
equal intervals
absolute zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

nominal scale

A

attributes only named
e.g: gender - male female
ethnicity - white, black, asian

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

ordinal scale

A

attributes only ordered
e.g: 1st, 2nd, 3rd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

difference between EXTRANEOUS variables
and
CONFOUNDING variables

A

EXTRANEOUS: may effect other variables, not acknowledging in study

CONFOUNDING: type of extraneous, directly effects our variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

calculate media formula

A

(n+1) / 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does data look like when its:
1) + skewed
2) normally distributed
3) - skewed

A

1) to the left
2) equal on both sides
3) to the right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is a factor

A

e.g: two categories: undergrad v post grad

to compare their media, mode etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

MAKING DECISION

if both variables are categorical use…

A

a contingency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

MAKING DECISION

if you have one categorical variable and one continuous use…

A

compare means/medians

or

collapse and use contingency tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what type of data is
1) mean
2) Median
best with

A

1) normal
2) skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

how to calculate a percentile value

A

percentile
————— X (n+1)
100

n = number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is RANGE

A

difference between highest and lowest value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is INTERQUARTILE RANGE

A

difference between upper and lower quartile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is STANDARD DEVIATION

A

measures average deviation from mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what is VARIANCE

A

standard deviation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what are the upper and lower fences

A

if values are either side of these they are outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

how to calculate upper and lower fence

A

Lower fence:
LQ - (1.5 X IQR)

Upper fence:
UQ - (1.5 X IQR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

elements of a box plot: (top to bottom)
5

A

1) biggest observation below UF
2) UQ
3) Median
4) LQ
5) smallest observation above lower fence

UQ– LQ = IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What does SD show

large SD and small SD

A

spread of data

LARGE SD: data more spread out

SMALL SD: data closer to mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

equation for standard deviation

A

square root of:

no. of observation - 1

31
Q

difference between categorical v continuous data

A

CATEGORICAL
data adds to a whole
e.g: BMI categories

CONTINUOUS
data on individuals over time
ratio/interval e.g: height

32
Q

what scale is categorical data measured on

A

nominal
ordinal

33
Q

what scale is continuous data measured on

A

ratio
interval

34
Q

graphs for categorical data (3)

A

1) bar chart
2) stacked bar chart
3) pie chart

35
Q

graphs for continuous data (6)

A

1) stem and leaf plot
2) histogram
3) box plot
4) bar chart w error bars
5) scatterplots
6) line graph for time series data

36
Q

when should a scatterplot be used

A

2 continuous variables

37
Q

what is an adjusted v non adjusted axis

A

unadjusted = start from zero

adjusted = start from e.g: 40 as that’s the lowest figure

38
Q

calculate standard error

A

square root of number of observations

39
Q

when should
1) SD
2) SE
be used

A

1) describe data you have
2) show how confident you are in estimate of the mean

40
Q

what does a histogram show

A

distribution of data

puts into categories e.g: age 1-5, 5-10

x axis = categories
y = frequency in each category

bars touch if continuous

41
Q

how does a stem and leaf diagram work

A

stem : all but last digit

leaf : last digit

e.g: 43, 46, 47, 53, 54, 62

4| 3 6 7
5| 3 4
6| 2

42
Q

adding or subtracting by constant number to each value in data when scaling

1) __________ SD
2) __________ mena

A

1) doesn’t change
2) changes mean by amount added or subtracted

43
Q

when multiplying or dividing by scale

1) SD __________
2) mean _________

A

1 and 2) increases/decreases by proportion x or / by

44
Q

when should
SCALING
and
STANDARDISATION
be used

A

SCALE: one person weight in lbs , one in kg

STANDARDISE: a boy and girl at 26 months weight is 10kg

standardise using gender

45
Q

what does z score show

A

number of SD’s an observation is from the mean

46
Q

+ Z score =

  • Z score =
A

+ = observation is above the mean

  • = observation is below the mean
47
Q

what does it mean if the Z score is zero?

A

observations equals the mean

48
Q

Z score equation

49
Q

1) mean of Z score =
2) SD of z score =

only when ….

A

1) 0
2) 1

working with whole data set they were collected from

50
Q

in normal distribution curve
what is the % from -1SD to +1SD

51
Q

imagine a normal distribution curve split into 6 ‘columns’ ,
name the % of each column going up then down

A

0.13%
2.15%
13.6%
34.1%
34.1%
13.6%
2.15%
0.13%

52
Q

on a ‘NORMAL DISTRIBUTION TABLE’ what does each column mean

A

along the left side: first digit in number

along the top: second digit in number

e.g: 0.66
0.6 along left side
0.06 along top

53
Q

when can you use a normal distribution table

A

e.g Q: what proportion of data lies between mean and 0.66

54
Q

what’s the difference between a|:
SAMPLE
and
POPULATION

A

SAMPLE: selection from population

POPULATION: whole, large group, everyone fit criteria

55
Q

theory of sampling (3)

A

1) STATISTICAL ESTIMATION
point/interval estimate

2) TESTING HYPOTHESIS
accept/reject null

3) STATISTICAL INFERENCES
general population statement

56
Q

limitations of sampling (5)

A
  • less accurate
  • changing of units
  • misleading conclusion
  • need special knowledge
  • is sampling possible?
57
Q

probability sampling methods (4)

A

1) simple random sampling
2) stratified sampling
3) systematic sampling
4) multistage sampling

58
Q

non probability sampling methods (4)

A

1) deliberate sampling
2) convenience sampling
3) snowball sampling
4) quota sampling

59
Q

PROBABILITY sampling methods:

+ and -

A

+:
detailed info of pop

measure precisely, unbiased

-:
require skill + expertise

time to plan

cost

60
Q

simple random sampling

characteristic

A

everyone has equal chance of being chosen

random number generator

61
Q

stratified sampling

what are strata and what should they have

A

population split into strata (similar groups)

strata needs homogeneity

same ratio in each strata

62
Q

systemic sampling

+ and -

A

order population, e.g: every 5th person

+: simple
smaller variance v
ordered population

  • : estimate error
63
Q

summarise multistage sampling

A

e.g:
1) randomly select region
2) randomly select school in region
3) randomly select children in school

64
Q

multistage sampling + and -

A

+:
complete pop list not needed
only need info on selected sample
cheaper if geographically defined

-:
larger errors

65
Q

NON PROBABILITY SAMPLING + and -

A

+:
include important units
practical
representative of importantance

-:
risk of bias
not reliable

66
Q

convenience sampling

when to use (3)

A

use when:
- no clear population
- sampling not clear
- complete list of source not available

67
Q

snowball sampling

A

contact few people in target group
get more people contacts from these

68
Q

quota sampling

A

non random
select categories then quota e.g: 40% men 60% women
actively look for people to fit this

bias
cheaper

69
Q

factors that effect reliability of sample (5)

A

size of sample

representativeness

homogeneity

unbiased

parallel sampling - another sample for test

70
Q

3 errors in samples

A

1) SAMPLING VARIABILITY - diff samples from sam pop have diff SD + mean

2) SAMPLING ERROR - mean of sample different to mean of pop

3) NON SAMPLING ERROR - error when asking / recording results

71
Q

SE formula

A

√ number in sample

72
Q

when to use SE instead of SD

A

when using sample means

to determine precision

73
Q

The Central Limit theorem (3)

A

1) will have ‘normal distirbution’

2) mean of sample means = mean of population

3) SE = SD