Lec 1 & TB Flashcards

Question 1

Q

2 main types of data

Answer

A

Continuous
Discrete

Question 2

Q

Define discrete data or 2 types of discrete data
- 3 types of categorical data

Answer

A

Discrete data:

counts, # of times smth happens
Categorical: put in categories
- Binary (0/1)
- Ordered/ordinal: the order is meaningful; but the distance b/w each is not
  - E.g. S,M,L; agree, n0, disagree
  - IOW: no meaningful distance b/w S to M
- Unordered/nominal: names; no logical order
  - race, sex

Question 3

Q

Define continuous data

2 types of cont data

Answer

A

Cont data: measured quantities that can be measured to infinite prevision (eg height, weight, BP); difference b/w the intervals are meaningful
Sometimes data are technically discrete are treated like cont data (eg SF-26 QoL instrument)
- Eg: likert scale is ordinal
  - Goal: add them up -> total #
  - Since the stuff you add up are diff v, it ends up LIKE a cont v
2 types of cont data
- Ratio: has TRUE meaningful zero (eg height, weight)
- Interval: zero is arbitrary (eg scale data, temp)
  - Eg 0 in dC: freezing pt h2o
  - Eg 0 in dF: freezing pt of salt h2o

Question 4

Q

define “X”

Answer

A

“X”: variable of interest
“xi”: the subscript number is a specific item in the data set
N: number in pop
“n”: lower case is # in sample
fi = frequency of xi
f = total # of observations in an interval
∑ = sum
X
Greek letters represent pop characteristics (parameters)
- µ = pop mean
- σ²= pop variance
- σ = pop SD
Roman letters rep sample characteristics (stats)
- x̄ = sample mean
- s²: sample variance
- s = sample sd

Question 5

Q

Type of data described by descriptive statistics

What does the distribution of data tells us?

Answer

A

Descriptive stats describe characteristics relating to distribution of data
The most appropriate descriptive stats depend on data distribution
Distribution of data = pattern of observations

Question 6

Q

mean

Where is the mean if the graph is right skewed?
Where is the mean if the graph is left skewed?
Geometric mean
Arithmetic mean

median

define
odd vs even # of observations

mode

Answer

A

mean = avg

-ve or right skewed, mean is shifted to right
+Ve or left skewed, mean is shifted to left
Geometric mean: multiple then root
Arithmetic mean: add then divide

Median

(Q2): middle data value
Odd # of observations: median = middle #
Even # of observations: median = avg of 2 middle vales

Mode

The # is most frequently occurring in the data set

Question 7

Q

5 number summary
How to get Q1 and Q3

Answer

A

5 # summary: min, Q1, Q2, Q3, max
Quartiles
- Sort the data
- Q1 = (n+1)/4 th ordered observation
- Q3: 3(n+1)/4 th ordered observation

Question 8

Q

Formulas

sample variance
coefficient of variation

Answer

A

Sample variance: Right formula of image

Sample standard deviation: s = √s2

Interquartile range: IQR = Q3 − Q1

Range: Max − Min

Coefficient of variation: CV = (s/x)(100)% (Only valid for ratio data)

Question 9

Q

degree of freedom

Why do we √ the variance

When is the empirical rules used?

Empirical rule

x

How do we determine outliers if data is asymmetrical, not normally distributed

Answer

A

Degrees of freedom

df of an estimate is the # of independent pieces of info used to obtain the estimate
x
√ the variance gives us the sd, and restores the original unit
X
If the freq distribution is symmetrical and bell-shaped = normal distribution; empirical rule is used
Empirical rule: for a normal distribution, all the data lies in 3 sd of the mean
- 68% of data lie w/in interval µ +/- σ
- 95% of data in µ +/- 2σ
- 99.7% in µ +/- 3σ
IOW: 0.3% of data are outliers
x
When we do not have a normal distribution/ data distribution is asymmetric, outliers are identified as
- < Q1 – (1.5 x IQR)
- > Q3 + (1.5 x IQR)

Question 10

Q

Graphs that show distribution

Graph that show association

Models for inferences

Answer

A

distributional: histogram, density plot, box-whisker plot, quantile-quantile (Q-Q) plot
Association: scatter plot
Inferences
- t-tests (parametric), Wilcoxon (non-parametric)
- Linear regression, analysis of variance (ANOVA)

Question 11

Q

Graphs

con
most common graph

Descriptive stats

what is displayed
how do we display relationships

Answer

A

Graphs
- Challenging to display at times
- Usually use dotplots
Descriptive stats
- # or prop (%) of each category
- Crosstabulations b/w categorical v (have multiple v) to display relationships

Question 12

Q

Inference and stat models

binomial test
fischer’s exact or X^2 test

Answer

A

Inference and stat methods

Binary data: Binomial (prop) test (single sample)
Fischer’s exact or X^2 test (chi-square test) (comparing samples)
More than 2 categories, use chi-squared

Question 13

Q

Population
Population parameters
Sample
Sample stats

Answer

A

Population vs samples

Pop
- Collection of all possible subjects
- Parameters:
  - µ = mean[AL1]
  - sigma sq: variance
  - π = proportion w/ characteristic
- Parameters: unknown constant to estimate
Sample
- Subset of pop (estimates of pop using sample)
  - x bar = mean
  - s^2 = sample variance
  - p = proportion in the sample w/ characteristic
- Sample stats: variable b/c it depends on a particular sample
  - Used to estimate pop parameters

[AL1]“m” “mu” “mean”

“s” “sigma” “sd”

“p” “pi” “prop”

Question 14

Q

Describe box plot

Answer

A

Box whisker

Lec 1 & TB Flashcards

(14 cards)