Data Collection, Sampling and Descriptive Statistics Flashcards

Question 1

Q

Data Collection Techniques (5)

Answer

A

Observations
Tests and assessments
Surveys
Document analysis (published articles)
Interviews
Cannot mix the techniques

Question 2

Q

Types of Data (2)

Answer

A

Primary: data that you collected
Secondary: data that someone else collected

Question 3

Q

Secondary Data Disadvantages (6)

Answer

A

May be out of date (limited by time)
May not have been collected long enough to detect trends.
May be missing info on some observations
May be incomplete
No control over data quality
Data collection may be estimated

Question 4

Q

Secondary Data Advantages (4)

Answer

A

Saves time
Saves money
Easily accessible
Makes collaboration easy, multicenter collaboration (rare diseases).

Question 5

Q

Primary Data Disadvantages (4)

Answer

A

Can be expensive to collect
Selection of population or sample
Difficulty recruiting participants
Pretesting the instrument to determine presence or absence of measurement bias.

Question 6

Q

Probabilistic Sampling Methods (4)

Answer

A

Simple random
Stratified random
Systematic random
Clustered random

Question 7

Q

Non Probabilistic Sampling Methods (3)

Answer

A

Convenience
Purposive
Snowball

Question 8

Q

What are Descriptive Statistics used for? (3)

Answer

A

To summarize data, describe data and present data.

Question 9

Q

Types of Descriptive Statistics (4)

Answer

A

Measures of frequency: count, percent and frequency (how often an observation occurs).
Measures of central tendency: mean, median and mode (data in relation to the middle position, locates distribution).
Measures of Dispersion or variability: range, variance, standard deviation(difference between observed score and mean) and Interquartile range.
Measures of position and rank: Percentile ranks, quartile.

Question 10

Q

Mean

Answer

A

Average
Mean = (Y1+Y2+…+Yn)/n
Y: variable
Y1: 1st observation of variable Y
Yn: last observation of variable Y
n: number of observations in sample
Outliers make the mean a bad measure of central tendency.

Question 11

Q

Median

Answer

A

All values are in rank order. The median is that value that splits the data set equally in halves. Same as 50th percentile.
If you have even nr. the average of the two middle nrs. is the median.

Question 12

Q

Mode

Answer

A

Observation with the highest frequency.
Can have more than one mode: Bimodal (2 modes).

Question 13

Q

Finding the mean when you have a bar chart with class intervals

Answer

A

You cannot find the exact mean when you have class intervals. You can estimate it by finding the midpoint of each interval.
(frequency x midpoint) / frequency
So you take each class interval and multiply the frequency of that class with it’s midpoint and then you add all of them up together and divide the nr by the total frequency.

Question 14

Q

Range

Answer

A

Difference between the lowest value and the highest value in a dataset.
Range = maximum value - minimum value
Can be affected by outliers.

Question 15

Q

Percentile

Answer

A

(C+0.5xf/N)x100%
C: nr/count of all observations lower than the observation of interest.
f: frequency of the observation of interest.
N: nr of all observations.
If you have two of the same observations you have to use the higher observation when finding C.
100th percentile means the highest score, 0 percentile the lowest score, not the same as percentage.

Question 16

Q

Interquartile Range

Answer

Study These Flashcards

A

Q1: the value occupying 1/4 position of all values.
Q3: the value occupying 3/4 position of all values.
IQR: Q3-Q1
When Q2 is an odd nr you use the 1st median value to calculate Q1 and 2nd median nr to calculate Q3.
When Q2 is an even nr. you do not include it in the calculations of Q1 and Q3.

Question 17

Q

Variance

Answer

Study These Flashcards

A

Measure of how close together or far apart the values in a dataset are.
The larger the variance, the further the individual values are from the mean.
The smaller the variance, the closer the individual values are to the mean.

Question 18

Q

Standard Deviation and Variance

Answer

Study These Flashcards

A

S= standard deviation
S2= variance
therefore s = √s2

Question 19

Q

Empirical Rules of Normal Distribution

Answer

Study These Flashcards

A

In symmetric normal distribution:
68% of values are within 1 SD of the mean
95% of values are within 2 SDs of the mean
99.7% of values are within 3 SDs of the mean.
Values more than 3 SDs from the mean are outliers.
Mean = Median = Mode for unimodal symmetrical normal distribution

Question 20

Q

Asymmetrical Distribution (2 Types)

Answer

Study These Flashcards

A

Positively skewed/right tailed: skewness > 0, drop in the trendline on the right side.
Negatively skewed/left tailed: skewness < 0, drop in the trendline on the left side.

Question 21

Q

Describing what you see in relation to the mean example

Answer

Study These Flashcards

A

To describe the relationship of the mean with the symmetry/asymmetry of the distribution, you could say that out of the 40 observations, 23 of them have IQ scores greater than or equal to the mean. That means that most of the people have I Q scores greater than or equal to the mean. While fewer people (n = 17, or 42.5% of the sample) have IQ scores below the mean score.

Data Collection, Sampling and Descriptive Statistics Flashcards

(21 cards)