Defining the Data Flashcards

1
Q

What is a population?

A

The collection of all the individuals of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample?

A

The subset of the population that is selected as the

result of sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a biased sample?

A

Study participants are not representative of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an unbiased sample?

A

Study participants are representative of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is validity?

A

the extent to which the instruments that are used in the study measure exactly what they should be measuring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is reliability?

A

the extent to which the results of the study are consistent when the study is repeated under the same conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a variable?

A

something whose value can change or vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is data?

A

the values we obtain when we measure a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two type of variables?

A

1, Categorical “attributes”

2. Quantitative “numbers”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two types of categorical attributes? And their meanings?

A

Nominal: Values are “names” that are unordered categories
Ordinal: Values are “names” that are ordered categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two types of quantitative numbers? And their meanings?

A

Discrete: Values are integer values 0, 1, 2 … on a proper numeric scale
Continuous: Values are a measured number of units, including possible decimal values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the two types of “continuous” quantitative numbers? And their meanings?

A

Interval: Interval scale variable has no true zero on the scale
Ratio: Ratio scale variable has true zero on the scale (0 just means the absence of something)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is derived variables?

A

variables that you create by calculating or categorising variables that already exist in your data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two different types of derived variables?

A

Calculated

Categorized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is threshold variables?

A

variables obtained by splitting the values of another variable into categories based on the values of well-known thresholds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a transformed variable?

A

a variable which has been transformed from another variable with a different measurement scale (ex. square rooting numbers, squaring…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an exposure variable?

A

a variable thought to predict an outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an outcome variable?

A

a variable thought to change as a function of changes in an exposure variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Center?

A

A representative or average value that indicates where the middle of the data set is located

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is variation in data?

A

A measure of the amount that the values vary among themselves from the average value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is distribution in data?

A

The nature or shape of the distribution of data (such as bell-shaped, uniform, or skewed)

22
Q

What are outliers in data?

A

Sample values that lie very far away from the vast majority of other sample values

23
Q

What is time in data?

A

Changing characteristics of the data over time

24
Q

What are the measures of central tendency?

A

Means, medians & modes

25
Q

What is the central tendency?

A

the tendency for values in a group to cluster around a central or ‘average’ value which is typical of the group

26
Q

do extreme values affect the median?

A

Nope

27
Q

Do extreme values affect the mean?

A

Yep

28
Q

Do extreme values affect the mode?

A

Nope

29
Q

What is dispersion? (variability, scatter, spread)

A

how stretched or squeezed a distribution of values within a sample or a dataset is

30
Q

which percentiles are good summary of a sample?

A

the “Five Number Summary” (P0, P25, P50, P75, P100)

31
Q

what are the measures of dispersion

A

Range, interquartile range, and standard deviation

32
Q

What does a small standard deviation mean?

A

most data points are close to the mean

33
Q

What does a large standard deviation mean?

A

data points are widely spread from the mean

34
Q

What is a percentile?

A

is a measure that indicates the value below

which a given percentage of observations in a group of observations fall

35
Q

How do you calculate the IQR (interquartile range)?

A

Q3 - Q1

36
Q

What is Q1?

A

25%

37
Q

What is Q3?

A

75%

38
Q

What is the formula of median when the sample is odd?

A

[𝑛+1]/2

39
Q

What is the formula of median when the sample is even?

A

([𝑛/2] , [𝑛/2+ 1])

40
Q

A garden contains 39 plants. The following plants were chosen at random, and their heights were recorded in cm: 38, 51, 46, 79, and 57. Calculate their heights’ standard deviation.

A

https://byjus.com/maths/standard-deviation-questions/

41
Q

SD indicates the variation where the what is the measure of central tendency?

A

Mean

42
Q

IQR indicates the variation where the what is the measure of central tendency?

A

median

43
Q

What is inferential statistics?

A

statistics used to make inferences based on
relationships found in the sample to relationships truly exist in the
population

44
Q

What is Descriptive statistics?

A

statistics used to describe, show or summarize data

in a meaningful way (take pictures of data)

45
Q

What are the two types of statistics?

A

descriptive statistics and inferential statistics

46
Q

What is a theory?

A

a generalization about a phenomenon (explanation of how or

why something occurs)

47
Q

What is a hypothesis?

A

a proposed explanation made on the basis of limited
evidence as a starting point for further investigation (without any
assumption of its truth)

48
Q

What are the steps of the research process?

A
  1. Initial observation (Research question)
  2. Generate theory
  3. Generate hypothesis
  4. Collect data to test hypothesis
  5. Analyse data
49
Q

Why is data important?

A

 Identifying problems
 Planning & making informed decisions
 Monitoring/evaluating progress
 Test hypotheses & make inferences about populations of interest

50
Q

What is the formula to calculate percentages?

A

L = sample size [𝑑𝑒𝑠𝑖𝑟𝑒𝑑 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒/100]

If L is whole number use average of the L and (L+1).
If L is not whole number round to the next whole number