Basic Statistics Flashcards

1
Q

What is an observation ?

A

The units on which we measure data, such as persons, cars, animals… are called observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a population ?

A

A collection of all units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample ?

A

A selection of n observations. A sample is always a subset of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a qualitative variable ?

A

Variables which take value that cannot be ordered in a logical or natural way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a quantitative variable ?

A

Variables that represent measurable quantities. The values which these variables can take can be ordered in a logical and natural way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a graphic ?

A

It represents the relationship between two or more variables
It is an alternative way to summarize a variable’s information
It provides clues that words and equations do not
It is great tool to form hypotheses and draw conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a disadvantage of graphs

A

They can be inaccurately interpreted, resulting in incorrect answers or conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the pie chart used for ?

A

Used to visualize the absolute and relative fréquences of nominal (categorical) and ordinal variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the bar chart used for ?

A

Used to visualize the absolute and relative frequencies of observed values of a variable. Can be used for nominal and ordinal variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the histogram used for ?

A

Used to visualize the distribution of values of continuous variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the differences between bar charts and histograms ?

A

Histograms shows the distribution of variables whereas bar charts compare variables
Histograms show quantitative data whereas bar charts show categorical data
The bars in an histogram cannot be reordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is line graph used for ?

A

Used to visualize quantitative data collected over a specific topic and a pecific time interval.
Data points are connected by a line, and they represent the observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are box plots used for ?

A

Used to visualize the distribution of data based on a five number summary : minimum, first quartile, median, third quartile, maximum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Q2 ?

A

The middle value of the data = the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Q1 ?

A

The lower quartile, the middle number between the smallest and the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Q3 ?

A

The upper quartile, the middle value between the median and the highest value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the interquartile range ?

A

From Q1 to Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How to determine the lower extreme in a blox plot graph ?

A

Lower extreme = Q1-1,5*IQR
Where IQR = Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How to determine the upper extreme in a box plot chart ?

A

Upper extreme = Q3+1,5*IQR
Where IQR=Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are scatter plots used for ?

A

Used to visualize the relationship between two quantitative variables measured on the same individuals.
It is useful to visually detect outliers
It shows the type of relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are tables useful for ?

A

Used to present results from research, e.g., within or between-group comparisons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an outlier ?

A

An outlier represents a value distant from the rest, due to variability or error.
Outliers are value more than 1,5 time the IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How to detect an outlier ?

A
  • visually inspect data using a scatter plot or box plot
  • use Tukey rule to detect outliers :

Q1-1,5IQR
Q3+1,5
IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a correlation ? What is it useful for ?

A

Correlation is used to test the relationship between variables (quantitative or categorical)
It is a measure of how things are related.
Some correlations are high
Some correlations are low

It is useful to make predications about future events

25
Q

Which graph is the most appropriate to read a correlation ?

A

The scatter plots graph
Adding a trend line will help to show the tendency behavior between variables

26
Q

How to read a correlation between interval or ratio variables ?

A

Using the correlation coefficient or Pearson coefficient «r»
«r» describes the strength and direction of the linear association between two continuous (interval or ratio) variables.
«r» varies from -1 (negative strong correlation ) to 1 (positive strong correlation)
0=no correlation

27
Q

How to read a correlation between qualitative ordinal scale variable ?

A

Using the Spearman correlation coefficient

28
Q

In the context of correlation of ratio/interval variables, what is «r square» ?

A

The coefficient of determination : the ratio of the amount of variance explained by the regression model to the total variation in the data

29
Q

What is reliability ?

A

It is the overall consistency of a measure.
There is high reliability if a measure produces similar result under consistent conditions

30
Q

What is the reliability test for categorical variables ?

A

Percent agreement or k-statistics - Cohen’s K

31
Q

What does the k-statistic determines ?

A

It determines how well an observation produces the same value for the same patient on repeated measurements (ideally two examiners)
It determines:
- intra and inter examiner reliability
- intra and inter session reliability

32
Q

How to calculate the % of raw agreement ?

A

((Sum of normal observations)+(sum of abnormal observations))/total observations

We do the sum of agreed observations and divide them by that total number of observations

33
Q

What is the purpose of crosstabulation ?

A

The purpose of crosstabulation is to show in tabular format the relationship between tow or more categorical variables.

34
Q

How to interpret K-statistics

A

0-.59 : weak
0.60-0.79 : moderate
0.80-0.90 : strong
Above 0.90 : almost perfect

35
Q

What means k=0

A

Represents the amount of agreement that can be expected from random chance

36
Q

What means k=1

A

Represents the perfect agreement between the raters

37
Q

What means k=-1 ?

A

Represents great disagreement among raters (or no agreement)

38
Q

For what is k-statistic used for ?

A

It is used as a measure for quantifying agreement beyond chance for categorical variables

39
Q

How to determine K in Kappa statistics ?

A

K=((Po-Pe)/(1-Pe)
Where :
Po is the percent agreement observed = raw % agreement
Pe is the percent agreement expected

40
Q

What is the coefficient of variation ?

A

For continuous variables, the coefficient of variation (CV) provides a very simple way to determine the relationship between the standard deviation and the mean of two sets of observations
Values close to zero show minimal variation

41
Q

How to determine the CV ?

A

CV=(standard deviation/mean)*100

42
Q

What is the interclass correlation coefficient ?

A

Is another reliability measure to use in continuous variables
Ranges between 0 and 1, ans is always associated to a 95% confidence interval

43
Q

What is the standard error measurement ?

A

A test of reliability
An estimation of the expected random variation in scores when no real change has taken place

44
Q

What is detectable difference (or change)

A

A test of reliability
The minimum amount of change that needs to be observed at either the group or individual level for it to be considered a real change

45
Q

What is inferential statistics ?

A

It refers to the generalization of results from a sample of participants to the whole population.

46
Q

Why is it helpful to use inferential statistics ?

A

-making inferences about the population from the sample
- concluding whether a sample is significantly different from the population
- if one model is significantly better than the other
- hypothesis testing in general

47
Q

What is the most used method of inferential testing ?

A

Hypothesis testing

48
Q

What is hypothesis testing ?

A

It determines the probability (p-value) of difference, or non-difference between groups

49
Q

In hypothesis testing, what is p-value ?

A

The p-value provides evidence against the null hypothesis H0
The smaller p-value is, the stronger the evidence against H0 and in favor of the alternative hypothesis Ha
If p-value is equal or inferior to 0,05, then H0 is rejected in favor of Ha

50
Q

What is the null hypothesis ?

A

The one we want to disprove

51
Q

What is the Chi-square test used for ?

A

Is used to determine if there is a significant association between two categorical variables
The test compares the observed values with the expected.

52
Q

What determines the T-test ?

A

The t test tells us how significant the difference between group means are

53
Q

What are the requirements for the T-test?

A

-continuous variables
-normal distribution
-equal variance in the samples

54
Q

What are the three types of T-tests ?

A

Paired sample T-test
Independent sample T-test
One-sample T-test

55
Q

What is the paired samples T-test ?

A

Used to compare the means between two sample from the same group/individual
Comparing the means of two conditions. Where the same people are in both groups

56
Q

What is the indépendant samples T-test used for ?

A

Used to compare means between two samples from different groups/individuals
Comparing the means of two different groups

57
Q

What is the one-sample T-test used for ?

A

Comparing the mean of a sample with a pre-specified mean

58
Q

What are the 3 questions answered to by each type of T-test ?

A
  • can I be certain that the difference between groups is not due to random chance ?
  • how big is the difference ?
  • is this difference important ?
59
Q

In T-student test, what is t-value ?

A

The degree to which the difference can be explained by the group
It is compare to a threshold value
t>0,05 : assumes Ha, and the difference is significant
t<0,05 : assumes H0 the difference is NOT significant