Final summary Flashcards

1
Q

population

A

consists of all the items or individuals about which you want to draw a conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

sample

A

is the portion of the population selected for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

measuring

A

means linking numerical values to research objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Observation unit i.e. statistical unit

A

is a single research object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

observation

A

is the measured result (value) that is related to one research object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

variable

A

is a characteristic of an item or
an individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Discrete variables

A

arise from a counting process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Continuous variables

A

arise from a measuring
process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sampling methods can be categorized

A

Probability - s in which the elements being included have a known chance of being selected
Non-probability samples - participants are selected in a purposeful way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Probability sampling methods

A

Simple random sampling
- Systematic random sampling
- Stratified sampling
- Cluster sampling
- Sequence sampling
Probability samples are samples in which the elements being included have a
known chance of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Non-probability sampling methods

A

-Judgment sampling
-Quota sampling
Non-probability samples are ones in which participants are selected in a purposeful way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Systematic random sampling

A

The sampling units are chosen from the sampling frame at a uniform interval at a
specified rate
- Sampling interval k = N/n (N = size of the population, n=sample size
- The starting point is selected from the first interval and the very kth element is selected
- For example: N = 200, n = 10 → k = 200/10 = 20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Simple random sampling

A

elements in the whole population are numbered and selected by using random numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Stratified sampling

A

the population devided into exclusive strata/groups (based on
nationality, profession, gender….)
- Each element can be included only in one strata
- Sample is drawn randomly from each strata/group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cluster sampling

A
  • each cluster represents the whole population
  • random clusters are selected to the sample
  • -selected clusters are included fully or randoms samples are selected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Sequence sampling

A

elements are picked up sequentially until the results do not change anymore

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Judgement sampling

A

Relies on judgement or expertise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Quota sampling

A

The first step is to estimate the sizes of the various subclasses or strata in the population
-sampling continues until each quota is full
- Even quota
- Proportional quota
- Optimal quota

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Even quota

A

the same number of elements is picked from each strata (e.g. 100
male and 100 female)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Proportional quota

A

if in population 60% male and 40% female, the sample
is drawn in the same proportions: 60 male + 40 female = total sample size 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Convenience sampling

A

the participant are self-selecting, there is no sample design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Sampling frame

A

a list of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Sample size

A

is affected by the desired accuracy of the results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

p= (confidence of p)

A

percentage in the sample, The maximum margin of error (e ) is reached with p = 50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

S

A

standart deviation, Measure the average scatter around the mean

25
Q

e

A

margin of error

26
Q

Z

A

critical value

27
Q

Confidence of mean

A

method of calculating a sample size, based on deviation of the population mean (margin of error)

28
Q

Confidence of percentage

A

method of calculating a sample size, based on percentage margin of error

29
Q

Margin of error tables

A
  1. presents the effect of the sample size on the margin of error of percentages and a 95% confidence level
  2. presents the sample size based on margin of error and population
    size
30
Q

Cumulative percentage distribution

A

divided the cumulative frequency by the number of observations, F%

30
Q

Frequency

A

number of observations, f,
Number of each value of the variable in the sample

30
Q

Cross tabulation

A

presents the results of two (or more) variables
The table contains frequencies in each cell at the intersection of rows and columns

30
Q

Cumulative frequency

A

is the running total for the data, F

31
Q

Relative frequency

A

is the frequency in each class divided by the total number of observations, f%

32
Q

Numerical descriptive measures are classified

A

as measures of central tendency and
measures of variation and shape

33
Q

Measures of central tendency

A

mode, median, mean, quartiles and fractals

34
Q

Mode

A

The value in a set of data that appears most frequently

35
Q

Median

A

● The middle value in a set of data that has been ranked from smallest to largest
● Half the values are smaller or equal to the median and half the values are larger or
equal to the median
● If there is an even number of values, the median is either of the two values in the middle, or mean of the two middle values

36
Q

Arithmetic mean

A

The arithmetic mean (often simply called the “mean” or “average”) is a measure of central tendency that represents the sum of all values in a data set divided by the number of values.
ˉ
X is the arithmetic mean.

37
Q

Xi

A

represents each individual value in the data set

38
Q

Quartiles

A

Arrange the data in ascending order.
Find Q2 (the median):
If the number of data points is odd, the median is the middle number.
If the number of data points is even, the median is the average of the two middle numbers.
Find Q1 (first quartile):
The first quartile is the median of the lower half of the data (excluding the overall median if the number of data points is odd).
Find Q3 (third quartile):
The third quartile is the median of the upper half of the data (excluding the overall median if the number of data points is odd).

39
Q

Fractals

A

It is any other division of the data. It is necessary that the data can be arranged in descending or ascending order

40
Q

Range

A

Largest value minus the smallest value

41
Q

Interquartile range

A

● Interquartile Range = Q3 - Q1
● Extreme values do not affect

42
Q

Normal disrtibution

A

Normality is tested by: Kolmogorov-Smirnov and Shapiro-Wilk tests
- If the sample size is less than 50 Shapiro-Wilk test is used, if over 50, Kolmogorow-Smirnov test is used
- If sig.>0.05 -> the variable is normally distributed

43
Q

statistical testing

A

if some phenomenon is present in the sample, is it also present in the population. Statistical testing tells which of the hypotheses is supported

44
Q

Hypothesis

A

is some theory of a particular parameter of the population

45
Q

Null hypothesis

A

is always formed as “no difference” or “no correlation” H0: σ1 = σ2

46
Q

Parametric tests assume

A
  • Data at the interval or ratio level of measurement
  • Normal distribution of the population (the test variable is normally distributed)
47
Q

p-value

A

is the probability of getting a test statistic equal to or more extreme than the
sample result, given that the null hypothesis is true.
The p-value is often referred to as the observed level of significance
If Р is less than 0,05 => Н1 correct

48
Q

correlation

A

relationships between variables
Knowing the value of value X, we can say something additional about Y

49
Q

Variation

A

measures the spread of values in a data set

50
Q

The shape of a data set

A

represents a pattern of all values, from lowest to highest value

51
Q

If sig (p-value) < 0,05

A

Н1 correct

52
Q

methods to analyse statistical correlation

A
  • Cross tabulation
    Chi-square (χ2)
  • Statistical measures
    o Pearson’s correlation coefficient
    o Spearman’s rank-order correlation (non-parametric)
    o Partial correlation
53
Q

Scatter plot

A

From the scatter plot you can see approximately
1) existence of correlation
2) character of the correlation
3) Extreme values (outliers)

54
Q

Report the result in Chi square test

A

There is a statistically significant correlation between gender and choice of
department (because Chi-square p = 0,013 < 0,05). Male students are typically
most often studying Business administration, while females are divided between Business administration and International business more evenly

55
Q

The credibility of the Chi-square test

A

if the test is credible, the expected values in each cell should not be less than 1
and the expected value can be under 5 only in 20% of the cells

56
Q

Pearson’s correlation coefficient

A

r | > 0.7 strong linear correlation
0.3 ≤ | r | ≤ 0.7 average linear correlation
| r | < 0.3 weak linear correlation

57
Q

if p < 0.05

A

there is a statistically significant correlation between the variables

58
Q

Practical interpretation of the results of Pearson’s

A

There is a positive (r = 0,363),
average (0,3 < | r | < 0,7)
linear correlation between gender
and participation in lessons.
On average, women participate
more than men.