Sampling Flashcards

Question

Independent Variable

Answer 1

Can be measured without relying on other variables. Example: A person's weight.

Answer 2

Requires information about independent variables. Example: BMI, which depends on both weight and height.

Answer 3

Constant: A characteristic that does not change across individuals in the study. Example: In a study on students, their status as "students" is a constant.

Answer 4

Collected for a specific research project. Example: Using surveys to gather student feedback.

Answer 5

used to select representative groups from a population for research purposes.

Answer 6

Pre-existing data collected for other purposes. Example: Data from Statistics Finland.

Answer 7

is the sample mean

Answer 8

standart deviation

Answer 9

= critical value, Usually, unless there is a specific requirement for accuracy level, 95% accuracy is used 90% = 1.64 95% = 1.96 99% = 2.58 99.9% = 3.30

Answer 10

см тетрадь

Answer 11

см тетрадь

Answer 12

percentage in the sample

Answer 13

margin of error = deviation

Answer 14

The following table presents the effect of the sample size on the margin of error of percentages and a 95% confidence level

Answer 15

The following table presents the sample size based on margin of error and population size.

Answer 16

everyone must find a suitable option Yes – No answers Scales

Answer 17

What do you think about this?

Answer 18

- answering options include “other” option

Answer 19

The research data is saved in a table format in the analysis software One row in the data contains information on one research unit (e.g. the responses of one respondent). The first row on the table contains the name of the variable One column on the table contains the values of one variable (e.g. respondent’s age). The first column contains the number of each statistical unit (e.g. number of the respondent or questionnaire). The values of the variables are typically saved in number format. If the variable is not numeric by nature (e.g. gender), the researcher assigns number codes for each value (e.g. 1= male, 2= female).

Answer 20

is the first step in data analysis even if the aim of the study is explanatory. The purpose is to summarise the information in a more easily interpretable format. This is done by presenting the data in tables, charts and numerical measures.

Answer 21

is the frequency in each class divided by the total number of observations. Usually in the tables, percentage distribution (f%) is presented.

Answer 22

= number of observations, count (f) Number of each value of the variable in the sample, the number of times a particular value appears in the dataset.

Answer 23

presents information about the number of items that are less than a certain value (накопительный итог) the sum of the frequencies of all previous values up to and including the current value. This helps you understand how many values are less than or equal to the current value.

Answer 24

(F%) presents the percentage from all observation shows the percentage of values accumulated as the values in the sample increase. This indicator allows us to see what percentage of the total number of values is below or equal to each particular value.

Answer 25

is a tabular representation of data that shows how often (with what frequency) different values occur in a data set. A frequency table helps us understand the distribution of data, identify the most frequent values (modes) and identify patterns. A table usually consists of two main columns: Value - unique values or categories of data. Frequency - The number of times each value appears in the dataset. In Excel frequency tables are created as Pivot-tables The number of observations in a sample is marked with capital letter N ● Part of the sample is marked with small letter n

Answer 26

presents the results of two (or more) categorical variables Key Elements of Cross Tabulation: Variables: Cross tabulation involves at least two variables. One variable is represented by the rows and the other by the columns. Cells: Each cell in the table shows the frequency (count) or percentage of occurrences for the intersection of two variables.

Answer 27

are classified as measures of central tendency and measures of variation and shape.

Answer 28

The central tendency is the extent to which all the data values group around a typical or central value. The measures of central tendency are mode, median, mean, quartiles and fractals.

Answer 29

● The value in a set of data that appears most frequently ● Multiple modes can exist on a data set

Answer 30

● The middle value in a set of data that has been ranked from smallest to largest ● Half the values are smaller or equal to the median and half the values are larger or equal to the median ● Data has to be measured on ordinal , interval or ratio scale ● If there is an even number of values, the median is ● either of the two values in the middle, or ● mean of the two middle values

Answer 31

The arithmetic mean (often simply called the "mean" or "average") is a measure of central tendency that represents the sum of all values in a data set divided by the number of values. It provides a general idea of the "typical" value in the dataset. Sensitive to outliers: Extreme values can significantly affect the mean ˉ X is the arithmetic mean.

Answer 32

is the arithmetic mean

Answer 33

represents each individual value in the data set

Answer 34

Q1 = 8 Q2 = 13.5 (the median) Q3 = 21

Answer 35

are statistical measures that divide a data set into four equal parts, each representing 25% of the data. Quartiles help identify the points where data is split into quarters. The three quartiles are typically referred to as the first quartile (Q1), second quartile (Q2), and third quartile (Q3). Arrange the data in ascending order. Find Q2 (the median): If the number of data points is odd, the median is the middle number. If the number of data points is even, the median is the average of the two middle numbers. Find Q1 (first quartile): The first quartile is the median of the lower half of the data (excluding the overall median if the number of data points is odd). Find Q3 (third quartile): The third quartile is the median of the upper half of the data (excluding the overall median if the number of data points is odd).

Answer 36

It is any other division of the data. It is necessary that the data can be arranged in descending or ascending order, otherwise it is not possible. Data has to be measured on ordinal, interval or ratio scale

Answer 37

Largest value minus the smallest value

Answer 38

● Interquartile Range = Q3 - Q1 ● Extreme values do not affect

Answer 39

● Measure the average scatter around the mean, S ● Square root of variance

Answer 40

● Standard deviation squared s2 ● In theoretical statistical analysis

Answer 41

V ● Always presented as percentage ● Relative measure for comparison

Answer 42

● Symbol g1

Answer 43

● Symbol g2

Answer 44

Normality is tested by: Kolmogorov-Smirnov and Shapiro-Wilk tests  If the sample size is less than 50 Shapiro-Wilk test is used, if over 50, Kolmogorow-Smirnov test is used  If sig.>0.05 -> the variable is normally distributed

Answer 45

отдельные столбики

Answer 46

столбики или горизонтальные барс, когда в одном сразу два значения, например, ж и м, где каждый бар поделен на ж в бизнесе и ж в хома и аналогично м

Answer 47

данные поделены на бины (интервалы) и распределены по осям, от чего итоговый вид показывает форму и тенденцию

Answer 48

линейные

Answer 49

кругляш

Answer 50

Читать: вершина показывает самый больший уровень данных, если нет экстремальных данных. График показывает максимальный уровень. Минимальный уровень поорядка 7,3. Но в нашем примере есть экстремальный уровень. Он настолько меньше, чем основные данные, что признан экстремальным. Поэтому минимум 6,3. Номер 36 показывает строку, в которой был дан этот ответ. Далее описываем бокс. Первая квота на уровне 8,0. Это означает, что 25% студентов имеют грейд ниже, чем 8,0. На вершине третья квота. Это означает, что 25% студентов имеют выше 9,0. Таким образом мы знаем, что полвина студентов имеет грейд от 8 до 9, то есть от 1 квоты до 3 квоты. Внутри бокс есть х - mean value, то есть среднее расчетное. Median – половина студентов имеет ниже 8,6 и половина выше 8,6. Горизонтальная линия внутри коробки. Если мы исключим экстремальный случай, тогда только используем минимум (в нашем примере). прочесть стр 60

Answer 51

куча точек. стр 62

Answer 52

прочесть стр 62

Sampling Flashcards

(76 cards)