STATISTICS (QUALI) Flashcards

1
Q

MIDTERM EXAM

When it comes to determining whether a treatment is statistically significant or not, the best statistics to use is ?

A

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

MIDTERM EXAM

The level of measurement (i.e.., Ordinal, nominal, interval, and ratio) can be considered as what level of measurement

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

MIDTERM EXAM

The psychological statistics instructors wanted to identify the level of statistics anxiety first year students using a self-made questionnaire that categorizes the score from low, moderate, and high statistics anxiety. The level of statistics anxiety is an example of what specific level of measurement?

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

MIDTERM EXAM

Identifying the top performing universities in board examination is an example of what specific level of measurement?

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MIDTERM EXAM
Which of the following levels of measurement has magnitude, equal interval and true zero point?

A

Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MIDTERM EXAM

A total of 250 first year students passed the qualifying exam” this is an example of what specific classification of variable?

A

Discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

MIDTERM EXAM

It is an organized tabulation of the number of individuals located in each category or score on the scale of measurement

A

Frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

MIDTERM EXAM
Number of enrollees per academic year is best presented through:

A

Line graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

MIDTERM EXAM
In a given study, it was found that 6560 of the individuals are male. The value provides is an example of ___

A

Proportion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

MIDTERM EXAM
If the variable “age” is categorized into “child”, “adolescent” and “adult”, what is the best graph to use in presenting this variable?

A

Bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

MIDTERM EXAM

Suppose that you are now a fourth-year student, and since then, observing your GPA from the start of your college life. To better determine the trend of your grades as you progressed in each year level, what graph is best to utilize?

A

Line chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

MIDTERM EXAM

After gathering data, laurence needs to identify the top 5 most soft-spoken psychology instructors in CVSU. To determine the most frequent names indicated in the survey, what measure of central tendency would you recommend to Laurence?

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

MIDTERM EXAM

In the formula of median (n+1)/2 will provide what value?

A

Position of the middle value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

MIDTERM EXAM

It is the frequency distribution in which all values have approximately the same frequency

A

Rectangular distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

MIDTERM EXAM

If the continous data is normally distributed and has no significant outliers, it is best to use ___ and ___ to determine central tendency and variability respectively

A

Mean; standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

MIDTERM EXAM

Reynier correlates the level of job satisfaction and level of supervisor support among BPO employees (n= 150) after identifying that both variables are continous, he then tested the normality and identified if there is/are significant outliers. No significant outliers was detected, and Reynier was able to obtain a p-value of 0.02 in the Shapiro-wilk test. Given this, what would be the most appropriate central tendency to be utilized?

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

MIDTERM EXAM

It is a balanced point of a distribution

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

MIDTERM EXAM

If you are planning to make a visualization of the number of recorded cases of bullying per month, what would be the best used?

A

Line graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

MIDTERM EXAM

In a histogram, the y-axis corresponds to the ___ of the individuals

A

Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

MIDTERM EXAM

Fervin was assigned to analyze his groups collected data using jamovi. For the first objective in their research, he needs to identify the average level of parental involvement and social desirability of middle children. If you were fervin, what specific measure of central tendency would he most likely use?

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

MIDTERM EXAM

Majority of scores in BSP 1-1 is found at the right side of the distribution while it is the opposite for BSP 1-2. Given this data we could probably say that

A

Scores of BSP 1-1 is negatively skewed, while it is positively skewed for BSP 1-2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

MIDTERM EXAM

If you are going to identify the average midterm examination score of BSP 1-5 In item number 38, what measure of central tendency will you use?

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

MIDTERM EXAM

The most frequent scores among first year psychology students in BPSY 55 midterm examination is 58 and 79. From this data set, what would be the shape of the distribution?

A

Bimodal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

MIDTERM EXAM

Scores of BSP 1-7 in ITP long examination are pilling up at the higer tail/end of the distribution. Given this data which of the following is correct?

A

Mode > median> mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

MIDTERM EXAM

Type of kurtosis that is considered to have “fatter” tails

A

Platykurtic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

MIDTERM EXAM

Kim and JL Are competing in terms of who will have a significant finding in their research. Highlighting the variability of their research, kim obtained SD= 12.75 while JL reported SD= 4.89 who do you think will more likely to obtain a significant result?

A

JL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

MIDTERM EXAM

Suppose that you are attempting to describe the variability of the college students aggression. One of your respondents is a student known for his highly aggressive behavior compared to others. What measure of variability is expected to be used?

A

Median standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

MIDTERM EXAM
True or fales ?

Two data sets can have the same value of central tendency but with different variability

A

Absolutely True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

If the kurtosis is leptokurtic, we would expect that the variability is?

A

Lower than mesokurtic
Lower then platykurtic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

MIDTERM EXAM

What measure of variability that should be best used to describe accurately how the scores are different from each other?

A

MAD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

MIDTERM EXAM

Arriving at a negative value on z score means that:

A

It falls down below the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

MIDTERM EXAM

What is the corresponding mean from the z-scores

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

MIDTERM EXAM
Compare the scores of bea in narcissistic and anti social tendency scales
Narcissistic tendency: 38/45; M= 32; SD = 4
Antisocial tendency: 40/60; M= 42; SD = 5
Given this data which of the following is Bea more suspectable to develop?

A

Narcissistic tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

MIDTERM EXAM
Suppose your IQ score is 88 wherein the mean is 100 and the standard deviation is 15, what can we infer from this data?

A

Your IQ is approximately 1 SD below the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

MIDTERM EXAM
TRUE OR FALSE?

The shape of distribution will not change even if you convert raw score into z-score

A

Definitely true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

MIDTERM EXAM

In a normal curve, what is the percentage of 2 standard deviations from the mean?

A

Approximately 95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

MIDTERM EXAM
TRUE OR FLASE?

In a negatively skewed data, the percentage of the left tail occupied is greater than the percentage of the right tail.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

MIDTERM EXAM
The percentage given by the z table always reports the proportion to the ___ of the z

A

Left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

MIDTERM EXAM

Rhai took a psychological test that measures disobedience. Her raw score was converted to a z-scores of -2.74 from this data, it implies that Rhai’s score can be interpreted as :

A

Below average disobedience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

ASSUMPTIONS CHECK
Ideal if n<50, because it can falsely report that a normally distributed data is non-normal if the sample size is large)
• If p-value is < .05 = ___
• If p-value is ≥ .05 = ___

A

Shapiro-wilk test
Non-normal
Normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

ASSUMPTIONS CHECK

Same decisions with Shapiro-wilk test ( n≥ 50)

A

Kolmogorov-smirnov test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

ASSUMPTIONS CHECK
Visually check if the distribution follows the normal curve
(Used if sample size if large,…, 300)

A

Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

ASSUMPTIONS CHECK
If the dots follow straight line, the data is normally distributed
(Used with histogram if sample size is large)

A

Q-Q plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

ASSUMPTIONS CHECK

• If no dots are present, there are no outliers
• If dots are present, there are outliers

A

Box plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

ASSUMPTIONS CHECK

Visually check if there is plotted data that seems to be far from the pattern; if a datum is far from the pattern, there is a significant outlier.
(Applicable only for correlation)

A

Scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

ASSUMPTIONS CHECK

  • ____ For T test and ANOVA
  • ____ (if p< .05= ____ , if p≥ .05= ___)
A

Homogeneity of variance
Levene’s test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

ASSUMPTIONS CHECK
* ____ for correlation
* ____ visually check if the data “fans out” if the data shows this pattern, the data is not homoscedastic (i..e.heteroscedastic)
* ____ for correlation
* ____ visually check if the pattern of the data is following a linear path (i..e, in a straight line)
• if yes there is linearity of relationship
• if the pattern curves, the relationship of the variabled is non-linear

A

Homoscedasticity
Scatter plot
Linear of relationship
Scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

ASSUMPTIONS CHECK

To check if the observation are unrelated to each other, the respondents score must not be duplicated

A

Independence of observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

ASSUMPTIONS CHECK
CHI SQUARE STATISTICS
*___ check expected frequencies: there must be atleast 5 expected frequencies per group
*___ check expected frequencies: less than 20 % of the cells have an expected frequency that is less than 5

  • ____ to check if each observation only contributes to one cell (i..e,, a respondent can only be found in one group) the encoding data should be checked if every respondent belongs to only one group
A

Chi-square test of goodness of fit
Chi-square test of independence
Mutual exclusivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

ASSUMPTIONS CHECK ( CHOOSING THE RIGHT STATISTICS)
* ____ used when the study aims to compare or identify difference between two groups
* ____ if the population mean is present but there is no existing population variance
ASSUMPTIONS
1. variable is ___
2. Observation are ___
3. ___ distribution
4. No ___

A

T- test
One-sample T-test
Continous
Independent
Normal
Outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
* ____ comparing two different sets of data from a single set of respondents (i..e.., 1 group of respondents, 2 scores each)
ASSUMPTIONS:
1. Dependent variable is ____ (interval or ratio)
2. ___ variable must be two, categorical related groups
3. Observation are ___
4. ___ distribution in difference scores
5. No significant ___ in different scores
( use ____ ( non parametric counterpart of Paired T-test) if assumptions is violated

A

T-test for dependent samples (paired T-TEST)
continous
Independent
Independent
Normal
Outliers
Wilcoxon- signed rank test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Comparing two sets of data from two different sets of respondents (i..e., 2 different groups of respondents, 1 score each)
ASSUMPTIONS:
1. Dependent variable is ___
2. Independent variable should have two categorical ____ (two ___)
3. Observation are independent
4. ___ distribution among two groups
5. No outliers in the ___
6. There has to be ___ of variance (if this assumption is violated, use ___)

*** Use ___ (nonparametric counterpart of T-test for independent samples) if assumptions of parametric test is violated

A

T-test for independent samples
Continous
Independent groups (different groups)
Normal
Two groups
Homogeneity
Welch’s t
Mann-whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Used to compare three or more groups
ASSUMPTIONS:
1. Dependent variable is continous
2. Independent variable is consist of two or more categorical, independent groups
3. ___ are independent
4. No significant outliers
5. ___ distribution is observed among all groups, or the residuals of the independent variable is normally distributed
6. There is a ___ of variance
• If all assumptions were met, ____ should be used. As for ___ comparison ( if ANOVA results indicate statistical significance)

A

One-way ANOVA
Observation
Normal
Homogeneity
Student’s ANOVA
Post hoc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
— ONE WAY ANOVA —-
• _____ = equal sample size
• _____ = unequal sample size
• _____ = unequal sample size
• If all assumptions were met except homogeneity of variance, use ____ and _____ as post hoc comparison
• If assumptions were violated ___ should be used, then either ___ (or ____) or ____ pairwise comparison can be used as a Post hoc test

A

Tukey’s HSD test
Tukey-kramer test
Scheffé test
Welch’s ANOVA
Games-howell test
Kruskal-wallis H TEST
Dunn’s test
Bonferroni procedure
DSCF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
• _____ ANOVA = three or more sets of scores are coming from the same respondents
• _____ ANOVA = two factors are used (e..g.., measuring the effect of coffee and music to memory)
• _____ = ANOVA where other variables that might affect the study are being controlled
• _____ = ANOVA where more than one dependent variable is being studied at once (e..g.., effect of coffee to hyperactivity and attention span)
• _____ = like ANCOVA but more than one dependent variable is being studied at once

A

Repeated measures
Two-way
Analysis covariance (ANCOVA)
multivariate analysis of variance (MANOVA )
multivariate analysis of Covariance (MANCOVA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Used to determine the relationship between two variables
ASSUMPTIONS:
1. Variables are both ___
2. Variables should be ___
3. Observations are independent
4. The relationship must be ___
5. ___ normal distribution is present
6. No ___ or ___ outliers
7. ____ is present

• If the assumptions were violated, either use ____ or _____ (but ___ is more preferred due to its robustness compared to spearman’s)

A

Pearson’s r correlation
Continous
Paired
Linear
Bivariate
Univariate/ multivariate
Homoscedasticity
Kendall’s tau-B correlation
Spearman’s RHO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Used to determine if the actual proportions of the categories being studied (e..g.., number of men vs women who committed crimes) do not fit to an expected (e..g.., 50% of those who committed crimes are men while the other 50% are women)
ASSUMPTIONS:
1. ____ categorical Variables
2. ____ of observation
3. ____ exclusive groups
4. At least ___ expected frequencies in each group

A

Chi-square test for Goodness fit
One
Independent
Mutually
5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
Used to determine if various categories are not related to each category (e..g., if certain color preferenced are associated with specific personality traits) . ASSUMPTIONS:
1. ___ categorical variables with atleast __ or more groups
2. ___ of observation
3. Less than ___ of the cells have an expected frequency of less than 5

A

Chi-square test for independence
Two
Independence
20%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

It was revealed from previous literature that contractual employees are more optimistic about their job compared to tenured employees. If you were to conduct a similar study, which statistical test is appropriate to use if the data is non-normal, has an equal variance, and there were significant outliers?

A

Mann-whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

In her study, Maria wants to identify the effectiveness of mindfulness meditation to the self-transcendence of psychotherapist. She randomly selects a sample of n=30 wherein one group will be exposed to the meditation technique while the other will not. If her data has no significant outlier but the variances are unequal and not normally distributed, which of the following statistical tests will she utilize

A

Mann-whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Suppose that you are to compare the impact of learned helplessness to the self-esteem of student leaders. You measure their self-esteem using a 5 point likert scale before intentionally giving them a very difficult examination then measure their self-esteem again using the same scale. The difference of the scores before and after the treatment condition revealed that the data is normally distributed and has no significant outliers. Which of the following statistical test will she use?

A

T test for dependent samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTICS

Levene’s test is used to check if?

A

There is homogeneity of variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
One of the assumptions in t test is to identify if the observations are independent. Suppose that in your analysis, you were not able to meet this assumption, what does this mean?

A

The scores of the respondents are duplicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

True or false?
Generally t-test is robust which means that parametric tests can still be employed even if some assumptions were violated

A

Definitely true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

In his study about the impact of social support to the susceptibility of PTSD, Dan is checking if the data is skewed or not normally distributed. In order to do so, what statistical tool should he look for?

A

Shapiro-wilk test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Aira would like to compare tye the rumination tendency between people diagnosed with OCD and major depressive disorder. Before employing the t test for independent samples, Aira is looking at the boxplot as part of the assumptions checking. from this scenario, we could say that aira is testing?

A

If there is/are significant outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
With the goal of comparing the means of the formed groups, if a researcher is checking wether the data is normally distributed, has significant outliers and homogeneity of variance, we could say that the researcher will be least to likely utilize what parametric tests?

A

T test for dependent samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Which statistical test will be more likely to be utilized if the null hypothesis is written as “there is no significant difference in terms of statistical anxiety among male, female and LGBTQIA+

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

The HR supervisor wants to examine wether the new shift schedule improves worker productivity. Two groups of workers were formed, the first group follows the traditional shift schedule, and the other group follows the new shift schedule. Which statistical method should the HR supervisor use to determine if there is a significant difference between the formed groups given that all assumptions were met?

A

T test for independent samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Irish is a psychologist who is studying the effects of therapeutic approach, specifically the psychodynamic therapy and rogerian and session frequency (i..e.., once a week and twice a week) on patients phobia. If she were to understand the main effects of each factor and any interaction effect between them, which statistical method should she use?

A

Two way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

After analyzing the data using one way ANOVA the researcher now needs to conduct post hoc analysis to identify which of the pairwise comparison are significantly different. If the sample size in each group is equal and the scores have equal variances among groups, what post hoc test shoulf he used?

A

Turkey’s HSD test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

What post hoc test should be utilized if there is homogeneity of variance but the sample size in each group is not equal?

A

Scheffé test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

An industrial psychologist is conducting a comparative study to assess the levels of stress among employees from three different departments, particularly from HR, marketing, and accounting departments. He then administered a stress inventory questionnaire to the employees in each department. Which statistical method should the psychologist use to determine if there are significant differences in stress levels among the three department given that all assumptions were met with an EXCLUSION OF HOMOGENIZED OF VARIANCES?

• If the results of the study were significant, what post hoc comparison would be employed?

A

Welch’s ANOVA
Games-howell test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC
____ there is no significant difference in job satisfaction levels among employees from the three different departments
____ there is no significant difference between personality type (introvert vs extrovert) and task complexity (low vs high) on performance accuracy of factory workers
____ there is no significant difference in happiness scores and self-transcendence among participants exposed to mindfulness technique and psychodrama

A

One way ANOVA
Two way ANOVA
MANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Jancis wants to compare the attitude towards death among three different age groups (18-25 yrs old, 26-35 yrs old, 36-45 yrs old) before conducting the analysis she needs to verify wether the assumption of homogeneity of variance is not violated. Which statistical tool should jancis use for this purpose?

A

Levene’s test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Suppose that researcher with the aim of comparing three groups of married couples in terms of their commitment and trust will be conducted, what statistical test will be used?

• If you were to CONTROL the impact of sex married couples in the analysis what statistical test woulf you use?

A

MANOVA
MANCOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

If you were to utilize kruskal wallis H test in your study, which of the following assumptions of the equivalent parametric test was more likely violated?

A

Normality of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

If you were to check if one way ANOVA will be appropriate to use in your study, which of the following statistical tools would you not utilize in assumption checking?

A

Scatter plot (for correlation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Suppose that you are trying to determine if neuroticism (a personality trait that is measured with a total score from a scale) is related with the number of criminal offenses among persons deprived of liberty (PDL) To check if the relationship of the two variables is linear, you will look for?

A

The scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Miaca wants to identify if there is a significant relationship between GPA and educational attainment (ordinal/rank) Within a group of respondents. If she intends to use a more robust and highest level of statistical analysis possible we can infer that Miaca will use what specific statistical tool?

A

Kendall’s tau-B test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Which of the following statistical test will you use if you want to verify the assumption of linearity before employing Pearson’s r?

A

Scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

In a correlational study about the driving anxiety and narcissistic tendency of truck drivers, the data in both variables were found to be normally distributed, homoscedastic and have no significant outliers. What statistical test will be utilized in this scenario?

A

Pearson’s r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Suppose that you are conducting a correlational study about the relationship of birth order (ranking) To the compassion of Filipinos. Given that assumptions of equal variances, linearity of relationship, normal distribution and no significant outliers are met, what statistical test will be best to use?

A

Kendall’s tau-B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Considering the high attrition rate in a BPO company a researcher would like to conduct a study to predict job commitment of applicants based on their IQ scores. Given that all assumptions were met, what statistical test would the researcher utilized to attain the objective of the study?

A

Simple linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

Ningning id aiming the possible association between dating violence tendency and online disinhibition of Gen Z social media users. Upon checking the assumptions she found out that the data is homoscedastic, has no significant outliers and the two variables have a linear relationship. Upon checking the normality, Shapiro-wilk test for dating the violence is p= 0.048, while online disinhibition is p= 0.095 from the provided information of assumptions checking. What statistical test should ningning use?

A

Kendall’s tau-B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

ASSUMPTIONS CHECK CHOOSING THE RIGHT STATISTIC

In a comparative study about fake news susceptibility between male and female, assumptions were verified to identify the usage of a parametric test. Levene’s test was one of the statistical test to used and it yielded a p= 0.79 from this data we can infer that?

A

The data has equal variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A sociologist is studying the cultural diversity based within a metropolitan city. They design a survey to collect data on the various ethnic background of the city’s inhabitants. “What ethnic group do you identify with?”(Asian, african, white, american , Pacific islander, etc ) Respondents select the category that best presents their ethnic identity.
What type of level of measurement is this?

A

Nominal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A market research team is conducting a survey to evaluate consumer preferences for different brands of coffee. They ask participants to rank their preferences for five brands from most preferred to least preferred. What type of measurement is this?

A

Ordinal scale

89
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A team of psychologist is studying the effect of a new cognitive behavioral therapy program on stress levels in adults. They decided to measure stress using a standardized psychological scale that rates stress from 0 to 100, where 0 represents no strees and 100 represents extreme stress. The scale is designed so that each increment represent an equal increase in stress level, but the zero point does not mean the absence of stress, it’s simply the lowest point on the scale “on a scale from 0 to 100, with 0 indicating no stress and 100 indicating extreme stress, how would you rate your current level of stress?”

A

Interval scale

90
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

In a research study examining the impact of diet and exercise on weight loss, participants are asked to track their daily caloric intake and weekly weight loss over a period of three months. The researcher provide a digital food diary app where participants can log every meal and snack they consume, with the app calculating the total calories. Additionally, participants are given a smart scale that records their weight each week. The data collected include question such as “ please enter the total number of calories you consumed today”, “How many pounds have you lost this week?”

A

Ratio scale

91
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A public health researcher is studying the incidence of flu cases in a small town during the winter season. They set up tracking system where local clinics report the number of confirmed flu cases each week. The researcher ask the clinics “please report the total number of confirmed flu cases at the end of each week” the clinic submit their weekly flu case counts.

What type of variable is this?

A

Discrete variable

92
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A climatologist is studying the impact of climate change on regional temperatures. They collect temperature data from various weather stations over a decade. The research includes analyzing daily high temperatures to observe trends and variations. The climatologist asks:
“Please record the daily high temperature in degrees Celsius for each day of the year.”

What type of variable is this?

A

Continous variable

93
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A researcher is studying the factors that influence voter turnout in local elections. They design a survey to be distributed to a random sample of the population after the upcoming election. One of the key questions in the survey is:

“Did you vote in the recent local election?”
Yes
No
What type of variable is this?

A

Dichotomous variable

94
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A team of educational researchers wants to determine if a new teaching method improves student performance more than the traditional method. They select a random sample of 100 students from a large population of high school students and divide them into two groups. One group is taught using the new method, while the other group is taught using the traditional method. After a semester, the researchers measure the academic performance of students in both groups.

What type of statistical method is this?

A

Inferential statistics

95
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A group of agricultural scientists is conducting a study on the yield of a new variety of wheat. They collect data from 50 farms across the country that have adopted this new variety. The data includes the total wheat yield per acre for each farm.
The survey may include a question like: “on average, how many bushels of wheat did you harvest per acre this season? “

What type of statistical method is this?

A

Descriptive statistics

96
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)
___Used to summarize and describe a group of numbers from a research study. They form the basis of virtually every quantitative analysis of data.
• They are often the FIRST STEP in data analysis, providing a foundation for further statistical analysis.
• They help in visualizing data to identify PATTERNS AND TRENDS

Example:
What is the average monthly income of single mothers in cavite?

What is the Iq level of first year psychology students?

A teacher wishes to determine the percentage of students who passed the preliminary examination in differential calculus

A student wishes to determine the average monthly expenditures on school supplies for the past 3 weeks

A

Descriptive statistics

97
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

Used to draw conclusions and make inferences that are based on the numbers from a research study but that go beyond the numbers. Involves using SAMPLE data to draw conclusions about a population.
Example:
Is there a significant difference in terms of IQ between men and women?

Is there a significant relationship between monthly income and life satisfaction?

A manager would like to predict based on previous years sales, the sales performance of a company for the next five years.

A politician would like to estimate based on the opinion poll his chance for winning in the upcoming 2019 senatorial election

A basketball player wants to estimate his chance of winning the most valuable player MVP award based on his season averages and the average of his opponent

A

Inferential statistics

98
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

___ The entire set of the individuals of interest for a particular research question

___ A set of individuals selected from a population, usually intended to represent the population in a research study

A

Population
Sample

99
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

A team of educational psychologists is interested in studying “math anxiety” among high school students. They hypothesize that math anxiety negatively impacts students’ performance in mathematics. To investigate this, they develop a research question:

“How does math anxiety, as measured by self-reported anxiety levels during math exams, affect the test scores of high school students?”

What type of data/variable is this?

A

Construct (Math anxiety)

100
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

____ characteristics that can have different values
____ internal characteristics that cannot be directly observed / explore complex ideas and relationships that cannot be directly measure but can be assesed through proxy measures or indicators
____ possible number or category that a variable can have
____ particular persons value on a variable
____ collection of measurement or observation, complete set of scores

A

Variable
Construct
Values
Score
Data/ data set

101
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

___ value usually numerical value that describes a population
EX : average gross income of all people in the Philippines
___ describe a sample
EX : 2019 gross income people in sample of 3 regions

A

Parameter
Statistics

102
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

The natural occuring discrepancy that exists between a sample statistic and the corresponding population

A

Sampling error

103
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)
___ the values are name, categories and the score is not numerical (lowest measurement not provide magnitude)
Example: name , Sex , Nationality, Marital status, ID number , Eye color, Hair color, Blood type

____ the numbers stand only for relative ranking (also calleyd rank-ordered variable) Scale has magnitude “ Most of the psychological variables are ordinal in nature”
Examples: awards in a contest, EDUCATIONAL LEVEL, opinion on an issue (strongly aggree, agree, neutral, etc.), PAIN LEVEL (mild, moderate), MILITARY RANK (private, corporal, sergeant), SOCIO ECONOMIC STATUS (low, middle, High)

A

Nominal scale
Ordinal scale

104
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)

___ measures magnitude with equal interval. Psychological variables that are ordinal in nature are then considered to be interval once processed statistically (DISTANCE IS MEANINGFUL)
Examples: temperature (10° C, 20°C, 30°C) midterm exam scores, results in a personality test, Dates (1990, 2000, 2010), Credit score (300 to 850)

____ measures magnitude with equal intervals between the values and has true absolute zero, (highest level of measurement)
Examples: TEMPERATURE in KELVIN, WEIGHT(50kg, 60kg) HEIGHT (150cm, 160cm), AGE measured in years (20 years, 30 years), INCOME (30,000, $50,00), DISTANCE, miles or meters (5 meters, 10 meters)

A

Interval scale
Ratio scale

105
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)
____ one thas has specific values and cannot have values between the specific values
Ex: number of cars, students, etc.
____ there are in theory an infinite number of values between any two values
Ex: speed, temperature

A

Discrete variable
Continous variable

106
Q

CHAPTER 1 (TYPES OF DATA, VARIABLES AND MEASUREMENT)
____ these have only two values
___ naturally formed two categories (male & female, yes or no)
___ reflects underlying continous scale forced into a dichotomy (passed or failed)

A

Dichotomous
True dichotomous
Artificial dichotomous

107
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
____ statistical measure that identifies the center point or typical value of a dataset. It help summarize a large set of data with a single value that represents the middle of the distribution. Most typical or most representative of the entire set of scores
• It aims to provide an accurate description of the entire data set with a single value that reflects the center of the data distribution.

OUTLIERS : Extreme values can skew the mean. ( use the median or mode, which are less affected by outliers)
SKEWED : In skewed distribution, the mean, median and mode can differ significantly ( choose the measure that best represents the data’s central tendency based on the distribution shape)

A

Central tendency

108
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
___ sum of all the scores in the distribution and divided by the number of scores
CHARACTERISTICS;
- Changing a score in the distribution can affect the value of the mean.
- introducing a new score or removing a score can affect the value of the mean.
- adding or subtracting a constant from each score will change the value of the mean
- multiplying or dividing each score by a constant will change the value of the mean.
—— WHEN TO USE THE MEAN —
1. Commonly used in quantitative research, especially in psychological studies.
2. Approximately normally distributed data without outliers
3. With equal interval variables
4. CONTINOUS DATA/VARIABLE
5. INTERVAL/ RATIO
6. suitable for numerical data where you want to find the central tendency
Ex :
A researcher conducts a survey to measure job satisfaction among employees in a company. What should the researcher must calculate to understand the overall sentiment?

A

Mean

109
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

In a clinical trial, researcher measure the cholesterol levels of participants before and after a new treatment. What should they calculate to get the cholesterol level to assess the treatments effectiveness?

A

Mean

110
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
_____ an average in which each observation in the data set is assigned or multiplied by a weight before summing to a single average value.
• Varying significance of different data points, making it more accurate in certain contexts.

Ex: A hospital calculate the ___ recovery time for patients undergoing different treatments, considering the number of patients and the recovery time for each treatment type

A

Weighted mean

111
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

A professor calculate the final grade for a course by assigning weights to assignments 20%, midterm exams (30%) and final exam 50% . We use what type of central tendency to determine the overall performance of each student?

A

Weighted mean

112
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
___ the middle score when all the scores in a distribution are arranged from lowest to highest. (Not affected by outliers)
——- WHEN TO USE ? ——
1. With rank ordered variables
2. Non-normal or skewed distribution (Robustness)
3. When a distribution has one or more outliers ( Scores with an extreme- very high or low ) in relation to the other scores in the distribution

Ex: A school administrator collects test score from students across different classes. They use this type of central tendency to evaluate the typical performance, minimizing the impact of extremely high or low scores

A

Median

113
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

A researcher conducts a survey to measure the level of job satisfaction among employees in a company. They calculate using what type of central tendency to satisfaction score to understand the typical sentiment, especially when there are OUTLIERS

A

Median

114
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
____ the score or category that has the greatest frequency, can be used to determine the typical or average value for any scale of measurement, including nominal scale
—— WHEN TO USE ? ——
1. With categorical where mean aand median cannot be calculated
(nominal) and numerical data
2. It provides insights into the distribution of data, especially in skewed distribution

Ex: A teacher collects data on the grades of students in a class. They calculate what type of central tendency to find out the most frequently occuring grade, which can help identify the overall performance trend.

A

Mode

115
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

A researcher conducts a survey to find out the most common leisure activity among adults in a city. What type of central should they use to identify the activity that is most frequently chosen?

A

Mode

116
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ frequency distribution with one value clearly having a larger frequency than any other (has only one point)
• In this distribution the mean, median and mode are all equal, providing a clear measure of central tendency.

Ex: A researcher collects data on the number of hours student spend studying each week. They find that the distribution center at one point, with most students studying around 10 hours per week

A

Unimodal distribution

117
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

In a study examining the distribution of daily physical activity levels among adults in a metropolitan area, researchers aimed to determined what type of distribution. They collected data on the number of steps taken per day by a sample of 1,000 participants over a month. The analysis revealed that the distribution of daily steps was in a single peak around 7,000 steps per day. This indicated that most participants had a similar level of physical activity, clustering around this central value.

A

Unimodal distribution

118
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ frequency distribution with two approximately equal frequencies each clearly Larger than any others ( has two fairly equal highpoints )
• Often indicate the presence of two distinct subpopulations within the data.
• The data has two different modes. This can occur when the data is derived from two different processes or populations.

Ex: In a study of daily commute times for employees, researcher found two indicating commure durations 30 and 60 minutes

A

Bimodal distribution

119
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
____ frequency distribution with two or more high frequencies separated by a lower frequency (2 or more high Points)
• Has two or more modes, which appear as distinct peaks
• There are multiple values (range of values) that occur more frequently than others in the data set.

Ex: In a study of daily commute time for employees, researcher found that the distribution has three common commute durations 20, 40 and 69 minutes
EX : Exam scores

A

Multimodal distribution

120
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

____ frequency distribution in which all values have approximately the same frequency
• This distribution can be either discrete or continuous
• Commonly used in uncertainty analysis when the exact distribution of data is unknown

Ex: In a study of the distribution of birth dates among students in a large university, researcher found that the distribution indicate that each month had approximately equal number of births

A

Rectangular distribution

121
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

____ distribution in which the pattern of frequencies on the left and right side are mirror images of each other
• The mean, median and mode are all equal and located at the center of the distribution
• Have zero skewness, meaning there is no bias towards the left or right side.

Ex: In a study of students test score, researcher found that the distribution indicates that the scores where evenly distributed around the mean, left and right are the same with Each other

A

Symmetrical distribution

122
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

____ frequency are not equally distributed on both sides of the central valuem. “THE MORE THE MEAN MOVES AWAYS FROM THE MODE, THE LARGER THE SKEWNESS”

• The distribution is not symmetrical around the mean
• In this distribution the mean, median and mode are not equal. The mean is PULLED in the direction of the skew.
• SKEWNESS can indicate the presence of OUTLIERS

Ex: In a Study of household income in a city, researchers found that the distribution indicate that most household earned between $30,000 And $70,000 per year, they were a few household with significantly higher incomes

A

Skewed distribution

123
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

In a study examining the distribution of daily screen time among teenagers, researcher aimed to understand the patterns of digital device usage . They collected data from a sample of 1,500 teenagers, recording the number of hours spent on screens each day. The distribution indicate that while most teenager spent between 2 to 4 hours per day on screens, there were few who spent significantly less time.

What type of distribution is this?

A

Skewed distribution

124
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ a distribution that is skewed to the right, Most values clustered around the left tail, the right tail of the distribution is longer( FLOOR EFFECT)

• The mean is greater than the median, which is greater than the mode. (Mean > Median> mode)
• THE LONGER TAIL ON THE RIGHT SIDE INDICATES THE PRESENCE OF HIGHER VALUES
• The distribution is asymmetrical with a peak on the left and a tail extending to the right

EX : Income distribution, many countries has small percentage of people earn significantly more than the majority
EX : Exam scores, scores on a particularly difficult exam, most students scoring low and a few scoring very high.

A

Positively skewed distribution

125
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ a distribution that is skewed to the left, most values are clustered around the right tail of the distribution while the left tail of the distribution is longer (CEILING EFFECT)
• The mean is less than the median, which is less than the mode. (Mean < median < mode)
• The LONGER TAIL on the left side indicates the PRESENCE OF LOWER VALUES.
• Peak on the right and a tails extending to the left

EX : Age of deaths, few people dying at younger ages.
Ex: In a study of the age at which people retire researcher found that the distribution indicating that while most people retire around the age of 65, there are few who retire much earlier extending to the left tail

A

Negatively skewed

126
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ extent to which a frequency distribution deviates from a normal curve in terms of wether its curve in the middle is more peaked or flat than the normal curve.
• IT INDICATES HOW MUCH DATA RESIDES IN THE TAILS COMPARED TO THE CENTER OF THE DISTRIBUTION
• Measures “tailedness” of a distribution which refers to the FREQUENCY and EXTREMITY OF OUTLIERS
• helps in identifying the presence of outliers and understanding the distributions shape

Ex: in a study of stock market returns, researcher found that the distribution had fat tails and a higher likelihood of extreme values compared to a normal distribution

A

Kurtosis

127
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ the scores are CONCENTRATED TOWARD THE MEAN, but there are more extreme values than in a normal distribution.
• The presence of extreme values or outliers can create this distribution
• Fat tails more outliers
• > 3 (greater than 3)
• The distribution has a sharp peak around the mean.

Ex: In a study of earthquakes in a seismically active region, researcher found that the distribution has a higher concentration of moderate magnitudes around the mean and a greater likelihood of extreme magnitude compared to a normal distribution

A

Leptokurtic

128
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ normal curve
• value approximately 3, or an excess 0. This means that the distribution has a moderate level of “tailedness” with neither particularly heavy nor particularly light tails.
• typically symmetrical around the mean like normal distribution

Ex: in a study of the distribution of birth weights among new borns in a hospital, researcher found that the distribution data had a moderate level of kurtosis, similar to a normal distribution

A

Mesokurtic

129
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ the scores have an EXTREMELY LARGE DEVIATION FROM THE MEAN. Characterized by a lower peak and thinner tails .
• The data is more spread out and there are fewer extreme values (outliers)
• Has a flatter peak around the mean.
• Less than <3 Indicating negative excess kurtosis
• Lack outliers

Ex: In a study of daily rainfall amounts in a desert region, researcher found that the distribution indicates that the rainfall amounts were more evenly spread out with fewer extreme events compared to a normal distribution

A

Platykurtic

130
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

In a study examining the distribution of daily energy consumption in residential households, researchers aimed to understand the variability and patterns of electricity usage. They collected data over a year, recording the daily energy consumption in kilowatt-hours for a sample of 500 households. The analysis revealed that the distribution, indicates that the energy consumption values were more evenly spread out with fewer extreme highs or lows compared to a normal distribution. This finding suggested that while there were fluctuations in daily energy usage, the likelihood of very high or very low consumption days was lower.

What type of kurtosis is this?

A

Platykurtic

131
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ a way to organize data to show HOW OFTEN EACH VALUE OCCURS. It can be presented in a table or graph, making it easier to see patterns and trends in the data.
• It helps organized large data set into more understable format.

Ex: In a study of the types of pets owned by students in a school, researcher found that in a distribution showed that the most common pet was dog, followed by cats and fish.

A

Frequency distribution

132
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
___ measures the fraction of the total group that is associated with each score.
• It shows that the two ratios are equal.
• Not all relationships are proportional.

Ex: Imagine you are conducting a study to understand the work preferences of employees in a mid size company. You send out a survey to all 500 employees, asking them wether they prefer remote work or working in the office. You want to test if the ___ of employees who prefer remote work significantly Greater than 60%

A

Proportion

133
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

An amount of something often expressed as a number out of 100

• Express relative values, making it easier to compare different quantities.

FORMULA : P(100) = F/n (100)

A

Percentage

134
Q

CHAPTER 2 DESCRIPTIVE STATISTICS

___ frequency table in which the number of individual (frequency) is given for each interval of values. It organizes data into intervals or groups, known as class intervals and displays the frequency that fall within each interval.
• Each class interval represents a range of values, and the frequency indicates how many data points fall within that range.

___ range of values in a grouped frequency table that are grouped together.

Ex: Imagine you are conducting a study to understand the age distribution of customers at a local coffee shop. Over a week, you collect data on the ages of 100 customers who visit the shop

A

Grouped frequency table
Interval

135
Q

CHAPTER 2 DESCRIPTIVE STATISTICS
___ barlike graph of a frequency distribution in which the values are plotted along the horizontal axis and the height of each bar is the frequency of that value; the bars are usually place next to each other without spaces (like a citys skyline)
• Used for CONTINOUS DATA, where data points are measured on a continuous level
• Used for identifying patterns, such as normal distribution or skewness

Ex: The teacher used what type of graph to display the distribution of student test scores , making it easier to identify the most common Performa levels.

A

Histogram

136
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

___ Refers to how spread out or dispersed the values in a data set are. It helps us understand the degree of difference among data points.
• Low variability indicates that the data points are close to the mean, making predictions more reliable. High variability suggest greater differences among data points, making predictions less certain.
—— PURPOSE —-
1. Describes the distribution and consistency of the data
2. Measures how well an individual score (or group scores) represents the entire distribution
3. Detect unusual values that may affect analysis

EX : Imagine you are a quality control manager at a factory, and you want to analyze the consistency of the lengths of bolts produced. You measure the lengths of 10 bolts and get the following data (in mm) 50, 51, 49, 52, 50, 51, 49, 50, 52, 51

A

Variability

137
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

Which is more reliable measure of variability compared to range?

– It takes into account all data points and their distances from the mean, providing a more comprehensive measure of variability

A

Standard deviation

138
Q

CHAPTER 2 (MEASURES OF VARIABILITY)
___ the distance covered by the scores in a distribution, from the lowest to the highest score (It does NOT give an accurate description of variability)
• It provides a quick sense of the spread or dispersion of the data to know which one has more variability.
• It is sensitive to outliers
• only consider the extreme values ignoring the distribution of the rest of the data.

EX : Imagine you are a teacher analyzing the test scores of your students to understand the variability in their performance. If the test scores are as follows 55, 60, 65, 70, 75, 80, 85, 90, 95, 100. The highest score is 100, the Lowest score is 55, the ____ is ____?

— this tells you that there is __ point difference between the highest and lowest scores, indicating the spread of the students performance

A

Range
45

139
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

Which measures of variability is less reliable if there are outliers in the data set?

— This can be less reliable if there are OUTLIERS because it only considers the highest and lowest values, which can be slightly significantly affected by extreme values

A

Range

140
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

___ it is the average of Each score’s squared difference from the mean (rarely used as its based on squared deviation scores)
• Variance provides a measure of how much the data points differ from the mean, indicating the spread of the data set

WHERE TO USE ?
1. DESCRIPTIVE STATISTICS - (It provides insight on how much the data points differ from the mean)
2. HYPOTHESIS TESTING - ( used in various hypothesis tests to determine if there are significant differences between groups.
3. GOODNESS OF FIT - (Used to assess how well a statistical models fits the data)
3. To analyze the VARIABILITY in students test score

A

Variance

141
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

What type of measures of variability is useful in comparing the spread of TWO DIFFERENT data sets?

— it provides numerical value that represents the degree of spread or dispersion. This helps in understanding how much the values in each data set differ from their respective means.

A

Variance

142
Q

CHAPTER 2 (MEASURES OF VARIABILITY)
___ it consistently overestimates or underestimates the corresponding population parameter
• deviate from the true value
• Results from non-random or poorly designed sampling methods
—-EX : Imagine you are a market researcher trying to understand customer satisfaction with a new product. You send out a survey, but only the most satisfied customers respond

____ if the expected value of the estimator equals the true value of the population parameter.
• Accurate on average
• typically result from proper random sampling methods.
• More reliable for making inferences about the population.
EX : If you randomly select students from the entire school to estimate the average height.

A

Biased statistics
Unbiased statistics

143
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

____ is a robust measure of how spread the data is, if the assumptions of standard deviation were not met . It is LESS affected by OUTLIERS compared to other measures like the standard deviation. Good for assessing the DISPERSION OF DATA with extreme values and NON-NORMAL SHAPE of distribution

•HOW TO CALCULATE ?
1. Find the median of the data set
2. Subtract the median from each value to find the absolute deviation
3. Find the median of these absolute deviation

WHERE TO USE?
• Robust statistics - It is more resilient to outliers than the standard deviation, making it useful for datasets with extreme values.
•Non-normal distribution - It is particularly effective for data that does not follow a normal distribution
• Psychological research - To analyze reaction times or other measurements that may have Outliers

  • For normally distributed data, standard deviation might be more appropriate
A

Median absolute deviation

144
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ It is the number of standard deviations that a score is above or below if its negative the mean of it’s distribution. It indicates how many standard deviations a data point is from the mean.
• It allow comparison of scores from different tests or measurements. (Comparing SAT and ACT scores)
• It assumes normality, if the data is skewed, it may not be accurate
• It help identify outliers (data points significantly different from the rest of the dataset) Typically data points with ___ greater than 3 or less than -3 are considered potential outliers.

EX : Imagine you are a researcher analyzing the heights of a sample of adult males. The average height is 175cm with a standard deviation of 10cm. You want to find out how a height of 190 cm compares to the average.
WHERE TO ???
1. To compare scores from different distributions
2. To determine if a data point is significantly different from the mean
3. To find the probability of a data point occuring within a normal distribution
4. To assess whether a product measurements deviates significantly from the standard

A

Z score

145
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ is an unaltered measurement or data point collected from a test or observation. It represents the initial, direct result before any statistical analysis or transformation. (ORIGINAL FORM)
– WHERE TO USE?
1. To record and analyze student performance on tests and assignments
2. To measure responses in surveyd or assessments
3. To gather initial data from serveys or questionnaire
4. To record initial measurements such as blood pressure, weight, or other health indicators

A

Raw score

146
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)
• The ___ of Z scores is always equal to 0, A Z score of 0 indicates the data point is exactly at the mean of the distribution
WHERE TO USE ??
1. To compare scores from different distributions
2. To determine if a data point is significantly different from the mean
3.To find the probability of a data point occuring within a normal distribution
4. To assess whether a product measurement deviates significantly from the standard

A

Mean in Z score

147
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

____ Helps to standardized different data points, making it easier to compare them across different distributions. (Z score)

A

Standard deviation of Z score

148
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ a normal distribution that has been transformed so that it has a mean of 0 and a standard deviation of 1. This transformation is achieved by converting the values of a normal distribution into z scores
• Used in Z test to determine if there is a significant difference between sample and population means.
• Helps in confidence intervals for population parameter
• Allows comparison of individuals scores from different distribution by converting them to a common scale
• Allows for the comparison of scores from different distributions, even if they have different means and standard deviation
• Helps in normalizing data, crucial for many statistical methods.

A

Standardized distribution

149
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ a symmetrical bell shaped curve that represents the distribution of a set of data. Most values cluster around the mean, and the probabilities for values further away from the mean taper off equally in both directions.
——– Percentage between mean —–
(Left to right)
.13% - 2.14% - 13.59%- 34.13% - 34.13%- 13.59% - 2.14% - .13%
CHARACTERISTICS:
• the mean, median and mode are all equal located at the center of the distribution
• The highest point on the curve Is at the mean, and the curve tapers off equally on both sides.
• The tails of the curve approach the x-axis but never touch it.

IMPORTANCE:
It helps in understanding the distribution of data, making PREDICTIONS and conducting various STATISTICAL ANALYSIS

A

Normal curve

150
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)
____ The probability for any specific outcome is defined as a fraction or a proportion of all the possible outcomes (Measures of how likely an event will occur)
EX : Your teacher wants you to know the probability that a student will randomly pick a red marble from a bag containing 5 red marbles, 3 blue marbles and 2 green marbles = 5(number of favourable outcomes), 10 (total number) = 5/10 is 0.5 or 50%

WHERE TO USE?
1. To calculate the odds of winning
2. To predict the likelihood of rain or other weather events
3. To determine the probability of a patient recovering from a disease
4. To assess the likelihood of defects in manufacturing

____ result of a specific event

A

Probability
Outcome

151
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ a technique used to select a subset of individuals from a larger population where each individual has an equal chance of being chosen. This method ensures that the sample is representative of the population, minimizing bias and allowing for reliable inferences
• Enhances the internal and external validity of the study.

IMPORTANCE???
– It ensures that the sample is representative of the population, reducing bias and allowing for accurate interferences

A

Random sampling

152
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

____ each individual is returned to the population before making the next selection. It is possible for that unit(Person) to be selected again in subsequent draws.

EX : Imagine you have a hat with the names of 5 students : andy, Karl, tyler, becca and jessica. You want to draw 2 names with replacement
1. First draw : you might select tyler
2. Replace : put Tyler’s name back in the hat
3. Second draw : you might select tyler again

A

Sampling with replacement

153
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ used when there are two or more mutually exclusive outcomes. It help us to find the probability that either one of the two events will occur.
• Helps in assessing the likelihood of different risks happening simultaneously
• Use to calculate the probability of one or more events occuring

• If events A and B are MUTUALLY EXCLUSIVE they cannot happen at the same time
—EXAMPLE: Heads or tails on a single flip or coin.
- Getting a 3, 4, 5 in a single roll of dice
- You want to pick a random student from your university. With 30% are seniors and 20% are juniors, what is the probability of getting either a senior or a junior?

• Overlapping events : Always subtract the probability of the intersection of the events to avoid double-counting

A

Addition rule

154
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

____ used when figuring probability of two or more events happening together.
• Essential for finding the probability of multiple events happening together

TWO VERSION OF RULE
1. For independent events - (events that do not affect each other)
2. For dependent events ( events that affect each other) Ex: Drawing two cards from a deck without replacement. The probability of drawing a second card depends on the outcome for the first draw.

EX : Let say you wanted to find the probability of flipping two heads in a row with a fair coin.
EVENT A: First slip is head
EVENT B: second flip is head
Determine if the events are independent: YES, the outcome of the first flip does not affect the second flip
EX : Getting a head on the second flip of the coin
EX : Getting a 5 on two throws of a dice

A

Multiplication rule

155
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ specific list of the members of the population in order to select a subset of that population. It serves as the SOURCE from which your draw your SAMPLES.

• The actual list from which you draw your sample.
• Discrepancies between the sampling frame and the actual population can lead to biased results. Minimize this error by using multiple sources and verifying the accuracy of the frame

A

Sampling frame

156
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)
____ involves selecting subjects in a way that the chance of being in the study can be calculated
WHERE TO USE???
1. When you need results that can be applied to the entire population
2. Large groups, random selection is feasible
3. When precise numerical data is required

IMPORTANCE!!!
— It is crucial in QUANTITATIVE research as it helps ensure that the sample is representative of the population, reducing bias and allowing for generalization of results

A

Probability sampling

157
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ participants has an equal chance of getting selected to be the part sample
EX : Imagine you are conducting a study on the EXERCISE HABITS OF ADULTS in a city. You decide to use this method. You create a list of all adults in the city and use a random number generator to select participants. This method ensures that every adult has an equal chance of being included in the study.

A

Simple random sampling

158
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ DIVIDING the population into small SUBGROUPS based on specific characteristics (age, gender, income level)
— WHEN TO USE? —
• When the population has distinct subgroups that you want to ensure are represented
• when you suspect that the subgroups will have different mean values for the variable you are studying

EX : Imagine you want to survey students in a school about their favorite subjects. The school has 500 students divided into 4 grades (freshmen, sophomores, juniors, seniors)
EX : Suppose you are a market researcher wanting to understand customer satisfaction with a new product. You have a list of 1,000 customer divided into three income groups (low, middle, high)

IMPORTANCE: It ensures that the sample is representative of all SUBGROUPS within the population. (Useful both in qualitative and quantitative research)

A

Stratified random sampling

159
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ sampling frame is divided into clusters or sections and then the clusters are randomly selected. Instead of sampling individuals directly, entire clusters are randomly selected, and then all or a random sample of individuals within those clusters are studied.
WHERE TO USE???
1. When the population is too large and spread out to sample individually (large populations )
2. When the population is spread over a wide area (geographically dispersed groups)
3. When resources are limited and you need a practical way to sample
— EX : You are conducting a study on the READING HABITS OF HIGHSCHOOL STUDENTS in a large city. You decide to use this method by dividing the city into clusters based on schools. You randomly select a few schools and survey all students within those selected schools. This method is efficient and ensures that you can manage the data collection process effectively.

IMPORTANCE!!!!
— useful in LARGE-SCALE SURVEYS and studies where it is impractical to sample every individual. It helps reduces costs and time while still providing a representative sample of the population. However it can introduce biases if the clusters are not representative of the population

A

Cluster sampling

160
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)
___ sampling, a probability sampling method where the elements are chose from a target population by selecting a random starting point and selecting other members after a fix sampling INTERVAL (selecting every nth person from the population)
• If the population has a cyclic pattern, the sample may not be representative. Ensure that the list is randomized
• If the population list is incomplete, the sample may not be representative, verify the sample list.

WHEN TO USE????
1. When the population is too large for simple random sampling, must be evenly distributed
2. When you have a list of the population in a random or pseudo random order
3. When you need a quick and straightforward method

IMPORTANCE!!
— useful in both qualitative and quantitative research. It help ensures that the sample is representative of the population, reducing bias however, it can introduce biases if the population list has PERIODIC PATTERN

A

Systematic sampling

161
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS)

___ it is the combination of two or more probability sampling techniques. The population is divided into smaller and smaller groups (stages) and samples are taken from these groups at each stage.
• This method is useful for studying large, geographically dispersed populations and makes data collection more manageable.
EX : Imagine you are conducting a study on the EDUCATIONAL ATTAINMENT of adults in a country. You decide to use this method. First you divide the country into states (clusters). You randomly select a few states. Within each selected state, you divide into cities and randomly select some cities. Within each selected city, you further divide it into neighborhoods and randomly select some neighborhoods. Finally you survey all adults within the selected neighborhoods

IMPORTANCE!!
— It ensures that the sample is representative of the population, reducing bias and allowing for accurate inferences, especially when dealing with large, geographically dispersed population

A

Multistage sampling

162
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS) (NON PROBABILITY)

___ alternative to probability sampling if it is impossible to use. Samples are gathered in a process that does not give all the individuals in the population equal chances of being selected
• Higher risk of sampling bias
• Results may not be representative of the entire population
qWHEN TO USE??
1. When you need initial insights or hypothesis
2. Limited budget, when time, budget or access is restricted
3. When studying hard to reach groups

IMPORTANCE!!!
— Useful in QUALITATIVE research where the focus is on depth and detail rather than generalizability, it can introduces biases, so as results should be interpreted with caution

A

Non probability sampling technique

163
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS) –NON PROBABILITY

____ The samples are selected based on the availability. This method is QUICK, EASY and COST EFFECTIVE, but it may not present the entire population accurately.
• Pilot studies, used to test research instruments and protocols before a full-scale study.
• Helps in gaining initial insights into a research question
• The sample may be too homogeneous, missing out on diverse perspectives, try to include a variety of participants to increase diversity.
EX: Imagine you are conducting a study on the eating habits of college students. You decide to use this method by surveying students in the campus cafeteria. This method is quick and easy but may not represent all students eating habits

WHEN TO USE??
1. When you need initial insights or hypothesis (explanatory research)
2. When time, budget are not enough
3. To test survey or research methods before a larger study (pilot testing)

A

Convenience sampling

164
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS) –NON PROBABILITY

___ selecting people who relect a specific purpose of the study (known as Judgemental sampling) select participants based on specific characteristics or qualities that are relevant to your research.
• Use when conducting exploratory or qualitative research
• Relies heavily on the researchers judgement and knowledge

EX: You are conducting a study on the experiences of EXPERT SOFTWARE DEVELOPERS in the tech industry. You decide to use this technique by selecting developers who have atleast 10 years of experience and have worked on a large scale projects. This method ensures you get a detailed and relevant insights from highly knowledgeable individuals.

A

Purposive sampling

165
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS) –NON PROBABILITY
____ sampling technique wherein the assembled sample has the same proportions of individuals as the entire population with respect to known characteristics, traits or focused phenomenon. You divide the population into subgroups and then select participants from each subgroup to meet a predetermined quota.
• Useful when a complete list of the population is not available
WHEN TO USE??
1. To understand preferences across different demographic groups
2. When studying specific characteristics within a population
3. To ensure representation - guarantees that the specific subgroups are represented in the sample.

EX: Imagine you are conducting a study on the DIETARY HABITS of residents in a city, you set quotas for each group (100, aged 18-25) you then select from each group until you reach the quota
IMPORTANCE!!!!
— It is useful in both qualitative and quantitative research. It helps ensure that specific subgroups are represented but it can introduce biases since the selection is not random.

A

Quota sampling

166
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS) –NON PROBABILITY
_____ technique used in the situation where the population is completely unknown and rare
WHEN TO USE???
1. When studying groups that are difficult to identify or access (drug users, immigrants, etc.)
2. When researching sensitive topics that people may be reluctant to discuss publicly (stigmatized behaviors)
3. When targeting very specific or small populations (people with rare disease)

• Ethical concerns ; Issues related to privacy and confidentiality, especially in sensitive populations.

A

Snowball sampling/ referral sampling

167
Q

CHAPTER 3 (KEY INGREDIENTS FOR INFERENTIAL STATISTICS) –NON PROBABILITY

EX : Imagine you are conducting a study onbtye WORK HABIYS OF EMPLOYEES in a large company. You decide to use this method. The company has 1,000 employees and you need a sample of 100. You calculate the sampling interval as 1,000/100= 10. You randomly select a starting point, say the 5th employee and then select every 10th employee (5th, 15th, 25th, etc.) until you have your sample.

What type of technique is this?

A

Systematic sampling

168
Q

CHAPTER 1
Identify the POPULATION, SAMPLE, PARAMETER, STATISTIC, VARIABLE and DATA for this example.
( A grocery store is interested in how much money, on average, their customers spend each visit in the produce department. Using their store records, they draw a sample of 1,000 visits and calculate each customers average spending on produce )

POPULATION:
SAMPLE:
PARAMETER:
STATISTICS:
VARIABLE:
DATA:

A

POPULATION: all the shopping visits by all customer
SAMPLE : 1,000 visits drawn for study
PARAMETER: the average expenditure on produce per visit by all the store’s customer
STATISTICS: the average expenditure on produce per visit by the sample of 1,000
VARIABLE: the expenditure on produce for each visit
DATA : the dollar amounts spent on produce for instance $ 12,00 or $11.53

169
Q

CHAPTER 1

What kind of data id “amount of money spent on produce per visit”?
A. Qualitative
B. Quantitative- continous
C. Quantitative -discrete

The study finds that the mean amount spent on produce per visit by the customers in the sample is $12.84 this is an example of ?
A. Population
B. Sample
C. Parameter
D. Statistics

A

C
D

170
Q

CHAPTER 2
A health club is interested in knowing how many times a typical member uses the club in a week. They decided to ask every tenth customer on a specified day to complete a short survey including information about how many times they have visited the club in the past week.
What kind of sampling design is this?
A. Cluster B. Stratified C. Simple random D. Systematic
“Number of visits per week” is what kind of data?
A. Qualitative B. Quantitative -continous C. Quantitative -discrete

A

Systematic sampling technique
Quantitative -discrete

171
Q

CHAPTER 1

The U.S federal government conducts a survey of high school seniors concerning their plans for future education and employment. One question ask wether they are planning to attend a four-year college or university in the following year. Fifty percent answer yes to this question; that fifty percent is a?

Imagine that the U.S government had the means to survey all high school seniors in the U.S Concerning their plans for future education and employment, and found that 50 percent were planning to attend a 4 year college or university in the following year. This 50 percent is an example of ?

A

Statistics
Parameter

172
Q

CHAPTER 2

In a left skewed distribution which is Greater?

In a right skewed distribution which is greater?

In a symmetrical distribution what will be the relationship among the mean, median and mode?

A

Mode
Mean
They will be fairly close to each other

173
Q

CHAPTER 1.

DIFFERENT TYPES OF DATA
1.
- NOMINAL, DICHOTOMY, ORDINAL, DISCRETE

2.
- INTERVAL, CONTINOUS, RATIO

A

Qualitative
Quantitative

174
Q

CHAPTER 1 ( One group with Two variables measured for each individual)
Examines the relationship between two or more variables without manipulating them. Two is observed one is measured.
1. It DOES NOT APPLY CAUSATION meaning it cannot determine if one variable causes changes in another.
2. Variables are observed and measured WITHOUT ANY MANIPULATION by the researcher.
3. Typically involves NUMERICAL DATA to calculate the strength and direction of relationship.

• correlation coefficient indicate both strength and direction of the relationship.

A

Correlational method

175
Q

CHAPTER 1. (COMPARING TWO (OR MORE) GROUPS OF SCORES)
____ METHOD
1. one or more variable is manipulated while another variable is observed or measured
2. Aims to establish a CAUSE-AND-EFFECT relationship between two variables and attempts to control all another variables to prevent them from influencing the results
3. It allows researcher to determine causality, crucial for understanding how different factors influence outcomes.

— EX : a researcher wants to test if a new teaching method improves student performance.
Variables: Teaching method (new vs traditional)
Dependent variable: student performance (test score)

A

Experimental method

176
Q

CHAPTER 1. (COMPARING TWO (OR MORE) GROUPS OF SCORES)
____ VARIABLE

  1. A variable that is manipulated by the researcher
  2. Should consist of atleast two (or more) levels (treatment conditions) to which subjects are exposed
    • PLOTTED ON X-AXIS

EX : a scientist wants to test the effect of different AMOUNTS OF SUNLIGHT on plant growth

____ VARIABLE, that is being observed or measured to assess the effect of the treatment
1. The VALUE depends on changes in the independent variable.
2. It provide a way to measure the outcome of an experiment allowing researchers to draw conclusions about cause-and-effect relationship
• PLOTTED ON Y-AXIS

EX : a scientist wants to test the effect of different amounts of sunlight on PLANT GROWTH

BAR GRAPH or LINE GRAPH can be used to show the relationship between the two

A

Independent variable
Dependent variable

177
Q

CHAPTER 1. (COMPARING TWO (OR MORE) GROUPS OF SCORES)

Sample question:

A group of researchers conducted a study that will determine the effects of background noise in class performance students in one classroom worked on a mathematical task with calming music in the background. Students in the second classroom heard aggressive, exciting music, and students in the last room had no music at all.

DETERMINE THE ?
1. Independent variable
2. Dependent variable
3. Levels of independent variable

A
  1. Background noise
  2. Class performance
  3. Calming music, heard aggressive music , exciting music , no music at all
178
Q

CHAPTER 1. (COMPARING TWO (OR MORE) GROUPS OF SCORES)

____ DESIGN
Involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. (Just like experimental but the participants are not randomized)
1. Participants are assigned to a groups based on NON-RANDOM CRITERIA, such as Pre-existing groups or other selection method.
2. The researcher manipulates the independent variable to observe its effect on the dependent variable.

EX ; A school wants to test a new teaching method.
DESIGN : Two existing classes are used. One class uses the new method (experimental group) and the other continues with the traditional method (control group)

EXPLANATION; The classes are not randomly assigned, but the comparison can still provide insights into the effectiveness of the new method

A

Quasi-experimental

179
Q

CHAPTER 1. (COMPARING TWO (OR MORE) GROUPS OF SCORES)
____ DESIGN, a type of quasi-experimental design where participants are not randomly assigned to a groups. This can lead to difference between groups that might affect the results. (COMPARES OUTCOME BETWEEN PRE-EXISTING GROUPS)
—EX : A company wants to test a new employee training program.
DESIGN : One department receives a new training (experimental group) and another department continues with the old training (control group).
EXPLANATION: the department are not randomly assigned so there may be a pre existing differences between them that could influence the results.

____ DESIGN, Involves measuring the dependent variable before and after the intervention in the same group to see if there is a change (COMPARES OUTCOMES BETWEEN NON-RANDOMLY ASSIGNED GROUPS)
— EX : a healthcare researcher wants to test the effect of a new diet on weight loss.
DESIGN: measure participants weight before starting the diet (pretest) And after completing the diet (posttest)
EXPLANATION; By comparing the pretest and post test weights the researcher can determine if the diet had an effect on weight loss

A

Non -equivalent groups
Pretest-posttest

180
Q

CHAPTER 1. (COMPARING TWO (OR MORE) GROUPS OF SCORES)
___ DESIGN, involves the studying the same group of participants over a long period to observe changes over time. (STUDIES THE SAME PARTICIPANTS OVER A LONG PERIOD TO OBSERVE CHANGES)
• Crucial for understanding how variables change over time, providing insights into development trends and long term effects
• They allow researchers to study phenomena in natural settings, enhancing the ecological validity of findings
• Typically no manipulation of Variable occurrs researcher observe and collect data as it naturally unfolds.

— EX : A psychologist wants to study the development of a social skills in children
DESIGN: Observe and measure the social skills of the same group of children at ages 5, 10 and 15

A

Longitudinal

181
Q

CHAPTER 2 (WAYS OF PRESENTING DATA)

___Continous line that represents the frequencies of scores within a class interval, based on a histogram; used for continous data. Created by plotting points at the midpoints of each class interval and connecting them with straight lines.
• They are useful for comparing multiple datasets on the same graph, which can be more challenging with histogram
• Suitable for continous data, showing the distribution over a range.
• Helps in understanding the shape and spread of the data

EX: You want to compare the distribution of test scores across different classes more clearly than with histogram

A

Frequency polygon

182
Q

CHAPTER 2 (WAYS OF PRESENTING DATA)
3. Uses vertical bars to represent data. Each bar represents a category and the height of the bar indicates the value.
• Useful for comparing the values of different categories or tracking changes over time

–EX: you want to compare the sales performance of different producs. ( You have sales data for different products)

  1. Uses horizontal bar. Each bar represents a category, and the length of the bar indicates the value.
    • Excellent for comparing different categories of data.
    • Can be used for a wide range of data types, including frequencies, counts and other measures.
    — EX : you want to compare the popularity of different cuisine among survey respondents (you have survey result showing the number of people who prefer different types of cuisine)
  2. Uses points connected by lines to show trends over time. It is useful for displaying Data that changes CONTINOUSLY (SPECIFICALLY USED FOR SHOWING TRENDS OVER TIME)
    –EX : you want to analyze how sales have changed over the year to identify trends or patterns (you have monthly sales data for a year)
A

Column chart
Bar graph
Line graph

183
Q

CHAPTER 2
• if Z skewness and Z kurtosis is less than 1. 96 the distribution is _____?

• If Z skewness and Z kurtosis is less than 3.39 the distribution is approximately ____

• The skewness must be between __ and ___ ( it generally means the distribution is not extremely skewed and is within ACCEPTABLE limits for many statistical analysis. )
——- WHY IT MATTERS? —–
1. For NORMALITY ASSUMPTIONS

• The kurtosis must be ___ and ___. It generally means that the distribution is within acceptable limits for many statistical analysis. This range suggest that the data is not extremely peaked or flat, making it more suitable for certain statistical test and models that assume normality.
——– WHY IT MATTERS? ——-
1. Normality assumption

A

Approximately normal
Normal
+2 & -2
+7 & -7

184
Q

CHAPTER 2

• Square root of the average of the squared deviations from the mean, the most common descriptive statistics for variation.
• Approximately the average amount that scores in a distribution vary from the mean.

INTERPRETATION:
1.____ It indicates that the data points are close to mean, showing low variability
2.____ indicates that the data points are spread out over a wider range, showing high variability

—– KEY POINTS —:
1. Useful for comparison of variability between different datasets.
2. In a normal distribution about 68% of data points lie within one ___ of the mean, 95% within two, 99.7% within three

A

Standard deviation
Low standard deviation
High standard deviation

185
Q

CHAPTER 2 (formula for population variance and standard deviation)

___ Equation for a statistical procedure directly showing the meaning of the procedure
___ equation mathematically equivalent to the definitional formula. Easier to use for figuring by hand, it does not directly show the meaning of the procedure.

A

Definitional formula
Computational formula

186
Q

CHAPTER 3 (PROBABILITY)
FRACTION: Probability of getting ace of spade in a deck of card
P(ace of spade) = ___
PERCENTAGE: Probability of getting three dots in a dice
P(3) = ___
DECIMAL PROPORTION: Probability of getting a head in heads or tails
P(heads) = ___

A

1/52
16.67%
0.50

187
Q

CHAPTER 4 (HYPOTHESIS TESTING, Z TEST AND CENTRAL LIMIT THEOREM)

____ a statistical method that uses sample data to evaluate hypothesis about a population. it involves making an assumption about a population parameter and then using sample data to test wether this assumption is likely to be true.

A

Hypothesis testing

188
Q

CHAPTER 4 (HYPOTHESIS TESTING, Z TEST AND CENTRAL LIMIT THEOREM)

___ is a prediction often basef on informal observation, previous research or theory that is tested in a research study.
___ set of principles that attempt to explain one or more facts, relationships or events.

A

Hypothesis
Theory

189
Q

CHAPTER 4 (HYPOTHESIS TESTING, Z TEST AND CENTRAL LIMIT THEOREM)
—–HYPOTHESIS TESTING STEPS——-
1. ____ States that in the general population, there is no change, no difference or no relationship.
• It assumes that any observed effect in the dsta is due to random chance rather than a true effect.
• The purpose of thid is to provide a baseline for comparison. Researchers collect data and perform statistical tests to determine whether there is enough evidence to reject this hypothesis.
• If the p-value is less than the chosen significance level (0.05), you reject the null hypothesis, indicating that there is sufficient evidence to support the alternative hypothesis
• If the p-value is greater than the significance level, you fail to reject the null hypothesis, indicating that there is not enough evidence to support the alternative hypothesis.a
• It helps in making objective decisions about the presence of an effect or relationship.
—– POTENTIAL PROBLEM—
1. TYPE 1 ERROR - Rejecting the null hypothesis when it is true (false Positive), choosing an appropriate significance level (0.05 or 0.01) to balance the risk of the type 1 and type 2 error
2. TYPE 2 ERROR : Failing to reject the null hypothesis when it is false (false negative), Increase the sample size to improve the power of the test

EX : In a study testing a new drug, the __ might be that the drugs has no effect on patients

A

Null hypothesis (H0)

190
Q

CHAPTER 4 (HYPOTHESIS TESTING, Z TEST AND CENTRAL LIMIT THEOREM)
—–HYPOTHESIS TESTING STEPS——-
3. ____ distribution used in hypothesis testing which represents the population situation if the null hypothesis is true
• Z test - compares the means of two distributions when the sample size is large and variance is known
• T-test - Compares the means of two distributions when the sample size is small and variance is unknown
• ANOVA - Compares means across multiple groups
• CHI-SQUARE TEST -compares categorical data.

PROBLEMS AND SOLUTION;
1. Skewed data - use median and interquartile range instead of mean and standard deviation
2. outliers - Use robust statistical methods that are less affected by outliers
3. Different sample size - Use statistical test that account for different sample sizes like welch’s t-test

3.1 ___ a point in hypothesis testing, on the comparison distribution, used to determine whether a particular score falls within a certain range of interest.
• Used to calculate the margin of error and the range within which the population parameter is expected to lie
• One tailed test- Used when the research hypothesis predicts a direct of the effect
• Two-tailed test : Used when the research hypothesis does not predict a direction
•It is essential in hypothesis testing to decide whether the observed data is significantly different from what is expected under the null hypothesis.
• They help in constructing confidence intervals, which provide a range values within which the true population parameter is likely to fall.

A

Comparison distribution
Critical value / cut off sample score

191
Q

CHAPTER 4 (HYPOTHESIS TESTING, Z TEST AND CENTRAL LIMIT THEOREM)
—-PRACTICE PROBLEM —
A training program to increase friendliness is tried on one individual randomly selected from the general public. Among the general public (who do not get this training program) the mean on the friendliness measure is 30 with a standard deviation of 4. The researcher wants to test their hypothesis at the 5% significance level. After going through the training program, this individual takes the friendliness measure and gets a score of 40. What should the researcher concluded?
1. H0 :
2. A=
4. z=(40-30)/4
z= 2.50 p(z>2.50) = 0.062
5. Decision:

A

H0 : the training program does not significantly increase friendliness
2. 0.05
5. Reject the null hypothesis. The training program significantly increases friendliness

192
Q

CHAPTER 4 (HYPOTHESIS TESTING, Z TEST AND CENTRAL LIMIT THEOREM)
—–HYPOTHESIS DIRECTION——-

____ test, a situation in which the region of the comparison distribution in which the H0 Would be rejected is all on one side (tail) of the distribution.
• The research hypothesis predicts a specific direction of the effect (greater than or less than)
• If it falls in the critical region, you would reject the null hypothesis
• They have more statistical power to detect an effect in one direction compared to Two-tailed test
POTENTIAL PROBLEM ; TYPE 1 ERROR - ( choose an appropriate significance level to balance the ris of TYPE 1 ERROR)

____ is a situation in which the region of the comparison distribution in which the H0 would be rejected is divided into two sides (tails) of the distribution
• For significance level of (a) of 0.05 each tail would have 0.025
• Used when the hypothesis does not specify a direction
• SMALL SAMPLE SIZE - Use t-distribution instead of Z-distribution for small sample size (n<30)

A

One tailed test
Two tailed test

193
Q

CHAPTER 1 (CORRELATIONAL METHOD)

What does it mean when we say, Correlation does not imply causation?

A

Just because two variables are correlated does not mean one causes the other

194
Q

CHAPTER 1 (EXPERIMENTAL METHOD)

Why controlling extraneous variable in experimental method is important?

A

Researcher can isolate the effect of the independent variable on the dependent variable

195
Q

CHAPTER 1 (DATA STRUCTURE 3)

____ STUDY, researcher investigate the potential cause of an event or phenomenon that has already occured. (Examine relationship after the fact)

• This type of study is often used when it is unethical or impractical to manipulate the independent variable.
• The researcher does not manipulate the independent variable. Instead they study the effects of variables that have already occured.
•(RETROSPECTIVE) The study looks back in time to examine the relationship between variables (EX: Studying the long term effect of childhood trauma on adult mentl health)

EX : Researcher wants to study the relationship between smoking and lung cancer.
COLLECTION OF DATA: Researcher gather data on the smoking habits of both groups.
ANALYSIS: they analyze the data to see if there is a higher incidence of smoking among those with lung cancer compared to those without

A

Ex-post facto study

196
Q

CHAPTER 1 (DATA STRUCTURE 3)

What does quasi-experimental mean?

A

It lacks random assignment and manipulation of variables/ Manipulating an independent variable to observe its effect on a dependent variable without randomly assigning participants to groups

197
Q

CHAPTER 2 (MEASURES OF VARIABILITY)

___ measures the average distance of each data point from the mean. It provides insights into the spread of the dat and is widely used measure of variability.
• It is sensitive to outliers, which can distort the measured of variability (Use robust measures like, Median absolute deviation (MAD) instead)

• It provides a measure of how much the data points differ from the mean, indicating the spread of the data set.
• It tells you on average, how far each value lies from the mean.
• The empirical rule __- __- __tells you where most of the values in frequency distribution lie if they follow a normal distribution
- ___%of scores are within 1 SD of the mean
_ ___%% of scores are within 2 standard deviations of the mean.
- ____% of scores are within 3 standard deviation of the mean

A

Standard deviation
68
95
99.7

198
Q

CHAPTER 3 (KEY INGREDIENTS TO INFERENTIAL STATISTICS)
___ branch of mathematics that deals with the likelihood of events occuring.
• It ranges from 0 to 1, where 0 means the event is impossible and 1 means the event is certain.
• Forms the basis for inferential statistics, allowing us to make predictions and generalizations about population from samples.

A

Probability

199
Q

CHAPTER 3 (KEY INGREDIENTS TO INFERENTIAL STATISTICS )
___ refers to an individual unit of data that is part of a dataset. Each element represents a single observation or measurement and is characterized by specific attributes or Variables.
• Most basic unit of information in a dataset
• has attributes or Variables that describes it (Dataset of students, each student might have attributes like age, gender, and grade)

A

Element

200
Q

CHAPTER 4 HYPOTHESIS TESTING

___ it represents a statement that indicates the presence of an effect, difference, or relationship between variables. It is what the researcher wants to prove or support through their study.
• It provides a clear direction for the research and helps in formulating the research question.
• Many statistical tests are designed to test the alternative hypothesis against the null hypothesis

TYPES OF ALTERNATIVE HYPOTHESIS:
1. ONE-TAILED hypothesis - Specifes the direction of the effect.(Greater than or less than)
EX : The new drug is more effective than the current drug
2. TWO-TAILED HYPOTHESIS - Does not specify the direction, only that there is a difference.
EX: The new drug has a different effect than the current drug

A

Alternative hypothesis

201
Q

CHAPTER 4 HYPOTHESIS TESTING
____ a value that is used to define the concept of “very unlikely” It is used wether to reject the null hypothesis.
• It represents the probability of making a Type 1 error, which occurs when the null hypothesis is rejected even though its true. It is the risk of concluding that there is an effect or difference when there actually isn’t.
• It balance the risk of type 1 error (false positive) and type 2 error (false negative) . A lower alpha level reduces the risk of type 1 error but increase the risk of type 2 error.
• (0.05) most commonly used, there is a 5% or rejecting the null hypothesis when it is true.
• (0.01) higher level of certainty is required (medical research)
• (0.10) sometimes used in explanatory research where a higher risk of type 1 error is acceptablem

A

Alpha level

202
Q

CHAPTER 4 HYPOTHESIS TESTING
3.3 ___ The range of values for the test statistic that leads to the rejection of the null hypothesis. If it falls in this region it indicates that the observed data is significantly different from what is expected under the null hypothesis.
• In one tailed test it is found in one tail of the distribution, in Two-tailed test it is split between both tails of the distribution
• Helps in determining whether to tp reject the null hypothesis
• Assist in constructing confidence intervals for population parameter

A

Critical region

203
Q

CHAPTER 4 HYPOTHESIS TESTING

—-WHEN TO USE ONE TAILED OR TWO TAILED
(.05) - (one tailed -1.64/1.64)
- (two tailed - 1.96/1.96)
(.01). - (one tailed - 2.33 /2.33)
- (two tailed - 2.58/ 2.58)

A
204
Q

CHAPTER 4 HYPOTHESIS TESTING
“ if the p is low, the H0 must go”
• A low p-value indicates that the observed data is unlikely under the null hypothesis
• If the p-value is less than the significance level, it suggest that the observed data is statistically significant
• When the p-value is low (below the significance level) it means there is strong evidence against the null hypothesis. Therefore you reject the null hypothesis H0, and accept the alternative hypothesis H1

A
205
Q

CHAPTER 4 HYPOTHESIS TESTING
____ a measure that helps researchers determine whether their observation are likely due to chance or if there is a true effect present.
• ( * ) Significant at 0.05 level
• ( ** ) significant at 0.01 level
• ( *** ) significant at 0.001 level

•A low p-value (typically ≤0.05) suggest that the observed data is unlikely under the null hypothesis, leading to its rejection
• If the p-value is less than the null hypothesis is rejected
• A p-value does not measure the size of an effect or the importance of a result

A

Significance level

206
Q

CHAPTER 4 HYPOTHESIS TESTING
____ The probability distribution of all possible sample means of a given size from a population. To understand how sample means vart and how they relate to the population mean.

• According to central limit theorem, it will be approximately normal if the sample size is sufficiently large (usually n\30) regardless of the population distribution.
• The mean of this distribution is equal to the mean of the population

A

Distribution of sample means

207
Q

CHAPTER 4 HYPOTHESIS TESTING

___ shows the rabge of possible values for a statistic (mean, proportion) from all possible samples of a given size from the same population.
• The standard deviation of the sampling distribution is called the “STANDARD ERROR”

EX : suppose you have population with a mean (100) and and standard deviation of (15). You take sample size of 15
Calculate the standard error:
15/ √25 = 15/5 = 3
Sampling distribution;
The mean of the distribution is 100
The standard deviation (error) is 3

A

Sampling distribution

208
Q

CHAPTER 4 HYPOTHESIS TESTING
___ is a measure of how much the sample mean of a data set is expected to vary from the true population mean. Same as SD of a distribution mean.
• SEM = Q/√n (sigma - population standard deviation)
• A smaller (SEM) indicates that the sample is more accurate estimates of the population
• A larger (SEM) suggest more variability and less precision in the estimate
• “ As the sample size (n) increases, the SEM decreases, indicating more precise estimates of the population mean. “
—- PROBLEM—-
1. unknown population standard deviation , use the sample standard deviation (s) as an estimate for (Q)/ sigma
2. SMALL SAMPLE SIZE - Ensure sufficiently large sample size to get a reliable estimate of the SEM

A

Standard error of the mean

209
Q

CHAPTER 4 HYPOTHESIS TESTING
____ a fundamental principle in statistics that describes the characteristics of the sampling distribution of the sample mean. It states that, given a sufficiently large sample size, the sampling distribution of the mean will be approximately normally distributed, regardless of the populations distribution.

• NORMAL DISTRIBUTION= Sample size is large (usually n> 30)
• The samples should be independent of each other
• It allows for the use of normal distribution based methods for hypothesis testing even when the population distribution is not normal

A

Central Limit theorem

210
Q

CHAPTER 4 HYPOTHESIS TESTING

STEPS FOR CONDUCTING Z TEST
Ex : Suppose you want to test wether the average height of a sample of 50 students is different from the population mean height of 170cm. The population standard deviation is known to be 10cm. The sample mean height is 172cm. You choose a significant level of 0.05
1. STATE THE HYPOTHESIS
H0 : There is no difference from the population mean height of 170cm
H1 : There is significance difference from the population mean height of 170cm
CALCULATE: Z= 172 - 170/ 10√50 = 2/1.41 ≈ 1.42
CRITICAL VALUE: For two tailed test (0.05) the critical value lies on +1.96
DECISION: 1.42 is less than 1.96, you fail to reject the null hypothesis.
—— POTENTIAL PROBLEM —–
1. UNKNOWN POPULATION STANDARD DEVIATION; Use a T-test instead of Z test when the population standard deviation is unknown

A

Z-TEST

211
Q

CHAPTER 4 HYPOTHESIS TESTING

___ is a range of values that is likely to contain a population parameter with a certain level of confidence.
• The confidence level indicates the percentage of times you expect the true population parameter to fall within the confidence interval if you were to repeat the sampling process multiple times.
• Common confidence interval levels are 90%, 95% and 99%
• MARGIN OF ERROR: The range within the true population parameter is expected to lie, calculated using the standard error and critical value
EX : Suppose you have a sample of 50 , a population standard deviation (Q) of 10 and a sample size (n) of 100, You want to construct a 95% confidence intervals for the population mean.
CALCULATE SE: SE= Q/√n =10/√100= 1
CRITICAL VALUE: For a 95% confidence intervals, the critical value (Z) is approximately 1.96
MARGIN OF ERROR: ME= critical value X SE= 1.96x 1 ≈ 1.96
CONSTRUCT CI : CI= x±ME= 50±1.96 ≈(48.04, 51.96)
• You can be 95% confident that the true population means lies between 48.04 and 51.96

A

Conference interval

212
Q

MIDTERM EXAM

After taking their midterm examination in theories of personality, Ma’am Aya found out that the scores of BSP 1-5 in are skewed to the left. What does this data indicates?

A

The examination was easy and encountered ceiling effect

213
Q

MIDTERM EXAM

The variance is usually not used in interpreting the average distance of the scores from the mean because?

A

The value will be too large for the scores

214
Q

MIDTERM EXAM

Suppose that there is an existing parameter about the level of statistics anxiety among students of CVSU. Then your, instructor required you to measure the statistics anxiety of selected students within the university. Assuming that you have selected a set of samples that represent the population, what can you expect about the variability?

A

The parameter is greater than statistics

215
Q

MIDTERM EXAM

If Taeyeon’s skill is considered to be at 2 standard deviations above the mean, we can conclude that he is better than __ of the population and it can also be inferred that ___ of the population is better than taeyong.

A

97 72%; 2.28%

216
Q

MIDTERM EXAM

How many percent of individuals are not excluded from the range of 2 standard deviations below the mean to 1 standard deviation above the mean?

A

81.85%

217
Q

MIDTERM EXAM

if you’re z-score is 1.96 what is the percentage of individuals who scored above you?

A

2.50%

218
Q

MIDTERM EXAM

Suppose that you have an experiment about the effect of caffeine on the examination performance of college students. Approximately 95% (within z= ±1.96) of the population of college students are found near the mean (M= 70, SD = 10 ) indicating average exam performance. Then you gave student X 10 shots of espresso and provided her with an exam. Given that her score is 83 what can you infer about the caffeine?

A

The caffeine did not have a significant effect on the performance of student X