Statistical Methods Flashcards

1
Q

SUS

A

The System Usability Scale (SUS) is a Likert scale that measures various aspects of system usability, including support, training, and complexity. It provides a composite measure of overall usability, allowing for a global assessment of system usability. The SUS is a reliable and low-cost usability scale that can be used to compare usability across different contexts. It is designed to capture subjective assessments of usability through the use of questionnaires and attitude scales. The SUS is not specific to any particular system and can be used for general assessments of usability. By using the SUS, researchers and evaluators can obtain quick and dirty assessments of usability in industrial systems evaluation. The scale considers the appropriateness of a system to its purpose within a specific context, reflecting the view of usability as defined by the context in which it is used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Median

A

The median is a robust statistical measure particularly suited for datasets with ordinal variables that have a clear order. It is determined by ranking all the values in the dataset and represents the middle-ranked value. In cases where there is an even number of values, the median is calculated as the mean of the two middle values, while for datasets with an odd number of values, it is simply an original data point. The median is considered a typical value, especially in situations where the data distribution is skewed. It corresponds to the 50th percentile boundary, indicating that 50% of the data points fall below it and 50% above it. Importantly, the median boasts a high breakdown point, remaining valid even when up to 50% of the data are systematic outliers, making it a robust choice for handling extreme values or outliers in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mode

A

Mode refers to a statistical measure that represents the most frequently occurring value in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Arithmetic mean

A

The Arithmetic Mean is a statistical measure that represents the average value of a set of numbers. It is calculated by summing up all the values in the dataset and dividing the sum by the total number of values. The arithmetic mean provides a measure of central tendency and is commonly used to describe the typical value in a dataset. It is sensitive to extreme values and can be influenced by outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parametric statistics

A

A methodology used in the analysis of data that assumes:
- The data is normally distributed
- The data is measured on an interval or ratio scale.
- The variances in each group are roughly equal when comparing two or more groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Non-Parametric Statistics

A

Non-parametric statistics do not make strong assumptions about the parameters of the population distribution.
Assumptions:
Typically, the only assumption is that the data is ordinal (i.e., can be ranked), although some non-parametric tests can also be used with nominal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random Sampling

A

random sampling allows us to gather data from a subset of the population to make estimates about the entire population. To account for the inherent variability in sampling, we present results as intervals (like confidence intervals) that provide a range of likely values for the population parameter, along with a measure of how confident we are about that range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sampling variability

A

The score we get from our sample (like the mean or median) won’t be a perfect representation of the entire population’s score because it’s based only on the sample we took. Depending on which individuals end up in our sample, the score will fluctuate. This fluctuation due to random sampling is termed sampling variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Confidence Interval

A

Due to this sampling variability, just presenting a single mean or median doesn’t give a complete picture. It might give a central or typical value, but it doesn’t tell us about the range of scores we might expect if we were to take another random sample from the population.

To address this, we present an interval around the mean or median. This interval is typically called a confidence interval. It provides a range of values, and we can be fairly confident that the true population parameter (like the true mean or median for the entire population) lies within this range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Confidence level

A

Along with the interval, we also provide a qualification, often called the confidence level, which tells us how confident we are that the true value lies within that interval. For instance, a 95% confidence level means that if we took many samples and calculated the confidence interval for each one, about 95% of those intervals would contain the true population value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Significance Testing/Hypothesis testing

A

Significance testing is a statistical method used to determine whether there’s sufficient evidence in a set of data to infer that a certain condition or effect holds true for a larger population from which a sample is drawn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Null Hypothesis (H0)

A

In hypothesis testing, you start with a null hypothesis (H0) and an alternative hypothesis (Ha or H1).

The null hypothesis typically represents a statement of no effect, no difference, or no association in the context of your study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Alternative Hypothesis (H1)

A

This is what a researcher wants to prove. Using the drug example, the alternative hypothesis might state that the drug has a positive effect on patients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Test Statistic

A

Depending on the data and the hypothesis, a specific test (e.g., t-test, chi-square test, ANOVA) is chosen to compute a test statistic. This statistic measures how far our sample statistic (like sample mean) is from the population parameter under the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

P-value

A

The p-value is a fundamental statistical concept that helps researchers assess the strength of evidence for or against a null hypothesis. It provides a quantitative way to make informed decisions about whether to accept or reject a hypothesis based on observed data.

  • The p-value is calculated based on the test statistic and the assumed distribution (e.g., normal distribution, t-distribution) under the null hypothesis.
  • It represents the probability of obtaining a test statistic as extreme or more extreme than the one observed in your data, assuming the null hypothesis is true.
  • Small p-value (typically ≤ 0.05) suggests strong evidence against the null hypothesis, leading to its rejection.
  • Large p-value suggests weak evidence against the null hypothesis, leading to a failure to reject it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Wilcoxon Signed-Ranked Test

A

The Wilcoxon Signed-Ranked Test Determine if there’s a significant difference between two paired groups when data is not normally distributed.

Assumptions:

  • Paired differences are independent and identically distributed.
  • Data is at least ordinal.

Procedure:

  • Calculate differences between paired observations.
  • Rank differences in absolute values.
  • Sum positive ranks (W+) and negative ranks (W-).
  • Smaller of W+ and W- is the test statistic.
  • Compare the test statistic to critical values.

Use Cases:
- Compare before-and-after measurements (e.g., pre- and post-treatment).
- Assess the effectiveness of interventions.

Outcome:
- If test statistic ≤ critical value, a significant difference exists.

17
Q

Sign Test

A

The sign Test determines if there is a significant difference between two paired groups based on the direction (positive or negative) of differences.

Assumptions:
- Less stringent assumptions; can work with ordinal or nominal data.

Procedure:
- Calculate differences between paired observations.
- Assign plus signs (+) to positive differences, minus signs (-) to negative differences.
- Count the number of plus signs (n+) and minus signs (n-).
- Use binomial distribution to assess significance.

Use Cases:
- Compare two conditions (e.g., success vs. failure).
- Analyse preference data (e.g., like vs. dislike).

Outcome:
- Evaluate if the proportion of plus signs significantly differs from 0.5 (no difference) using the p-value.

18
Q

Hypothesis Test

A
  • You collect data and perform a statistical test to determine whether there’s enough evidence to reject the null hypothesis in favor of the alternative hypothesis.
  • The test produces a test statistic, which measures the difference or effect observed in the data.
19
Q

Error Types

A

Type I and Type II errors are two concepts in statistical hypothesis testing that represent different kinds of mistakes that can be made when drawing conclusions based on data. They are often denoted as follows:

  • Type I Error (False Positive): This occurs when a null hypothesis that is actually true is rejected. In other words, you conclude that there is an effect or difference when there isn’t one in reality. Type I errors are sometimes referred to as “false positives” or “alpha errors.”
  • Type II Error (False Negative): This occurs when a null hypothesis that is actually false is not rejected. In other words, you fail to detect an effect or difference that does exist in reality. Type II errors are sometimes referred to as “false negatives” or “beta errors.”