Statistical Methods Flashcards
SUS
The System Usability Scale (SUS) is a Likert scale that measures various aspects of system usability, including support, training, and complexity. It provides a composite measure of overall usability, allowing for a global assessment of system usability. The SUS is a reliable and low-cost usability scale that can be used to compare usability across different contexts. It is designed to capture subjective assessments of usability through the use of questionnaires and attitude scales. The SUS is not specific to any particular system and can be used for general assessments of usability. By using the SUS, researchers and evaluators can obtain quick and dirty assessments of usability in industrial systems evaluation. The scale considers the appropriateness of a system to its purpose within a specific context, reflecting the view of usability as defined by the context in which it is used.
Median
The median is a robust statistical measure particularly suited for datasets with ordinal variables that have a clear order. It is determined by ranking all the values in the dataset and represents the middle-ranked value. In cases where there is an even number of values, the median is calculated as the mean of the two middle values, while for datasets with an odd number of values, it is simply an original data point. The median is considered a typical value, especially in situations where the data distribution is skewed. It corresponds to the 50th percentile boundary, indicating that 50% of the data points fall below it and 50% above it. Importantly, the median boasts a high breakdown point, remaining valid even when up to 50% of the data are systematic outliers, making it a robust choice for handling extreme values or outliers in a dataset.
Mode
Mode refers to a statistical measure that represents the most frequently occurring value in a dataset.
Arithmetic mean
The Arithmetic Mean is a statistical measure that represents the average value of a set of numbers. It is calculated by summing up all the values in the dataset and dividing the sum by the total number of values. The arithmetic mean provides a measure of central tendency and is commonly used to describe the typical value in a dataset. It is sensitive to extreme values and can be influenced by outliers.
Parametric statistics
A methodology used in the analysis of data that assumes:
- The data is normally distributed
- The data is measured on an interval or ratio scale.
- The variances in each group are roughly equal when comparing two or more groups
Non-Parametric Statistics
Non-parametric statistics do not make strong assumptions about the parameters of the population distribution.
Assumptions:
Typically, the only assumption is that the data is ordinal (i.e., can be ranked), although some non-parametric tests can also be used with nominal data.
Random Sampling
random sampling allows us to gather data from a subset of the population to make estimates about the entire population. To account for the inherent variability in sampling, we present results as intervals (like confidence intervals) that provide a range of likely values for the population parameter, along with a measure of how confident we are about that range.
Sampling variability
The score we get from our sample (like the mean or median) won’t be a perfect representation of the entire population’s score because it’s based only on the sample we took. Depending on which individuals end up in our sample, the score will fluctuate. This fluctuation due to random sampling is termed sampling variability.
Confidence Interval
Due to this sampling variability, just presenting a single mean or median doesn’t give a complete picture. It might give a central or typical value, but it doesn’t tell us about the range of scores we might expect if we were to take another random sample from the population.
To address this, we present an interval around the mean or median. This interval is typically called a confidence interval. It provides a range of values, and we can be fairly confident that the true population parameter (like the true mean or median for the entire population) lies within this range.
Confidence level
Along with the interval, we also provide a qualification, often called the confidence level, which tells us how confident we are that the true value lies within that interval. For instance, a 95% confidence level means that if we took many samples and calculated the confidence interval for each one, about 95% of those intervals would contain the true population value.
Significance Testing/Hypothesis testing
Significance testing is a statistical method used to determine whether there’s sufficient evidence in a set of data to infer that a certain condition or effect holds true for a larger population from which a sample is drawn.
Null Hypothesis (H0)
In hypothesis testing, you start with a null hypothesis (H0) and an alternative hypothesis (Ha or H1).
The null hypothesis typically represents a statement of no effect, no difference, or no association in the context of your study.
Alternative Hypothesis (H1)
This is what a researcher wants to prove. Using the drug example, the alternative hypothesis might state that the drug has a positive effect on patients.
Test Statistic
Depending on the data and the hypothesis, a specific test (e.g., t-test, chi-square test, ANOVA) is chosen to compute a test statistic. This statistic measures how far our sample statistic (like sample mean) is from the population parameter under the null hypothesis.
P-value
The p-value is a fundamental statistical concept that helps researchers assess the strength of evidence for or against a null hypothesis. It provides a quantitative way to make informed decisions about whether to accept or reject a hypothesis based on observed data.
- The p-value is calculated based on the test statistic and the assumed distribution (e.g., normal distribution, t-distribution) under the null hypothesis.
- It represents the probability of obtaining a test statistic as extreme or more extreme than the one observed in your data, assuming the null hypothesis is true.
- Small p-value (typically ≤ 0.05) suggests strong evidence against the null hypothesis, leading to its rejection.
- Large p-value suggests weak evidence against the null hypothesis, leading to a failure to reject it.