Vocab Quiz Flashcards
Absolute Value
Term used in mathematics to indicate distance of a point or number from the origin (zero point) of a number line or coordinate system. The symbol is a pair of vertical lines flanking the quantity whose absolute value is to be determined.
Alpha
Known as the significance level, alpha is the probability of a Type 1 Error (rejecting the null hypothesis when it is true)
Alternative Hypothesis
States what the researcher expects to find; it is the tentative answer to the research question that guides the entire study. Also called the research/causal hypothesis.
Array
In stats, arrays describe a data set arranged by numbers in rows and columns from least to greatest.
Association
Two variables’ correlation or association depending on when the values of one or the other tend to increase as the other increases. A positive association means a relationship exists between two variables and goes positive 100% of the time.
Beta
Probability of a Type II error (failing to reject the null when it should be). Basically means incorrectly concluding there is no statistical significance.
Bias
Prejudice in favor of or against one thing, person, or group compared with another, usually in a way considered to be unfair. Can be knowingly or unknowingly done.
Bimodal
Having or involving two modes, in particular of a statistical distribution having two maxima.
Categorical
Variable that can take on one of a limited and usually fixed number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property.
Causality
Correlation between variables, however, does not automatically mean the change in one variable is the cause of change in the values of the other variable.
Causation
Indicates that one event is the result of the occurrence of the other event.
Central Tendency
A summary measure that attempts to describe a whole set of data with a single value that represents the middle or center of its distribution. There are three main measures of central tendency: the mean, median, and mode.
Cohort
A “group” or “panel”. Associated with types of studies used for determining the natural history of a condition.
Column
A vertical division
Confidence Interval
Estimated range of values likely to include an unknown population parameter; the estimated range being calculated from a given set of sample data
Contingency Table
A table showing the distribution of one variable in rows and another in columns, used to study the association between the two variables.
Continuous
There is an order to the variables: measured with mean and standard deviation OR made up of interval variable (distance between attributes have meaning) or ratio variable (possible to locate absolute zero).
Control group
Group of subjects closely resembling the treatment group in many demographic variables but not receiving the active medication or factor under study and thereby serving as a comparison group when treatment results are evaluated.
Correlated
Establish a mutual relationship or connection b/w variables
Counts
To determine the total numbers
Cross-sectional
A “one-time look”. All observations on a given subject are made at essentially one point in time. It is weaker than longitudinal (look over a period of time). It is used to describe some features of a population, such as prevalence (rather than incidence, which is longitudinal).
Cumulative
Of or relating to the total observed frequency of data, or the probability of a random variable, that is less than or equal to a specified value.
Dependent data
T-Test: performed to determine if there is a difference between before and after measurement of the same subjects (i.e. Dependent data)
Dependent variable
Variable being tested and measured in a scientific experiment (also called an outcome variable)
Descriptive
These studies provide information about a problem, condition, phenomenon….used to characterize it regarding a certain population.
Dichotomous
Nominal variable that only has two attributes.
Dispersion
Extent to which a distribution is spread; common measures include variance, SD, IQR.
Effect size
Descriptive statistic that estimates the strength of the relationship or difference between the true value and the hypothesized value. Can either be standardized or unstandardized. The standardized effect size is calculated as the difference between two means divided by the pooled SD of either sample, assuming eq. variances of both means.
ES = mean difference / SD
Also called the magnitude of the phenomenon of interest in the population. The smallest effect of clinical or practical significance (generally determined by an expert in the field, dictates how much of a difference would be clinically significant).
Experimental research
Experimental designs provide complete control over the intervention or causal factor being studied through randomization. Performs active observations that allow test systems to be altered.
Exponent
Almost every time e is used in a formula; raises its power.
False positive
Test result which incorrectly indicates that a particular condition or attribute is present (Type 1/alpha error).
False negative
Test result which incorrectly indicates that a particular condition or attribute is absent (Type 2/beta error).
Frequencies
Rate at which something occurs or is repeated over a particular period of time or in a given sample.
Homogeneity
Assumption that a time series sample is drawn from a stable/homogenous process
Incidence
Rate of new cases of the disease. Reported as the new number reported over a set period of time.
Independent
Statement of two variables being statistically independent, if the occurrence of one does not affect the probability of occurrence of the other. Two random variables are independent if the realization of one does not affect the probability distribution of the other.
Independent variable
Sometimes called an experimental or predictor variable, manipulates an experiment in order to observe the effect on a dependent variable.
IQR
Measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts.
Interval
Data in which the distance between attributes have meaning.
Intervention
The variable received by the experimental group.
Kurtosis
Based on the size of a distribution’s tails. Large tails are leptokurtic (taller) and those with small tails are platykurtic (shorter and wider). Distribution w/ same kurtosis as normal distribution is mesokurtic.
Leptokurtotic
Having large positive kurtosis; more concentrated around the mean.
Longitudinal
Tracks the same sample/cohort at different points in time.
Mean
Sum of sampled values divided by number of items in the sample.
Median
Middle point of a number set, in which half the numbers are above and below.
Mode
Value that occurs most often in a data set.
Nominal
Data that is neither measured nor ordered with subjects merely allocated into distinct categories
Non-parametric
Refers to stats method in which data does not need to fit a normal distribution. Often for ordinal data or other ranking based data
Normal
Bell shaped frequency distribution curve, also called Gaussian.
Null hypothesis
Hypothesis that states there is no significant difference between specified populations, with any observed difference due to sampling or experimental error. Usually the ‘working hypothesis’ that the researcher wants to disprove or reject/discredit.
Observational research
A social research technique that involves the direct observation of phenomena in their natural setting.
Outcome variable
Also called a dependent variable.
Paired data
Ordered pairs; refers to two variables in the individuals of a population that are linked together in order to determine the correlation between them. Both of these data values must be linked or attached to one another and not considered separate.
Parameters
Describes aspects of the parent population.
Parametric statistics
Assumes sample data comes from a population that follows a probability distribution based on a fixed set of parameters. “Normal distribution”.
Percentages
Way to represent statistics ‘per hundred’.
Percentile
Measure used in stats indicating the value below which a given percentage of observations fall.
Placebo
Harmless intervention given more for psychological benefit vs physiological.
Platykurtotic
Flat, high variance in Gaussian graph
Population
Entire collection of units of interest described by parameters.
Power
Defined as the probability that it will reject a false null hypothesis. Statistical power inversely related to beta or probability of type II.
Predictor variable
An IV; used to observe effect on DV.
Prevalence
Number of cases of a disease present in a particular population in a given time
Prospective
Study where participants are enrolled prior to developing disease or outcome in question.
Quartile
One of the three points dividing a range of data or population into four equal parts.
Random
Sample selected from a finite population is said to be random if every possible sample has equal probability of selection.
Randomized clinical trial
Participants are assigned by chance to separate groups w/ different treatments; neither the researchers nor participants can choose which group is assigned.
Range
Difference between max and min observations.
Rank
Determined by sorting data into order and replacing each value by its relative position on the order.
Ratio
Variables with the condition that 0 means there is none of that variable
Regression
Statistical measure that attempts to determine the strength of the relationship between one DV and a series of other changing variables
Reliability
Measure of reproducibility or stability; the degree to which an assessment tool produces consistent results.
Relational
Type of study designed to look at relationships between 2 or + variables
Retrospective
Type of cohort study that looks backwards and examines exposures to suspected risk or protection factors in relation to an outcome that was established at the start of the study.
Row
1) Summary stats when row values have been put into ranges.
2) Horizontal sections of data tables.
Sample
Portion from a larger population of interest being studied; should be representative of the larger population to give generalizable results
Scale
Refers to the way in which variables are defined and categorized i.e. nominal, ordinal, interval
Sensitivity
Ability of a test to correctly identify those with the disease (true positive rate). If test is highly sensitive and test result is negative, you can rule out disease. “Sensitive SNOUT”
Skewed/skewness
Degree of asymmetry of data distribution. Positive and negative skew.
Specificity
Ability of test to identify those without the disease (true negative rate). SPIN
Standard deviation
Measure used to quantify the amount of variation or dispersion of a set of vlaues.
Standard error
Standard deviation of its sampling distribution or an estimate of the standard deviation.
Statistics
The practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring proportions in a while from those in a representative sample.
Stratified
Sampling method by which the researcher divides the population into separate groups called strata, with a random sample taken from each group
Systematic
Probability sampling method in which sample members are picked from a random starting point via fixed periodic interval
Type 1 Error
Incorrect rejection of a true null hypothesis (false positive finding)
Type II Error
Incorrect retainment of false null hypothesis (false negative finding)
Variability
Dispersion; the extent to which a distribution is stretched
Variance
Measures how far a set of random numbers are spread out from their average values; square of the standard deviation
Validity
Extent that the instrument measures what it was designed to measure
Z-score
Measure of how many SD below or above population mean a raw score is.