Descriptive Statistics Flashcards
What is the definition of descriptive statistics?
Descriptive statistics are methods for summarizing and organizing a group of scores to make them understandable.
How do descriptive statistics differ from inferential statistics?
Descriptive statistics summarize data, while inferential statistics generalize findings from a sample to a larger population.
What are the key components of descriptive statistics?
Sample Distributions
Measures of Central Tendency
Measures of Variability
Data Visualization
What is Central Tendency?
The central point of the dataset.
What is a sample distribution?
A summary of the distribution of scores for a variable, showing values and their frequencies.
What are common tools to summarize distributions?
Frequency tables
Bar charts
Histograms
What are the measures of central tendency?
Mode
Median
Mean
What are the advantages and disadvantages of using the mode?
Advantages: Unaffected by outliers, identifies the most common value, best for nominal data.
Disadvantages: Less sensitive to data distribution, not useful for small or uniform datasets.
What are the advantages and disadvantages of using the median?
Advantages: Resistant to outliers, useful for skewed distributions.
Disadvantages: Ignores the exact values of all data, may not represent all observations.
What are the advantages and disadvantages of using the mean?
Advantages: Widely used, considers all data points.
Disadvantages: Affected by extreme values (outliers).
Bar Charts vs. Histograms.
Bar Charts:
Represent categorical data.
Bars are separated with spaces.
Can be arranged in any order.
Example: Comparing different product sales.
Histograms:
Represent numerical (continuous) data.
Bars touch each other to show continuity.
Values are grouped into intervals (bins).
Example: Distribution of student test scores.
What is a normal distribution?
A symmetrical, bell-shaped curve where most data points cluster around the mean.
What are the key features of a normal distribution?
Equal values above and below the mean
Symmetry
Defined by mean and standard deviation
What is a positively skewed distribution?
A distribution where the tail extends to the right, indicating more low values and a few high outliers.
What is a negatively skewed distribution?
A distribution where the tail extends to the left, indicating more high values and a few low outliers.
What is skewness?
A measure of the asymmetry of a distributionās shape.
What is kurtosis?
A measure of the ātailednessā or sharpness of the peak of a distribution.
What are the measures of variability?
Range
Interquartile Range (IQR)
Variance
Standard Deviation
How is range calculated, and what are its pros and cons?
Formula: Range = Max value - Min value
Pros: Simple to calculate.
Cons: Sensitive to outliers.
What is the interquartile range (IQR), and why is it useful?
The IQR measures the range of the middle 50% of data, reducing the impact of outliers.
What is the formula for IQR
IQR = Q3-Q1
What is the formula for deviation?
Deviation = Xā XĖ
Where:
X = individual data point
Ė
X = mean of the dataset
What is the formula for variance?
sĀ² = ā(Xā XĖ)Ā² / N ā> the average squared deviation form the mean.
What does standard deviation represent?
The average deviation of data points from the mean. The square root of the variance. S=SQ of SĀ²
How do you interpret standard deviation values?
Small SD: Data is tightly clustered around the mean.
Large SD: Data is widely spread out.