Lecture 8 Flashcards by Kayleigh Enriquez

What is a variable

Any quantity that can be measured

How well did you know this?

Not at all

Perfectly

Finish the sentence “In a dataset there will be ___ of a variable for each individual in the sample”

Observations

How well did you know this?

Not at all

Perfectly

What’s a central tendency

The typical value of a variable

How well did you know this?

Not at all

Perfectly

What is dispersion

How far from the typical value the individual observations of a variable are

How well did you know this?

Not at all

Perfectly

What is an association?

How a variable relates to another variable

How well did you know this?

Not at all

Perfectly

What are inferential statistics?

Stats used to make predictions about parameters of the population based on two factors

How well did you know this?

Not at all

Perfectly

What are parameters?

Characteristics

How well did you know this?

Not at all

Perfectly

What estimates the parameters?

Statistics computed from a sample

How well did you know this?

Not at all

Perfectly

What is probability?

The chance that a particular event will occur

How well did you know this?

Not at all

Perfectly

What is sampling distribution

The probability that we obtain the parameters observed in our sample

How well did you know this?

Not at all

Perfectly

What is hypothesis testing?

The data supporting our beliefs about the population

How well did you know this?

Not at all

Perfectly

When do we use descriptive statistics?

To summarise sample data

How well did you know this?

Not at all

Perfectly

What do we use statistical inference?

To generalise about population parameters

How well did you know this?

Not at all

Perfectly

What determines or influences what statistical methods we can apply?

The level of measurement of the data

How well did you know this?

Not at all

Perfectly

What’s descriptive statistics for?

To summarise the key features of data.
- To make it understandable for human readers
- To identify characteristics
- To identify patterns
- To provide basis for further analysis

How well did you know this?

Not at all

Perfectly

What are measure of Central tendency?

Mean (x̄), median (M), mode (Z)

How well did you know this?

Not at all

Perfectly

What are measure of central tendency

Single number that represents the ‘typical’ value of a variable (an average: mean, median, mode)

How well did you know this?

Not at all

Perfectly

How would you visualise data?

In frequency tables i.e. Bar charts and Histograms

How well did you know this?

Not at all

Perfectly

What is skewness?

Distributions that have a relatively higher proportion of values at the low (left) or high (right) end of the range (on the graph)

How well did you know this?

Not at all

Perfectly

Where can you visualise skewness best?

Comparing values of means, median and mode in histograms

How well did you know this?

Not at all

Perfectly

What does a normal distribution look like?

Evenly spread above and below the mean (bell shape)

How well did you know this?

Not at all

Perfectly

Which side does a positive skew lean towards?

Right

How well did you know this?

Not at all

Perfectly

Which way does a negative skew lean towards?

Left

How well did you know this?

Not at all

Perfectly

What is the mean the best representation of?

The average in most cases of continuous data

How well did you know this?

Not at all

Perfectly

What does the median identify?

The central point

What is the median useful for?

Correcting skewed data or when continuous variables are measured on subjective scales

What is the mode suitable for?

Nominal data or grouped data

What does dispersion measure?

How far, on average, each observation is from the central tendency (mean)

What does the dispersion figure represent?

The variation in values within a variable

What do lower values of dispersion indicate?

That the central tendency (mean) is a better representation of the 'typical value' (more accurate)

What does the range and interquartile range provide?

A basic measure, useful for visualisation and identifying outliers

Why should we use variance and standard deviation?

They are more statistically powerful measures

What's the interquartile range?

The range of the middle 50% of values (Median of upper and lower halves)

What is variance?

The mean of the squared differences between each data point and the mean

What is standard deviation

Square root of the variance (most common measure of dispersion)

What can measures of dispersion not be applied to?

Nominal variables

What is a good visual form for understanding dispersion of a variable and identifying outliers?

Box plots

What is a plot outlier?

Values, figures, or data that lie outside the box plot limits

How do you calculate variance?

Mean of the squared differences between each value in the dataset

How do you calculate standard deviation

Square root of the variance

What does the measure of association consider?

The relationship between two variables

What does Kurtosis mean?

Flatness

What is a large SD? (Standard deviation)

Flat distribution

What is a small SD? (Standard deviation)

Narrow distribution

What does standard deviation tell us about in terms of distribution?

The flatness of distribution

What's the statistic for categorical data?

Chi-squared x^2

Whats the statistic for continuous data?

Pearson's correlation coefficient (r)

What does Chi-Squared measure?

The association between two categorical variables

What does chi-squared compare?

The expected frequency if there was no relationship with the observed frequency in the sample data

Correlation (r)

Strength and magnitude (direction) of the association between two variables

What does a positive correlation mean?

Increase in one variable associated with an increase in the other

What does a negative correlation mean

Increase in one variable associate with a decrease in the other

What does 0 correlation mean

No association

What does Chi-Squared rely on?

Testing for statistical significance

What is a statistical significance

Importance or quality of the data/stats

What is a critical value

How far from expected centre do you need to be before saying something is unusual here

Correlation is used for?

Ordinal and scale data (continuous)

Chi-Squared is used for?

Nominal (Categorical) data

What is covariance?

The degree to which two variables deviate from their expected values (mean) in similar ways

What does a positive covariance indicate?

Variables that tend to ‘move together’ away from their means: if we observe a high value of x, we also expect to see a high value of y

What's a scatter plot good for?

Checking if there is a linear relationship between two variables

What does negative covariance indicate?

Variables that move in opposite directions: if we observe a high value of x, we expect to see a value of y below its mean

What is a strong correlation's r value?

r = ± 0.8

What is a weak correlation's r value?

r ± 0.3

What is an omitted variable?

A factor that could lead to changes in X and Y

What is a reverse casuality?

A change in Y leads to the change in X

What is Sample selection bias?

When Individuals sampled have a different tendency to show the association than the whole population

What's a measurement error?

Values in the data that differ from the true value of the variable

What does association NOT imply?

Causation

What are alternative reasons for finding a relationship?

Omitted variables, reverse casuality, sample selection and measurement error

What does the test we use depend on?

Data meeting certain assumptions

What does the statistical inference process rely on?

Estimating the probability of obtaining our sample results, based on the distribution of sample statistics and population parameters.

What's the standard normal (Z) distribution's mean and SD value?

Mean = 0 SD = 1

What does a values Z-score represent

How many standard deviations it lies from th mean?

What does a higher Z score mean?

A lower probability of observing the value

What is the central limit theorem?

the sampling distribution of the mean will always be normally distributed, as long as the sample size is large enough

What is the 'sampling distribution of the sample means?'

The distribution consisting of the means of all random samples (n) of a given size that can be drawn from a population.

Alternative hypothesis (H1) means

what we believe the data will support

What's the Null hypothesis (H0)

It covers all states we want to disprove

What is the statistical significance level?

0.05 (Significance level of 5%)

Give me 4 levels of measurement data

Nominal, ordinal, interval or ratio data

What do larger samples provide?

More reliable results

Lecture 8 Flashcards

(83 cards)