W2: RQ for Associations (Page 1-22) Flashcards
What does a good research question contain:
- ) A question
- ) All constructs being investigated
- ) The study population
- ) A verb that indicates the type of relationship among constructs being proposed (Associations, Predictions, Difference)
* Most important
What is a population and a sample? How many are there?
Population
- Complete set of all individuals relevant to our research question (often defined by psychological construct)
- Only one relevant to the RQ
Sample
- Subset of individuals selected by some sampling scheme from the population, and assumed to be representative of that population
- Many samples drawn from the population, but typically use one in research
Random varable and random samples. What are the types? How is it typically done?
Random Variable
- Each value has an associated probability of ocurrence
Continuous
- Any numerical value within a defined interval (e.g. 0 to 100, -infinity to +infinity)
Discrete
- Finite number of distinct values (e.g. integers 1,2,3,4,5)
Random Sample
- Each member of the population has an equal probability of being selected.
- Typically done using a uniform probability distribution
Construct, Measures and Scores
Construct
Unobservable attributes to explain human behavior
Measure
Method to measure people on a construct to obtain a construct score
Scores
Numerical value on construct measure assigned to an individual
- Raw
- Deviation
- Z
- Standardised
What is a raw/observed score and what are they generally indicated by
Values obtained directly from construct measure.
Capital letter (X,Y), a variable containing a set of values
What is a deviation score and what are they generally indicated by.
X - mean(x) = x
- Lower case letter (x,y)
- Mean subtracted from individual score
- Mean: 0
- SD: Same as SD from Raw Scores
What is a z score. What is the formula, mean and sd of z scores
Particular kind of standardized score by dividing a deviation score by standard deviation
z = x / sd(x)
- Mean = 0
- SD = 1
What is a standardized score (Generally speaking). Give an example.
- Raw scores that have been transformed such that that have a predefined mean and a predefined scaling for each unit standard deviation.
- IQ = 100 + 15*z
What do score transformation change?
Changing raw scores into deviation, z or standardised scores:
- Will ONLY change scaling of the variable
If both measures are continuous/categorical, what do we use?
2 Continous
Correlation
2 Categorical
Contingency Tables
How do we quantify a relationship between 2 variables?
By calculating a relevant summary characteristic over all observed scores
What is summary characteristics. What kinds are there?
Aggregation undertaken on individual values to produce a single quantity informative about values (e.g. mean, standard deviation, correlation)
- Sample Statistics
- Aggregated summary characteristic of individual scores calculated in a single sample drawn from a population
- Can be many values for a sample statistic, one for each sample drawn from the same population
- Population Parameters
- Aggregated summary characteristic of individual scores derived from all members of a population
- Only one, which is unknown
What is a Sample and Population Distribution?
Sample Distribution
- If we measure individuals in a single sample drawn from the population, then the set of scores form a sample distribution
- Many possible distributions
Population Distribution
- If we measure everyone in the population on a construct, then the set of scores form a population distribution.
- One possible distribution
- Which will be much larger than any single sample distribution but the size of the population is unknown in most cases…
Scatterplot vs Correlation Plot
While associations between multiple variables can be observed in both, it is much easier to discern patterns in correlation plots
What is a Pearson Correlation
Measure of linear symmetric association between two continous variables.
What defines association strength?
The absolute value of a correlation indicates the strength of (linear) association. Ignore whether it is positive or negative in value
What is the population correlation coefficient and sample correlation coefficient?
Population Correlation Coefficient
- p (rho)
- Correlation calculated on everybody in a population
- Value almost certainly unkown
Sample Correlation Coefficient
- r
- Correlation calculated on sample
- Differs from one sample to the next
- Use this known sample value to estimate unkown population value
What are the effects of having a larger sample size
- Reduces variability of the sampling distribution
- More Narrow
- Smaller Standard Error
What is a sampling distribution. Why is it relevant?
A distribution of values of a sample statistic obtained from a large number of repeated samples taken from a population
- Anytime we have a distribution of values, we can calculate summary characteristics of those values (e.g. mean)
- Any kind of sample statistic (e.g. correlation coefficient) will have a corresponding sampling distribution
- Under certain conditions, it can be shown that the mean of a sampling distribution will get closer to the unknown population parameter value as the number of repeated samplings increases
- Standard error (SD of sampling distribution) can be estimated from ONE sample statistic
- Note: Range will be more variable…
Why and What is a confidence interval. What does it have.
Research: One Sample.
We cannot construct sampling distributions but using a confidence interval gives us a good idea of the likely value of the unkown population parameter
Confidence Intervals
Range of plausible values of an unknown population parameter based on the
- (1) value of SINGLE sample statistic
- (2) its standard error.
Will the population distribution be larger/smaller than a sample distribution
Much Larger
What does it mean by “symmetry” in a Pearson Correlation
Correlation of X and Y = Correlation of Y and X
What is the correlation value always between
-1 to +1
Will any kind of sample statistic have a corresponding sampling distribution. What happens when sampling increases (under conditions)
In Theory, yes. - Anytime we have a distribution of values, we can calculate summary characteristics.
Under conditions, as Sampling increases, 1.) Mean of sampling distribution will get closer to unknown population parameter value 2.) SD can be calculated (known as Standard Error)
How many types of scores are there
- Raw Score
- Deviation Score
- Z-Score
- Standardised Score
- Deviation, Z and Standardized Scores are transformations of Raw Score
- Changes scaling only