4.2.3.1 Data Handling And Analysis Flashcards
What are correlational techniques?
Non-experimental methods used to measure how strong the relationship is between two or more variables.
What is correlation?
A mathematical term that illustrates the strength and direction of the association between two co-variables.
Does correlation imply causation?
It does not imply causation, but will usually lead to further studies to determine if there is a cause and effect relationship.
What are the types of correlation and what do they mean?
Positive correlation- as one variable increases, the other also increases
Negative correlation- as one variable increases, the other variable decreases
Zero correlation- when a correlational study finds no association between variables
What is a correlation coefficient?
A number used to measure the strength and nature of the relationship between two co-variables.
The correlation coefficient number represents the strength of the relationship and can range from +1.0 to -1.0
The closer the number to 1 or -1, they stronger the correlation.
What is the value of a perfect positive and perfect negative correlations?
Perfect positive- 1
Perfect negative- -1
What value can we determine correlation between variables?
+/_ 0.8
What is a scattergram?
A graph that shows the correlation between two sets of data (co-variables) by plotting points to represent each pair of scores.
It indicates the degree and direction of the correlation between co-variables, one of which is indicated on the x and one of which is indicated on the y axis.
What are the strengths of correlational studies?
An ideal place to begin preliminary research investigations, as it provides valuable insight for future research by assessing the strength and direction of a relationship between co-variables. This will give researchers more confidence before committing an experimental study.
Can be used when a laboratory experiment would be unethical, as the variables are not manipulated, simply correlated. Secondary data can be used which removes the concern over informed consent as the information is already in the public domain.
Quick and economical to carry out as there is no need to control;l the environment or manipulate variables.
Weaknesses of correlational studies?
As a result of the lack of experimental manipulation and control, it is impossible to establish a cause and effect relationship, and conclude that the change in one variable caused the change in the other. There could be other factors that influenced the relationship- known as the third variable problem. Therefore correlations may be misused or misinterpreted.
Correlations only identify linear relationships and not curvilinear, so cannot be used to identify relationships in all cases.
What is a correlational hypothesis?
It includes no IV or DV, but instead states the relationship between two operationalised co-variables. They can be directional or non-directional.
What is quantitative data?
Numerical data that can be statistically analysed and converted easily into a graphical format. It can be counted and is usually given in numbers.
Strength of quantitative data
It is easy to analyse statistically, as it produces large amounts of numerical data, which is easy to conduct descriptive statistics or inferential tests of significance. This allows for comparisons and trends to be identified between groups, and mathematical procedures used make results objective, and so more scientific and less open to bias.
Weakness of quantitative data?
The data is not in-depth and is narrow in scope when explaining complex human behaviour. This means numerical findings can often lack meaning and context. Therefore it may not be a true representation of real life and thus lacks validity.
What is qualitative data?
Non-numerical, language based data expressed in words, which allows researchers to develop an insight into the unique nature of human experiences, opinions and feelings.
Strength of qualitative data
The data offers the researcher rich detail of information as participants can develop their responses freely. This provides the investigator with meaningful insights into the human condition. Therefore, the external validity of the findings is enhanced as they are more likely to represent an accurate, real-world view.
Weakness of qualitative data
It is difficult to analyse and difficult to summarise statistically, making it hard to identify patters and comparisons with and between data. Therefore, conclusions are often based on the opinions and judgements of the researcher, which increases subjectivity and increases the chance of researcher bias due to preconceptions.
What is primary data?
Refers to data that has been collected for a specific reason and reported by the original researcher. It is data that the participant reports directly to the researcher, or is witnessed first-hand. It is sometimes referred to as field research.
Strengths of primary data
It is authentic as it is collected with the sole purpose of being a specific part of the investigation, so it is designed to suit the aims of the research. This makes it more useful to draw conclusions from.
The research can also ensure a high level of control is kept, to ensure internal validity in the results.
Weakness of primary data
Designing and carrying out a study can take a long period of time and considerable effort. This makes it costly for it to take place, due to researchers time and expenses of the equipment. Therefore, in comparison, secondary data which already exists can save the researcher time, effort and money.
What is secondary data?
Information that was collected by other researchers for a purpose other than the investigation in which it is currently being used. It is data that already exists, and its significance is already known.
Strength of secondary data
The information already exists in the public domain, so it is much less time consuming and expensive to collect. This means researchers can find the information they desire with very little effort.
Weakness of secondary data
There are some concerns over accuracy, as the information was not gathered to meet the specific aims of this experiment. This means that there may be significant variability in the quality of the data, and may be of little to no value ton the researchers, due to being outdated or incomplete. This challenges the validity of any conclusions made.
What is a meta-analysis?
A process whereby investigators combine findings from multiple studies (secondary data) on a specific subject, to make an overall analysis of trends and patterns arising across research. This can include a qualitative review of previous research, or a statistical, quantitative analysis to test for significance or effect size.
Strength of meta analysis
Since the results are combined from many studies rather than just one, the conclusions drawn will be based on a larger sample size. This provides greater confidence for generalisation, increasing the external validity of the patterns and trends identified.
Weakness of meta analysis
The researcher has control over the research they select for analysis which means they may choose to emit certain findings from their investigation. This could lead to investigator bias, as the results do not accurately represent all of the relevant data on the topic, leading to low internal validity.
What are descriptive statistics?
A quantitative summary of data numerically, which allows researchers to view the data as a whole.
Why are descriptive statistics useful?
Helps the reader to get an understanding of the data and saves them from needing to navigate through sets of results.
What to descriptive statistics typically include?
A measure of central tendency, and a measure of dispersion, selected based on the type of data collected.
What are measures of central tendency?
Tell us about the central, most typical value in a data set and are calculated in different ways.
What are the three measures of central tendency?
Mean
Median
Mode
What is the mean?
The arithmetic average calculated by adding up all of the values in a set of data, and dividing by the number of values.
How is the mean ‘sensitive’?
It is the most sensitive measure of central tendency, and takes into consideration all of the values in the dataset.
This may lead to misrepresentation of the data set if there are extreme outliers present.
What is the median?
The central value in a set of data when values are arranged from lowest to highest.
Why is the median useful?
It is useful when there are extreme values in a data set, as it is not affected by data that is heavily skewed.
It is easy to calculate as it takes the middle value within the data set.
It is less sensitive than the mean as the actual values of lower and higher numbers don’t affect it.
What is the problem with the median?
It does not take extreme values into consideration, which may be of importance to the results, so the result may not be representative of the entire data set.
What is the mode?
The most frequently occurring value in a set of data.
Why is the mode useful?
It is easy to calculate
It can be used for categorical data, whereas the mean and the median cannot.
What is the problem with the mode?
It can be misleading if an outlier in the data appeared multiple times, as the mode wouldn’t be representative of the entire data set.
What does bi-modal and multi-modal mean?
When the dataset has more than one mode, it is multi-modal. If it specifically has two modes, it is bi-modal.
When would there be no mode in a data set?
If all of the values obtained are different.
What are measures of dispersion?
The general term for any measure of the spread or variation in a set of scores or results.
What are the two measures of dispersion?
Range
Standard deviation
What is the range?
A calculation of the dispersion in a set of scores which is worked out by subtracting the lowest score from the highest score, and adding one as a mathematical correction (As some of the scores in the data will have been rounded).
Why is the range useful?
It is easy to calculate
Why is the range not useful?
It only takes into account the two most extreme values, which may be unrepresentative of the data set as a whole.
It does not indicate whether most numbers are closely grouped around the mean or spread out- whereas the standard deviation does show this aspect of dispersion.
It does not show a difference between data with a strong negative skew and a strong positive skew, so provides limited insight into the data set.
What is standard deviation?
A measure of dispersion in a set of scores, which tells us by how much, on average, each score deviates from the mean.
Large SD- lots of variation around the mean
Small SD- data closely clustered around the mean
Zero SD- all of the data was the same
Why is the standard deviation useful?
It takes into account all of the values within the data set, and is a very precise measurement.
Why is the standard deviation not useful?
It is more complicated to calculate, in comparison to the range.
It is easily distorted by extreme values, which means that it misrepresents the data in some cases.
How is data summarised visually?
Using graphical techniques
What are tables?
The most straightforward way of presenting data, in rows and columns, that allows the reader to easily compare the most important values, without needing to interpret the data.
What is a scattergram?
A type of graph that represents the strengths and directions of the relationships between co-variables in correlational analysis.
What is a bar chart?
A type of graph in which the frequency of each variable is represented by the height of the bars, used to represent discrete/categoric data. They show the number of times that each category of the data occurred.
What is a histogram?
A type of graph that shows frequency, where the area of the bars represents frequency.
The data it represents is continuous, and the X-axis must start at zero, with bars touching each other.
What is normal distribution?
When there is a symmetrical spread of frequency data that forms a bell-shaped pattern. The mean, median and mode are all located at the highest peak, indicating that most scores are close to the mean, with progressively fewer scores being located at the extremes of either tail of the distribution.
How can we determine normal distribution from standard deviation?
For any data set to be considered normally distributed, 68.26% will lie within one standard deviation of the mean, and 95.44% of scores will lie within two standard deviations.
What is skewed distribution?
A spread of frequency data that is not symmetrical and the data clusters to one end.
What is a positive skew?
Occurs when most of the data is concentrated to the left of the graph, resulting in a long tail to the right.
What is a negative skew?
Occurs when most of the data is concentrated to the right of the graph, resulting in a long tail to the left.
What difficulty of test would form a positive/negative skew?
A positive skew is formed if a test is too hard, and a negative skew is formed if a test is too easy.
What are the positions of the measures of central tendency on a positively skewed graph?
The mode remains at the highest point of the peak, followed by the median.
The mean is dragged to the right as it is affected by extreme scores.
Mode - Median - Mean (reverse alphabetical)
What are the positions of the measures of central tendency on a negatively skewed graph?
The mode remains at the highest point of the peak, then the median.
The mean is dragged to the left as it is affected by extreme scores.
Mean - Median - Mode (alphabetical)