Research Methods - Techniques Of Data Handling And Analysis Flashcards
Quantitative data
Quantitative Data = Quantitative data involves numbers and can be measured objectively. It is immediately quantifiable.
Quantitative data includes
The dependent variable in an experiment.
Closed questions in questionnaires.
Structured interviews
A tally of how many times a behavioural category is seen in an
observation.
Qualitative data
Qualitative data involves words and the data is based on the subjective interpretation of language. It is only quantifiable if the data is put into categories and the frequency is counted.
Qualitative data includes
Open questions in questionnaires.
A transcript from an unstructured interview.
Researchers describing what they see in an observation.
Problems with qualitative data
Qualitative data is challenging to analyse because it relies on interpretation by the researcher, which could be inaccurate, subjective or even biased. Furthermore, qualitative data may not be easy to categorise/collate into a sensible number of answer types. The researcher could be left with lots of individual responses that cannot be summarised
Primary data
Primary Data = Primary data is collected directly by the researcher for the purpose of the investigation.
Secondary data
Secondary data is information that was collected for a purpose other than the current use. The researcher could use data collected by them but for a different study, or collected by a different researcher. A researcher might make use of government statistics, such as mental health statistics collected by the NHS.
However, there is substantial variation in the quality and accuracy of secondary data and it can be hard for researchers to know how reliable secondary data is.
Meta-analysis
A meta-analysis refers to the process of combining results from a number of studies on a particular topic (secondary data) to provide an overall view. Meta- analysis allows us to view data with much more confidence and results can be generalised across much larger populations. However, meta-analysis may be prone to publication bias; the researcher may choose to leave out studies with negative or non-significant results.
Tables
When tables appear in the results section of a research report they are not raw scores but have been converted to descriptive statistics (measures of central tendency or measures of dispersion). There should be a paragraph beneath the table explaining the data.
Scattergraph
A scattergraph is a graphical display that shows the correlation or
relationship between two sets of data (or co-variables) by plotting dots
to represent each pair of scores. A scattergraph indicates the strength
and direction of the correlation between the co-variables.
Bar chart
A bar chart is used to show frequency data for discrete (separate) variables. The height of each bar represents the frequency of each item. In a bar chart a space is left between each bar to indicate the lack of continuity. The frequency of each category is plotted on the vertical y-axis.
Distributions
With most data sets the frequency of these measurements should reflect a bell shaped curve. This is called normal distribution which is symmetrical. Within a normal distribution most people are located in the middle area of the curve and very few people are at extreme ends. The mean, mode and median all occupy the same mid-point of the curve.
Skews
A positive skew is where most of the data is concentrated to the left of the graph. In this case the mode remains at the highest point of the peak, the median comes next but the mean has been dragged across to the right. The opposite occurs in a negative skew.
Measures of central tendency + mean
Measures of central tendency inform us about central values for a set of data. They are ‘averages’ – ways of calculating a typical value for set of data. The average can be calculated in different ways, each one appropriate for a different situation.
The mean calculated by adding all the scores and dividing by the number of scores. The advantage of this method is that it is representative of all the data collected as it is calculated using all the individual values. The mean is the most sensitive measure of central tendency as it uses all the values in set of data. However, the disadvantage is that it can be distorted by a single extreme value in the set and the mean score may not be one of the actual scores in the set.
The median
The median is calculated by arranging the scores in order then choosing the numerical midpoint. The advantage is that it is unaffected by extreme scores, unlike the mean. The disadvantage is that any outlier values/extreme values would not form part of the average measurement. It is less sensitive than the mean. It does not represent all the findings
The mode
The mode is the most frequent value in a set. The advantage is that it is unaffected by extreme scores. The disadvantage is that it tells us nothing about other scores in the data set.
Measures of dispersion
A set of data can also be described in terms of how dispersed or spread out the data items are.
The range is calculated by taking the lowest score from the highest. An advantage of this is that it is quick and easy to calculate. A disadvantage is that it can be easily distorted by extreme values.
The standard deviation is the average amount that each score differs from the mean. An advantage is that it takes account of all the scores. A disadvantage is that it is more difficult to calculate than the range.