Data Analysis Flashcards

Question

When/why is using the median not useful?

Answer 1

- some information is lost as the raw scores are not used in the calculation

Answer 2

when values are added up and then divided by the total number of values

Answer 3

- most appropriate with interval/ratio data, symmetrical distributions with no extreme values - includes information from all the items of data so is the most sensitive measure of central tendency (least information is lost)

Answer 4

- if the data is skewed (outliers) - mean may not be one of the original values (e.g. family does not have 3.2 children) so may be misleading - if the distribution is bimodal, again may be misleading

Answer 5

how spread out data is from around the mid-point e.g. range, interquartile range, standard deviation

Answer 6

calculated by subtracting the lowest from the highest value in the data set (often researchers add 1)

Answer 7

- easy/quick to calculate

Answer 8

- includes end values, may be distorted by outliers - only having information from end scores contains no information about whether the values are spread evenly or clustered

Answer 9

measures how spread out a set of values are around the mean value - the larger the standard deviation, the larger the spread of scores are within a set of data

Answer 10

1. calculate the mean 2. subtract mean from each value in data set to find the difference between each value and the mean 3. square each of these (get rid of -) 4. find the sum of all of these squared differences 5. divide by population/sample (variance) 6. find the square root of the variance

Answer 11

- easy/quick to calculate

Answer 12

- includes end values, may be distorted by outliers - only having information from end scores contains no information about whether the values are spread evenly or clustered

Answer 13

includes descriptive statistics, common to include a paragraph or two after explaining what results show

Answer 14

all possible contingencies included, often for nominal data and shows the frequency of occurrences in each category (e.g. as well as showing those speeding, show also not speeding - so that wrong conclusions are not drawn)

Answer 15

show continuous data, how one variable changes in respect to another (e.g. time)

Answer 16

used to show the relative proportions of different categories, show the frequency of each category as as percentage

Answer 17

used to represent data from correlational research, each pair of values plotted, one against the other, to determine if a consistent trend is apparent

Answer 18

shows data in the form of categories which the researcher wishes to compare (e.g. males with females) categories go alone x-axis, y-axis = IV, height of bar represents frequency; used for discrete variables

Answer 19

used for continuous variables, rather than discrete, continuous variable plotted on x-axis indicated by no space between bars, y-axis must show frequency with which value on the x-axis occurs

Answer 20

very similar to histogram and one variable on the x-axis must be continuous, drawn by drawing line from midpoint of each bar in a histogram to the midpoint on the next - advantage: 2+ frequency distributions displayed on the same graph, allow for comparisons to be made

Answer 21

the pattern that can be seen on a graph, normal, positively skewed or negatively skewed

Answer 22

an arrangement of data that is symmetrical and forms a bell shaped pattern where the mean, median and mode all fall in the centre at the highest peak (can be bimodal)

Answer 23

an arrangement of data that is not symmetrical data is clustered to one end of the distribution

Answer 24

Mean, median, mode - possibly when a task is too easy and so participants might be expected to get a high score (ceiling effect);(left foot)

Answer 25

Mode, median, mean - may occur if task is too difficult (floor effect);(right foot)

Answer 26

the ways of analysing data using statistical tests that allow the researcher to make conclusions about whether a hypothesis was supported by the results

Answer 27

P < 0.05 - the probability the observed value is down to chance is less than 5% chance

Answer 28

P < 0.025 or P < 0.01 - more stringent levels used if study cannot easily be checked by replication or there is an aspect of risk involved

Answer 29

when we reject the null hypothesis but we shouldn't and the result was actually down to chance - increased chance when the we set the level of significance too low

Answer 30

when we retain the null hypothesis, but there was actually a real effect taking place and we should have rejected it - increased chance when we set the level of significance too high

Answer 31

1. collect data in a table 2. make sure level is NOMINAL - look at difference between second and first rating and see if it is positive or negative 3. add the number of times the less frequent sign occurs (this is S - the observed/calculated value) 4. to see if the difference between the two conditions is significant, chose the correct statistical table - if the observed value (s) is less than/equal to the critical value for a given level of significance, the null hypothesis can be rejected

Data Analysis Flashcards

(55 cards)