Statistics Flashcards
definition of statistics?
use of a study to explore the most important/concise information from a huge set of data using a small set of data
definition of population/global data?
a huge set of data to be investigated, or an experimental set of data with a specific condition
definition of a sample(s)?
a small set of data from the population/global data
definition of sampling?
randomly taking a set of samples from population data
why should sampling always be randomised?
randomised sampling means the data will always be representative of the bigger population
what key indexes are used to represent a set of data?
- Mean
2. Standard deviation (SD)
what is the definition of mean?
the mean is an an average:
the sum of the samples/total number of samples
what is the meaning of mean in terms of data distribution?
the mean gives information concerning how the data is CONCENTRATED
what is the definition of standard deviation?
the SD indicates how much the values of a data set vary from the mean value on average
i.e. gives the average distance from samples to a centre value (mean)
what is the meaning of standard deviation in terms of data distribution?
the standard deviation gives information concerning how the data is SEPARATED
what is the equation for standard deviation?
(can’t write oot just write doon)
what does ‘frequency of data’ mean in stats?
the number that similar data occurs
what does ‘distribution of data’ mean in stats?
the shape constructed by data frequencies
what relationship exists between the frequency of data and distribution of data?
data frequencies plotted together will establish a pattern - the distribution of data
In SPSS, what can users do with the Variable View interface?
define a variable: by name, type and labelling.
In SPSS, what can users do with the Data View interface?
edit data: e.g. copy, paste, delete.
when defining a variable in SPSS, what ‘type’ of data is preferred and why?
Numeric data - can be changed into another form
if a huge number of samples are collected from variables with natural characteristics, the data may form a famous distribution.
What is that?
What shape does it look like?
At the peak position of this distribution, what key value is expected?
A normal distribution curve.
Bell-shaped.
Peak position - the mean.
In Normal Distribution, how much data would be included in the approximate range of:
mean +/- 1 standard deviation?
mean +/- 2 standard deviation?
mean +/- 3 standard deviation?
mean +/- 1 SD : 68%
mean +/- 2 SD: 95%
mean +/- 3 SD: 99%
In stats there are 3 main data types. What are they?
- Numeric data
- e.g. body mass, age, score - Nominal data
- categories without rank
- e.g. gender, colour, weekday - Ordinal data
- categories with rank
- e.g. feeling, satisfaction, visual analogy scale
What types of files can be imported into SPSS directly?
txt
excel file
import data manually
What are the most important indexes to report on in normal data distribution description?
- Mean
- Standard Deviation
- Standard error of mean
- 95% confidence interval: mean +/- 2SEM
- Max and Min values
What is the definition of Standard Error of Mean (SEM)?
shows the”true mean” for the population when multiple sample groups result in several means. if “mean” is used as a variable, and the distribution of means is plotted, it is still a normal distribution.
SEM is the standard deviation of means.
SEM = SD/square root of number of samples
What is a confidence interval?
How do you use SEM to estimate the confidence interval range of mean?
the confidence interval is the range where the global mean could fall within (SEM) .
because plotting means as a variable produces a normal distribution curve, a 95% confidence interval is roughly:
mean +/- SEM
How do standard deviation and the number of samples influence SEM?
since SEM = SD/square root of number of samples,
if SD increases, SEM also increases
If number of samples increases, SEM decreases
What are the most important indexes to report on in data distribution description which isn’t normal?
- Median
- Quartiles
- Frequency or percentage
What is the definition of median?
reordering the data from the smallest to largest, and it is the value at the middle sample
What is the difference between mean and median?
Median comes from sample values directly.
Mean isn’t a sample value. It is the central value, but may not be equal to any sample value.
In terms of sample data, what are the meanings of the maximum and minimum?
the highest and lowest values in the data.
In SPSS, what can users do with the Cross-table function?
- arrange 2 variables into a table
- calculate chi-square
In SPSS, there are many tools to plot graphs. What does a Simple-Bar graph show?
proportions and percentage of data
In SPSS, there are many tools to plot graphs. What does a Pie-Plot graph show?
proportions and percentage of data
In SPSS, there are many tools to plot graphs. What does a Boxplot show?
median (50%), quartiles (25-75%), and extreme values (max/min) within a category
In SPSS, there are many tools to plot graphs. In Error-Bar, what do the circles and dashes represent?
Circle - mean
Dashes - standard deviation
In SPSS, what characteristics from two variables can be shown using the Scatter/Dot graph?
- tendency of the data
- relationship between variables
Using SPSS, what file types can be exported as output?
many formats - txt, word file, excel, html
What methods ensure a sample is randomised?
- researchers have no particular standard in the selection of samples.
- samples have no particular standard to be selected by researchers
samples taken from a data set can either be ‘dependent’ or ‘independent’. what does this mean?
independent: measurements have no effect on each other e.g. different subjects from different areas are tested using the same equipment - data is independent
dependent: measurements have an effect on each other e.g. same subjects tested at different times, pre- and post- op - data is dependent
what is a double blind experiment
both researchers and patients have no idea what samples they are dealing with
what is the difference between subjective and objective data?
subjective data - the results produced from the feeling or psychological impression of the participants
objective data - the results produced by the measurement instruments or equipment
why is it necessary to test if data is normal distribution?
some statistical methods require the data to be normal distribution, or a similar one
What methods can be used to test whether the data is normal distributed or not?
- Skewness Coefficient (SC):
- measure of asymmetry of distribution
- SC close to 0: ND
- SC > 0: long right tail
- SC < 1: long left tail - P-P plot:
- points close to the line: ND, otherwise not ND. - Kolmogrov-Smirnov test with p-value
If two sets of sample data have different means, are their global means significantly different? Why?
Can’t be sure - the sampling could be the main reason for the difference. Need to use a statistical method to check