Introduction to Statistical Analysis Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Why do we analyse data?

A

=To discover implicit structure in the data (finding patterns in experimental data which might
in turn suggest new models or experiments)
=To confirm or refute a hypothesis about the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the types of data and the scales within them?

A
=Qualitative
-Categorical scale
-Ordinal scale
=Quantitative
-Interval scale
-Ratio scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is categorical scale?

A

Each data item is drawn
from a fixed number of categories, where the
names of the categories may occur in any
sequence and are not orderable
-Nationality: French, Japanese, Mexican, etc.
-Can be called nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Ordinal scale?

A

Data on an ordinal scale has a recognized
ordering between data items, but there is no
meaningful arithmetic on the values
-Finishing position in a race: 1st, 2nd, 3rd etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Interval scale?

A

a numerical scale (usually with real
number values) in which we are interested in relative
rather than absolute value
(Celsius temperature scale)
=The differences between the numbers are
interpretable, but the variable doesn’t have a
“natural” zero value
=Subtraction and average are meaningful, but addition
or multiplication are not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Ratio scale?

A

Ratio scale: a numerical scale (again usually
with real number values) in which there is a
notion of absolute value (response time/ age in years)
=Zero really means zero
=Subtraction, average, addition and
multiplication are meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between continuous and discrete data?

A

Continuous variable: it is possible to have another
value between any two values
 e.g. response time
• Discrete variable: a variable that is not continuous
 e.g. graduation year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What scales are continuous and discrete?

A
Continuous= interval and ration (quantitative)
Discrete= nominal, ordinal, interval, ratio (all)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe normal distribution

A

Any normal distribution is described by two
parameters:
 The mean μ is the centre around which the data
clusters.
 The standard deviation σ is a measure of the
spread of the curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the percentages associated with standard deviations?

A
1= 68% within
2= 95% within
3= 99.7%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a statistic?

A

single value computed from data that captures some overall property of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What measures are we interested in when describing data?

A

=Central tendency- idea of what a typical or common value for a given variable is (mean, median, mode)
=Dispersion- idea of how
spread out data values are
(range, variance, standard deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the mean

A

-Total divided by the number of values
-Appropriate for both interval and ratio scales;
it does not depend on an absolute zero in the
scale. Does not work for qualitative
-Affected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Descriptive vs inferential data

A
  • Descriptive= present information to summarise and visualise
  • Inferential= generalise to larger populations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the median

A

middle value
when the values are ranked in ascending or
descending order
-Non-decreasing order= x((N+1)/2) for N odd or any value between x(N/2) and x(N/2)+1 for even
=Appropriate for qualitative ordinal data and
quantitative interval and ratio data. It does
not make sense for categorical data, as that
has no appropriate ordering.
• Median is a good summary statistic for data
where there is a forced cutoff at one end, or
possible distortion by extreme outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the mode

A
most commonly
occurring value
It is most typically used for ordinal or
categorical data.
• It is not particularly informative for
quantitative data with real-number values,
where it is uncommon for the same data
value to occur more than once
17
Q

Describe the range

A

difference
between the highest and the lowest values.
• Often the minimum and maximum values are
also reported
The interquartile range is an alternative
measure that is less influenced by extreme
values. This is used a lot

18
Q

What is variance?

A

Mean square deviation from the mean

19
Q

Describe standard deviation

A

The standard deviation makes sense for both
interval and ratio data; but has no meaning for
qualitative data scales.
• This is perhaps the most popular
measure of dispersion

20
Q

How do you take a representative sample?

A
  • Random

- Sample as large as practically possible

21
Q

How do we estimate the variance of the whole population using the sample mean?

A

Bessel correction= denominator (n-1) rather than n

Different equation than the normal variance equation