Statistics Midterm Flashcards

1
Q

Descriptive statistics

A

Consists of collection, organisation, summarisation and presentation of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Inferential statistics

A

Consists of generalising from samples to population, performing estimations, hypothesis testing, determining relationships among variables, and making predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population

A

Consists of all subjects that are being studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample

A

Is a group of subjects selected from the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parameter

A

A measure that describes a characteristic of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Statistic

A

A measure that describes a characteristic of a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Selection bias

A

A distortion of evidence or data that arises from the way that the data are collected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variable

A

A characteristic or attribute that can assume different values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data

A

The values that variables can assume.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Qualitative variable

A

The characteristic being studied is non numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Quantitative variable

A

Information is reported numerically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discrete variable

A

Can only take on a finite number of values. (ex people)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Continuous variables

A

Can only take on an infinite number of values. (ex hight/ time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Nominal level of measurement

A

Classifies data into mutually exclusive (no overlapping), exhaustive categories in which no order or ranking can be imposed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ordinal level of measurement

A

Classifies data into categories that can be ranked. (precise differences between the ranks do not exist).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Interval level of measurement

A

Ranks data, and precise differences between units of measurement do exist. However, there is no meaningful zero. (ex temperature)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Ratio level of measurement

A

Ranks data, and precise differences between units of measurement do exist. A true zero exists.

18
Q

Frequency table (categorical frequency distribution)

A

The organisation of qualitative data into table form, using mutually exclusive classes and showing the number of observations in each class.

19
Q

Relative frequency

A

Class frequency / total frequency

20
Q

Pie chart

A

degrees = class relative frequency * 360

21
Q

Building frequency tables

A

You should have: class limit, class boundaries, class midpoint, frequency, cumulative frequency, relative frequency + sometimes cumulative relative frequency.

Rules: between 5 to 20 classes, the classes must be mutually exclusive, the classes must be continuous, the classes must be exhaustive (cover full data range), the classes must be equal width.

Procedure: determine class range, decide number of classes, decide the width, set class limits and boundaries, count the number of occurrences in each class.

22
Q

Cumulative frequency

A

The total number of values that are less than a given upper class boundary.

23
Q

Class midpoint

A

Lower + upper class limit / 2

24
Q

Histogram

A
Display the data by using vertical bars of various heights to represent the frequencies.
Vertical line: frequencies
Horizontal line: class boundaries
25
Q

Frequency polygon

A
Display the data by using lines that connect points plotted for the frequencies at the midpoint of the classes.
Vertical line: frequencies
Horizontal line: class midpoint
26
Q

Ogive / cumulative frequency

A
Represents the cumulative frequencies for the classes in a frequency distribution. The line will alway go up.
Vertical line: cumulative frequency
Horizontal line: upper class boundaries
27
Q

Mean

A

The sum of the values, divided by the total number of values.

28
Q

Rounding rule of the mean

A

The mean should be rounded to one more decimal place than occurs in the raw data.

29
Q

X-bar x̄

A

Sample mean

30
Q

μ

A

Population mean

31
Q

Weighted mean

A

When some values are of more importance. Ex calculating the grade.

32
Q

Median

A

The midpoint of an ordered data set. (when the data is arranged in order.

33
Q

Mode

A

The value that occurs most often in the dataset.

If you don’t have a mode, it’s called bimodal.

34
Q

Midrange

A

lowest + highest values / 2

35
Q

Normally distributed data

A

When it looks like a pyramid. Zero skewness. It has approximately the same amount of values on both sides. The top is in the middle.

The mean is the most used to find the average.

36
Q

Positive skewness

A

When the top is on the right. (ex with salaries)

The median is the most used to find the average.

37
Q

Negative skewness

A

The top is to the left. (ex exam grades.)

The median is the most used to find the average.

38
Q

Range

A

Highest value - lowest value

39
Q

Variance

A

The average of the squares of the distance each value is from the mean.

  1. Find the mean
  2. Value - mean
  3. Square the result
  4. Add the result together
  5. (sample) Divide the result by the total number of numbers in the data set minus one.
    (population) Divide the result by the total number of numbers in the data set.
40
Q

Standard deviation

A

The square root of the variance.

41
Q

Chebyshev’s theorem

A

1-(1/k^2)

Step 1: difference between value and mean / standard deviation

Step 2: 1-(1/k^2)