10 Descriptive Statistics Flashcards

1
Q

What do stats allow us to do?

A
  • understand world phenomena with use of available data
  • obtain tools to summarise and interpret data
  • make proper inference/ forecasting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive statics

A

Used to summarise information which would otherwise be too complex to take in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Statistical inference

A

The drawing of lessons about a population from studying a sample of data drawn from that population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variable

A

A specific characteristic of a unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two different types of variable?

A

Numerical- where each observation takes a numerical value

Categorical- records which of a series of categories are observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the types of numerical variables

A

Discrete- possible values are limited to a sequence of number (usually natural numbers)
Continuous- can take on any value within a range of real numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the types of categorical values

A

Nominal- the categories have no ordering or ranking

Ordinal- the categories have a ranking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Types of data

A

Cross section data- data on several units at one point in time
Time series data- data on one unit across several points in time
Panel data- data on several units across several points in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Population

A

Describes the complete set of all units of interest to an investigator (N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sample

A

An observed subset of the population (n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Simple random sampling

A
  • each member of the population is chosen strictly by chance
  • each member of the population is equally likely to be chosen
  • every possible sample of n objects is equally to be chosen
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Xi

A

An observation in a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sigma

A

The sum of the values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Frequency distribution

A

A list or table containing groupings and corresponding frequencies for days within each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

K

A

Possible groups which data could fall in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Absolute frequency

A

The number of observations belonging to a group

17
Q

Relative frequency

A

The proportion of observations belonging to that group

18
Q

Cumulative frequency

A

The total number of observations in that and any previous class

19
Q

Cumulative relative frequency

A

The proportion of observations in that and any previous class

20
Q

Arithmetic mean

A

The sum of all the values divided by the number of values

21
Q

Median

A

The middle observation. If the number of values is even, the median is the mean of the tie middle values

22
Q

Mode

A

Most common value

23
Q

Which one out of mean, median and mode is most affected by outliers?

A

Mean

24
Q

Left skewed data

A

When the mean is less than the median

25
Q

Right skewed data

A

When the mean is greater than the median

26
Q

Geometric mean

A

Used to measure the rate of change of a variable over time. It is the nth root of a product of n numbers

27
Q

Range

A

The difference between the smallest and largest value of the data

28
Q

Interquartile range

A

Calculates the range of the middle 50% of the data Q3-Q1

29
Q

Variance

A

The average of squared deviations of values from the mean.

30
Q

What is the sample variance divided by and why?

A

n-1 because the sample variance is an estimation and is underestimated since extreme values are rare and are unlikely to be included in the data

31
Q

Standard deviation

A

The square root of the variance

32
Q

Why is standard deviation more useful than variance

A

Standard deviation allows us to measure the spread from the men’s in units

33
Q

Advantages of variance and standard deviation

A
  • each value of the data set is used in the calculations

* values far from the mean are given extra weight

34
Q

Coefficient variation

A

Measures the relative variation and can be used to compare two or more sets of data. It is always given as a percentage

35
Q

Covariance

A

Measures the joint variability of two variables

36
Q

What does the sign of the covariance indicate

A

The direction of the relationship between the two variables.
Cov(x,y)>0 x and y have positive correlation
Cov(x,y)<0 x and y have negative correlation

37
Q

Coefficient of correlation

A

Measures the relative strength of the linear relationship between two variables

38
Q

What values does the coefficient of correlation take?

A

-1 to 1

39
Q

Why is coefficient of correlation more useful than covariance?

A

It gives both the direction and strength of the relationship