Statistics Flashcards

1
Q

Standard deviation indication

A

shows spread between numbers & how volatile they are

In same unit as data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variance indication

A

average degree to which points deviate from the mean (in squared units)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Outlier

A

any data that lies an abnormal distance from the given data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Extrapolating

A

estimating a value outside the given data range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Interpolating

A

estimating value inside the given data range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

line of best fit

A

estimated correlation used for predictions by extrapolating the graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

venn diagram

A

a geometric representation of sets & their relation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

quartiles

A

show you were certain percentages of the data lie; 25%, 50%, 75% respectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Outlier test

A

Q1- 1.5 x IQR = is below Q1
Q3- 1.5xIQR= is below Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

set

A

collection of well defined unqiue objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

list

A

a collection of objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

PMCC

A

Pearson moment correlation coefficient
* only used for linear equations
* - = negative correlation
* + = positive correlation
* always between 0-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ogive

A

cumulative frequency curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

central tendency

A

mean, mode, median = descriptive summary of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Skew

A

measuring where most data lies
negative skew = most are positive
postitive skew= most are negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Analysing histograms

A

CSOS
* centre
* spread
* outlier
* shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Shape

A
  • amount of peaks= unimodel, bimodel, multimodal
  • symmetry & skew
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

unreliable data

A

if
* missing data
* errors in handling data

14
Q

sufficient data

A

if there is enough data to support your conclusion

15
Q

How is standard dev. affected when a constant is added or subtracted

A

unaffected as all values shift by that number= distance between values remains the same

16
Q

how is standard dev. affected when a constant is multiplied or divided

A

standrad deviation is also multiplid or divided as this affects the ratio between the distances of the vlaues

17
Q

How is mean affects when a constant is added to a value

A

constant is also added to mena as it shifts

18
Q

Target population

A

population from which you take a sample of

19
Q

Sampling Unit

A

single member that is chosen to be sampled

19
Q

Sampling frame

A

list of the items/people

20
Q

Sampling values

A

possible values the sampling variable can take

20
Q

Sampling Variable

A

variable under investigation

21
Q

BIA’s in sampling

A
  • no response
  • bad design
  • bias in respondant
  • some mebers are excluded
22
Q

Reliable data

A

data is reliable when you can retake it and get the similar results

23
Q

Sufficient data

A

Data is sufficient if there is enough data available

24
Q

Qualitative Data

A
  • opinion based
  • expressed in words
  • can be described
  • ONLY mode can be calculated
25
Q

Quantitative data

A
  • expressed in numbers
  • can be discrete or continuous
  • can be measured
  • can be counted
26
Q

Discrete data

A
  • countable
  • in disctinct catagories
  • finite value

Types of graph
* dotted graph
* bar chart

27
Q

Continuous data

A
  • measureble
  • can always be measured more accurately and to higher resolution
  • infinite value

Graph
* histogram
* graph (example cumulative frequency)

28
Q

Simple random sampling

A

Sampling units are assigned numbers and a random number generator is used

Pros
* everyone has equal chance of being chosen = bias free
* simple & cheap

Cons
* not suitable for large population
* needs sampling frame

28
Q

Systematic Sampling

A

You take the population/sampling frame= k
assign numbers on everyone and start between 1- k; take every kth member

Pros
* simple & quick to use
* suitable for large sample sizes

Cons
* might be biased when you chose who to start on
* sampling frame needed

28
Q

Quota Sampling

A

Split sample into groups based on qualities
handpick one item from each group until quota is satisfied

Pros
* ensures variety in sample
* allows small groups to be represented
* no sampling frame required

Cons
* biased in choosing = not random

28
Q

Stratified sampling

A

put items in stratas with common characteristics
find startas proportion within population = strata/population
perform random sampling in each strata

Pros
* random
* represnets different groups reflective in the population proportionally

cons
* needs smapling frame
* same cons as random within each strata

28
Q

Convenience sampling

A

find whoever is most convenient/closest proximity

Pros
* easy & inexpensive

Cons
* unreflective of use in sample
* highly biased

28
Q

Clustered sampling

A

put people into random groups of different kinds of people
select individual group randomly
only choose one from each group

29
Q

things to remember when making box plots

A

Check for outliers