Data Analysis Flashcards

1
Q

Distribution =

A

How frequently different values are observed in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Frequency =

A

Number of times value appears in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Frequency distribution =

A

Table or graph that shows values and their corresponding frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Relative Frequency =

A

Frequency of a value/Total Number of Data Entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Relative frequency distribution

A

Table or graph showing relative frequencies of each value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Make predictions with the slope of trend line of a scatter plot

A
  1. Take or estimate 2 points on the trend line
  2. Work out the slope
  3. Slope = The change in y axis per every value on the x axis
  4. Multiply slope if needed to change x axis unit for example (for every hour, for every week etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Arithmetic Mean =

A

Sum of all the values/ No. Of Values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Weighted Mean =

A

Sum of All UNIQUE Values/ no. Of unique values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Weight of a value =

A

Frequency it appears

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Median =

A

‘Middle Number’

  1. Order values from smallest to biggest
  2. If no. Of Values is Odd, Median = number in the middle of this list

If No. of Values is even, there are 2 numbers in the middle. Median = Mean of these 2 values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Mode

A

‘Most frequent’

Value that occurs most frequently in list

There can be more than one in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Positions of data

A

(Order data from least to greatest)

L = Least

M = Median

G = Greatest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quartiles

A

Q1, Q2(M), Q3 Split data in to 4 groups:

L - Q1

Q1 - Q2(M)

Q2(M) - Q3

Q2 - G

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Percentiles

A

99 percentiles split data up in 100 groups

Group 1. L - 1 percentile

Group 100. 99 percentile - G

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to find Q1

A

Find median of 1st half of data (the data before median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to find Q3

A

Find median of Second half of data (data after the median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dispersion

A

Degree of spread of the data

Most common = range, interquartile range, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Range =

A

G - L

Greatest - Least

(Show maximal spread of data but can be effected by outliers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Interquartile Range =

A

Q3 - Q1

Shows spread of middle data. Is not effected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard Deviation - measure of

A

Measure of spread that depends on every number in the data set (unlike ranges).

The more data is spread away from the mean - the greater the standard deviation

Sometimes called Population Standard Deviation (differentiate it from sample standard Deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How to calculate standard deviation =

A
  1. Find the mean
  2. Find the difference between each value and the mean and square it
  3. Find the mean of these squared differences
  4. Square root this number (take only the positive answer)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How to find the SAMPLE Standard Deviation

A
  1. Find the mean
  2. Find the difference between each value and the mean and square it
  3. Sum of these squared differences/ (no. of Values - 1)
  4. Square root this number (take only the positive answer)

(Sometimes preferred for a sample of data taken from a larger ‘population’ (set) of data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

1 , 2 , 3 Standard deviations above the mean =

A

Mean + 1d
Mean + 2d
Mean + 3d

d = standarde Deviation

24
Q

1 , 2 , 3 Standard deviations below the mean =

A

Mean - 1d
Mean - 2d
Mean - 3d

d = standarde Deviation

25
How many standard deviations from the mean is X?
If X > Mean Mean + Rd = X If X < Mean Mean - Rd = X Where R = no. Of standard deviations So re written: R = (X - Mean) / d OR R = (X+ Mean)/ d
26
In any group of data all values are within ____ standard deviations of the mean
In any group of data all values are within 3 standard deviations of the mean
27
Set =
Collection of objects (aka members or elements) Repetitions do not count as additional elements Order does not matter
28
Finite set
All elements can be completely counted
29
Infinite Set
Can't counts all elements E.g.: set of all integers
30
Empty set
Has no elements/members Denoted by ∅
31
Non Empty Set =
A set with 1 or more members/elements
32
Subset
Set of numbers that are also all featured in a larger set. Example: A and B are Sets. All the elements in Set A are also in Set B. Therefore A is a SUBSET of B. Set A - {2,8} Set B - {0,2,4,6,8}
33
∅ is a subset of ______
∅ is a subset of every set
34
List =
A set that is in order Can have repeating elements (Unlike a set)
35
Intersections
A set formed from the parts that appear in both of 2 other sets. Example: intersection of X and Y (written as X ∩ Y) = all the elements that appear in both Set X and Set Y
36
Union
A set that is made up of all of the elements in 2 other sets (don't include elements twice) Example: The union of X and Y (written X ∪ Y) = all of the elements of Set X and Set Y If sets are mutually exclusive X U Y = |X| + |Y| If sets can intersect - inclusion-exclusion principle
37
If set have not elements in common they are said to be ____
If set have not elements in common they are said to be mutually exclusive (or disjointed). Written as X ∩ Y = ∅
38
Inclusion-exclusion principle
| A U B | = |A| + |B| - | A ∩ B | IF THE SETS CAN INTERSECT
39
Multiplication principle
If K = different possibilities for first choice M = different possibilities for second choice (that is independent of first choice) KM = different possibilities for the pair of choices Example - 5 meals 3 deserts = 15 combos (Note can be more than 2 choices)
40
Permutation
An order of elements Example : how many permutations of the letters A B and C are there?
41
Factorial
n! = n(n-1)(n-2)(n-3)..... 1 Example 3! = (3)(3-1)(3-2) = (3)(2)(1) = 6
42
Solving Permutation problems
1. Find number of elements (n) | 2. Calculate: n!
43
No. of Permutations ( objects are placed in rising order) of k Objects taken from Set n also written as: permutations of n objects taken k at a time
nPk = n!/(n-k)! Example: how many 5 digit positive integers can be using 1,2,3,4,5,6,7 if none can occur more than once? 1. n = 7 k = 5 2. 7!/(7-5)! = 2,520
44
No. of combinations (objects not placed in order) of k Objects taken from Set n also written as: permutations of n objects taken k at a time
nCk = n!/k!(n-k)! Example: How many ways to select a 3 person committee from group of 9? 1. n= 9 k = 3 2. 9!/3!(9-3)! = 84
45
Permutations nPk = Combinations nCk =
Permutations = The number of ways to select AND ORDER k Objects from a set of n Objects Combinations = The number of subsets of n that contain k objects
46
Sample Space
Set of all possible outcomes
47
Event
particular set of outcomes
48
Probability that event (E) occurs =
P(E) = no. of outcomes that satisfy E / Number of total possible outcomes
49
If event E is certain to occur P(E) =
P(E) = 1
50
If event E is certain not to occur P(E) =
P(E) = 0
51
IF event E is possible but not certain
0

52
Probability Event E wont happen =
1 - P(E)
53
Sum of probabilities of all possible outcomes =
1
54
The probability that both event E and F occur =
IF events E and F are independent: P (E and F) = P(E)P(F) IF events E and F are mutually exclusive (cannot occur at same time) : P (E and F) = 0 - impossible
55
The probability that event E or F or Both occur
IF events E and F are independent: P (E or F) = P(E) + P(F) - P (E and F) IF events E and F are mutually exclusive (cannot occur at same time) : P (E and F) = P(E) + P(F)