Data Analysis Flashcards by B Daniel

Quantitative or numerical variables

Result is a number (age, height, etc.)

How well did you know this?

Not at all

Perfectly

Categorical or nonnumerical variables

Result is something other than a number (eye color, person voted for, etc.)

How well did you know this?

Not at all

Perfectly

Frequency or count

Number of times a variable appears in the data

How well did you know this?

Not at all

Perfectly

Relative frequency

Frequency of the variable appearing divided by the total number of data (appears as fractions, decimals, or percents)

How well did you know this?

Not at all

Perfectly

Histograms (4 things)

Show interval data (often in percentage of relative frequency) and there are NO gaps between bars like in bar graphs. A gap indicates no data for that interval. Useful for identifying the shape or spread of data.

How well did you know this?

Not at all

Perfectly

Measures of central tendency

Goal: find the “center” of the data. Mean, median, and mode.

How well did you know this?

Not at all

Perfectly

Weighted mean

divide only the numbers that are DIFFERENT (not the frequencies for each one) multiplied by the frequencies

Ex: 2, 4, 5, 5, 6, 6, 6, 7, 9
(2) + (4) + 2(5) + 3(6) + (7) + (9) / 6 = 8.333

How well did you know this?

Not at all

Perfectly

Which measure of central tendency is least affected by outliers?

The median

How well did you know this?

Not at all

Perfectly

Measures of position (6)

Least, greatest, median, quartiles, percentiles (99 to divide into 100 groups)

How well did you know this?

Not at all

Perfectly

How to calculate the 1st and 3rd quartiles

The median of the lower half of the data from the median as a whole, and the median of the upper half of the data (in an ordered list!)

How well did you know this?

Not at all

Perfectly

Measures of dispersion (3)

indicate the degree of spread of the data

range, interquartile range, standard deviation

How well did you know this?

Not at all

Perfectly

Interquartile range

difference between 3rd quartile and 1st quartile (measures the spread of the middle half of the data; less susceptible to outliers)

How well did you know this?

Not at all

Perfectly

How to find the standard deviation (5 steps)

find the mean
find the difference between the mean and each value
square each difference
find the average of the squared differences
take the square root of the average

How well did you know this?

Not at all

Perfectly

The mean is X SD away from the mean.

The mean is 0 SD from the mean

How well did you know this?

Not at all

Perfectly

Most data fall within X SD of the mean

3 SD

How well did you know this?

Not at all

Perfectly

How many elements does set S have?

{1, 2, 3, 2,}

Study These Flashcards

How many elements does set S have?

{3, 1, 2}

Study These Flashcards

T/F: {1, 2, 3, 2} {3, 1, 2} are the same set

Study These Flashcards

True

In a set, repetitions…

and order…

Study These Flashcards

repetitions are not counted

order does not matter

In a list, repetitions…

and order…

Study These Flashcards

repetitions are counted

order does matter

T/F 1, 2, 3, 2 and 1, 2, 2, 3 are the same list

Study These Flashcards

False

A U C =

Study These Flashcards

The union (overlap) between sets A and C

A ^ C =

Study These Flashcards

Sets A and C are mutually exclusive

Inclusion-exclusion principle

Study These Flashcards

The numbers of elements in a union of two sets equals the sum of their individual numbers of elements minus the elements in their intersection
(Think chemistry, algebra, physics problem)

Multiplication principle

Two choices, made sequentially, second choice is independent of the first, k(m) = number of possibilities Ex: 5 entrées, 3 desserts = 15 different meal combinations

permutations of n objects taken k at a time (select and order k objects from a group of n objects)

n! / (n-k)!

Permutations vs. combinations

Permutations--order DOES matter (can't repeat or put back, etc.) Combinations--order does NOT matter

combinations of n objects taken k at a time (n choose k)

n! / k! (n-k)!

Probability formula

the number of outcomes in the event (possible that fit parameters) / total possible outcomes Ex: probability of rolling an even number on a die: 3 (2, 4, 6) / 6 = 1/2

If event E is certain to occur, then P is...

If event E is certain NOT to occur, then P is...

If an event is possible but not certain, than P is...

between 0 and 1

The probability that an event will NOT occur is equal to...

1 - probability that it will occur (E/TP)

P(E or F) = | in general

P(E) + P(F) - P(E and F)

P(E or F) = | are mutually exclusive

P(E) + P(F) if E and F are mutually exclusive

P(E and F) =

P(E) P(F) if E and F are independent

What is the link between data distributions and probability distributions?

For a random variable that represents a randomly chosen value from a distribution of data, the probability distribution of the random variable is the same as the relative frequency distribution of the data.

4 properties of a bell curve

1. mean, median, and mode are all nearly equal 2. data are grouped fairly symmetrically around the mean 3. 2/3 of data are within 1 SD of mean 4. almost all of the data are within 2 SD of the mean

Data Analysis Flashcards

(39 cards)