data management ch 1 Flashcards
population
the entire group of the study
sample
a selection of some individuals from the population
cross-sectional data
observational study made a specific point in time
longitude
measured the variables over a long period of time
raw-data
unprocessed information collected for a study
qualitative variable
cannot be measured numerically
quantitative variable
can be measured nymerically
discrete data
only quantitive
measured with whole numbers
continuous data
only quantitive
measured with a given range
ordinal variable
can be put into relative order
nominal variable
categories that cannot be ordered
what are the titles for the circle graph chart
the chart for the information for the graph
- title
- percent of what people like
- angle
how to find the angle for a circle graph
the percent people like * 360
how does a stem and leaf plot key work
give an example
you can use any random number and show how it would look in the stem and leaf plot
1|3 means 13
how does a pictograph key work
each (picture) represents % of (title)
what is a frequency table
how many things there are
what are the titles for the frequency table chart
- # of __ / interval
- tally
- frequency
- midpoint (if you’re using intervals)
- cumulative frequency (if question asks)
- relative frequency (if question asks)
how to calculate cumulative frequency
adding up the frequency one row at a time
how to calculate relative frequency
frequency / total freq
what is the point of a bargraph and what is it used for
for categorial or discrete
no touching indicates separation between groups
what is a frequency polygon
same information as a bar graph but simpler to look at
what is a cumulative frequency graph (ogive)
the running total from lowest to highest
what is a histogram
a bar graph but the bars are touching
!! the x values are not intervals, the bars are in between the values!!
how to use the brackets for intervals
”[” means exact ≤
“(“ means anything but <
relative frequency polygons y values
go up from (0-1)
0.1, 0.2, 0.3
median
middle value
mean
added total frequency / numbers of frequency
mode
the most occuring number
simple mean equation
with sigma
x̄ = Σx / n
x̄ = sample mean
n = total # value in sample
Σ = “the sum of”
weighted mean sample equation
X̄w = Σxw / Σx
the denomenator will always equal 100%
what is the interquartile range (IQR)
range of the middle half data
how to find the IQR
Q3 - Q1 = IQR
how to find the upper threshold
for modified box and whisker plot
- IQR * 1.5
- Q3 + IQR
how to find the lowerthreshold
for modified box and whisker plot
- IQR * 1.5
- Q1 - IQR
how to find the semi-interquartile range (SIQR)
IQR + (IQR/2)
population deviation formula
x - μ
μ = population
x = point of data
sample deviation formula
x - x̄
x = point of data
x̄ = sample
population variance formula
σ^2 = Σ(x-μ)^2 / N
σ = population
Σ = “the sum of”
μ = population mean
N = # of elements in population
population standard deviation formula
σ = √[Σ(x-μ)^2 / N]
σ = population
Σ = “the sum of”
μ = population mean
N = # of elements in population
sample variance formula
S^2 = Σ(x-x̄)^2 / n - 1
S = sample
Σ = “the sum of”
x̄ = sample mean
n = # of elements in population
sample standard deviation formula
S^2 = √[Σ(x-x̄)^2 / n - 1]
S = sample
Σ = “the sum of”
x̄ = sample mean
n = # of elements in population
popular standard deviation for grouped data formula
σ = √[Σf * (x-x̄)^2 / N]
when finding the standard deviation we use a table to stay organized. what are the titles for the table
- x
- x - x̄
- (x - x̄) ^2
change x̄ to μ for population insted of sample if appicable
when finding the weighted standard deviation we use a table to stay organized. what are the titles for the table
there are 6 titles
- x
- frequency (f)
- culminating frequency
- x - x̄
- (x - x̄)^2
- f * (x - x̄)^2
change x̄ to μ for population insted of sample if appicable
how to get the deviation graph threshold
(mean ± standard deviation)
μ - σ = threshold -1
μ = threshold 0
μ + σ = threshold 1
μ + σ = threshold 2
sample z-score formula
x-x̄ / s
deviation / standard deviation)
population z-score formula
x-μ / σ
(deviation / standard deviation)
what is an index
the value of a variable (or group of variables) to a value of a particular date
how to find the slope
m = y2-y1 / x2-x1
factor grow/ fall formula
in a stock graph for example
(new number / old number) * 100 = percent grown/ fall
percent change formula
- find the percent change
- subtract 100 from the answer in 1
how to find how much money your stock will go up depending how much you put
formula
multiply it with the rate of change
(money invested) * y2-y1/x2-x1 = $$
random sampling
literally anything that is 100% random
* random number generator
* pick out of a hat
systematic random sampling
what is it, what formula do you have to use and why
- every nth person
- have to pick a random starting point
n = (population size) / (sampeling size) - n is the number of people you jump
stratified random sampling
- population is divided into subgroups based on qualities
1. ‘relative frequency’ is ->
# of students / total # of students
2. ‘# of surveyed in sample’ is
RF * (survey size %)
survey size % = % of total amount of students
cluster random sampling
- divide population into groups
- randomly select a few of the many groups
- survey everyone in the group
unreliable if clusters are not representing the whole population
multistage random sampling
- multiple levels of random sampling
bias from some areas around the world are not diverse
1. randomly choose city
2. randomly choose block
3. randomly choose houses within the block
convenience sampling - non random
- asking people something
- bias from unrepresented data -> only ask friends
voluntary sampling - non random
- people who willingly take the survey
- bias from super strong opinions of hate/ love
- people who dont care dont care
what is a bias
occurs when a sample is not representative of the population
sampling bias
dose not accurately represent the population
* football game, asked for football or band equipment
household bias
different groups are not polled proportionally to their size
* 10 students sampled from each grade but there are more gr 9 students
measurement bias
the way data was collected influences the results
also happens when something is unnatural or unclear
* sign says slow down but you’re trying to find how many people speed
leading question bias
pushes people to answer in a certain way
* what are your fav songs, give 3 options
loaded question bias
certain words that imply a certain response
* do you really intend…
non-response bias
people choose not to participate
also non participation of certain groups
* group of students respond to a survey about school activities
response bias
feel embarrassed to give honest answers
also poorly written questions
*do you do illegal things (not anonymous)