exam 1 Flashcards

1
Q

statistics

A

a field of mathematics that develops and studies methods to collect, analyze, interpret, and present empirical evidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

empirical vs anecdotal evidence

A

empirical - information received from the observation or measurements of patterns using experimentation
anecdotal - evidence collected in a casual or informal manner that relies heavily on personal testimony or conclusions (not statistical data collection)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

data

A

a collection of numerical facts or information from which conclusions can be drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

raw data

A

unformatted data (numerical measurements, instrument readings, text) that has not been processed or analyzed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

replicates

A

parallel measurements of a phenomenon to estimate variability in your sample (the number of replicates = n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

sampling effort

A

how much data do we need?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

precision and accuracy

A

precision - how fine the divisions on a scale of measurement are
accuracy - how close to the truth our measurement is
(accuracy is the priority)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

descriptive statistics

A

quantitative description of observations sampled from a population (mathematically summarizing patterns, data centers, and variability without making conclusions about overall meaning of data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

data distribution (histogram)

A

sampled populations arranged by rank order and graphically presented

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

normal distribution

A

an arrangement of data in which most values cluster in the middle of the range and the rest taper off symmetrically toward either extreme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

log-normal distribution

A

data are clustered at low values, but there are some much higher values (positive skew)
(can be made normal by applying a logarithm function)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

central tendency

A

numeric value describing a central position in a dataset. mean, median, and mode are valid measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

skew

A

positive, negative, or normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

central limit theorem

A

if a population with finite variants is sufficiently sampled, the mean of all the samples from the population will be = approximately equal to the mean of the population, AND the means from the samples will approach a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

main steps in the scientific method

A

planning - what are you going to do? learn the system, develop ideas about how the system works (maybe do a pilot study), decide hypothesis, figure out what data you will need
recording - collect and properly accord data, can take many forms, must record extremely carefully
analysis - interrogate data to test hypothesis, analysis cannot be successful if data gathering was not designed with analysis in mind, should allow you to accept or reject null
reporting - disseminating methods and media will depend on the type of work and audience, statistical results must be reported using proper conventions, graphs must be properly labelled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

types of data

A

continuous - data that can take any value (usually measured)
discrete - numerical data that can take a limited number of values (often counted)
ordinal - data in categories that can be placed in order, but magnitude of difference between categories is not fixed
categorical - data in categories that can’t be usefully ordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

null and alternative hypothesis

A

null hypothesis - no change (Ho)
alternative hypothesis - what you want to show (Ha or H1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sampling strategies

A

random - best choice
systematic - transects (sampling on a created line)
mixed - stratified random sampling
haphazardly - when you are unable to randomly sample because of practicality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

mean, median, mode

A

mean - sum of observations is divided by number of observations in the sample
median - the middle score for the sampled data that has been arranged by order of magnitude
mode - the most frequent score in a sampled dataset
(equations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

data in quartiles

A

divide data into quarters and use five number summary
steps -
rank data from smallest to largest
smallest is first number, largest is 5th
median is third
middle of first and third is second, middle of fifth and third is fourth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

dividing n-1 to calculate variance

A

penalty for having a small amount of replicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

shapiro-wilk test and how to interpret

A

takes a data distribution and determines whether it is significantly different to normal
p-value of <.05 = not normal, reject Ho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

standard error of the mean (SEM) (def and equation)

A

estimate of how close the sample mean is compared to the true population mean
standard deviation of resampled mean
=Sx/sqrt n

24
Q

types of project

A

descriptive -
differences - is a different to b, bar charts and box and whisker plots, categorical variable and want to know if the response variable differs between categories
correlations - links between variables, usually quantitative variables are independent and quantitative variables are dependent
associations - similar to correlations but with categorical data

25
Q

how to calculate mean

A

bar x = (E^n i=1 * xi)/n

26
Q

how to calculate median

A

the middle value

27
Q

how to calculate mode

A

most often occurring data

28
Q

how to calculate range

A

rank-order observations - highest - lowest

29
Q

how to calculate variance

A

=(E^n i=1(xi - bar x)^2)/n-1 OR = SS/n-1

30
Q

how to calculate standard deviation

A

= sqrt(E^n i=1 (xi - bar x)^2/n-1) OR = sqrt (SS/n-1)

31
Q

how to calculate standard error

A

=Sx/sqrt n

32
Q

copy>paste special> paste values in excel

A

makes values the actual number rather than the equation in a cell

33
Q

$ in excel

A

keeps a number the same to make a cell value unchangeable

34
Q

sum in excel

A

=SUM(array)

35
Q

count in excel

A

=COUNT(array)

36
Q

mean in excel

A

=AVERAGE(array)

37
Q

median in excel

A

=MEDIAN(array)

38
Q

mode in excel

A

=MODE(array)

39
Q

variance in excel

A

=VAR(array)

40
Q

standard deviation in excel

A

=STDEV(array)

41
Q

standard error of the mean in excel

A

=AVERAGE(array)/SQRT(count(array))

42
Q

list in R

A

ls()

43
Q

remove in R

A

rm(objectname)

44
Q

quit in R

A

q()

45
Q

import .csv in R

A

read.csv(file.choose())

46
Q

sum of values in a column in R

A

> sum(objectname$variablename)

47
Q

number of values in a column in R

A

> length(objectname$variablename)

48
Q

mean of values in a column in R

A

> mean(objectname$variablename)

49
Q

median of values in a column in R

A

> median(objectname$variablename)

50
Q

quartiles in R

A

> summary(objectname$variablename)

51
Q

plot quartiles as boxplot in R

A

> boxplot(objectname$variablename)

52
Q

plot histogram in R

A

> hist(objectname$variablename)

53
Q

variance in R

A

> var(objectname$variablename)

54
Q

standard deviation in R

A

> sd(objectname$variablename)

55
Q

shapiro-wilk test for normality in R

A

> shapiro-test(objectname$variablename)