General Flashcards

Question

Normal Data Set

Answer 1

This is a data set where mean=median=mode and where 68% of the data lies between x̄+-s 95% is within x̄ +-2s 99.7% is within x̄ +- 3s

Answer 2

This is ranked data that can be numerical but the intervals separating the data is not equal. (Ex: Moh's scale of hardness). Values also cannot be negative

Answer 3

These are data sets that are trying to understand how one variable influences a different variable

Answer 4

This is the total collection of elements that we want to investigate. This is too large to investigate each of the contained elements.

Answer 5

These are models that help us understand the validity of our conclusions by assigning probabilities of finding our results. It acts as the basis of statistical inference and if an inference cannot be checked using a probability model then we cannot conclude the inference is legitamate.

Answer 6

If the slope relating y and x is \<0 then r \<0 and vice versa. the absolute value of r indicates the linearity of the relationship If r is for (x, y) where w = a + bx and z = c + dy then r(x,y) = r(w,z)

Answer 7

This is data that is greater than zero and on a scale where each interval is spaced evenly. Examples include weight or length

Answer 8

This is f/n where f is the number of occurrences for a given phenomena and n is the total number of phenomena investigated. The summation of f/n =1

Answer 9

The data point equal to where less than 100\*p% of data lies. It includes that data point p=probability as a decimal

Answer 10

s²=Σ (x_i-x)² /(n-1) This fundamentally finds the average values of the squared difference of a data point to the mean (hypotenuse) It is squared to find the absolute value of the difference and not be influenced by negatives

Answer 11

These are subgroups of populations which ideally represent the population

Answer 12

regular sampling is where sampling occurs in evenly distributed plots Uniform sampling occurs by taking random samples within a defined area clustered sampling takes samples from an outcrop or other area of limitted exposure

Answer 13

These are x vs y plots where y = DV

Answer 14

This includes all techniques for manipulating a signal to minimize the noise. They are most often used in combination with time series anlayses to make sense of things like geophysical data

Answer 15

This is a suite of techniques used to understand how observations relate to one another in 2 or 3 space.

Answer 16

This is data that is collected in either 2 or 3 space and represent the occurrence of something in space Ex: Spatial distribution of a tracer in water

Answer 17

Central tendencies are influenced by addition/subtraction. Spread is not but multiplication changes spread by the constant squared

Answer 18

This is the art of learning from data and includes the collection, description, and inference of data

Answer 19

These are plots for small to medium datasets of data that have two parts which can be separated to make a stem and a leaf

Answer 20

This is understooding data sequences as a function of time. It can also include periodic oscilatory data

Answer 21

This is reserved for data that is independent meaning that every outcome under analysis does not influence the other

Answer 22

These can be displayed using line graphs, bar graphs, or frequency polygons

Answer 23

This can be shown on a table, line graph, bar chart, relative frequency polygon, or a pie chart.

Answer 24

We need to be able to display data in a way that makes it interpretable and meaningful. It should be intuitive.

Answer 25

These are most common in very large datasets or the conglomeration of datasets

Answer 26

1. ) arrange data from least to greatest. n = number of data 2. ) find n\*p. The resultant value is the np'th smallest value which satisfies the 100p criterions. 3. ) IF np is not an integer then ROUND UP and that value is the 100p IF np is an integer then the 100p value is given by ((np)+(np+1))/2 This is like the median where if the median is an even number then you use the average of the n/2 and n/2+1 values

Answer 27

These are the 25%th, 50%th, and 75%th values. You find what value of the data set meets these criteria by finding np where p = .25,.5,.75 and if np is whole use the average.

Answer 28

Generally mean gives a better understanding of the dataset in terms of describing the data. The median should be used when probabilities are involved and/or the value is being used to understand the order of a group. Ex: Housing. The mean income would be best for determining what the average person in an area can spend on a home but if we want to design housing where we could expect 50% of the population could live (P(Affordable)=.5) then the median is more useful.

Answer 29

S = sample space = all possible outcomes to some event or occurrence

Answer 30

This is a specific outcome of the sample space S consisting of one or more outcomes that can be defined within one event.

Answer 31

This is the ∩ symbol that is used interchangeably with "and" To say we have two events, E ; F, which occur we could say EF, E∩F, or E and F This represents E and F must occur together and if E and F are mutually exclusive then EF = ϕ = null event

Answer 32

ϕ = null event = the scenerio where the input situation cannot occur. This means that there is no way for the inputs to occur as described. Ex: if E and F are mutually exclusive then EF = ϕ because there are no parts of E and F that overlap

Answer 33

E U F = E or F which means that any outcomes within either subset or event E or F are valid.

Answer 34

If we have an event, E, within the sample space, S then E^c = compliment of E and includes everything that is not E

Answer 35

If the occurrence of E means that F must have also occured then E is contained within F which is shown with a sideways U symbol

Answer 36

E U F = F U E Event E or F = Event F or E EF=FE Event E and F = Events F and E

Answer 37

(EUF)UG = EU(FUG) E or F or G = E or F or G EF(G) = (E)FG E and F and G = E and F and G

Answer 38

(EUF)G = EGUFG Events (E or F) and G = Events E and G or F and G EFUG = (EUG)(FUG) Events E and F or G = (E or G) and (F or G)

Answer 39

(EUF)^C = E^CF^C E or F do not occur = E not occurring and F not occurring (EF)^C = E^C U F^C E and F not occurring = E not occurring or F not occurring

Answer 40

1: 0 2: P(S) = 1 3: P(U_iⁿ E_i) =Σ_iⁿ P(E_i) = P(E₁) + P(E₂) +... +P(E_n) This is assuming that E_i and E_i+1 are mutually exclusive

Answer 41

This refers to a sample space where each outcome has an equal probability of occurring, aka there is no **weight** to a particular outcome In this scenario P(E) = 1/N = p

Answer 42

This says that if we have a set of events that can occur inour sample space and each of these events creates a series of potential outcomes then the total number of outcomes is the product of the total number of secondary outcomes and the total number of first events In other terms if we have "r" experiments and each experiment has "n" outcomes then r\*n = total number of outcomes

Answer 43

This is a specific arrangement of a set of objects where the total number of permutations available to a subset of things is equal to n! where n is the total number of things in the subset

Answer 44

If we have a sample of size = n and we want to know how many unique combinations of size r can be made from the elements of n this is = n!/[(n-r)!r!] This says that for a sample size=n that we can arrange "r" elements this many ways uniquely

Answer 45

The number of unique combinations of n within groups of size r (ⁿ_r) = n!/[(n-r)!r!] where r

Answer 46

It is used when there is **limited information** within a problem (you are attempting to derive the probability of an event based on other events) or It is the easiest way to find the probability of a cause or input to an event with new information (backwards reasoning)

Answer 47

P(E|F) = P(EF)/(P(F) Probability of E occurring given F has occurred = the probability E and F occur divided by the probability F occurs

Answer 48

P(E|F^C) = P(EF^C)/P(F^c) This is says that the probability of E occurring given that F does NOT occur equals the probability of E occurring and F not occurring divided by the probability F does not occur.

Answer 49

P(E) = P(EF) + P(EF^C)) Probability of E = Prob of E and F + Probability of E and not F

Answer 50

P(E) = P(E|F)P(F) + P(E|F^c)(1-P(F)) The P(E) = The weighted average of E occurring if F has occurred and if E occurs and F does not occur Where E occurs as the consequence of F.

Answer 51

Answer 52

If P(E|F) = P(E) then E and F are independent and E is not a function of event F and P(EFG)=P(E)P(F)P(G)

General Flashcards

(77 cards)