6 | Statistics for Proportions and Frequencies I Flashcards

1
Q

(POLL)

You would like to characterize the penguin species distribution on an island, what measures you could use?
- median
- mean
- modus
- proportions
- standard deviation

A

modus, proportions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

(POLL)

Which of the following statements is true about an independence table?
* it shows the observed numbers of our real data
* it shows the expected numbers of our data
* it shows the Pearson residuals of our data
* is used to calculate the Pearson residuals
* shows expected numbers if both variables are not related

A
  • it shows the expected numbers of our data
  • is used to calculate the Pearson residuals
  • shows expected numbers if both variables are not related

not these:
* it shows the observed numbers of our real data (no, this is in contingency table)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

(POLL)

The Pearson residual(s) show the …
* strength of the association between two variables
* normalized deviation from the expected values
* raw deviation from the expected values
* is one number for a 2x2 table
* are 4 numbers for a 2x2 table

A
  • normalized deviation from the expected values
  • are 4 numbers for a 2x2 table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

(POLL)

A good way to show the distribution of very similar count data of a single variable is the ….
* Assocplot
* barplot
* dotchart
* Histogram
* piechart
* Xyplot

A
  • barplot
  • dotchart

no:
* piechart ?
* Assocplot (no, usually used for 2 variables!)
* Histogram (no, for numerical data)
* Xyplot (no, for 2 numerical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

(POLL)

Which of the following distributions can be used to for the statistics of univariate data?
* Bernoulli distribution
* Binominal distribution
* Chisq distribution
* Normal distribution
* Poisson distribution

A
  • Bernoulli distribution
  • Binominal distribution
  • Poisson distribution

no:
* Chisq distribution (2 variables)
* Normal distribution (numerical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

(POLL)

Which of the following distributions can be used to for the statistics of bivariate data?
* Bernoulli distribution
* Binominal distribution
* Chisq distribution
* Normal distribution
* Poisson distribution

A
  • Chisq distribution - most appropriate.

yes but not the best:
* Bernoulli distribution (yes but not best)
* Binominal distribution (yes but not best)
* Poisson distribution (yes but not best)

no:
* Normal distribution (no not this)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are contingency tables? What kind of data are they used for?

A
  • tables with counts (of occurrences of certain level, or combination of levels)
  • normally used on factors/categorical data
  • [numerical data can be categorized with cut (“good” break → use quantiles for cutting)]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can numerical data be categorized so that one can create a contingency table? What is a good way to do this?

A
  • numerical data can be categorized with cut
  • (“good” break → use quantiles for cutting)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the frequency in the context of categorical data?

A
  • How often a category is counted nc
  • For k categories the sum of all frequencies = n : Σ1≤c≤k nc = n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the relative frequency in the context of categorical data?

A
  • sample proportion of a single category, p̂c = nc / n
  • for k categories the sum of all proportions = 1: Σ1≤c≤kc = 1
  • percentages: relative frequencies * 100
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequency table vs contingency table?

A
  • frequency: 1 variable
  • contingency: > 2 variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What can we calculate from contingency tables?

A
  • Expected values
  • Residuals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can we calculate expected values?

A
  • Contingency table → Margin table → independence table
  • Independence table contains expected number: Rowtotal * Columntotal / Total
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a margin table?

A
  • The contingency table with total sums for rows and columns added
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an independence table?

A
  • Calculated from the margin table
  • Number of observations if there would be no dependencies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which bracket means inclusive? ( or [

A
  • [
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which bracket means exclusive: ( or [ ?

A

(

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a Chi Square test? (statscast)

A
  • A statistic that checks for patterns or relationships in categorical variables
  • It checks whether any observed variations from evenly spread data are meaningful or just a coincidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

With how many variables can you do a Chi Square test? (statscast)

A
  • One or more
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Give some examples of things you could test with Chi Square and state the number of variables and levels (statscast)

A
  • Whether a die is fair? 1 variable with 6 levels → one way chi square
  • Whether participating in a study group is related to passing an exam?
  • Does gender vary across educational majors? Female, male vs engineering, business, psychology.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Consider this question: Does gender vary across educational majors? Female, male vs
engineering, business, psychology. (statscast)
What test could you use for this? What are the possibilities we want to determine?

A
  • Chi square
  • No relationship → expect gender evenly spread across majors → H0
  • Relationship → expect gender unevenly spread across majors → alternative hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

As an _________, a chi square allows us to make __________ about the ________ beyond our data.

A

As an inferential statistic a chi square allows us to make inferences about the population beyond our data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How is the chi square value calculated? (statscast)

A
  • For each group: (expected – observed) 2 / expected
  • Then add values for all groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Chi Square: how to find the degrees of freedom? (statscast)

A
  • DF = level of each categorical variable minus one, multipled
  • From contingency table: take away 1 row and 1 column and see how many cells are left
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are some requirements for doing chi square? (statscast)

A
  • At least one case in every cell in the table
  • At least 80% of table cells should have 5 or more cases (some say all cells)
  • → if you have too low, you can combine some levels or collect more data
  • All data should be independent, ie scores should not influence one another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Is Chi Square limited somehow?

A
  • At least 1 case in every cell, 80% should have at least 5
  • A significant chi square test doesn’t tell you which levels of your variables are driving the effect → Chi squares with many levels can be difficult to interpret
  • Results of inferential statistics only to be applied to pops that resemble sample
  • All data should be independent, ie scores should not influence one another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Robustness of chi square?

A
  • Non-parametric test → more robust than parametric tests (t-test, ANOVA, ..)
  • Don’t even need normal distributions!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How do you write a report of a chi square test? (statscast)

A

A Chi Square test of independence was used to check for a relationship between x and y, Χ2 (df) = 6.0, p < .05, indicating a statistically significant relationship between x and y. (statscast)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are Pearson-Residuals

A
  • Part of chi squared test
  • normalized measure for the distance of each cell frequency to the expected data
  • residuals = (Observed – Expected) / √Expected
  • → see any under- or overrepresentation / how far data from an independence table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

R:
What is a function that can be used for tabulating one or two variables?

A
  • table()
  • returns a contingency table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

R:
What is a function that can be used for tabulating more than two variables?

A
  • ftable()
  • returns a contingency table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

R:
What is a function that can be used for table extraction of multidimensional data?

A

apply() → you can ignore some fields and just extract some dimensions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

R:
How could you make a modus function in r?

A
  • use table() eg tab = table(x)
  • check what the maximum is eg idx = which(max(tab)==tab)
  • return(names(tab)[idx])
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

R:
What function would one use to convert numerical data into categorical? Give an example.

A

cut()
~~~
Survey$cm = c(150, 187, 165, 166, 170, 180, 145, 191, 160,)
cSize=cut(survey$cm,c(0,160,185,250)) # ‘dwarves, normal, giant ‘
~~~

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

R:
What functions can be used to get the expected values in r?

A
  • either: addmargins(tab), then calculate eg tab[x,y]*tab[y,z]/tab[x,z]
  • or: chisq.test(tab)$expected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

R:
How to get the pearson residuals?

A

Chisq.test()$residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What kind of analysis can one do with a contingency table, apart from chi square?

A

Calculate the proportions with prop.table() eg to:
* Summarize proportions from contingency tables.
* Normalize data (row-wise, column-wise, or total proportion).
* Check categorical distributions in datasets.
* Compare observed vs. expected values (like in Chi-square tests)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

R:

A

Function for a proportion table? And how to control whether the rows or columns sum to 1?
* prop.table(table)
* rowwise: prop.table(table,1)
* columnwise: prop.table(table,2)

39
Q

Which graphics are appropriate for exploring single (categorical) variables?

A
  • Pie chart
  • Barplot
  • Dotchart
40
Q

Which graphics are appropriate for exploring 2 (categorical) variables?

A
  • Mosaicplot
  • Assocplot (or assoc from vcd)
  • Fourfold
41
Q

R:
How to make a plot with three figures, eg with a piechart, boxplot, dotchart with values 6:11?

A

configure graphical parameters plotting multiple graphs in a single figure:
~~~
> par(mfrow=c(1,3),mai=c(0.5,0.5,0.5,0.3)) # three figures - 1 row, 3 columns; margins in inches around plot
> pie(6:11,col=1:6,cex=2)
> barplot(6:11,col=1:6,cex.axis=2,cex.names=2)
> box()
> dotchart(6:11,cex=1.4,col=1:6,xlim=c(0,12),pch=15)
~~~

42
Q

Pie angle problems?

A

Human eye is not so good at noticing the differences in a pie chart → sometimes better to use eg a barplot or a dotchart.

43
Q

Dotcharts / dotplots pros ?

A
  • less cluttered
  • less ink
  • less redundant
  • overlay second variable
44
Q

What are the different parameters here?
~~~
dotchart(6:11,cex=1.4,col=1:6,xlim=c(0,12),pch=15)
~~~

A
  • cex: marker size
  • col: colour
  • pch: shape of marker
45
Q

What is a mosaicplot?

A
  • Exploring the relationship between two variables
  • absolute numbers visualized
  • Visualises proportions with area
46
Q

What is an association plot?

A
  • Assocplot()
  • Exploring the relationship between two variables
  • Visualisation of pearson residuals
47
Q

R:
Complete code for a figure showing a mosaic plot and an association plot for the survey$gender data according to the cut height data.
~~~
> ____(____=c(1,2), _____=c(0.4,0.7,0.5,0.0))
> ____ (table(____, ______), col=c(2,4),cex=1.0,main=”____”)
> ____ (table(____, ______),main=”____”)
~~~

A
> par(mfrow=c(1,2), mai=c(0.4,0.7,0.5,0.0)) 
> mosaicplot(table(cSize, survey$gender), col=c(2,4),cex=1.0,main="mosaic") 
> assocplot(table(cSize, survey$gender),main="assoc")
48
Q

Mosaicplot vs Assocplot ?

A

Mosaicplot / assocplot
* Width: actual proportions by absolute value / proportional to square root of total individuals
* Height: actual proportions by absolute value / proportion according to pearson residuals

49
Q

Barplot, dotcharts vs mosaicplot, assocplot?

A
  • One can also used a stacked barplot to visualise two variables - However this does not show actual numbers in the width so mosaicplot is more useful
  • One can also show pearson residuals with a dotchart, but its much more difficult to grasp than with an assocplot
50
Q

What’s better than an assocplot()? Give an example

A
  • vcd library assocplot, shows the scale of the pearson residuals
  • ie shows the significance of the pearson residuals, also with colour coding if desired
    ~~~
    > library(vcd)
    > assoc(aids.azt,shade=T)
    ~~~
51
Q

What is a fourfold plot

A

A fourfold plot provides a graphical expression of the association in a 2×2 contingency table, visualising the odds ratio.
Each cell entry is represented as a quarter-circle (denoted by the middle of the three rings).
* Actual numbers from contingency table shown in the corners
* Proportions in relative terms
* Dark blue: significant changes between groups
* Confidence intervals – lines above and below circle

52
Q

R:
Code for four fold plot?

A
cotabplot(aids.azt,panel=cotab_fourfold)
53
Q

Graphics summary
~~~
————————————————————————–
1D 2D:Numeric 2D:Categoric
————————————————————————–
Numeric ??

Categoric

~~~

A

Hist, density

54
Q

Graphics summary
~~~
————————————————————————–
1D 2D:Numeric 2D:Categoric
————————————————————————–
Numeric ??

Categoric

~~~

55
Q

Graphics summary
~~~
————————————————————————–
1D 2D:Numeric 2D:Categoric
————————————————————————–
Numeric ??

Categoric

~~~

A

boxplot
(stripchart)
(violinplot)

56
Q

Graphics summary
~~~
————————————————————————–
1D 2D:Numeric 2D:Categoric
————————————————————————–
Numeric

Categoric ??

~~~

A
  • (pie)
  • (barplot)
  • Dotchart – best option
57
Q

Graphics summary
~~~
————————————————————————–
1D 2D:Numeric 2D:Categoric
————————————————————————–
Numeric

Categoric ??

~~~

A

Boxplot
(stripchart)
(violinplot)

58
Q

Graphics summary
~~~
————————————————————————–
1D 2D:Numeric 2D:Categoric
————————————————————————–
Numeric

Categoric ??

~~~

A
  • Association plot
  • Mosaic plot
  • Fourfold plot
59
Q

Descriptive statistics for categorical variables?

A
  • Contingency & proportion tables
  • Modus
  • Visualising: assoc, mosaic, fourfold plots
60
Q

What are the aims of inferential statistics of proportions and frequencies?

A
  • generalize from sample to population
  • not only for the average, but also spread, form of the distribution
61
Q

Probability theory
Why is mathematical probability theory used for inferential statistics of proportions and frequencies (categorical variables)?

A
  • Probability used as a measure of uncertainty
  • Uncertain as long as our sample is smaller than population → we estimate parameters of population and we specify the extent of uncertainty
  • How: compare out data with random data where we know
62
Q

Probability theory
How is probability theory used for inferential statistics of proportions and frequencies?

A
  • compare our data with random data where we know there is no effect
63
Q

Probability theory
What is randomness? Eg?

A
  • a phenomenon is called random if the outcome cannot be calculated with certainty
  • ex: coin tossing, we don’t know outcome before
64
Q

Probability theory
What is the sample space? Examples

A
  • Sample space S: collection of possible outcomes
  • Eg coin: S = {Head, T ail} (Coin, ns = 2)
  • Eg dice: S = {1, 2, 3, 4, 5, 6} (Dice, ns = 6)
  • sample space can contain an infinite number of outcomes – eg body height if it could be measured exactly
65
Q

Probability theory
Measure of probability with a coin toss?

A

P(Head) + P(T ail) = 1

66
Q

Probability theory
What is an event? How can you calculate the probability of the event? Eg with coin toss.

A
  • (E) is a subset in sample space
  • possible event for three dice rolling: E = {1, 2, 4}
  • probability of this event P(E) = 1/6 + 1/6 + 1/6 = 1/2
67
Q

Probability theory
What is a complement?

A
  • for any event there exists a complement: those items in sample space but not in event (disjoiny)
  • probability of the complement: P(Ec) = 1 - P(E)
68
Q

Probability theory
Union of two events?

A
  • the set of outcomes that are at least in one of the events
69
Q

Probability theory
What is conditional probability?

A
  • two events → conditional probability
  • P(E1|E2) = probability of event E1 if E2 has occurred
70
Q

Probability theory
Independent events?

A
  • two events independent if knowledge of outcome of E1 does not alter prob. for event E2
  • P(E1|E2) = P(E1) and P(E2|E1) = P(E2)
71
Q

Probability theory
If two events are independent, their joint probability is the product of their _____________. Formula? Example with dice?

A
  • Marginal probabilities
  • P(E1 ∪ E2) = P(E1) x P(E2)
  • Dice throw: two times a six in sequence → P(E6u6) = 1/6 x 1/6 = 0.028
72
Q

Probability theory
What is the mathematical definition for conditional probability?

A

P(A|B) = P(A∩B) / P(B), if P(B) ≠ 0

73
Q

Bayes Theorem
What is the Bayes Theorem (definition)?

A
  • Allows calculation of Ps for events which are not independent of each other
  • mathematical rule for inverting conditional probabilities → find P of a cause, given effect.
  • (= Bayes’ law or Bayes’ rule, after Thomas Bayes)
74
Q

Bayes Theorem
What is the Bayes Theorem (mathematical statement)?

A

P(A|B) = P(B|A)P(A) / P(B)

75
Q

Bayes Theorem
Example for cancer in smokers.
* E1 = smoker: P(SM) = 0.25
* E2 = lung cancer: P(C) = 0.001
* P(SM|C) = 0.40
What is the conditional P of C given SM, ie the probability of lung cancer if someone is a smoker, P(C|SM)

A
  • P(C|SM) = P(SM ∩ C) / P(SM) (conditional probability
  • (0.4 * 0.001)/0.25 = 0.0016
  • smokers have around 60% higher risk of lung cancer
  • with Bayes theorem: P(C|SM) = P(SM|C) * P(C) / P(SM)
    https://www.youtube.com/watch?v=HZGCoVF3YvM
76
Q

Use of frequentist statistics?

A
  • supports statistical needs of experimental scientists and pollsters;
  • does not support all needs; gamblers typically require estimates of the odds without experiment
  • in SBI course: mostly frequentist
77
Q

Probability in Bayesian vs Frequentist statistics?

A

Bayesian:
* having prior probability and posterior probability
* degree of belief in event (based on prior knowledge, previous experiments, or beliefs)
* combining old data with new evidence

Frequentist:
* probability = limit of relative frequency of event after many trials
* → probabilities can be found (in principle) by a repeatable objective process
* → thus ideally devoid of opinion

78
Q

What are Random Variables? Eg with coin, dice?

A
  • A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events.
  • The term ‘random variable’ in its mathematical definition refers to neither randomness nor variability
    Eg random variable X for head or tail outcome of a coin:
  • P(H) + P(T) = P(X = 0) + P(X = 1) = 0.5 + 0.5 = 1
  • set of possible values is the range for the variable: range of X: X = {0, 1}
    Eg dice:
  • P(X=1) = 1/6
  • range of X for cube: X = {1, 2, 3, 4, 5, 6}
79
Q

Probability Distributions for random variables?

A
  • Bernoulli special case of Binomial distribution with n = 1 (just 1 trial)
  • Binomial distribution: seen when sequence of independent random variables with same P
  • parameter Θ is the probability of success (for events like: 1, survived, YES, female, tails)
  • P(X = 1) = Θ
  • P(X = 0) = 1 – Θ
80
Q

Binomial distribution?

A
  • Has parameters n and p
  • discrete P distribution of number of successes in sequence of n independent experiments
  • each own Boolean-valued outcome: success (with probability p) or failure
  • with probability P(q) = 1 – P(p)
  • basis for the binomial test of statistical significance.
81
Q

Bernoulli distribution?

A
  • Special case of binomial distribution
  • A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment,
  • sequence of outcomes is called a Bernoulli process
  • for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution.
82
Q

R:
Probability functions
What does the pdqr stand for in pdqr functions? (Detlef called them rpdq)

A
  • r: random number generator
  • p: probability function (cumulative probability function c.d.f)
  • d: density function (point probability)
  • q: quantile function (inverse c.d.f)
83
Q

R:
What are the pdqr functions for the normal distribution?

A
  • pnorm(): Cumulative probability
  • dnorm(): Probability density
  • qnorm(): Quantile function
  • rnorm(): Generate random numbers
84
Q

R:
Code for random numbers from binomial distribution for coins?
> # 20 times each time 10 coin trials # number of tails
> _____(__,__,__)
[1] 4 6 1 6 3 4 7 7 5 6 7 6 7 7 6 7 3 8 7 6
> # bernoulli special case with n=1 just one coin trial
> _____(__,__,__)
[1] 1 0 1 0 1 1 0 0 1 1 1 0 1 0 0 0 1 0 0 1

A

> # 20 times each time 10 coin trials # number of tailsrbinom(20,10,p=0.5)
[1] 4 6 1 6 3 4 7 7 5 6 7 6 7 7 6 7 3 8 7 6
# bernoulli special case with n=1 just one coin trial
rbinom(20,1,p=0.5)
[1] 1 0 1 0 1 1 0 0 1 1 1 0 1 0 0 0 1 0 0 1

85
Q

Binomial vs Poisson Distribution?

A
  • binomial distribution has upper limit: throw coin 50 times → max head value is 50
  • count numbers without theoretical limits (spatial or temporal) → often follow Poisson distribution
  • lower limit is zero, but no upper limit
86
Q

Example for Poisson distribution? Parameter?

A
  • count cells in a grid, number of visits of doctors by a patient …
  • parameter λ = rate of occurrence within a certain time or space, the mean of the sample.
  • Higher λ → higher average of all
87
Q

χ 2 Distribution

A

A random variable has a Chi-square distribution if it can be written as a sum of squares of independent standard normal variables.
Sums of this kind are encountered very often in statistics, especially in the estimation of variance and in hypothesis testing. (wiki)
(watch youtube videos!)

88
Q

χ 2 distribution (two variables) ?

A

If we cross-tabulate random two variable distribution (eg binomial, passion) →
χ 2 = Σ1 ≤ j ≤ m (njo – nje)2</sup / nje
χ 2yates = Σ1 ≤ j ≤ m (|njo – nje| - 0.5)2</sup / nje
nj= observed, nje = expected

89
Q

How often is χ 2 19.6383336713559 …

A

> chisq.vals=rchisq(1000*1000,df=1)
hist(chisq.vals,col=’light blue’)
box()
abline(v=res$statistic,lwd=3,col=’blue’)
length(which(chisq.vals>res$statistic))/ + length(chisq.vals)
[1] 1e‐05
(watch youtube videos!)

90
Q

(QUIZ 2)
The ______distribution is a special case of the ______distribution where there is only one trial for a binary outcome. The ______distribution has no upper limit

A

Bernoulli, Binomial, Poisson

91
Q

(QUIZ 2)
From a 2x2 contingency ______ we can calculate the so called _independence _ ____which holds the counts for the data which we would get if the is no relationship between our two variables. The deviations ______ minus ______ values can be used to calculate the ______.

A

From a 2x2 contingency tables we can calculate the so called _independence _ table which holds the counts for the data which we would get if the is no relationship between our two variables. The deviations _observed _ minus expected values can be used to calculate the _Pearson residuals _.

92
Q

(QUIZ 2)
The formula to calculate the Pearson residuals for every cell of a contingency table is: (______ - ______) / ______
The ______value calcuation uses a similar formular (with out sqrt) and sums up the values for every cell. ______values of this measure are more likely to produce low p-values than ______ values

A

The formula to calculate the Pearson residuals for every cell of a contingency table is: (observed - expected) / sqrt(expected)
The _chisq_value calcuation uses a similar formular (with out sqrt) and sums up the values for every cell. _Higher_values of this measure are more likely to produce low p-values than lower values

93
Q

(QUIZ 2)
Which descriptive measures and plots are appropiate to summarize a contingency table?
* Modus
* mosaicplot
* mean
* boxplot
* median
* xy-plot
* assocplot

A

modus, mosaicplot, assocplot

94
Q

(QUIZ 2)
_______ _are just a measure of randomness where as ________gives both a measure of significance and as well a guess how large the effect is. A highly _______ _result does not mean that we have a large ______.

A

p-values, confidence intervals, significant, effect