8 | Statistics for Numerical Data I Flashcards
(POLL)
Which measures can be used to describe the data scatter?
* mean
* median
* cv
* sem
* sd
- cv
- sd
(POLL)
If one of the values is 0, the geometric mean is:
* positive
* negative
* zero
* undefined
* 1
zero
(POLL)
The data are normally distributed within +/- 1 SD are …
* 50% of all data
* 2/3 of all data
* 1/3 of all data
* the SD does not tells us this
- 2/3 of all data
(POLL)
Which of the following measures can be used to describe the shape of the distribution to the data scatter?
* cv
* kurtosis
* mean
* sd
* sem
* skewness
* var
- kurtosis
- skewness
(POLL)
With the 3 SD criteria, how many outliers you expect for 1000 values if the data are normally distributed?
* 0
* 1
* 2-3
* 5-10
* 10-100
* 100-1000
- 2-3
(POLL)
You have a numerical variable with ten values. For visualization you would use a?
* Histogram
* Stripchart
* density line
* violinplot
- Stripchart
also possible but not best:
* Histogram (stripchart better)
(POLL)
You visualize a numerical variable against a categorical one, which is the appropiate plot to use?
* barplot
* boxplot
* histogram
* stripchart
* violineplot
* xyplot
- boxplot
- stripchart
- violineplot
(POLL)
Which test(s) you could use to check if your data are normally distributed?
* Chisq-Test
* Fisher-Test
* Kolmorogov-Smirnov-Test
* Shapiro-Wilk-Test
* T-Test
- Kolmorogov-Smirnov-Test
- Shapiro-Wilk-Test
(POLL)
Which things are shown on a boxplot?
* mean,max,minimum,outliers
* median,1st quartile,3rd quartile,outliers
* mean,1st quartile,3rd quartile,outliers
- median,1st quartile,3rd quartile,outliers
R:
How would you create a summary of a dataset with the following variables?
Cat
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
`
R:
How would you create a summary of a dataset with the following variables?
Cat
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Cat Cat
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Cat Cat Cat
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Cat Num
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Cat Num Num
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Cat Cat Num
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Num
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Num Num
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
R:
How would you create a summary of a dataset with the following variables?
Num Num Num
Data Summaries
—————————————————————————————-
1D 2D 3D | function
—————————————————————————————-
Cat NA NA | table(c1)
Cat Cat NA | table(c1,c1), chisq.test(table(c1,c2))
Cat Cat Cat | ftable(c1,c2,c3)
—————————————————————————————-
Cat Num NA | aggregate(n2,by=list(c1),func)
Cat Num Num | sbi$aggregate2(n2,n3,c1,cor)
Cat Cat Num | aggregate(n3,by=list(c1,c2),func)
—————————————————————————————-
Num NA NA | mean(n1), median(n1), sd(n1), mad(n1)
Num Num NA | cor(n1,n2)
Num Num Num | cor(n1,n2), cor(n1,n3), cor(n2,n3) OR cor(data.frame(n1=n1,n2=n2,n3=n3))
—————————————————————————————-
Univariate Descriptions of Numerical Data
How can we describe the center?
- center: mean, mean(x,trim=0.1), median
Univariate Descriptions of Numerical Data
How can we describe the scatter?
- scatter: var, sd, cv
Univariate Descriptions of Numerical Data
How can we describe the distribution?
- distribution: quantile, IQR, max, min, range
Univariate Descriptions of Numerical Data
How can we describe the shape?
- shape: skewness, kurtosis
Univariate Descriptions of Numerical Data
How can we describe with graphics?
- plots: boxplot (barplot with arrows)