IDE 620 Week 2 Flashcards

1
Q

A way of depicting frequency distributions for categorical (nominal) variables, such as religious affiliation, ethnic group, or state of residence

A

Bar Graph

Note that the bars do not touch in a bar graph, as they do in a histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A distribution having two modes or peaks.

A

Bimodal Distribution

Strictly speaking, for a distribution to be called bimodal, the peaks should be the same height. However, it is quite common to call any two-humped distribution bimodal, even when the high points are not exactly equal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Any of several methods, pioneered by John Tukey, of discovering unanticipated patterns and relationships, often by presenting quantitative data visually. The stem-and-leaf display and the box-and-whisker diagram are well-known examples.

A

Exploratory Data Analysis (EDA) `

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A tally of the number of times each score occurs in a group of scores. More formally, a way of presenting data that shows the number of cases having each of the attributes of a particular variable.

A

Frequency Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A graph with frequency shown by the height of contiguous bars, used for variables measured at the interval and ratio levels.

A

Histograph

Because the data in a histogram are interval or ratio, the bars should touch; in a bar graph for nominal or ordinal data, the bars do not touch.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A graphic depiction of data relying on one or more lines.

A

Line graph

The lines can be linear or curvilinear. For example, a line graph of the business cycle over the past 50 years might be plotted on a line graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The most common (most frequent) score in a set of scores.

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A distribution of scores or measures that, when plotted on a graph, produce a nonsymmetrical curve.

A

Skewed Distribution 

A positively (or upward or right) skewed distribution is one in which the infrequent scores are on the high or right side of the x-axis, such as the scores on a difficult test. A left (or downward or negatively skewed) distribution is one in which the rare values are on the low or left side of the x-axis, such as the scores on an easy test. One way to sort out which is which is to remember that a skewer is a pointy thing; when the pointy end of the distribution is on the right, it is right skewed, and conversely for left skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The points falling between half a measurement unit below and half a unit above the number.

A

Real Limits (of a Number)  

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The degree to which measures or scores are bunched on one side of a central tendency and trail out (become pointy, like a skewer) on the other.

A

Skewness

The more skewness in a distribution, the more variability in the scores.

Computer programs often compute indexes of skewness. Positive values indicate a positive or right skew. Negative values indicate a negative or left skew.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A way of recording the values of a variable, created by John Tukey, that presents raw numbers in a visual, histogram-like display. It is a histogram in which the bars are built out of numbers.

A

Stem-and-Leaf Display

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A distribution with only one mode

A

Unimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Any of several statistical summaries that, in a single number, represent the typical or average number in a group of numbers. Examples include the mean, mode, and median.

A

Measures of central tendency

A batting average is a well-known measure of central tendency in the United States. A grade point average might be a more important example for many college students.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A variable that can take on many possible values

A

Continuous variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A variable that takes on only a few possible values

A

Discrete variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The variable you are measuring

A

Dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The variable that you manipulate

A

Independent variable

18
Q

Person who created stem-and-leaf display and EDA (exploratory data methods)

A

John Tukey

19
Q

Leftmost digits of a number

A

Leading digits (most significant digits)

20
Q

Vertical axis of na stem-and-leaf display

21
Q

Digits to the right of the leading digits

A

Trailing digits (less significant digits)

22
Q

Horizontal axis of display containing the trailing digits

23
Q

Average; most popular measure of location or central tendency; has the desirable mathematical property of minimizing the variance.

24
Q

computed by taking the nth root of the product of n scores (e.g., the square root of 2 scores, the cube root of 3, etc.).

A

Geometric mean

25
calculated by dividing n by the sum of the reciprocals of the numbers.
Harmonic mean (smaller than the arithmetic and geometric mean)
26
The middle score or measurement in a set of ranked scores or measurements; the point that divides a distribution into two equal halves; the 50th percentile.
Median
27
The most common (most frequent) score in a set of scores.
Mode
28
A mean computed after removing the extreme observations.
Trimmed Mean Thus, a trimmed mean is a measure of central tendency that allows the researcher to deal separately with a distribution’s outliers.
29
Anything that produces systematic error in a research finding; causes of bias can range from poor data collection to flawed measurements to inappropriate statistical analysis. While the distortions due to systematic error continue to grow in the long run, random errors tend to balance out in the long run.
Bias
30
The effects of any factor that the researcher did not expect to influence the dependent variable.
Bias `
31
A type of graph in which boxes and lines show a distribution’s shape, central tendency, and variability. The “boxplot,” as it is often called, gives an informative picture of the values of a single variable and is helpful for indicating whether a distribution is skewed and has outliers.
Box-and-whisker diagram
32
The upper and lower boundaries of each box in a box-and-whisker diagram
Hinges
33
The number of values “free to vary” when computing an inferential statistic. It’s the number of pieces of information that can vary independently of one another or, alternatively stated, the number of unconstrained observations used in calculating an estimate
Degrees of freedom
34
 A statistic showing the amount of variation or spread in the scores for, or values of, a variable.
Measure of dispersion When the dispersion is large, the scores or values are widely scattered; when it is small, they are tightly clustered. Two commonly used measures of dispersion are the variance and the standard deviation. A measure of dispersion always implies the presence of a measure of central tendency, such as a mean. For example, the standard deviation measures deviation from the mean.
35
The mean value of a variable in repeated samplings or trials
Expected value
36
The mean of the sampling distribution of a statistic.
Expected value
37
The middle half of a distribution. A measure of dispersion calculated by taking the difference between the first and third quartiles (that is, the 25th and 75th percentiles). Also called “midspread.”
Interquartile Range (IQR) 
38
 A subject or other unit of analysis that has an extreme value on a variable or a combination of variables or has a large residual value.
Outlier Outliers are important because they can distort the interpretation of data or make misleading a statistic that summarizes values (such as a mean). Outliers may also indicate that a sampling error has occurred by including a case from a population different than the target population.
39
Divisions of the total rank-ordered cases or observations in a study into four groups of equal size. Technically, the three points that divide a series of ordered scores into four groups.
Quartiles
40
A measure of variability, of the spread or the dispersion of values in a series of values.
Range To get the range of a set of scores, you subtract the lowest value or score from the highest.
41
A statistic that shows the spread, variability, or dispersion of scores in a distribution of scores. It is a measure of the average amount the scores in a distribution deviate from the mean.
Standard deviation The standard deviation is the square root of the variance. As a variable, it is symbolized as SD, Sd, s, or lowercase sigma (σ).
42
 A measure of the spread of scores in a distribution of scores, that is, a measure of dispersion.
Variance The larger the variance, the farther the individual cases are from the mean. The smaller the variance, the closer the individual scores are to the mean. Specifically, the variance is the mean of the sum of the squared deviations from the mean score divided by number of scores. That is, it’s the average distance from the mean in squared units. (See sum of squares for an example.) Taking the square root of the variance gives you the standard deviation (i.e., it converts the variance into regular, nonsquared units). A variance cannot be less than zero, nor can the standard deviation.