IDE 620 Week 2 Flashcards by Allison Morrow

A way of depicting frequency distributions for categorical (nominal) variables, such as religious affiliation, ethnic group, or state of residence

Bar Graph

Note that the bars do not touch in a bar graph, as they do in a histogram.

How well did you know this?

Not at all

Perfectly

A distribution having two modes or peaks.

Bimodal Distribution

Strictly speaking, for a distribution to be called bimodal, the peaks should be the same height. However, it is quite common to call any two-humped distribution bimodal, even when the high points are not exactly equal.

How well did you know this?

Not at all

Perfectly

Any of several methods, pioneered by John Tukey, of discovering unanticipated patterns and relationships, often by presenting quantitative data visually. The stem-and-leaf display and the box-and-whisker diagram are well-known examples.

Exploratory Data Analysis (EDA) `

How well did you know this?

Not at all

Perfectly

A tally of the number of times each score occurs in a group of scores. More formally, a way of presenting data that shows the number of cases having each of the attributes of a particular variable.

Frequency Distribution

How well did you know this?

Not at all

Perfectly

A graph with frequency shown by the height of contiguous bars, used for variables measured at the interval and ratio levels.

Histograph

Because the data in a histogram are interval or ratio, the bars should touch; in a bar graph for nominal or ordinal data, the bars do not touch.

How well did you know this?

Not at all

Perfectly

A graphic depiction of data relying on one or more lines.

Line graph

The lines can be linear or curvilinear. For example, a line graph of the business cycle over the past 50 years might be plotted on a line graph.

How well did you know this?

Not at all

Perfectly

The most common (most frequent) score in a set of scores.

Mode

How well did you know this?

Not at all

Perfectly

A distribution of scores or measures that, when plotted on a graph, produce a nonsymmetrical curve.

Skewed Distribution

A positively (or upward or right) skewed distribution is one in which the infrequent scores are on the high or right side of the x-axis, such as the scores on a difficult test. A left (or downward or negatively skewed) distribution is one in which the rare values are on the low or left side of the x-axis, such as the scores on an easy test. One way to sort out which is which is to remember that a skewer is a pointy thing; when the pointy end of the distribution is on the right, it is right skewed, and conversely for left skewed

How well did you know this?

Not at all

Perfectly

The points falling between half a measurement unit below and half a unit above the number.

Real Limits (of a Number)

How well did you know this?

Not at all

Perfectly

The degree to which measures or scores are bunched on one side of a central tendency and trail out (become pointy, like a skewer) on the other.

Skewness

The more skewness in a distribution, the more variability in the scores.

Computer programs often compute indexes of skewness. Positive values indicate a positive or right skew. Negative values indicate a negative or left skew.

How well did you know this?

Not at all

Perfectly

A way of recording the values of a variable, created by John Tukey, that presents raw numbers in a visual, histogram-like display. It is a histogram in which the bars are built out of numbers.

Stem-and-Leaf Display

How well did you know this?

Not at all

Perfectly

A distribution with only one mode

Unimodal

How well did you know this?

Not at all

Perfectly

Any of several statistical summaries that, in a single number, represent the typical or average number in a group of numbers. Examples include the mean, mode, and median.

Measures of central tendency

A batting average is a well-known measure of central tendency in the United States. A grade point average might be a more important example for many college students.

How well did you know this?

Not at all

Perfectly

A variable that can take on many possible values

Continuous variable

How well did you know this?

Not at all

Perfectly

A variable that takes on only a few possible values

Discrete variable

How well did you know this?

Not at all

Perfectly

The variable you are measuring

Dependent variable

How well did you know this?

Not at all

Perfectly

The variable that you manipulate

Study These Flashcards

Independent variable

Person who created stem-and-leaf display and EDA (exploratory data methods)

Study These Flashcards

John Tukey

Leftmost digits of a number

Study These Flashcards

Leading digits (most significant digits)

Vertical axis of na stem-and-leaf display

Study These Flashcards

Stem

Digits to the right of the leading digits

Study These Flashcards

Trailing digits (less significant digits)

Horizontal axis of display containing the trailing digits

Study These Flashcards

Leaves

Average; most popular measure of location or central tendency; has the desirable mathematical property of minimizing the variance.

Study These Flashcards

Mean

computed by taking the nth root of the product of n scores (e.g., the square root of 2 scores, the cube root of 3, etc.).

Study These Flashcards

Geometric mean

calculated by dividing n by the sum of the reciprocals of the numbers.

Harmonic mean (smaller than the arithmetic and geometric mean)

The middle score or measurement in a set of ranked scores or measurements; the point that divides a distribution into two equal halves; the 50th percentile.

Median

The most common (most frequent) score in a set of scores.

Mode

A mean computed after removing the extreme observations.

Trimmed Mean Thus, a trimmed mean is a measure of central tendency that allows the researcher to deal separately with a distribution’s outliers.

Anything that produces systematic error in a research finding; causes of bias can range from poor data collection to flawed measurements to inappropriate statistical analysis. While the distortions due to systematic error continue to grow in the long run, random errors tend to balance out in the long run.

Bias

The effects of any factor that the researcher did not expect to influence the dependent variable.

Bias `

A type of graph in which boxes and lines show a distribution’s shape, central tendency, and variability. The “boxplot,” as it is often called, gives an informative picture of the values of a single variable and is helpful for indicating whether a distribution is skewed and has outliers.

Box-and-whisker diagram

The upper and lower boundaries of each box in a box-and-whisker diagram

Hinges

The number of values “free to vary” when computing an inferential statistic. It’s the number of pieces of information that can vary independently of one another or, alternatively stated, the number of unconstrained observations used in calculating an estimate

Degrees of freedom

A statistic showing the amount of variation or spread in the scores for, or values of, a variable.

Measure of dispersion When the dispersion is large, the scores or values are widely scattered; when it is small, they are tightly clustered. Two commonly used measures of dispersion are the variance and the standard deviation. A measure of dispersion always implies the presence of a measure of central tendency, such as a mean. For example, the standard deviation measures deviation from the mean.

The mean value of a variable in repeated samplings or trials

Expected value

The mean of the sampling distribution of a statistic.

Expected value

The middle half of a distribution. A measure of dispersion calculated by taking the difference between the first and third quartiles (that is, the 25th and 75th percentiles). Also called “midspread.”

Interquartile Range (IQR)

A subject or other unit of analysis that has an extreme value on a variable or a combination of variables or has a large residual value.

Outlier Outliers are important because they can distort the interpretation of data or make misleading a statistic that summarizes values (such as a mean). Outliers may also indicate that a sampling error has occurred by including a case from a population different than the target population.

Divisions of the total rank-ordered cases or observations in a study into four groups of equal size. Technically, the three points that divide a series of ordered scores into four groups.

Quartiles

A measure of variability, of the spread or the dispersion of values in a series of values.

Range To get the range of a set of scores, you subtract the lowest value or score from the highest.

A statistic that shows the spread, variability, or dispersion of scores in a distribution of scores. It is a measure of the average amount the scores in a distribution deviate from the mean.

Standard deviation The standard deviation is the square root of the variance. As a variable, it is symbolized as SD, Sd, s, or lowercase sigma (σ).

A measure of the spread of scores in a distribution of scores, that is, a measure of dispersion.

Variance The larger the variance, the farther the individual cases are from the mean. The smaller the variance, the closer the individual scores are to the mean. Specifically, the variance is the mean of the sum of the squared deviations from the mean score divided by number of scores. That is, it’s the average distance from the mean in squared units. (See sum of squares for an example.) Taking the square root of the variance gives you the standard deviation (i.e., it converts the variance into regular, nonsquared units). A variance cannot be less than zero, nor can the standard deviation.

IDE 620 Week 2 Flashcards

(42 cards)