Weeks 1 And 2 Flashcards
True or false: To “operationalize” means to take observations and measurements and analyze them.
False - high-level theories and concepts lend importance to scientific investigations. but they can’t be measured directly. Operationalization involves turning these nebulous, but important, high-level ideas into observable measurements. this word probably came about because the end result of operationalization is a sequence of operations that unfold in the lab or the field that an investigator follows in his or her studies. to operationalize thus means to move from the big to the small - from theory to measurement.
In Harlow’s experiments with juvenile monkeys, which of the following was used to operationalize social bonding? a) using monkeys as test subjects; b) love between a mother and infant; c) amount of time spent clinging to objects, and d) wire mesh.
c) amount of time spent clinging to objects
In Harlow’s experiments with juvenile monkeys, which is the following was used to operationalize social bonding? a) using monkeys as test subjects; b) love between a mother and infant; c) amount of time spent clinging to objects, and d) wire mesh.
c) amount of time spent clinging to objects
A scatter plot presents measurements of the same variable on both the horizontal and vertical axes. True or false?
False. Scatter plots involve 2 distinct variables and present values of a different variable on each axis. Each point in a scatter plot thus represents the results of 2 measurements on one of the experimental subjects in the design, I.e., one measurement that appears on the y axis and the other that appears on the x axis (meant to show relationships).
To make a histogram from a series of observations, the variable involved must be ordinal or better (I.e., either ordinal, interval, or ratio). True or false?
False. Must use equal sizes bins, which is not necessarily the case for ordinal variables.
The mean is to the standard deviation as the median is to the…a) interquartile range; b) semi-interquartile range; c) variance, and range.
Semi-interquartile range. The M and SD are often used to characterize data whose distrib is well behaved (no outliers, symmetric, and unimodal), with the M + or - 1 SD encompassing more than half of all the observations. For data with outliers, not symmetric, and not normally distributed, use the median and the SIR to describe the centre and dispersion of the data.
An observation that is right in the middle of its distribution may appear as an outlier if you plot it in a scatter plot. True or false?
True. Scatter plots show RELATIONSHIP between two variables (e.g. height x weight).
Suppose the boxplot below describes a unimodal distribution. Where would you be likely to find the mode? (Note: the mode is the most commonly occurring observation). A) between the lower hinge and Q1; b) between Q1 and Q2; c) between Q2 and Q3; d) between Q3 and Q4, and e) above Q4. (image of a boxplot, with a median line near the top of the box)
c) Between Q2 and Q3.
Explanation: According to the boxplot, 25% of values in this distribution are clustered in the narrow range between Q2 and Q3. This is why the median appears very close to the top hinge (i.e., close to Q3). The other 75% of data are much more spread out. That means that the mode, the most commonly occurring observation, probably lies between Q2 and Q3.
What goes on the vertical axis of a boxplot? (Assume that the boxplot is laid out vertically so that Q3 is shown vertically above Q2, and so on) a) values of the variable under examination; b) frequencies of the occurrence of observations in bins; c) bins for the variable under examination, and d) category names.
a) values of the variable under examination.
Explanation: The vertical axis of a boxplot indicate values of the variable under examination. Remember that Q1, Q2, and Q3 points are all displayed in the boxplot and so the vertical scale must be able to accommodate these values. Histograms, not boxplots, show accumulated frequencies of observations in bins.
according to the criteria used to construct boxplots in class, does the dataset {5,3,7,1} contain an outlier? Assume: 25th percentile = 1.5; 50th percentile = 4; 75th percentile = 6.5. Yes or no?
No. Explanation: To answer this question you need to determine whether any points in the dataset lie more than 1.5 interquartile ranges (IQRs) beyond the upper and lower hinges. The lower hinge (same as Q1) is at 1.5 and the upper hinge (same as Q3) is at 6.5. therefore, IQR = 5 and outliers must be 1.5*5 = 7.5 units beyond the upper or lower hinges. The inner fence at the bottom end of the distribution is thus 1.5 - 7.5 = -6 and the top inner fence is 6.5 + 7.5 = 14. Referring back to the dataset, all values are inside these values and so there are no outliers.
Which of the boxplots below most closely corresponds to the distribution shown in the histogram on the left? (Histogram is positively skewed, with the mean closer to the left. a) boxplot closer to bottom, with median lower than centre of box; b) box near top of plot with median above centre of box; small plot with box plot in centre, with median in centre; d) smallish plot with box in centre with median in centre, and a single outlier way above the plot).
a) a large plot with box near bottom of plot, with median below the median. Explanation: You can answer this question by a process of elimination. The distribution in the histogram is clearly positively skewed. But the boxplot c represents a symmetric distribution, not a skewed one, so that isn’t it. Answer d is wrong since it shows it’s symmetrical, but with a single outlier at the top, which is not in the histogram. To make more sense of the boxplots, turn them on their side, and you will note that a is closer to the left which has the histogram focusing on that side, whereas b has more values on the right. Therefore, a is the right answer.
What advantage does a plot using error bars have over a boxplot? a) it can detect multiple modes in a distribution; b) it is better able to allow comparison of the properties of multiple distributions; c) it is better able to detect outliers, and d) it is more flexible–different properties of a distribution can be displayed.
d) it is more flexible–different properties of a distribution can be displayed.
Explanation: boxplots and plots using error bars summarize data before they present it graphically. Because of this they sometimes miss things in your data. For instance, if you want to know whether your data are multimodal you need a histogram. However, boxplots and plots that use error bars do have their strength. Both types of plots allow side-by-side comparison of many distributions at once and this is one of their main strengths. Plots with error bars, however, are more flexible than boxplots in the choice of which properties of the distributions can be compared. Error bars may be used to indicate SD, SE of the mean, CI, or some multiple of one of these measures. Boxplots, in contrast, always show Q1, Q3, the upper and lower inner fences and the location of outliers. This means that they are less flexible than plots involving error bars. Plots involving error bars do not normally show outliers.
What do you call a normal distribution, according to how the data is “pushed together”?
mesokurtic
What is considered a “thin” distribution?
leptokurtic
What is considered a “flat” distribution?
platykurtic