KEY TERMS Flashcards
Define Population (parameter)
Entire collection of events in which you are interested.
Eg. scores of all morphine injected mice or milk production of all cows in the country
Define Sample (statistics)
Set of actual observations; subset of a population
Define Statistics
Numerical values sumerizing sample data
Define Parameters
Numerical values sumerizing population data
Define Random Sample
A sample in which each member of a population has an equal chance of inclusion in the study.
Define Decision Tree
Graphical representation of decisions involved in the choice of statistical procedures
Define Measurement Data (quantitative data)
Data obtained by measuring objects or events
Categorical Data (frequency data & count data)
Data representing counts or number of the observations in each category
Define Measurement
The assignment of numbers to objects.
i.e. paw-lick latency as a measure of pain sensitivity, we are measuring sensitivity by assigning a number ( a time) to an object (a mouse) to assess the number of sensitivity of that mouse.
Define Scales of Measurement
Characteristics of relations among numbers assigned to objects.
Define Nominal Scale
Numbers used only to distinguish among objects.
i.e. numbers on jerseys have no meaning just convenient
label that distinguishes players on their positions from one another used for purpose of classification
Note: categorial data are often measured on a nominal scale b/c we merely assign category labels ( e.g. male or female same context group of different context group) to observe.
Define Ordinal Scale
Numbers used only to place objects in order. Orders ppl.,objects or events along some continuum. i.e. 1, 2, 3, 4, or number of changes rather than ranks.
Define Interval Scale
Scale on which equal intervals b/w objects represents equal differences - differences are meaningful.
i.e fahrenheit scale of temperature, in which a 10pt difference has the same meaning anywhere along the scale. Thus, the difference in temp. b/w 10F and 20F is the same as the difference b/w 80F and 90F
Define Ratio Scale
A scale with a true zero point- ratios are meaningful
i.e. ratio scale = common physical ones of length and volume.
Define Variables
Properties of objects or events that can take on different values.
i.e. hair colour, b/c it is a property of an object ( hair) that can take on different values ( brown, blonde, blk). Properties such has height, length and speed are variables for the same reason. Bib #’s position in race etc.
Define Discrete Variables
Variables that take on a small set of possible values.
(i.e. gender, marital status, # of TV’s in home.)
- there is no decimals in there values*
Define Continous Variables
Variables that take on any value ( e.g. speed, paw- lick latency, amt. of milk produced by cow etc.)
Variable could assume any value b/w lowest number and highest points on the scale.
Note: nominal variables can never be continuous b/c they are not ordered along any continum
Define Independant Variable
Those variables controlled by the experimenter.
i.e. forms of therapy, placement of stimulation electrodes, methods of treatment etc.
Define Dependant Variables
The variables being measured; data or score
(i.e. those that are not under the experimenters control- the data)
“D” for dependant Variable and “D” for Data
Define Random Assignment
The allocation or assignment of participants to groups by a random process.
Define Random Sampling
Each person of a population has an equal opportunity to be in the study.
(i.e. names drawn from a hat)
Define Constant
A number that does not change in value in a given situation
Define Frequency Distribution
A distribution in which the values of a dependant variable are tabled or plotted against their frequency of occurrence.
Define Real Lower Limit
The point halfway between the bottom of one interval and the top of one below it.
Define Real Upper Limit
The point halfway between the bottom of one interval and the top of one above it
Define Midpoint
Center of the interval; average of upper and lower limits
Define Histogram
Graph in which a rectangle is used to represent frequencies of observations with in each interval.
Define Goldilocks Principle
Optimal number number of intervals to use when grouping data. ( neither too big or too small).
Define Stem and Leaf Displays
Graphical display presenting original data arranged into histogram.
Define Exploratory Data Analysis ( EDA)
A set of techniques developed by Tukey for presenting data in visually meaning ways.
Define Leading Digits (most significant digits)
LEFT most digits of a number
Define Stem
Vertical axis of display containing the leading digits
Define Trailing Digits ( less significant digits)
Digits to the RIGHT of the leading digits
Define Leaves
Horizontal axis of display containing the trailing digits
Define Bar Graphs
A graph in which frequency of occurrence of different values of X is represented by height of a bar
Define Line Graph
A graph in which the Y values corresponding to different values of X are connected by a line.
What are the steps involved in creating a graph?
- Decide what is plotted on each axis
- Identify independent/dependant variable
- Look for patterns in data
- In Histograms- looking for shape of distribution hoping it is highest towards the center.
- In Bar & Line Graphs looking for differences b/w groups and/or trends in data.
What are the different ways to name Vertical axis?
- Vertical Axis
- Y Axis
- Ordinate
What are the different ways to name Horizontal axis?
- Horizontal Axis
- X Axis
- ABSCISS
What are the guidelines for Plotting Data?
- Supply the title
- Label Axis
- Try and start both X & Y Axis at 0
- DO NOT use pie charts
- Try not to plot more than 2 dimensions
- Do not add non essential material
Define Symmetric
Having the same shape on both sides of the center
Define Biomodel
A distribution that has 2 distinct peaks
Define Unimodel
A distribution that has 1 distinct peaks
Define Modality
The number of meaningful peaks in a frequency distribution of the data.
Define Negatively Skewed
A distribution that trails off to the LEFT
Define Positively Skewed
A distribution that trails off to the RIGHT
Define Skewness
A measure of degree to which a distribution is symmetrical
Define Central Tendancy
Measures that relate to the center of distribution of scores. The most common measures are the Mean (average), Median (middle score) and Mode ( most common score)
Define Mode (MO)
The most common score ( least useful).
i.e. the score obtained from the largest # of participants.
- Therefore the mode is that value of X, the dependant variable that corresponds to the highest point on the distribution.
Define Median (Mdn)
The middle score in the ordered set or data ( aka the 50th percentile)
i. e ( 3, 5, (7), 8, 15) the number 7 = the middle score known as the median.
* if there were an uneven # of scores i.e. ( 3, 5, (7, 11), 14,15) there is no middle score so that point would fall between 7-11 =9 ) (7,8,(9),10,11).
Define Median Location
The location of median in an ordered series.
Median Location = ( N + 1 ) / 2
So, for #’s ( 3,5,7,8,15) the median location =( 5+1 ) / 2 = 3, which simply means that the median is the third # in the order series
Define Mean
The sum of all scores divided by the # of scores
(“ the average” )
Mean is the most common central tendency
Define Trimmed Mean
The mean that results from trimming away (or discarding) a fixed percentage of extreme observations.
- To calculate a trimmed mean you take 1 or more of the largest and smallest values in the sample, set them aside and take the mean of what remains.
i. e. for a 10% trimmed mean we would set aside 10% of the largest observations & 10% of the lowest observations. The mean of what remained would be the 10% trimmed mean.
( always discard the same percentage of scores from each end of the distribution)
Define Dispersion (variability)
The degree to which individual data points are distributed around the mean.
Define Range
The distance from the lowest to the highest score.
Define Outlier
An extreme point that stands out from the rest of the distribution.
Define Interquartile Range
The range of the middle 50% of the observations
- Represents an attempt to circumvent the problem of the range being heavily dependent on extreme scores.
Define Trimmed Samples
Samples of the percentage of the extreme score removed.
Define Trimmed Statistics
Statistics calculated on trimmed samples
Define Sample Variance
Sum of the squared deviations about the mean divided by the N - 1
Define Population Variance
Variance of a population; usually estimate, rarely computed.
Define Standard Deviation
The square root of the variance.
Define Bias
A property of a statistic whose long-range average is not equal to the parameter it estimates.
Define Degrees of freedom (df)
The number of independent pieces of information remaining after estimating one or more parameters
Define Box Plot (aka box and whisker plot)
A graphical representation of the dispersion of a sample
Define Quartile Location
The location of the quartile in an ordered series
Define Whisker
Line from top and bottom of the box to the farthest point that is no more than 1.5 times in the interquartile range from the box.
Define Winsorized Variance
The varaiance of a winsorized sample
Define Winsorized standard deviation
The standard deviation of a Winsorized sample.
List the 4 different types of Data
- Qualitative (categorical)
- Quantitative (measurement)
- Descriptive (describes data)
- Inferential (use stats, which measures a sample, to infer values of parameters, which are measures of a population)