Chapter 1 - Describing data: graphical Flashcards
Population
A complete list of all items of interest in research
Sample
A specific part of the population
Simple random sampling
A manner of selecting a sample of objects out of a population
A way which each member of the population is chosen completely by chance
The selection of one member does not have an influence on the probability of another member in the population being chosen
Systematic sampling
Involves the selection of samples out of a list of the population
Parameter
A numerical measure that describes a specific characteristic of a population
Statistic
A numerical value that describes a specific characteristic of a sample
Sampling error
Often results from the fact that only information about a part of the entire population is available
Nonsampling errors
A type of error that can always occur
Unrelated to the kind of sampling procedure used
Examples of non sampling errors include
The population sampled is not the relevant one
Survey subjects may give inaccurate or dishonest answers
There may be no response to survey questions
Descriptive statistics
Focuses on graphical and numerical procedures that are used to summarise and process data
Describes the overall data
Inferential statistics
Focuses on using the data to make predictions, forecasts, and estimates to make better decisions
Categorical variables
Produce responses that belong to groups and categories
Numerical variables
Can be split up into discrete and continuous variables
Discrete variables
Countable values
Continuous variables
Can take on any value within a given range of real numbers and usually arises from a measurement
Qualitative data
There is no measurable meaning to the ‘difference’ in numbers
Quantitative data
There is a measurable meaning to the difference in numbers
Nominal scale
A scale used for labelling variables into distinct groups
Ordinal scale
A variable measurement scale used to depict the order of variables but not the difference between each of the variables
Interval scale
A numerical scale where the order of the variables as well as the difference between these variables is known
Ratio scale
Almost the same as the interval scale
However it does have zero point
Frequency distribution
A table used to organise data
Left column includes all possible responses on a variable being studied
The right column is a list of the frequencies, or number of observation, for each class
Relative frequency distribution
Obtained by dividing each frequency by the number of observation and multiplying the resulting value by 100%
Cross table
Lists the number of observations for every combination of values for two categorical or ordinal variables
Pie chart
Used to draw attention to the proportion of frequencies in each category
Pareto diagram
A bar chart that displays the frequency of defect causes
The bar on the left indicates the most frequent cause
The bar on the right indicate causes with decreasing frequencies
Why is the Pareto diagram used?
Separate the vital few with the trivial many
What are the 3 rules when constructing a frequency distribution
Determine k, the number of classes. This is decided in an arbitrary manner
Classes should be the same width, w. The width, w
Classes must be inclusive and non overlapping
How to calculate the class width
(Largest observation - small observation)/number of classes
Cumulative frequency distribution
Contains the total number of observation whose values are less than the upper limit for each class
We construct a cumulative frequency distribution by adding the frequencies of all frequency distribution classes up to and including the present class
Relative cumulative frequency distribution
Cumulative frequencies can be expressed as cumulative proportion or percentages
Histogram
A graph that consists of vertical bars constructed on a horizontal line that is marked off with intervals for the variable being displayed
Ogive
A line that connects points that re the cumulative percent of observations below the upper limit of each interval in a cumulative frequency distribution
Stem and leaf display
Alternative to the histogram
Data are grouped according to their leading digits and the final digits are listed separately for each member of a class
Scatter plot
Can be prepared by locating one point for each pair of two variables that represent an observation in the data set
What does a scatter show?
The range of each variable
The pattern of values over the range
A suggestion as to a possible relationship between the two variables
An indication of outliers