Intro Flashcards

1
Q

What infographic do you never use

A

Pie charts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistics

A

Statistics is the branch of mathematics that examines ways to process and analyse data. Statistics provides procedures to collect and transform data in ways that are useful to business decision makers. To understand anything about statistics, you first need to understand the meaning of a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

4 fundamental terms of statistics

A

Population
Sample
Parameter
Statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population

A

A population consists of all the members of a group about which you want to
draw a conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample

A

A sample is the portion of the population selected for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Parameter

A

A parameter is a numerical measure that describes a characteristic of a
population (measures used to describe a population) GREEK LETTERS REFER
TO A PARAMETER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Statistic

A

A statistic is a numerical measure that describes a characteristic of a sample
(measures calculated from sample data) ROMAN LETTERS REFER TO
STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 types of statistics

A

Descriptive statistics

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Descriptive statistics

A

Collecting, summarising and presenting data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Inferential statistics

A

Drawing conclusions about a population based on sample

data/results (i.e. estimating a parameter based on a statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 steps of descriptive statistics

A

Collect data
Present data
Characterise data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Collect data example

A

Survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Present data example

A

Tables and graphs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Characterise data example

A

Sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Steps of inferential statistics

A

Estimation

Hypothesis Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Estimation example

A

Estimate the population mean weight (parameter) using the

sample mean weight (statistic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Hypothesis testing example

A

Test the claim that the population mean weight is 100 kilos

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

4 important sources when collecting data

A

Data distributed by organisation or individual

Designed experiment

Survey

Observational study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

2 classifications of data sources

A

Primary

Secondary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

2 types of data

A

Categorical (defined categories)

Numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

2 types of numerical variables

A

Discrete (counted items)

Continuous (measured characteristics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Categorical data

A

Simply classifies data into categories (e.g. marital status, hair
colour, gender)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Numerical discrete data e.g.

A

Counted items – finite number of items (e.g. number of

children, number of people who have type-O blood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Numerical continuous data e.g.

A

Measured characteristics – infinite number of items

e.g. weight, height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
4 levels of Measurement and Measurement Scales from highest to lowest
Ratio data Interval data Ordinal data Nominal data
26
Ratio data
Differences between measurements are meaningful and a true zero exists
27
Interval data
Differences between measurements are meaningful but no true zero exists
28
Ordinal data
Ordered categories (rankings, order or scaling)
29
Nominal data
Categories (no ordering or direction)
30
Ratio data eg
Height, weight, age, weekly food spending
31
Interval data eg
Temperature in degrees Celsius, standardised exam score
32
Ordinal data eg
Rankings in a tennis tournament, student letter grades, Likert scales
33
Nominal data eg
Marital status, type of car owned, gender, hair colour
34
What data is charted and how is this done
Categorical data through the use of summary tables
35
What data is graphed and how is this done
Numerical data through the use of bar charts and pie charts
36
Ordered array
A sequence of data in rank order. Shows range, min to max. Provides some signals about variability within the range and may help identify outliers. If the data set is large or if the data is highly variable the ordered array is less useful.
37
Frequency distribution
``` A frequency distribution is a summary table in which data are arranged into numerically ordered classes or intervals. The number of observations in each ordered class or interval becomes the corresponding frequency of that class or interval. ```
38
Why use a frequency distribution
It is a way to summarise numerical data. It condenses the raw data into a more useful form. It allows for a quick visual interpretation of the data and first inspection of the shape of the data.
39
Frequency distribution rules
Class boundaries must be mutually exclusive and classes must be collectively exhaustive. Essentially no class overlaps. Each data value belongs to only one class. Each class grouping has the same width. Usually at least 5 but no more than 15 groupings. Round up the interval width to get desirable endpoints
40
How is width of interval determined in a frequency distribution
range/number of desired class groupings
41
Histogram
``` A graph of the data in a frequency distribution is called a histogram. The class boundaries (or class midpoints) are shown on the horizontal axis. The vertical axis is either frequency, relative frequency, or percentage. Bars of the appropriate heights are used to represent the frequencies (number of observations) within each class or the relative frequencies (percentage) of that class. ```
42
Important note about histograms
No gaps between bars even though excel does
43
What allows you to compare two or more variables
Frequency polygon and ogives
44
Scatter diagrams
Scatter diagrams are used to examine possible relationships between two numerical variables In a scatter diagram: one variable is measured on the vertical axis (Y) and the other variable is measured on the horizontal axis (X).
45
Time series plot
A time-series plot is used to study patterns in the values of a variable over time. In a time-series plot: one variable is measured on the vertical axis and the time period is measured on the horizontal axis.
46
Stem and leaf display
A quick and simple way to see distribution details in a data set Method: Separate the sorted data series into groups (the stem) and the values within each group (the leaves)
47
Tables and charts for numerical data
Photo 1
48
Stem and leaf display example
Photos 2-5
49
Frequency distribution example
Photos 6-10
50
Histogram example
Photo 11
51
Frequency polygon example
Photo 12
52
The ogive example
Photo 13
53
Scatter diagrams example
Photo14
54
Time series plot example
Photo 15
55
Variables
Variables are characteristics of items or individuals.
56
Data
Data are the observed values of variables.
57
Operational definition
Defines how a variable is to be measured.
58
Big Data
Large data sets characterised by their volume, velocity and variety.
59
Statistical packages
Computer programs designed to perform statistical analysis.
60
Primary sources
Provide information collected by the data analyser.
61
Secondary sources
Provide data collected by another person or organisation.
62
Focus group
An observational study. A group of people who are asked about attitudes and opinions for qualitative research.
63
Discrete variables
Can only take a finite or countable number of values.
64
Continuous variables
Can take any value between specified limits.
65
``` Problems for section 1.4 Chapter 1 review problems Problems for Section 2.1 Problems for Section 2.2 Problems for Section 2.3 Problems for Section 2.4 Problems for Section 2.5 Problems for Section 2.6 Chapter 2 review problems ```
Work through problems in textbook
66
Summary table
Summarises categorical or numerical data; gives the frequency, proportion or percentage of data values in each category or class.
67
Summary table examples
Photos 16-17
68
Bar chart
Graphical representation of a summary table for categorical data; the length of each bar represents the proportion, frequency or percentage of data values in a category.
69
Pie Chart
Graphical representation of a summary table for categorical data, each category represented by a slice of a circle of which the area represents the proportion or percentage share of the category relative to the total of all categories.
70
Class width (frequency distribution)
Distance between upper and lower boundaries of a class.
71
Range
Distance measure of variation; difference between maximum and minimum data values
72
Class boundaries (frequency distribution)
Upper and lower values used to define classes for numerical data.
73
Class midpoint
Centre of a class; representative value of class.
74
Relative Frequency Distributions and Percentage Distributions
A relative frequency distribution is obtained by dividing the frequency in each class by the total number of values. From this a percentage distribution can be obtained by multiplying each relative frequency by 100%.
75
Relative frequency distribution
Summary table for numerical data which gives the relative frequency of data values in each class.
76
Percentage distribution
Summary table for numerical data which gives the percentage of data values in each class.
77
Cumulative percentage distribution
Summary table for numerical data; gives the cumulative frequency of each successive class. A cumulative percentage distribution gives the percentage of values that are less than a certain value.
78
Percentage polygon
Graphical representation of a percentage distribution.
79
cumulative percentage polygon (ogive)
Graphical representation of a cumulative frequency distribution.
80
Chartjunk
Unnecessary information and detail that reduces the clarity of a graph.