UNIT 1, ALL LESSONS VOCAB / KEY CONCEPTS Flashcards

memorize & understand key concepts and vocab

1
Q

descriptive statistics

A

summarizing and describing features of a data set without making any generalizations/conclusions about a population
› states facts and proven outcomes we already know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

inferential statistics

A

uses data from a sample/population to draw conclusions and make predictions about a larger population
› analyzes data to make predictions that we don’t know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

phases of a statistical study

A

› phase 1, data gathering: any process that gets you data (surveys, questionnaires, counting, etc.)
› phase 2, data organization and analysis: includes making graphs, charts, and tables, from data and can also include calculating stats and analyzing data looking for patterns
› phase 3, probability-based inference: the process of using data to make conclusions about a population based on a sample of that population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

statistical inference

A

process of using data to make conclusions about a population based on a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

statistic

A

number that describes a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

probability

A

mathmatical concept that measures the likelihood of an event occuring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data

A

a bunch of facts collected together for reference and analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

parameter

A

a number that describes a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

data set

A

set or collection of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

datum

A

a bit of information / facts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

numeric/quantitative data

A

data that is used as numbers and can be sorted and calculated
› can be sorted and worked with mathematically
› presented in numbers
› can be discrete or continuous
- discrete data is gathered by counting and can be whole numbers (counted data is a form of discrete data (numbers of people, anything counted)
- continuous data is gathered by measuring and are always numerical, can be fractions or decimals of any length (measured data, is a form of continuous data (lengths, weights, volumes- anything measured)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(non-numeric) categorical/qualitative data

A

data that is named or can be put into categories and cannot be sorted and can’t do calculations
› can be divided into groups or categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

distribution

A

a set of numbers on a graph that shows the possible values for a variable and how often they occur
› cannot be any graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

variable

A

a characteristic, number, or quantity that can be measured or counted, and that can take on different values (ex. age, sex, income, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

univariate data

A

a set of data that only focuses on one variable (ex. salaries of employees in a company, the number of pets in different households, length of trouts in a lake, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

unit of analysis

A

the major entity that you are analyzing in your study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

class width

A

the difference between the upper and lower class limits of a class interval (ex. class interval 163-175, class width is 175-163=12)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

constant

A

a fixed value that never changes within a given context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

bar graph

A

a visual representation of data where rectangular bars are used in a way of showing the distribution of data
› counts or percents are on the vertical axis
› categories are on the horizontal axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

observation

A

a fact or figure we collect about a given variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

class

A

the range of values assigned to a group of data points (ex. 0-10, 11-20, 21-30, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

class boundaries

A

the values that separate different classes (or groups) within a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

relative frequency

A

the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes
› calculation: divide the frequency (the number of times a particular value for a variable has been observed to occur) of a specific category by the total number of observations
› can also be in table form- one column contains values or intervals- other column contains frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

class interval

A

a range of values within a data set that are grouped together for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

histogram

A

a graphical representation that displays the distribution of numerical data using rectangles
› height of a rectangle (vertical axis) represents the distribution frequency of a variable (amount, or how often that variable appears)
› height of bars display frequency
› width of bars display intervals
› has midpoints- the exact center of value of a given bin (or class interval), calculated by adding the upper and lower boundary values of that bin and dividing the sum by two)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

relative frequency histogram

A

a graph that displays the classes on the horizontal axis and the relative frequencies on the classes on the vertical axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

frequency table

A

a list that shows how often each value occurs in a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

cumulative relative frequency

A

a summary of a data set showing the relative frequency of items less than or equal to the upper-class limit of each class; tells you the percentage of data that falls within a specific category or below a specific point on the data scale
› plot cumulative frequency on the vertical axis
› place class boundaries or interval midpoints on the horizontal axis
› classes are on the horizontal axis
› cumulative frequencies are on the vertical axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SUMMARY 1.2 ACTIVE RECALL

A

data presented with frequencies
› individual values often occur many times in a distribution
› classes or intervals group data
› frequency tables show frequencies for all the classes in your data
histograms from frequency data
› histograms are graphical interpretations of frequency data
› are made up of vertical bars
› height of bars display frequency
› width of bars display intervals
relative frequency tables
› relative frequencies are included in a summary table
› one column contains values or intervals
› one column contains frequencies
relative frequency histograms
› similar to a histogram from count data
› vertical axis represents the relative frequency of occurrence
cumulative frequency plots
› line graphs
› classes are on the horizontal axis
› cumulative frequencies are on the vertical axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

frequency graph

A

a visual representation of data that shows how often each value (or category) appears in a dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

stem and leaf plot

A

a visual representation of data where each data point is divided into a “stem” (leading digit) and a leaf (last digit)
› plot displays the data in a way that allows you to see spread of data, where most values clusters, and extreme outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

spread

A

how much variation or dispersion exists within a data set; describes how scattered the data points are within a dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

distribution

A

is the way data is spread or organized; describes how frequently each value occurs and provides insights into patterns and characteristics of the dataset
› is symmetric when it can be cut in half by a vertical line and the halves are exact mirror images of each other
› may be described as approx. symmetric if halves are close but not exact
› can also be called nonsymmetric
› is uniform when all values have roughly the same number of observations
› can have several modes; unimodal, bimodal, and multimodal
- unimodal distribution only has one peak in its distribution, bimodal has two peaks in its distribution, multimodal has three or more peaks in its distribution
› if moundshaped, with a long tail of observations that trail out in one direction or other, its skewed (if it sticks out to the left, left skewed, if it sticks out to the right, right skewed
› can have gaps, clusters, and outliers
- gaps happen where there is a significant number of values with no observations
- clusters are groups of observations at similar values
- outliers are isolated observations that lie far from the bulk of the distribution - separated from it by a wide gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

measures of center

A

mean, median, mode
› mean, the sum of all values in a data set divided by the total number of values (the average)
› median, the middle value in a data set when arranged in ascending order
› mode, the value that appears the most frequently in a data set
› important: when a data is skewed, the median is often considered a better indicator of the center than the mean, as outliers can significantly impact the mean
- analyzing the relation between the mean and median can reveal information about the distribution of data (symmetrical vs. skewed)

17
Q

measures of variation

A

standard deviation & interquartile range
› SD: used to test the variation in statistics by calculating the average distance from the mean of all the values in a data set- average amount of variation or dispersion in a set of data
› IQR: describes the middle 50% of values when ordered from lowest to greatest; difference between the third quartile (Q3) and the first quartile (Q1)
- formula : IQR= Q3-Q1
- larger IQR indicates wider spread in the middle portion of the data, smaller IQR indicates a tighter distribution
- used In box plots

18
Q

average

A

the mean of the data set

18
Q

degrees of freedom

A

represent how many values in a statistical analysis have the freedom to vary without breaking any constraints
› calculated as sample size - 1

19
Q

population mean

A

average value of a variable for the entire population
› symbol, µ (mu)
› note. while the population mean is the average for the whole population, the sample mean is the average calculated from a smaller sample taken from that population

19
Q

population

A

the entire group of individuals that a researcher wants to study and draw conclusions about

20
Q

population standard deviation

A

measure of how spread out the data points are from the mean within an entire population
› average distance of data points from the population mean
› symbol, σ

20
Q

population size

A

the total number of individuals within a specific group that a researcher is interested in studying
› is considered a parameter, and describes a characteristic of the entire population not just a sample
› note. while population size refers to the total number of individuals in the entire group, sample size refers to the number of individuals selected from that population to collect data

20
Q
A
21
Q
A
22
Q
A
23
Q
A
24
Q
A
24
Q
A
25
Q
A
25
Q
A
26
Q
A
27
Q
A
28
Q
A
29
Q
A
30
Q
A
31
Q
A
32
Q
A
32
Q
A
33
Q
A
34
Q
A
35
Q
A
36
Q
A
36
Q
A
36
Q
A
36
Q
A
36
Q
A
36
Q
A
36
Q
A
37
Q
A
37
Q
A
37
Q
A
38
Q
A
39
Q
A
40
Q
A
41
Q
A
41
Q
A
41
Q
A
42
Q
A
42
Q
A
42
Q
A
42
Q
A
42
Q
A
43
Q
A
44
Q
A