MODULE 1 - DESCRIPTIVE STATISTICS Flashcards
The study of statistics is often broken into what two main categories?
- descriptive statistics
- inferential statistics
inferential statistics (3)
- Frequently, it is impossible to contact every person in large populations, so a smaller group is used, called a sample.
- A researcher can draw conclusions about the larger population using the sample data.
- Focuses on using information from the sample to make conclusions about the population from which the sample was drawn.
descriptive statistics (4)
- focuses on summarizing survey data about a sample drawn from a population.
- Summary statistics include measures of central tendency such as mean, median, and mode; and dispersion such as range and standard deviation.
- Descriptive statistics cannot make conclusions based on the data. 4. Rather, descriptive statistics is a way to present data in a meaningful way.
What is data?
is information, especially facts or numbers, usually collected or computed for purposes of analysis.
Common sources of data (3)
- Social networks
- Traditional Business Systems
- Internet of Things
Data analytics
is the field of analyzing data to gain insight, draw conclusions, or make decisions.
Big data
refers to very large data sets that cannot be processed by traditional methods, and is characterized by high volume, rapid velocity of collection, and variety in type and quality.
3 Types of data analytics
- Descriptive
- Predictive
- Prescriptive
Descriptive data analytics
analytics seeks to describe data, providing insight and knowledge.
Predictive data analytics
seeks to make predictions from data
Prescriptive data analytics
seeks to make decisions (prescriptions) based on data
Data is typically represented using what?
variables
variable
is an item that can have different (“varying”) values
Variables are often considered as being of two possible types:
- quantitative variable
- categorical variable
quantitative variable
can take on a numeric value (quantitative data) that can be measured and ordered
categorical variable (qualitative variable)
can take on the value (usually a label) of one of several categories
reason for distinguishing variable types (3)
- Each type is handled differently in data analytics
- A categorical variable typically involves counting the instances of each category, often then depicted with a bar chart or pie chart.
- But a quantitative variable is commonly plotted versus another quantitative variable, often depicted with a scatter plot or line chart
Two types of categorical variables are often distinguished
- Nominal
- Ordinal
Nominal variable
have no ordering, existing in name only, like apples, oranges, and grapes. (“Nominal” means “in name only”).
Ordinal Variable
have an ordering, like disagree, neutral, and agree.
Two types of quantitative variables are often distinguished
- continuous variable
- discrete variable
continuous variable
are infinite along a continuum of values within a range, typically real numbers. Continuous variables usually represent measurements, like height ( meters) or temperature ( degrees).
discrete variable (3)
- are finite within a range, typically integers.
- Discrete variables usually represent countable items, like people in a family () or cars in a city ().
- Generally, if “number of” can be added to the beginning, the variable is discrete, like “number of people in a family”, but not “number of height”.
Data visualization
is the display of data in a format, such as a table or chart, that seeks to achieve a goal of conveying particular information to a viewer