Introduction to Statistics Flashcards
What is data?
Data refers to any collection of facts, figures, measurements, or descriptions that can be analyzed to gain insights, answer questions, or solve problems. It’s fundamental in statistical analysis, allowing us to quantify and understand the world around us.
What is statistics?
Statistics is the science concerned with developing and studying methods for collecting, analysing, interpreting and presenting data.
What are the main types of data?
Data can be broadly categorized into two main types:
- Numeric Data (Quantitative)
- Non-Numeric Data (Categorical or Qualitative)
What is numeric data?
Numeric data, also known as variable data, involves numbers and can be further divided into two subcategories:
- Discrete Data
- Continuous Data
What is discrete data?
Discrete data consists of distinct, separate values that can be counted. It is often expressed in whole numbers but can include decimal values if they are exact and meaningful.
Examples of discrete data include:
The number of students in a lecture: You can have 30, 31, or 32 students, but not 31.5 students.
Money: You can have £1, £1.50, or £1.56, but not £1.563, as such a small fraction isn’t practical.
Discrete data is counted
What is continuous data?
Continuous data can take any value within a range and is measurable to an arbitrary level of precision. It often involves rounding for practicality and can be infinitely precise depending on the measurement tool’s accuracy.
E.X: Height, weight
Continuous data is measured
Why is it important to be consistent with measurement units in continuous data?
Consistency in units is crucial to avoid significant errors.
For example, the Mars Climate Orbiter disaster occurred because NASA used metric units (Newtons) for calculations, while the software used imperial units (pounds per second). This inconsistency led to a catastrophic miscalculation, causing the orbiter to disintegrate in Mars’ atmosphere.
What is non-numeric data?
Non-numeric data, often termed categorical data, includes:
- Nominal Data: Categories without a natural order. Examples: Types of fruits, colors, names (unordered categories).
- Ordinal Data: Categories with a natural order but without consistent differences between them. Examples: Movie ratings (poor, average, good, excellent), education levels (high school, bachelor’s, master’s, doctorate) (ordered categories).
What is time-series data?
- Data recorded at regular intervals over a long period of time.
- Collected multiple times (hourly/ daily/ monthly etc).