Section A.1: Introduction to Statistics Flashcards
What is data?
Data are measurements or observations collected as a source of information
What is statistics?
(A, I, P, O)
Statistic is a branch of mathematics which deals with the
- Analysis
- Interpretation
- Presentation
- Organisation
of data.
What are the two types of statistical analysis?
Descriptive analysis, and inferential analysis
Descriptive statistics
Descriptive statistics refers to the statistical measures used to summarise and describe the basic properties of data, it is used in univariate, bivariate, and multivariate analysis.
Types of statistical measures in descriptive statistics (2)
Measures of Central Tendency
Measures of Dispersion/Variability
Inferential statistics
Inferential statistics refers to the processes that make inferences about a population based on a representative sample of the population.
Define statistical analysis
Statistical analysis refers to the analysis of data in order to gather insights, and discover underlying patterns and trends.
Define: Population Distribution
Population Distribution refers to the spread of a characteristic or variable across an entire population
What is meant by sample?
Sample refers to a subset of a population that is selected to represent the entire population
Categorical Variable
A categorical variable is a type of variable which is character or feature-based. Any variable which is not numerical. It is qualitative (characterised by qualities or features) and descriptive in nature.
Categorical variable data types (3)
Nominal - which has no order or ranking
Binary data - Data which has exactly two possible values (eg. True, False)
Ordinal - has a natural order or ranking
Nominal data does not have a natural order or ranking (T/F)
True, it does not have a natural order or ranking.
Eg. Gender, eye colour.
Ordinal data
Ordinal data is data that has a natural order or ranking.
Eg. Education level, size (small, medium, large)
Binary Data
A type of nominal data which has exactly two possible values
Eg. A or B, Yes or No.
Numerical variable - data types (2)
Discrete and Continuous
Discrete data
Data that can only take on certain values, usually whole numbers, NOT like continuous data.
Eg. Number of siblings
Continuous data
Continuous data is data that can take on ANY value in a range, NOT like discrete data.
Eg. Height, temperature
Data Dimension
Data dimension refers to the number of variables or features in a dataset.
“A dataset can have one, two, or more dimensions depending on the number of variables being measured.”
Data Dimension types (3)
Univariate or one-dimensional: One variable (age)
Bivariate or two-dimensional: Two variables (age vs. height)
Multivariate (three-dimensional or more): 3 or more variables (Age vs. height vs. weight)
Variable
A variable is a characteristic or attribute that can take on different values for different observations
Independent variable
An independent variable is a variable that is manipulated or controlled by the researcher in a study to measure its effect on the dependent variable/s. It is plotted on the x-axis.
Dependent variable
A dependent variable is a variable which response is being measured or predicted in a study. Plotted on the y-axis.
What are bar graphs used for?
Bar graphs are used to visualise and compare groups of categorical data, the frequency of observations, distribution of observations, and/or compare magnitudes among them.
What are bar graphs not used for? (2)
Numerical data (that is a histogram), and continuous or categories that are not independent of each other.