Week 1 ML Flashcards
Scientific Approach
- Systematic pursuit of knowledge
- Logical steps: Problem - Hypotheses
- Data collection: Observation of behaviour or experimentation
- Test hypotheses and draw conclusions
Research Methods
The systematic approach to answering questions
Statistics
- Numbers that summarise observations
- Mathematical procedures to produce those numbers
Scientific Method
- Theory
- Hypothesis
- Exp/Observe (research methods)
- Evidence (statistics)
- Theory
Testing hypothesis
- Reproducible observation of the hypothesised effect in action
- Controlled (reproducible) circumstances
- Empirically observed
- Variables measured and/or controlled
- Alternative explanations controlled/eliminated
- Observation/interpretation is unbiased
The scientist practitioner model
Furthers understanding through research
• Consumers of research
• Evidence-based practice
• Inform own practice and methodology
Scientific Enquiry
- Choose something to observe
- Choose method of observation
- Describe observations
- Identify variation in observations
- Explain variations
Types of data
Categorical
- > Nominal
- > Ordinal
Continuous
- > Interval
- > Ratio
Nominal data
refers only to identity information, that is values are ascribed that have no inherent order, or magnitude.
For example, gender, nationality, or the number assigned in a race are all types of nominal data
-> names of things without meaning
Ordinal data
describes identity, but has magnitude.
For example, medal positions in a race are types of ordinal data. They have a sequential order (the gold medalist beat the silver medalist beat the bronze medalist), but this measure doesn’t tell us anything about the interval between each competitor, they are categorised as 1 - 2 - 3
-> data that is ordered without fixed intervals
Interval data
a continuous type of variable, measuring identity and magnitude and fixed intervals between units of measurement.
For example, temperature. Here, the difference between 20 degrees and 30 degrees is the same as between 60 degrees and 70 degrees. We can order our data points by magnitude as we do with ordinal data, but we can also quantify the amount of difference between data points.
-> data that has fixed intervals allowing us to order it
Ratio data
identity, magnitude, fixed interval and there is a true zero.
For example, the time a race is run cannot be a negative value.
-> time, height, where there are fixed intervals but there is a “true zero” which the data can not run under
Descriptive statistics
- Each observation is a “Datum”: Plural is “Data”
- A bunch of data is often called a “Data Set”
- Different types of data are analysed in different ways
- Most basic description is how frequently similar observations occurred
- Easiest description to follow is a picture
Data through pictures
- Bar graph
- Line graph
- Pie chart
- Scatter Plot
Categorical data through pictures
- Pie chart
- Bar graph