1 Exploring and understanding Data Flashcards
What makes a bad graph?
- Not using the correct graph
- Not using the correct scale
- Using 3D
- Using % for unequal sample size
- Inappropriate Extrapolation
- Perspective
- Pie Charts
- Suppress the origin or change the base line
What is causation?
A relationship between two variables foes not mean ‘X causes Y.’
What is the problem with averages in statistics?
Average can be affected with one extreme outlier.
What is bad sampling?
When the data processing reduces information content of the data. Results extracted from a sample cabbot be better than the sample itself.
What are the two reasons we use graphs?
- Exploration (to find the story of the data)
2. Explaination (to tell the story to an audience)
What are the rules for good data visualisation?
- Use an appropriate graph for your variable type.
- Check your data
- Label axes
- Legend or figure caption
- Integrity
What makes a good graph?
- Only 2D
- Do not distort perspective
- Check you have not exaggerated main features
- No pie charts
- Summarise complex data into its simplest understandable form
- Split the data up if needed
- Only use lines to join points if continuous data. Leave gaps for missing data.
Why have 2 graphs?
You need two graphs if data can’t be adequately expressed in 1 graph.
Statistical language: What are Cases?
Cases are the individuals or objects being described.
Statistical Language: What is a variable?
A variable is a characteristic of a case.
Statistical Language: What is Data?
Data are the observed values of the variables.
Statistical Language: What is a Data Set?
A data set contains the observed values of the variables for a group of individuals.
Statistical Language: What are cases?
Cases are the individuals or objects being described.
Statistical Language: What is a variable?
A variable is any characteristic of a case.
Statistical Language: What is Data?
Data are the observed values of the variables.