Topic 1 - Data, Samples and Descriptive Statistics Flashcards
Define population
The entire/complete group we are interested in
Define sample
Smaller group on which data is collected
Define parameter
Numerical measures which describe specific characteristics of a POPULATION (remember p parameter for p population)
Define statistics
Numerical measures computed from a sample (remember s statistics for s sample)
How many areas can statistical analysis be divided into?
2 areas
What are the areas that statistical analysis can be divided into?
1) Descriptive statistics
2) Inferential statistics
What are descriptive statistics?
Numerical and graphical methods used to summarise and present info in a meaningful way
What are inferential statistics?
When you make inferences or predictions about the population based on the findings from your sample
e.g. things like hypothesis testing
Describe the general process in statistics
1) From the population you collect data to form your sample
2) Using descriptive statistics you can describe the sample (statistics)
3) Using the statistics found you can use inferential statistics to make estimates about the parameters (features) of the population
4) This can then allow you to draw conclusions about the population
What is simple random sampling?
Each unit of population has an equal probability of being chosen to be part of sample- each individual chosen randomly (entirely by chance)
- good, unbiased representation of population
What are the variations of simple random sampling?
1) With replacement- sample unit drawn and then returned to population after characteristics recorded
2) Without replacement- sample drawn is not returned to population after characteristics recorded- … each unit can be only be drawn once (NOTE here sample values not independent as sample drawn can affect next sample- as you cannot draw the unit you drew previously)
How could you describe a distribution of a histogram or potentially even other graphical representations of data?
1) Symmetric distribution- if there is symmetry or balance when data is presented
2) Positively skewed- skewed to the right- where distribution is higher to the left and lower on the right
3) Negatively skewed- skewed to the left- where distribution is higher to the right and lower on the left
How many ways can you summarise the data of one variable?
2 ways
What are the ways in which you can summarise the data of one variable and briefly describe or give examples of each way?
1) Frequency distribution e.g. ordered list or table
2) Histograms- graph summarising the data in a frequency distribution- vertical axis either frequency, relative frequency or percentage
In how many ways can you display the relationship between two variables?
2 general ways