Statistics definitions Flashcards
Descriptive statistics
when data is summarised and presented
Inferential statistics
When statisicians try to predict or forecast based on responses from a small group
Explanatory variable
Variable whose effect on response we study (independent)
Response variable
Variable whose changes we wish to study (dependent)
Data set
all observations of a particular variable for the elements of the sample, a set of facts or values
Categorical Data
Questions that cannot be answered with numbers, qualitative
Numerical Data
Questions that provide numerical data, quantitative
Ordinal Data
Categorical data that can be ordered ex: grades (H1,H2)
Nominal Data
Categorical data that cannot be ordered(Hair colour)
Continuous Data
Numerical Data that can be one of an infinite number of values (EX: Rainfall measurements)
Discrete Data
Numbers or measurements that can only have certain values (EX: shoes size)
Univariate data
data with one value
Bi variate data
data with two paired values
Observational studies
Researchers collects information but does not influence events, includes case control studies
A designed experiment
Researchers apply some treatment to a group then observe the effect it has on them, a control group can be used
Experiment
A controlled study in which the researcher understands cause- and effect relationships
population
complete set of data under consideration
census
The collection of data of the whole population
sample
A selected small group from the population
Statistical interference
Conclusions drawn from a sample are applied to the whole population
Parameter
A number that describes a population characteristic
Bias
Anything that distorts the data so it will not give a representative sample
Causes of bias
Too small sample
Low response rates
Error in recording data
Failing to identify correct population
Name five sampling methods
simple random sample Stratified random sampling Systematic Sampling Quota Sampling Cluster Sampling
Simple random sample
Sample size is selected in such a way that every possible sample of size n from the population has an equal chance of being selected (lottery)
Stratified random smaple
Population is divided into subgroups or strata with subjects in each group sharing a characteristic (ex:age) A simple random sample is selected from each startum. Number taken frome ach group is proportional to the size of the stratum rlative to the population
Systematic Sampling
Constant skip method. One person is selected at random then every kth name forwards/backwards is selected. K is decided by the size of the population divided by the sample size needed
Cluster Sampling
Population divided into clusters (subgroups) then a simple random sample of clusters is selected and all members in each chosen cluster are surveyed
Quota Sampling
Dividing population into groups of criteria and a quota of subjects are interviewed from each group but interviewer chooses subjects for convenience
What are some ethical issues of clinical trials
Experimental techniques can be healing or harmful
Benefits of trials go to future patients not subjects
In control trials if a treatment works is in ethical not to give it to the control group
What are important when collecting data from anybody- ethics
Informed consent and confidentiality
What are the main survey methods- give one advantage and disadvantages of each (not on brainscape but do know this)
Face to face interview Telephone interview Postal questionnaire Online questionnaire Observation
Questionnaire
Set of questions designed to obtain data from a population
What makes a good questionnaire
Provides instructions or example of how to complete
starts with simple questions
be as brief as possible to be answered quite quick
Clear about who should complete it
What makes good questionnaire questions
Clear simple language Relevant to survey Not be open ended- difficult to analyse Not cause offense Provide tick boxes, yes or no answer or number answer
Symmetric distribution
Values smaller and larger than the midpoint are mirror images of each other
Negatively skewed distribtuion
Data is skewed left, mean < median
Positively skewed distribution
Data is skewed right mean> median
Correlation
Measures strength of the linear association between two quantitative variables
outlier
An individual value that fall outside the overall pattern
Correlation coefficent
Measure of strength of the linear relationship between two sets of data. Value is between -1 and 1
Line of best fit/regression line
Line that describes the relationship between the two variables. Straight line that best fits the data