Statistics Flashcards
Descriptive Statistics?
Descriptive statistics is what we can say about a sample by observing the sample itself. This is somewhat limited and mostly consists of summarisations of the data, e.g. like aggregates on a column in a database table.
Inferential Statistics
Inferential statistics is what we can say about a population based on what we know about a sample. That means that we can infer (deduce or conclude from evidence rather than from explicit statements) about the population based on a smaller sample.
In statistics what is ‘Probability’?
Probability is what we can generally say about samples from a population.
So if we know 10 % of the population are left handed, we can expect 10 % of a sample randomly taken to be left handed.
In Probability Theory:
What does the experiment yield?
One possible outcome of a a sample space.
The sample space for tossing a coin is {head, tail}
In Probability Theory
What is a ‘Sample Space S’
A set of possible outcomes of an experiment.
The sample space for tossing a coin is {head, tail}
In Probability Theory
What is a ‘Event E’
An event is a possible outcome of an experiment, e.g. the event head when we toss a coin.
In Probability Theory
What is a ‘Probability of Outcome P(s)’
The probability of an outcome is always greater than 0 and less than 1, and the sum of the probability of all possible outcomes is 1, .
Descriptive Statistics
In Descriptive Statistics Which are the two different areas
Centrality and variability
Centrality: mean, median, mode
Descriptive Statistics
What is the Mean, or average and what kind of data is it most useful for?
The mean / average is the sum of a value divided with the number of values.
Most useful with homogeneous data - variables of one type. categorical or binary.
In Descriptive Statistics
What is the Median
What is the median in an evenly numbered data set?
The exact middle value of the data set.
If n is even, the median is the mean value of the two middle elements
In Descriptive Statistics
What is the Mode
The mode is the most frequent element.
1 , 1, 2, 3, 4 = mode = 1
Standard Deviation
Measure of the amount of variation on a set of values.
Low standard deviation indicates that the values are closer to the mean - the distribution is less wide
A high standard deviation indicates that the values are spread out on a wider range
In Descriptive Statistics
Is Standard Deviation describing variability or centrality
Variability : Dispersion of the data
Centrality: centrality measures determine the relative significance of a node in a social network
What is Correlation Analysis concerned with
Correlation analysis is concerned with relations between variables, e.g. if one goes up, what happens to the other?
What is a Correlation Coefficient
A correlation coefficient is statistic measure of the degree that one variable Y is a function of another variable X.