PRELIM LEC 2 (2): DESCRIPTIVE STATISTICS Flashcards
Deals with the collection and presentation of data and collection of summarizing values to describe its group characteristics
DESCRIPTIVE STATISTICS
✔ central thread of any activity
✔ Understanding the nature of data is most fundamental for proper and effective use of statistical skills
DATA
2 TYPES OF DATA
ACCORDING TO SOURCE
ACCORDING TO FUNCTIONAL RELATIONSHIP
2 TYPES OF DATA ACCORDING TO SOURCE
Primary Data
Secondary Data
interview, registration, experiment, questionnaire, etc
PRIMARY DATA
book, journal, newpaper, thesis, dissertation, etc.
SECONDARY DATA
2 TYPES OF DATA ACCORDING TO FUNCTIONAL RELATIONSHIP
Independent Data
Dependent Data
refers to any controlling data
Independent Data
refers to any data that is affected by controlling data
Dependent Data
METHODS OF COLLECTING DATA
o Objective Methods
o Subjective Methods
o Use of Existing Records
METHODS IN PRESENTING DATA
o Textual
o Tabular
o Graphical
summarizes a data set by giving a “typical value” within the range of the data values that describes its location relative to entire data set
Measure of Location
MIN is the smallest value in the data set while MAX is the largest value in the data set
Minimum and Maximum
it is the average of the data
Mean
Properties of the Mean
o Uniqueness
o Simplicity
o Affected by extreme values
Divides the observations into two equal parts
o If n is odd, the median is the middle number.
o If n is even, the median is the average of the 2 middle number
Median
Value that occurs most often
Mode
A data set that has only one value that occurs with the greatest frequency
Unimodal
If a data set has two values that occur with the same greatest frequency, both values are mode
Bimodal
If a data set has more than two values that occur with the same greatest frequency, each value is used as the mode
Multimodal
When no data value occurs more than once
No mode
values that divide the distribution into 100 equal parts. P10 or tenth percentile locates the point that is greater than 10 percent of the items in the distribution
Percentiles
values that divide a distribution into 10 equal parts. The 1st decile is the 10th percentile; the 2nd decile is the 20th percentile…
Deciles
Divide an array into four equal parts, each part having 25% of the distribution of the data values. The 1st quartile is the 25th percentile; the 2nd quartile is the 50th percentile, also the median and the 3rd quartile is the 75th percentile.
Quartiles
o single value that is used to describe the spread of the distribution
o A measure of central tendency alone does not uniquely describe a distribution
Measures of Dispersion
2 TYPES OF MEASURES OF DISPERSION
ABSOLUTE MEASURES OF DISPERSION
RELATIVE MEASURE OF DISPERSION
difference between the maximum and minimum value in a data set
Range
- distance or range between the 25th percentile and the 75th percentile
Interquartile range
it measure dispersion to the scatter of the values about there mean
Variance
is the square root of variance
● ±1SD = 68.3% ● ±2 SD = 95.4% ● ±3SD = 99.7%
Standard Deviation
is a measure use to compare the dispersion in two sets of data which is independent of the unit of the measurement
Coefficient of Variation
A distribution is said to be symmetric about the mean, if the distribution to the left of mean is the “mirror image” of the distribution to the right of the mean
symmetry
measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point
● Positively Skew
● Negatively Skew
● Symmetrical Distribution/Equal
Skewness
measure of whether the data are peaked or flat relative to a normal distribution.
● Leptokurtic
● Mesokurtic (Normal)
● Platykurtic
Kurtosis
a branch of mathematics which deals with the study of possible outcomes of an event or set of events together with the outcomes’ relative likelihood and distributions
Probability
2 TYPES OF PROBABILITY
OBJECTIVE PROBABILITY
SUBJECTIVE PROBABILITY
calculated by the process of abstract reasoning
Classical probability
depends on the repeatability of some process and the ability to count
relative frequency probability
based upon an educated guess
Subjective probability
3 PROPERTIES OF PROBABILITY THEORY
- Given some process ( or experiment) with n mutually exclusive outcomes ( called events), E1, E2, . . . , En, the probability of any event Ei is assigned a nonnegative numbers. That is, P(Ei) ≥ 0
- The sum of the probabilities of the mutually exclusive outcomes is equal to 1. P(E1) + P(E2) + … + P(En) = 1
✔ This is the property of EXHAUSTIVENESS - Consider any two mutually exclusive events, Ei and Ej. The probability of the occurence of either Ei or Ej is equal to the sum of their individual probabilities. P(Ei + Ej) = P(Ei) + P(Ej)
Calculating the probability of an event
- Conditional Probability
✔ The condtional probability of A given B, denoted P(A\B), is the probability that event A has occurred in a trial of a random experiment for which it is known that event B has occurred. - Joint Probability
✔ Calculates the likelihood of two events occurring together and at the same point in time - The Multiplication Rule
- The Addition Rule
- Independent Events
✔ When P(A\B) = P(A) * P(B) holds, which in turn is true if and only if P(B\A) = P(B) - Complementary Events
- Marginal Events