Statistics and Probability Full Blown Reviewer Flashcards
“STATISTICS” comes from the Italian word
“_______” which means “______”.
stato which means state
Comes from the Italian word
“stato” which means “state”.
STATISTICS
In the early times if someone works with
statistics he/she concerns him/herself with
_________________
government affairs.
The word statistics first appeared in print in the
book by ___________ entitled, “Abriss
statswissen der heutigen vornehmsten
europaishen Reiche und Republiken” in 1979.
Gottfried Achenwall
During this time statistics was simply a
______________________ like
numbers of deaths, births, causes of death, etc.
collection of data on government records
In the early day-day statistics, data were ____________________
This is because the theory of probability was
not yet considered part in the analysis of data.
not
utilized to predict future events nor data were
analyzed in relation with other sets of data.
The term __________________ , in layman’s terms, is the degree of
likelihood for an event to happen.
Probability
The concise mathematical computation on this degree falls
under the ____________
theory of probability.
The theory of probability has its beginnings during the time of
___________
Cardano in 1525.
In 1654, a certain__________ asked an intriguing
question that deals with probability which provoked the fertile
minds of ____________
Chevalier de Mere, Blaise Pascal and Pierre de Fermat.
A certain Dutch mathematician ____________
also worked on the posed by Chevalier.
Christiaan Huygens
The science of collecting, analyzing,
presenting, and interpreting data.
Statistics,
Statistics is the science of collecting, analyzing,
presenting, and interpreting _____.
Data
Governmental
needs for __________ as well as information
about a variety of economic activities provided
much of the early__________ for the field of statistics.
census data, impetus
- is the field of statistics that focuses on
quantitatively description of a collection of data.
Descriptive statistics
-It is usually used to define the basic characteristics
of the data in a study.
Descriptive statistics
- It is used to make conclusions of the
probability that a difference between samples is
either reliable or by chance.
Inferential statistics
- Inferential statistics, conclusions are being
formulated from the direct data.
Inferential statistics
a whole population while a
statistics described a sample of a given population.
Parameter
are all the information of a given
population and this is something that is hard to
determine since it requires a lot of time, resource and
skills.
-parameters
is a measure of “types” and may be
represented in terms of characteristics, names or symbols.
Qualitative data
is a measured of “values”, or “counts”
and expressed in numerical values.
Quantitative data
*Basically qualitative data answer the question “__________”
while quantitative answers the question “________________”.
what, how many
Measure of “types” and may
be represented by names or
symbols
QUALITATIVE DATA
Describes individuals or
objects by their categories or
groups
QUALITATIVE DATA
Answer the question “what
type”
QUALITATIVE DATA
Measure of “values” or
“counts” and expressed in
numbers
QUANTITATIVE DATA
Operations such as addition
and averaging make sense
QUANTITATIVE DATA
Answer the questions “how
many”, “how much”
QUANTITATIVE DATA
Data are in original form.
Raw data
Data collected is already arranged
in certain pattern such as in ascending or
descending order.
Array data
ARE THE
CHARACTERISTICS OF THE
INDIVIDUAL TO BE OBSERVED OR
MEASURED.
VARIABLES
Called the
predictor variable.
Independent Variable
Called the criterion
variable.
Dependent Variable
variables that can be expressed in
decimals.
Continuous Variables
Variables that cannot be
expressed in decimals.
Discrete or Discontinuous Variables
Data that consist of names,
labels, or categories only
The data cannot be
arranged in an ordering
scheme
numbers or symbols are
used to classify an object
or person to identify the
group they belong
Examples:
Gender (Male and Female
Nationality (Filipino,
American, Japanese)
NOMINAL SCALE
Data contains the properties of nominal level.
The data can be arranged in an ordering scheme or ranked.
The difference between the values of the data cannot be determined. The interval is meaningless.
ORDINAL SCALE
Data contain the properties of ordinal level.
Data values can be ranked.
The difference between the values of the data are of known sizes.
The interval between the values has meaning.
The “zero” does not imply the absence of characteristics.
The ratio of data values are meaningless.
Interval scale
Data contain the properties of interval level.
The “zero” indicates the absence of the characteristics under consideration.
The ratio of data values has meaning.
Ratio scale
The branch of mathematics that deals
with uncertainty is the _____________
theory of
probability.
- well-defined results.
EXPERIMENTS
If the set of all outcomes of an experiment is the
sample space or probability space
then an event is a subset of the
sample space
Formula for probability?
P(E) = (𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡)/(𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒)
Methods of data collection:
Observation, interview, Questionaire, and Database
Five Most Common Methods in Collecting Data
1) Direct Method
2) Indirect Method
3) Registration Method
4) Observation Method
5) Experimental Method
Are data collected directly by the researcher himself.
PRIMARY DATA
First hand or original sources
PRIMARY DATA
are information taken from published and unpublished materials previously gathered by other researchers or agencies such as book, newspaper, magazine, journals, published and unpublished thesis and dissertations.
SECONDARY DATA
“interview method”
Direct Method
done through a direct and personal contact of the researcher with the person from whom data will be collected
Direct Method
“questionnaire method”
Indirect Method
- Executed through the use of either online questionnaire or paper form questionnaire distributed to groups of people.
Indirect Method
- Done through the gathering of data from concerned offices.
Registration Method
- Done through the gathering of data from concerned offices.
Registration Method
Purely based on the subjective remarks of the observer.
Observation Method
- It is applicable to data pertaining to attitude, behavior, and values of individuals.
Observation Method
- It is applicable to data pertaining to attitude, behavior, and values of individuals.
Observation Method
The method that determines the cause and effect relationships of a certain parameter or event under a controlled condition.
Experimental Method
- This method is usually used by researchers in the field of sciences.
Experimental Method
The complete set of individuals or subject.
POPULATION
Is just a representative of the whole population.
SAMPLE
This sampling technique also called the Simple Random Sampling
PROBABILITY SAMPLING
Probability sampling technique also called?
the Simple Random Sampling
Are randomly picked
The samples
Each member of the population has an equal chance of being picked as part of the sample.
PROBABILITY SAMPLING
often times used when the population to be considered is too large.
Restricted Random Sampling
The selection of sample is done by picking every 𝑘^𝑡ℎ element of the population.
a. Systematic Sampling
Is a process or activity that generates data
STATISTICAL EXPERIMENT
is an organized record of
measurements arranged in columns and
rows.
Data Set
is the set/collection of all possible outcomes in an
experiment.
SAMPLE SPACE
is a collection of one or more outcomes of an
experiment.
Event
is a function that
associates a real number to each element in
the sample space. It is a variable whose
values are determined by chance.
Random Variable
It is a variable whose
values are determined by chance.
Random Variable
EVERY UNIT HAS A ‘CHANCE’ OF BEING SELECTED,
AND THAT CHANCE CAN BE QUANTIFIED.
PROBABILITY SAMPLING
EVERY ITEM IN A POPULATION DOES NOT HAVE AN
EQUAL CHANCE OF BEING SELECTED.
NON-PROBABILITY SAMPLING
INVOLVES THE SELECTION OF A
SAMPLE FROM A POPULATION, BASED ON THE PRINCIPLE
OF RANDOMIZATION OR CHANCE.
PROBABILITY SAMPLING
IS MORE COMPLEX, MORE
TIME-CONSUMING AND USUALLY MORE COSTLY THAN
NON-PROBABILITY SAMPLING
PROBABILITY SAMPLING
TO PREVENT THE POSSIBILITY OF A BIAS OR ERRONEOUS INFERENCE, A
RANDOM SAMPLING IS COMMONLY RECOMMENDED.
SIMPLE RANDOM SAMPLING
UNDER THE CONCEPT OF RANDOMNESS, EACH MEMBER OF THE
POPULATION HAS AN EQUAL CHANCE TO BE INCLUDED IN THE SAMPLE
GATHERED.
SIMPLE RANDOM SAMPLING
THE ITEMS OR INDIVIDUALS ARE ARRANGED IN SOME WAY- ALPHABETICALLY OR OTHER SORT.
SYSTEMATIC RANDOM SAMPLING
A RANDOM STARTING POINT IS SELECTED; AND THEN EVERY 𝑘^𝑡ℎmember will be the succeeding samples.
SYSTEMATIC RANDOM SAMPLING
A population is first divided into subsets based on homogeneity called strata.
Stratified random sampling
A population is first divided into subsets based on homogeneity called what?
strata
the strata are internally homogeneous as possible and at the same time each stratum is different from one another as much as possible.
Stratified random sampling
In stratified random sampling, the strata are internally homogeneous as possible and at the same time each _______________ is different from one another as much as possible.
stratum
Samples are selected proportionally from each stratum which can be done through simple or systematic random sampling
Stratified random sampling
Can be done by subdividing the population into smaller units and then selecting only at random some primary units where the study would then be concentrated.
Cluster sampling
is sometimes referred to as an “area sampling” because it is frequently applied on a graphical basis.
Cluster sampling
The cluster sampling is sometimes referred to as an “__________” because it is frequently applied on a graphical basis.
area sampling
In general, we can get more precise results under cluster sampling when each cluster contains as a varied mixture as possible and at the same time one cluster is as nearly alike as the other.
Cluster sampling