Introduction to Statistics Flashcards
statistics is derived from Latin word
status
status means
state
its early uses involved compilation of data and graphs describing various aspects of the state or country
statistics
actual
numbers derived from data a
statistics
method of analyzing and
interpreting data.
statistics
is a science which deals with the collection, presentation, analysis,
and interpretation of quantitative data
statistics
a collection of quantitative
data, such as statistics of crimes, statistics on enrollment, statistics on unemployment, and the like.
statistics
example of use of statistics
surveys
consumer preference
experiments
sampling
economics
It deals with the methods of organizing, summarizing and presenting a mass of data so as
to yield meaningful information.
descriptive statistics
It deals with making generalizations about a body of data where only a part of it is
examined. This comprises methods concerned with the analysis of a subset of data leading
to predictions or inferences about the entire set of data
inferential statistics
is the set of all individuals or entities under consideration or study. It may be
a finite or infinite collection of objects, events, or individuals, with specified class or
characteristics under consideration.
population
is a characteristic of interest measurable on each and every individual in the
population, denoted by any capital letter in the English alphabet.
variable
types of variable
qualitative
quantitative
consists of categories or attributes, which have non-numerical
characteristics.
qualitative variable
consists of numbers representing counts or measurements
quantitative variable
classification of quantitative variable
discrete
continuous
results from either a finite number of possible values or a
countable number of possible values
discrete
e results from infinitely many possible values that can
be associated with points on a continuous scale in such a way that there are no gaps or
interruptions
continuous quantitative variable
is part of the population or a sub-collection of elements drawn from a population
sample
is a numerical measurement describing some characteristic of a population
parameter
c is a numerical measurement describing some characteristic of a sample
statistic
is often conducted to gather opinions or feedbacks about a variety of topics
survey
most often simply referred to as census, is conducted by gathering
information from the entire population
census survey
most often simply referred to as survey, is conducted by gathering
information only from part of the population
sampling survey
classify the ff quantitative variable
number of students
number of books
number of patient
age
monthly income
money
discrete quantitative
classify the ff quantitative variable
height
weight
gwa
time
length
continuous quantitative variable
four levels of measurement
nominal
ordinal
interval
ratio
is characterized by data that consist of names, labels, or categories only
nominal
- The data cannot
be arranged in an ordering scheme. - considered to be the weakest level as it uses number
and symbols to classify object, person, or characteristics.
This level or scale is classificatory in
nature.
nominal
level of measurement
ex: name, religion, civil status, address, sex, degree program
nominal
involves data that may be arranged in some order, but differences between data values
either cannot be determined or are meaningless.
ordinal
`This scale, although categorical in nature, shows
difference or some kind of relation between categories (greater than or Less than).
ordinal
level of measurement
military rank
job position
year level
ordinal
is like the ordinal level, with the additional property that meaningful amounts of
differences between data can be determined
interval
there is no inherent (natural) zero starting point.
interval
level of measurement
IQ score
temperature (C and F)
dates
interval
is the interval level modified to include the inherent zero starting point. For values at this
level, differences and ratios are meaningful.
Ratio
level of measurement
height, area, width, weekly allowance, absolute zero (kelvin)
ratio
what level of measurement is applied for qualitative variables
nominal
ordinal
level of measurement for quantitative variables
interval
ratio
Excel, JMP, SPSS, Minitab
are what
Statistical Software
classifying variables by type (2)
numerical
categorical
data that is derived from counting process (how many?)
discrete
data that is describe from a measuring process - how much?
continuous
numbers from a population
parameter
all of the data that is collected in a particular study
data set
the entity on which the data or collected
elements
the characteristic from each element that we are studying
variable
each individual measurment
observation
data collected at one point in time
cross sectional data
data collected over several time periods
time series data
data that already exist in some form
secondary data
data that you collect for your use
primary data
data that exist within your corporation or organization
internal secondary data
data that exists outside your corporation or organization
external secondary data
process of capturing, storing , and maintaining data
data warehousing
system to extract information and uncover patterns
data mining
the process of using statistics to draw conclusions about population parameters
statistical inference
Determine whether the following statements use the area of descriptive statistics or
statistical inference.
A bowler wants to find his bowling average for the past 12 games.
descriptive
Determine whether the following statements use the area of descriptive statistics or
statistical inference.
A manager would like to predict based on previous yearsβ sales, the sales performance of
a company for the next five years.
inferential
Determine whether the following statements use the area of descriptive statistics or
statistical inference.
A politician would like to estimate, based on an opinion poll, his chance for winning in the
upcoming senatorial election.
inferential
Determine whether the following statements use the area of descriptive statistics or
statistical inference.
A teacher wishes to determine the percentage of students who passed the examination.
descriptive
Determine whether the following statements use the area of descriptive statistics or
statistical inference.
A student wishes to determine the average monthly expenditures on school supplies for the
past five months.
descriptive
Determine whether the following statements use the area of descriptive statistics or
statistical inference.
A basketball player wants to estimate his chance of winning the most valuable player
(MVP) award based on his current season averages and the averages of his opponents.
inferential
Classify the following statements as belonging to the area of descriptive statistics or
statistical inference. Write DS for descriptive statistics and write IS for inferential statistics on
the space provided.
- Recording the number of infected persons of COVID 19 in a certain barangay.
- If the present trend continues, the number of infected people of COVID 19 will reduce
dramatically within a month. - In a certain city, arsonists deliberately set 3% of all fires reported last year.
- Records show that case of dengue has decrease in the last 5 years.
- As a result of a recent poll, most Filipinos are in favor of electronic voting.
DS
IS
DS
IS
DS
Classify the following statements as belonging to the area of descriptive statistics or
statistical inference. Write DS for descriptive statistics and write IS for inferential statistics on
the space provided.
- Philippinesβ Gross Domestic Product (GDP) grows by 5.9% in 2019, -0.3 percentage lower
than its 6.2% performance in 2018. - The average grade of Annalyn, a BS Biology student in her 8 subjects is 3.57.
- In the United States, it was predicted from current trend that the deaths due to covid19 will
reach 15000 by end June 2020. - All four provinces of ARMM are among the 10 poorest provinces in the Philippines for 2002.
- Data show the number of enrollees in private institution will decrease by 15% next School
Year.
DS
DS
IS
DS
IS
Identify the population, variable of interest, and type of variable in the following:
- The dean of CSCS would like to determine the average weekly allowance of BS Computer
Science students. - The registrar of DLSU-D would like to conduct a survey on the preferred courses of grade
12 students in Cavite. - The dean of the a certain college would like to know the number of students who are
smoking. - A survey by a group of students entitled βDress Codeβ will be conducted to first year
students to determine the fashion preferences of these students. - A group of researchers would like to know the number of deaths due to COVID 19 in all
countries in Southeast Asia.
DQV
Quali
DQV
Quali
DQV
: Identify the population, variable of interest, and type of variable in the following:
From all students registered this semester, the Mathematics and Statistics Department
would like to know how many students like Statistics.
7. A study to be conducted by NGO would determine the Filipinosβ awareness about the
spread of COVID in the Philippines.
8. A group of students taking Statistics conducted a study on the effect of distant learning to
the academic performance of the students.
9. Some parents would like to determine whether Mobile Legend is good or bad to the
behavior of their children.
10. The head librarian would like to identify the e-book/s commonly read by DLSU-D
students.
DQV
Quali
Quali
Quali
Quali
Quali
Identify each statement as having discrete or continuous data. Write D for discrete and
C for continuous.
1. Among the 855 deaths due to COVID19, 349 are female.
2. A student spent on the average 3.54 hours per day studying his/her lesson.
3. Yesterdayβs records show that there is an increase of 955 new cases of COVID19 infection.
4. A COVID 19 patient can recover from sickness to about 12-16 days.
5. Upon completion of a diet and exercise program, Elmer weighed 12.37 lbs. less than when he
started the program.
D
C
D
C
C
Identify which of the following quantitative data would be presented by a discrete
variable or a continuous variable. Write DV for discrete variable and write CV for continuous
variable.
Number of pairs of pants
2. Time (in minutes) to finish a 4-km marathon.
3. Circumference (in inches) of coconut trunk
4. Length (in cm) of harvested bamboo
5. Dimension (LWT) of the newest brand of cellular phone
DV
CV
CV
CV
CV
Identify which of the following quantitative data would be presented by a discrete
variable or a continuous variable. Write DV for discrete variable and write CV for continuous
variable.
Percentage increase in enrolment this year
7. Number of COVID 19 infections of 50 countries
8. Monthly income of 100 randomly selected persons at KADIWA Market
9. Number of heads when a coin is tossed 25 times
10. Age in years of COVID 19 patients.
CV
DV
DV
DV
DV
At what level are the following variables measured? Write nominal, ordinal, interval
or ratio on the space provided.
- Student number
- Emotional quotient of teachers
- Telephone number
- Species of orchid plants
- Final course grades of 0.0, 1.0,1.25,1. 50,β¦
N
I
N
N
I
At what level are the following variables measured? Write nominal, ordinal, interval
or ratio on the space provided.
- LEVEL OF COMPLIANCE, such as always, usually, frequently, sometimes, never
- Intelligence quotients of 50 selected students in CSU.
- Lengths of TV commercials ( in seconds)
- The years 1896, 2000, 1776,1995
- Attitude toward gun laws such as favorable, somewhat favorable, somewhat unfavorable
O
I
R
N
O
At what level are the following variables measured? Write nominal, ordinal, interval
or ratio on the space provided.
- Zip codes
- Board exam rating
- Harvest in kilograms of per hectare of rice
- Candidate voted for in 2019 senatorial elections
- Tax Identification Number
N
R
R
N
N
- Altitude of mountains liters of gasoline consumed day
- Rate of success in the entrance exam
- Systolic Blood pressure
- Height of students
- Number of COVID 19 infection per day
R
R
I
R
R
- Number of won cases in court
- Academic rank in High School
- Savings Account Number
- Are you a Pag-Ibig Member? (Yes/No)
- Number of books sold per day
R
O
N
N
R
- Weekly expenses in internet subscription of CvSU students
- Main source of income
- Birth order in the family
- Number of organizations involved in
- Car plate number
R
N
O
R
N
a narrative description of the data gathered.
textual method
a systematic arrangement of information into columns and rows
tabular method
an illustrative description of the data
graphical method
is a statistical table showing the frequency or number of observations contained in each of the defined classes or categories.
frequency distribution table
Parts of a Statistical Table (4)
table heading
body
stubs or classes
caption
includes the table number and the title of the table.
table heading
main part of the table that contains the information or figures.
body
classification or categories describing the data and usually found at the left most side of the table.
stubs or classes
designations or identifications of the information contained in a column, usually found at the topmost of the column.
caption
is a frequency distribution table where the data are grouped according to some qualitative characteristics; data are grouped into non numerical categories.
qualitative or categorial FDT
frequency distribution table where the data are grouped according to some numerical or quantitative characteristics.
quantitative FDT
range formula
π =βππβππ π‘ π£πππ’πβπππ€ππ π‘ π£πππ’π
number of classes (K) formula
πΎ= βπ , where N is the total number of observations in the data set.
class size is determined by first computing what
preliminary class size
preliminary class size formula
cβ = R/K
conditions of the actual class sizes are (2)
a. It should have the same number of decimal places as in the raw data.
b. It should be odd in the last digit.
other columns in FDT (5)
True class boundaries
Class Mark
Relative frequency
Cumulative frequency
Relative Cumulative frequency
TCB (true class boundaries) include (2)
Lower True Class Boundaries (LTCB)
Upper True Class Boundaries (UTCB)
LTCB formula
πΏππΆπ΅=πΏπΏβ1/2 π’πππ‘ ππ ππππ π’ππ
UTCB formula
πππΆπ΅=ππΏ+1/2 π’πππ‘ ππ ππππ π’ππ
midpoint of the class interval where the observations tend to cluster about.
class mark
class mark formula
πΆπ= 1/2 (πΏπΏ+ππΏ) ππ πΆπ= 1/2(πΏππΆπ΅+πππΆπ΅)
β the proportion of observations falling in a class and is expressed in percentage.
relative frequency
RF formula
RF = frequency /n
%RF frequency
frequency / N *100
accumulated frequency of the classes.
cumulative frequency
- total number of observations whose values do not exceed the upper limit of class.
less than CF (<CF)
β total number of observations whose values are not less than the lower limit of the class
Greater than CF
s a device for showing numerical values or relationships in pictorial form
h/chart
advantages of a graph/chart
main features and implications of a body of data can be seen at once
can attract attention and hold the readerβs interest
simplifies concepts that would otherwise have been expressed in so many words
can readily clarify data, frequently bring out hidden facts and relationships.
Qualities of a Good Graph:
It is accurate.
It is clear.
It is simple.
It has a good appearance.
Common Types of Graph
line chart
scatter graph
pie chart
column and bar graph
β graphical presentation of data especially useful for showing trends over a period of time.
line chart
Is a graph used to present measurements or values that are thought to be related.
scatter graph
a circular graph that is useful in showing how a total quantity is distributed among a group of categories. The βpieces of pieβ represent the proportions of the total that fall into each category.
pie chart
like pie charts, column charts and bar charts are applicable only to grouped data. They should be used for discrete, grouped data of ordinal or nominal scale
column and bar graph
β a bar graph that displays the classes on the horizontal axis and the frequencies of the classes on the vertical axis.
frequency histogram
a line chart that is constructed by plotting the frequencies at the class marks and connecting the plotted points by means of straight lines.
frequency polygon
graphs of the cumulative frequency distribution
ogives
the <CF is plotted against the UTCB
<Ogive
the >CF is plotted against the LTCB
> ogive
study how to make FDT
+1