Data Management pt. 1 Flashcards
2 types of statistics
Descriptive and inferential
Collection, organization, summary, and presentation of data. Beginning
i.e. measures of location, measures of variability, skewness and kurtosis
Descriptive
interpretation and analysis of data.
conclusion is drawn based on the subset of the population
i.e hypothesis testing and regression analysis
Inferential
the characteristic that is being studied
varies across individuals or objects
Variables
data that can assume values that manifest the concept of attributes
AKA categorical data
Cannot be measured
Qualitative Variables
finite number of possible values
CAN be counted but CANNOT be measured
Whole numbers
Discrete Variables
data from counting or measuring
numerical data representing numerical value
Quantitative Variables
infinite number of probable values, can be selected within a given rage
CAN be measured but CANNOT be counted
Continuous Variables
Levels of Measurement
Nominal, ordinal, ratio and interval (NORI)
used to label or classify variables using letters, words, and alpha numeric symbols. No particular order.
Nominal
represents discrete and ordered units, follows a natural order
Ordinal
tells the distances between measurements in addition to the classification and ordering
no true zero point
Interval
most informative as it combines the first three levels,
order units that have the same difference
Ratio
Examples of ratio
kelvin, height, weight, length, and time/duration
steps in statistical inquiry or investigation
Defining the problem
Collection/ gathering of info or data
Organization/presentation of data
Interpretation of data
2 types of sampling methods
Probability and Non-Probability
equal chance of getting selected, includes entire population
lottery, fishbowl method, and table of random numbers
Simple Random Sampling
everyone is assigned a number and individuals are chosen at regular intervals.
Systematic Sampling
populations -> subgroups (strata) based on a relevant characteristic not all members are included though
(e.g. age, income, job role, gender …)
Stratified Random Sampling
area sampling
population -> subgroups but all members are included
Cluster Sampling
most accessible individuals to the researcher
Convenience sampling
can be biased as some people are more likely to volunteer than others
Voluntary Response Sampling
individuals are handpicked by the researcher, deemed most useful for the research
Purposive Sampling
finding respondents through recommendations from previous participants
Snowball Sampling
2 types of sources
Primary and secondary sources
raw, first-hand evidence
i.e. interview, statistical data, and artworks
Primary
second-hand information and commentary from other researchers
Secondary sources
Data collection techniques
Interviews
Projective technique
Delphi Technique
Focus Groups
Questionnaires
(PIdeFoQue)= Putang Ina The Fuck?
researchers ask qs by direct interviews or means of mass communications
Interviews
indirect interview, respondents know why they’re being asked but the q is incomplete, to be filled in with their opinions feelings, and attitudes
Projective Technique
each expert answers based on their field of specialization, then their responses are consolidated into one opinion.
Delphi Technique
6-12 people with a moderator discussing one topic / issue
Focus Groups
series of questions either open or close-ended related to the matter at hand.
Questionnaires
Presentation of Data
Textual, Tabular, Graphic
in narrative/ paragraph form
combines text and figures in a statistic
Textual Presentation
data is in tables, more comprehensible comparison of figures or report
Tabular Presentation
presented in visual or pictorial form
clear view of relationships through pics and colored maps
Graphic Presentation
Types of Graphic Presentation
Line Graph, Bar Graph, Circle Graph/ Pie Chart, and Pictograph/Pictogram
shows a trend over a period
Line Graph
for comparison of simple magnitude
can be horizontal or vertical
Bar Graph
circle divided into parts, sizes are proportional to the magnitude/ percentages they represent
shows component parts of a whole
Circle Graph / Pie Chart
makes use of pictorial symbols to indicate data with a legend
Pictograph / Pictogram
tabular arrangement of data with its classification / grouping according to magnitude / size
Frequency Distribution
Number of given
n
end numbers of a class
highest and lowest value
Class Limit
Total number of classes
Sturge’s formula and Slovin’s formula
Number of Classes = K
“true” class limits defined by upper and lower boundaries
lower boundary can be determined by ave of the upper limit of a class and lower limit of the next class
Add 0.5 to upper limit and subtract 0.5 to the lower limit
Class Boundaries =CB = +/- 0.5
ave of the lower and upper limits of each class
Class Mark = midpoint = x
difference of upper and lower boundaries of each class.
affected by the nature of data and the number of classes
Class Interval = range = r = UL-LL
the width of each class interval
Class size = LL+ R/K
adding frequencies from highest to lowest
“Less Than” (<cf)
adding frequencies from lowest to highest
“More Than” (>cf)
Relative frequency = %RF
(f/n)(100) = %RF
Cumulative Percentage (%>or <cf)
(>cf or <cf / n)(100) = %cf
When intervals are uniform, width of the bar must also be uniform
x axis = class boundary (CB)
y axis = frequency (F)
Histogram
Constructed broken line curve
x axis = midpoint (x)
y axis = frequency (F)
Frequency Polygon