QM -Midterms Flashcards
- Used to filter individuals from a population and create samples
Probability
o – random selection & Most used
Simple random sampling
o sampling – population -> groups(strata)
Stratified random
o –main segment -> clusters (geographic segmentation) univ -> colleges
Cluster sampling
o – first = random, ff = fixed interval(N/n)
Systematic sampling
WHEN TO USE PROBABILITY SAMPLING
o You want to reduce the sampling bias
o The population is usually diverse
o To create an accurate sample
Advantages of Probability
o It’s cost-effective
o It’s simple and straightforward
o It’s non-technical
Can avoid sampling errors (unexpected results)
- Not all of them has equal chance of being chosen
Non-probability
o – elements chosen = proximity to the researcher, quick and easy implementation
Convenience sampling
o – ^same, r = choose 1 element / sample group
Consecutive sampling
o – elements = knowledge of traits & pers. -> strata
Quota sampling
o – used when audience = rare
Snowball sampling
o – samples = r experience & skills
Judgmental sampling
WHEN TO USE NON-PROBABILITY
o Particular trait/characteristic exist in the population
o Aim at conducting qualitative research, pilot studies or exploratory research
o Have limited time to conduct research or budget constraints
o In-dept analysis is needed
o Get specific results
Advantages of Non-probability sampling
o More conducive and practical
o Faster and more cost-effective
o – research method used for collecting data
Survey
- – data -> same categories
Nominal
- – measure variable in ranking, meaningful insights un. responses
Ordinal
- – measure variable w/ equal interval, temp & time
Interval
- – comparison in ratios, %, ave,
o Great for research in fields like science, engineering, and finance, where you need to use ratios, percentages, and averages to understand the data.
Ratio
- – effective survey dist., widely used
- – responses are much higher using this
Buy respondents
- – ^ responses = close proximity to the brand
Embed survey on a website
- : social media -> survey aids = ^ responses
Social distribution
- : store the URL for the survey. Print & publish
QR code
- : quick and time-effective way to collect a high number of responses.
SMS survey
- ’ is a market research term
o number of individuals included in conducting research.
Sample size
- is the process of choosing the right number of observations or people from a larger group to use in a sample.
Sample size determination
– how many people fit your demographic, total number
Population Size
- The ________ tells you how sure you can be that your data is accurate. %
confidence level
- tell you how far off from the population means you’re willing to allow your data to fall.
Confidence intervals
- A ________ describes how close you can reasonably expect a survey result to fall relative to the real population value.
margin of error
- is the measure of the dispersion of a data set from its mean. It measures the absolute variability of a distribution. ^ dispersion or variability = = the standard deviation + ^ magnitude of the deviation.
Standard deviation
- aka data scrubbing or data cleansing.
- Ensures the use of the highest-quality data to perform the analysis.
- ”Garbage in, garbage out” George Feuchsel
- 80/20 Dilemma: 80% of research time is finding, cleaning, and reorganizing huge amounts of data. Only 20% is spent on actual data analysis.
Data Cleaning
o Also known as Exploratory Data Analysis (EDA)
o Developed by John Tukey in the late 1970s
o an approach used to better understand the data through quantitative and graphical methods
Discovering Data
o reshaping data for a particular statistical analysis. Data = irregularities and inconsistencies, -> accuracy of the researcher’s models
Structuring Data
o central to ensuring you have high-quality data for analysis.
Cleaning Data
o need to find other datasets and merge them into the current one
Enriching Data
o vital to ensure data are clean, correct, and useful.
o Incorrect datas = incorrect answers
Validating Data
o data are available for appropriate use by others, which is embodied by the FAIR principles.
Publishing Data
FAIR principles
Findable, Accessible, Interoperable, and Reusable
- improves the accuracy and quality of data ahead of data analysis.
Data cleaning
- graphical representation of data through visual elements such as charts, graphs, maps, or timelines.
- it makes learning easier.
- it helps you break down, process, and present information in a visual context. This way, it takes less time and effort for the brain to digest the information than analyzing data tabularly.
Data visualization
Advantages of Data Visualization
- Easily sharing information.
- Interactively explore opportunities.
- Visualize patterns and relationships.
Disadvantages of Data Visualization
- Biased or inaccurate information.
- Correlation doesn’t always mean causation.
- Core messages can get lost in translation.
- : This consists of rows and columns used to compare variables. Can overwhelm users
Tables
- : These graphs are divided into sections that represent parts of a whole. simple way to organize data and compare the size of each component to one other.
Pie charts and stacked bar charts
- : These visuals show change in one or more quantities by plotting a series of data points over time and are frequently used within predictive analytics.
Line charts and area charts
utilize lines to demonstrate these changes
Line graphs
connect data points with line segments, stacking variables on top of one another and using color to distinguish between variables
area charts
- : This graph plots a distribution of numbers using a bar chart (with no spaces between the bars), representing the quantity of data that falls within a particular range. This visual makes it easy for an end user to identify outliers within a given dataset
Histograms
- : These visuals are beneficial in reveling the relationship between two variables, and they are commonly used within regression data analysis. However, these can sometimes be confused with bubble charts, which are used to visualize three variables via the x-axis, the y-axis, and the size of the bubble.
Scatter plots
- : These graphical representation displays are helpful in visualizing behavioral data by location. This can be a location on a map, or even a webpage.
Heat maps
- , which display hierarchical data as a set of nested shapes, typically rectangles. Treemaps are great for comparing the proportions between categories via their area size.
Tree maps
*: Think about who your visualization is designed for and then make sure your data visualization fits their needs.
Know your audience(s)
- : Specific visuals are designed for specific types of datasets.
Choose an effective visual
- : Data visualization tools can make it easy to add all sorts of information to your visual.
Keep it simple
– analyzing and gathering numerical data to uncover trends, calculate averages, evaluate relationships, and derive overarching insights.
Quantitative Research
- – giving away r. questions -> target respondents
Survey research
- – c. analysis w/ graphs & diagrams to show relationships -> research
Correlational research
- – investigate cause & effect relationships of ind. & dependent v.
Experimental research
- market research method that focuses on obtaining data through open-ended and conversational communication.
- “what” people think and “why” they think so
- eg. One-on-one interview, Focus groups, Ethnographic research, Case study research, Record keeping, Process of observation
Qualitative Research
- objective measurements and the statistical, through polls, questionnaires, and surveys, or by manipulating pre-existing statistical data using computational techniques
- collect and analyze data that can be measured in numbers
Quantitative Method
- designing experiments or surveys, gathering measurable data, and analyzing this data using mathematical models to draw conclusions or make predictions.
QM Scope
- provide a framework for obtaining objective data that can be universally measured and analyzed. This reduces the influence of personal biases and subjectivity in interpreting results.
Objective Analysis
- analyze data from a sample, -> conclusions that can be generalized to a larger population. (medicine, economics)
Generalizability
- developing models to predict outcomes based on measurable variables.(weather forecasting, finance, logistics)
Predictive Capabilities
- precise measurement and replicable procedures, ensures that studies can be repeated. key to verifying results and building on existing knowledge.
Replicability
- uses statistical tools to validate hypotheses. This provides a robust framework for testing theories and establishing facts with a known degree of accuracy.
Statistical Validity
o – educated guess or prediction
Hypotheses
- make informed decisions in business and policy-making. For example, by analyzing customer data, businesses can optimize their strategies to meet market demands better.
Decision Making
- To collect data from a large group of people efficiently.
Survey and Questionnaires
- To determine cause-and-effect relationships by manipulating variables.
Experiments
- To track changes over time.
Longitudinal Studies
- To analyze existing data collected for other research purposes.
Secondary Data Analysis
- surveys, polls, or questionnaires to gather quantitative data = in-dept & actionable numerical data
Structured tools
- – represents the target market
Sample size
- – cornerstone of quantitative research
Close-ended questions
- – searching previous studies rrl
Prior studies
- represented using tables, charts, graphs, or other numerical forms.
Quantitative data
- generalize results to the entire population.
Generalization of results
Benefits of QM
- Collect Reliable and Accurate Data
- Quick Data Collection
- Wider Scope of Data Analysis
- Eliminate Bias
- involves summarizing, organizing, and presenting data meaningfully and concisely.
- describing and analyzing a dataset’s main features and characteristics without making any generalizations or inferences to a larger population.
- Involves graphical representation (ave, %)
Descriptive Statistics
- Datasets consist of a distribution of scores or values.
Distribution (Also Called Frequency Distribution)
- Mean, Mode, and Median
Measures of Central Tendency
- Range, Standard Deviation, and Variance
Variability (Also Called Dispersion)
– examine only one variable
Univariate Descriptive Statistics
- two variables are concurrently analyzed (compared) to see whether they are correlated.
- may also be referred to as multivariate
Bivariate Descriptive Statistics