Chapter 1 Flashcards
Statistical Inference Procedures
Procedures that allow a decision maker to reach a conclusion about a set if data based on a subset of that data. TWO Primary categories: Estimation & Hypothesis.
Business Statistics
A collection of procedures and techniques that are used to convert data into meaningful information in a business environment
Statistical Inference Procedure: ESTIMATION
In situations in which we would like to know about all the data in a large data set but it’s impractical to work with all the data; The estimates are formed by looking closely at a subset of the larger data set.
Statistical Inference Procedures: HYPOTHESIS
Hypothesis testing uses statistical techniques to validate a claim.
Experiment
An experiment is any process that generates data as its outcome, whose results cannot be predicted with certainty.
Closed-Ended Questions
Questions that require the respondent to select from a short list of defined choices.
Experimental Design
The plan for performing the experiment in which the variable of interest is defined is referred to as an experimental design. In the experimental design one or more factors are identified to be changed so that the
impact on the variable of interest can be observed or measured.
Open-Ended Questions
Questions that allow respondent the freedom to respond with any value, words, or statements of their own choosing.
Demographic Questions
Questions relating to the respondents’ characteristics, background and attributes.
Structured Interview
Interviews in which the questions are scripted.
Unstructured Interview
Interviews that begin with one or more broadly states questions with further questions being based on the responses.
Direct Observation
A procedure used to collect data, the procedure requires the process from which the data are being collected to be physically observed and the data recorded based on what takes place in the process.
Bias
An effect that alters a statistical result by systematically distorting it; different from a random error, which may distort on any one occasion but balances out on he average.
DATA COLLECTION ISSUES
- Data Accuracy
- Interviewer Bias
- Non Response Bias
- Observer Bias
- Selection Bias
- Measurement Error
- Internal Validity
- External Validity
Selection Bias
Bias interjected through the way subjects are selected for data collection.
Observer Bias
Data collection through personal behaviour is also subject to problems. People tend to view the same event or item differently. This is referred to as observer bias.
Internal Validity
A characteristic of an experiment in which data are collected in such a way as to eliminate the effect of variables within the experimental environment that are not of interest to the researcher.
External Validity
A characteristic of an experiment whose results can be generalized beyond the test environment so that the outcomes can be replicated when the experiment is repeated.
Population
the set of all objects or individuals of interest or the measurements obtained from all objects or individuals of interest.
Sample
a subject of the population
Census
an enumeration of the entire set of measurements taken from the whole population
Statistical Sampling Techniques:
those sampling methods that use selection techniques based on chance selection
Non-Statistical Sampling Techniques
Those methods of selecting samples using convenience, judgement or other non chance processes
Convenience Sampling
a sampling technique that selects the items from the population based on accessibility and ease of selection
Frame
the list of all objects or individuals in the population
Parameters
descriptive numerical measures, such as an average or a proportion, that are computed from an entire population
Statistic
corresponding measures for a sample
Statistical sampling methods
also called probability sampling, allow every item in the population to have a known or calculable chance of being included in the sample
Statistic Sampling Methods
the fundamental statistical sampling method is called a SIMPLE RANDOM SAMPLING , other types of statistical sampling methods are STRATIFIED RANDOM SAMPLING, SYSTEMATIC SAMPLING AND CLUSTER SAMPLING
Simple Random Sampling
A method of selecting items from a population such that every possible sample of a specified size has an equal chance of being selected
Stratified Random Sampling
a statistical sampling method in which the population is divided into subgroups called strata, so that each population item belongs to only one stratum. The objective is to form a strata such that the population values of interest within each stratum are as much alike as possible. Sample items are selected from each stratum using the simple random sampling methods
Systematic Random Sampling
a statistical sampling technique that involves selecting every Kth item in the population after a randomly selected starting point between 1 and K. The value of K is determined as the ratio of the population size over the desired sample size
Cluster Sampling
a method by which the population is divided into groups or clusters that are each intended to be mini-populations. A simple random sample of a cluster can be selected using any probability sampling technique
Quantitive data
measurements whose values are inherently numberical
Qualitative data
data whose measurements scale is inherently categorical
Time Series data
a set of consecutive data values observed at successive points in time
Cross-sectioned data
a set of data values observed at a fixed point in time
DATA MEASUREMENTS LEVELS
- Nominal data
- Ordinal data
- Interval data
- Ratio data
Nominal data
the lowest form of data, assigning codes to categories generates nominal data. Values given to categories that have no specific meaning are nominal data, because the order of the categories is arbitrary. With NOMINAL data, we also have complete control over what codes are used
Ordinal data
ORDINAL or rank data are one notch above nominal data on the measurement hierarchy. At this level, the data elements can be rank-ordered on the basis of some relationship among them, with the assigned values indicating this order
Interval data
if the distance between two data items can be measured on some scale and the data have ordinal properties (>,<,=) the data are said to be interval data. Best example is TEMPERATURE
Ratio data
data that have all the characteristics of interval data but also have a zero point (at which means “none”) are called RATIO data. Ratio measurements is the highest level of measurements. Weight is a good example of RATIO data.