Data Management Part 1 Flashcards
- is a process by which information is acquired and processed to ensure the accessibility and reliability of the data for its users.
- One of the most important tool in processing and managing such information is statistics.
Data Management
- is a science which deals with the collection, organization, presentation, analysis, and interpretation of data so as to give a more meaningful information.
- subdivided into two branches, namely: descriptive statistics and inferential statistics
Statistics
- refers to the collection, organization, summary, and presentation of data
- Examples are the measures of location, measures of variability, skewness and kurtosis.
Descriptive Statistics
- deals with the interpretation and analysis of data where conclusion is drawn based from the subset of the population.
- Examples are hypothesis testing and regression analysis
Inferential statistics
5 stages in statistical investigation
- Collection of Data
- Organization of data
- Presentation of data
- Analysis of data.
- Interpretation of data.
- Is a characteristic or attribute that can assume different values in different persons, places, or things.
- includes age, race, gender, intelligence, personality type, attitudes, ethnic group or patients, height, weight, heart rate, marital status, eye color, etc.
Variable
- data which can assume values that manifest the concept of attributes.
- are sometimes called categorical data.
- e.g. person’s gender, home town, birthdate, post code, marital status, eye color, etc.
Qualitative variables
- data are obtained from counting or measuring.
- Numerical data which represents the numerical value i.e. how much, how often, how many
- Numerical data gives information about the quantities of a specific thing e.g. height, length, weight, test score, and so on.
Quantitative variables
- contains only a finite number of possible values.
- this type of data can’t be measured but it can be counted. e.g. number of students in a class
Discrete variables
- Continuous data has an infinite number of probable values that can be selected within a given range.
- This type of data can’t be counted but it can be measured. e.g. temperature range
Continuous variable
Levels of measurement:
* values in the variable are used to label or classify variables. It has no order.
* words, letters and alpha numeric symbols can be used.
Nominal
Levels of measurement:
* values represent discrete and ordered units. It follows a natural order
Ordinal
Levels of measurement:
* values tell the distances between the measurements in addition to the classification and ordering. It does not have a true zero point.
Interval
Levels of measurement:
* is the most informative level of measurement. The combination of first three levels of measurements. It also order units that have the same difference.
Ratio
the entire group that you want to draw conclusions about
population
is a way of selecting individual members or a subset of the population to make statistical inferences from them and estimate characteristics of the whole population.
sampling methods
the specific group of individuals that you will collect data from.
sample
means that every member of the population has a chance of being selected. It is mainly used in quantitative research
Probability sampling
involves non-random selection based on convenience or other criteria, allowing you to easily collect data. It is often used in exploratory and qualitative research
non-probability sampling
every member of the population has an equal chance of being selected. Your sampling frame should include the whole population. Two ways of: lottery or fishbowl technique and table of random numbers.
simple random sampling
is similar to simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, individuals are chosen at regular intervals
systematic sampling
involves dividing the population into subgroups, but each subgroup should have similar characteristics to the whole sample. Sometimes referred to as “area sampling”
Cluster sampling
to use this sampling method, divide the population into subgroups (called strata) based on the relevant characteristic (e.g. gender, age range, income bracket, job role).
stratified random sampling
simply includes the individuals who happen to be most accessible to the researcher
convenience sampling