Management & Knowledge Generation Flashcards
Quantitative Data
Numbers within a statistical format
Gathered after the design of data collection is outlined
Primary or secondary data
Qualitative Data
Verbal, graphic, subjective
Time-intensive to gather
Useful at beginning of design process for data collection
Primary Data
Quantitative data collected for a particular purpose
Secondary Data
Quantitative data originally collected for another purpose
Data Management
Use of computers to store, access, and secure patient information
Stored as tables in relational databases
Data Warehouses
Used to store results from clinical trials or insurance companies
Not required on a daily vasis
Used by management to make decisions
Data Warehouses
Used to store results from clinical trials or insurance companies
Not required on a daily vasis
Used by management to make decisions
Frees space and increases response time
Knowledge-Based Data
Training, support, research, practice guidelines
Comparison Data
Internal or external comparisons to benchmarks or best-practice guidelines
Analog Data
TV, radio, telephone, recorded
Continuous waveform signals varying in intensity
Binary Code
Comprised of strings of 1s and 0s
1s stored in magnetized areas (on), 0s in non-magnetized areas (off)
Data converted into bits for digital transmission
1 Byte
8 bits
256 character
1 Kilobyte
1000 bytes
1 Megabyte
1 million bytes
1 Gigabyte
1 billion bytes
1 Terabyte
1 trillion bytes
American Standard Code for Information Interchange
Most common binary coding scheme for English and European languages
Hexadecimal Coding System
2 hexadecimal characters represent 1 byte
Base of 16 and 16 symbols (1-9 and A-F for 10-15)
1 digit = 1 nibble
1 byte = 1 octet
Binary code 1000 = Hexadecimal code 8
Unicode Standard Coding Scheme
Standardized coding system that has a large capacity and can represent most languages, including Asian languages
110,000 characters
Users can assign values as needed
Data Aggregation
Collection and summation of data for further use
May be used to collect data about one topic or person from multiple sources
Data Aggregation Criteria
Apps should integrate with existing
Apps should be flexible and use industry standards
Fast and reliable performance
Scalable results
Efficient implementation with little training
Requires little increase in hardware, software, and stoarge
Cost-effective for organization
Subject-Oriented Data Warehouse
All events or objects that are the same are linked in a traceable manner
Time-Varient Data Warehouse
Ability to see information changes as a function of time
Non-Volatile Data Warehouse
Information can never be deleted or manipulated in a way that can cause loss
Integrated Data Warehouse
Information from all areas of the enterprise is placed into the same database for the sake of analysis
Data Warehouse Infrastructure
Hardware and software of the system
Data Warehouse Data
Diagram representations of the structures that send and store information and their relation
Data Warehouse Process
How information gets from one place to another or is dealt with
Codd Rules of Normalization
Used by data warehouses to break data down into a table to show relationships
Dimensional (data into numerical facts) or normalized (groups into tables by subject)
Knowledge Discovery in Database (KDD)
Method by which to identify patterns and relationships in large amounts of data
Steps - Selecting data, preprocessing, transforming, data mining, interpreting results
Data Perturbation
Hiding of confidential information while maintaining basic information in the database
Data Preprocessing
Assembling target data set, cleaning data of noise
Data Mining
Analysis (often automatic) of large amounts of data to identify underlying or hidden patterns
Mean
Average number
Median
50th percentile
Mode
Number occurring with the highest frequency
Range
Distance from the highest to lowest number
Interquartal
Range between the 25th and 75th percentile
Varience
Distribution spread around an average value
Standard Deviation
Squre root of the varience, shows the dispersion of data above and below the mean en equally measured distances.
Chi-Square Test
Means by which to establish if a varience in categorical data is of statistical signifigance
T Test
Used to analyze data to determine if there is a statistically significant difference in the means of both groups
Regression Analysis
Used to evaluate data sets found in scattergrams
Compares relationship between the dependent and independent variables
Sensitivity
The data include all positive cases, taking into account variable and decreasing the number of false-negatives
Specificity
The data include only those cases specific to the needs of the measurement, excluding those from a different population thereby decreasing the number of false-positives
Stratification
Data are classified according to subsets, taking variables into consideration
Recordability
The tool/indicator collects and measures the necessary data
Reliability
Results should be reproducable