Course 3: Module 1 Flashcards
How data is collected
- Interviews
- Observations
- Forms
- Questionnaires
- Surveys
- Cookies
Data collection considerations:
- How the data will be collected
- Choose data sources
- Decide what data to use
- Select the right data type
- Determine the time frame
First-party data
Data collected by an individual or group using their own resources
Second-party data
Data collected by a group directly form its audience and then sold
Third-party data
Data collected from outside sources who did not collect it directly
Population
All possible data values in a certain dataset
Sample
A part of a population that is representative of the population
Discrete data
Data that is counted and has a limited number of values
Continuous data
Data that is measured and can have almost any numeric value
Nominal data
A type of qualitative data that is categorized without a set order
Ordinal data
A type of qualitative data with a set order or scale
Internal data
Data that lives within a company’s own systems
External data
Data that lives and is generated outside of an organization
Structured data
Data organized in a certain format such as towns and columns
( Spreadsheets)
Unstructured data
Data that is not organized in any easily identifiable manner
(Audio and video files)
Data model
A model that is used for organizing data elements and how they relate to one another
Data elements
Pieces of information, such as people’s names, account numbers, and addresses
Conceptual data modeling
gives a high-level view of the data structure, such as how data interacts across an organization. It can be used to define the business requirements for a new database. It also doesn’t contain technical details
Logical data modeling
focuses on the technical details of a database such as relationships, attributes, and entities. For example, a logical data model defines how individual records are uniquely identified in a database. But it doesn’t spell out actual names of database tables. That’s the job of a physical data model.
Physical data modeling
depicts how a database operates. A physical data model defines all entities and attributes used; for example, it includes table names, column names, and data types for the database.
Two common methods of data modeling
Entity Relationship Diagram (ERD) and the Unified Modeling Language (UML)
Data type
A specific kind of data attribute that tells what kind of value the data is
Data types in spreadsheets
- Number
- Text or string
- Boolean
Text or string data type
A sequence of characters and punctuation that contains textual information
Boolean data type
A data type with only two possible values, such as TRUE or FALSE
Wide data
Data in which every data subject has a single row with multiple columns to hold the values of various attributes of the subject
Long data
Data in which each row is one time point per subject, so each subject will have data in multiple rows
Data organization
better organized data is easier to use
Data compatibility
different applications or systems can use the same data
Data migration
data with matching formats can be moved from one system to another
Data merging
data with the same organization can be merged together
Data enhancement
data can be displayed with more detailed fields
Data comparison
apples-to-apples comparisons of the data can be made