Determine what data to collect Flashcards
What are some data collection considerations?
How the data will be collected
What data sources to choose
What data do use
How much data to collect
What data type to select
What time frame to use
What is a data sample?
A part of a population that is representative of the population
What is first-party data?
Data that you collect yourself
What is second-party data?
Data that is collected directly by another group and then sold
What is third-party data?
Data that is sold by a provider that didn’t collect the data themselves
What is discrete data?
Data that is counted and has a limited number of values. Allowing partial measurements like half-stars of quarter-points would mean the data is not discrete
What is continuous data?
Data that is measured and can half almost any numeric value (eg; 110.03568 minutes)
What is nominal data?
A type of qualitative data that is categorised without a set order (data that isn’t in a particular order)
What is ordinal data?
A type of qualitative data with a set order or scale
What is internal data?
Data that lives within a company’s own systems. Internal data is generally more reliable and easier to collect
What is external data?
Data that lives and is generated outside of an organisation
What is structured data?
Data organised in a certain format such as rows and columns (expense reports, inventory)
What is unstructured data?
Data that is not organised in any easily identifiable manner (eg; social media posts, or video and audio files)
Define qualitative data
A subjective and explanatory measure of a quality or characteristic
Define quantitative data
A specific and objective measure such as a number, quantity, or range
What is a data model?
A model that is used for organising data elements and how they relate to one another, which helps keep data consistent and provide a map of how it is organised
What are data elements?
Pieces of information, such as people’s names, account numbers, and addresses
What are the three most common types (levels) of data modeling?
Conceptual: a high-level view of the data structure (doesn’t contain technical details)
Logical: focuses on the technical details of a database such as relationships, attributes, and entities (doesn’t spell out actually names of database tables)
Physical: depicts how a database operates. Defines all entities and attributes used (table names, column names, data types)
What are two common methods/ techniques for developing data models?
Entity Relationship Diagram (ERD): a visual way to understand the relationship between entitities in the data model
Unified Modelling Language (UML): detailed diagrams that describe the structure of a system by showing the system’s entities, attributes, operations, and relationships
What is a data type?
A specific kind of data attribute that tells what kind of value the data is
What is text or string data?
A sequence of characters and punctuation that contains textual information
What is boolean data?
A data type with only two possible values; true or false