2.2: OBTAINING THE DATA Flashcards
What are the two most commonly requested data formats?
The two most commonly requested data formats are text and tabular.
When might business analysts prefer to receive data in text format?
Business analysts may prefer text format for analyzing mostly unstructured text-based data, such as customer reviews, transcripts of phone calls, and social media data.
What is the characteristic of tabular data?
Tabular data are structured into rows and columns, with each column dedicated to one attribute, and each row corresponds to one instance of the data.
How should cells in tabular data be structured?
Each cell in tabular data should contain only a single value, and never multiple values.
For instance, multiple email addresses should be stored in separate fields.
What is the common file format for delivering tabular data?
Tabular data are commonly delivered in a comma-separated file format with the file extension .csv, which is platform-independent and widely compatible.
Why is it beneficial to work with unformatted data for business analysis?
Working with unformatted data (raw data) allows analysts to have full control, identify outliers, anomalies, and frame the data in ways that support decision-making.
What does “aggregated data” refer to?
Aggregated data are individual data points that have been combined into subtotals, such as counts, sums, or averages.
It is data that have been processed and cleaned.
What is raw data and what is the benifit of using it?
raw data
Data that have not been processed, cleaned, or aggregated.
Working with raw data allows you to select the most relevant data, identify outliers and anomalies, and frame the data in ways that best support decision-making.
It provides greater control and insight into the data.
Example of Tabular Data