2.3: Structured Data types: Categorical vs. Numerical Flashcards
What are the two broad categories of structured data?
The two broad categories of structured data are categorical and numerical.
What is nominal data?
Nominal data refers to categorical data that cannot be ranked or ordered, such as country of origin and transaction type.
It is summarized using counting, grouping, and proportion methods.
How can you analyze categorical nominal data like the “Online or In-Person” column?
Categorical nominal data can be analyzed by counting and grouping transactions by their categories (e.g., online or in-person) and calculating proportions.
Proportions are found by dividing the number of observations in a category by the total number of observations in the sample.
What is proportion?
proportion
The number of observations in one category divided by the grand total of all available observations.
What is ordinal data, and what distinguishes it from nominal data?
Ordinal data allow counting, grouping, calculating proportions, and ranking based on a natural order or ranking associated with the variable.
This distinguishes ordinal data from nominal data, which lack a natural order.
What are the three primary methods of summarizing ordinal data?
The three primary methods for summarizing ordinal data are counting and grouping, calculating proportions, and ranking (or sorting).
Can you provide examples of ordinal data?
Examples of ordinal data include letter grades (A, B, C, D, F) and Olympic medals (gold, silver, bronze). They have a natural order that allows ranking.
Why is sorting by rank valuable in analyzing ordinal data?
Sorting by rank adds value to the analysis of ordinal data because it reflects the natural order associated with the data, providing insights beyond numerical or alphabetical sorting.
What are numerical data, and what types of analysis can be performed on them?
Numerical data consist of meaningful numbers, and analysis can include operations such as summing, multiplying, or dividing.
What are the four primary methods for summarizing numerical data?
The four primary methods for summarizing numerical data are counting and grouping, proportion, summing, and averaging.
What is interval data, and why does it lack a meaningful zero?
Interval data has an equal interval between data points but lacks a meaningful zero because zero in interval data does not represent the absence of something, and it is simply another number.
Examples include temperature and SAT scores.
What distinguishes ratio data from interval data, and why is it considered the most sophisticated form of data?
Ratio data has an equal interval between data points, but it also has a meaningful zero, allowing for the calculation of ratios.
It is considered the most sophisticated form of data due to its ability to measure relationships and financial positions.
Provide examples of data types that fall under ratio data.
Examples of ratio data include transaction amounts, expenses, revenues, assets, salary, and taxes.
What is a “flag”?
flag
A term used to describe a data set in which there are only two options.
What is the string data type, and what types of characters can it include?
The string data type consists of a collection of characters, which can be letters, numbers, or a combination.
However, the numbers in a string are not meaningful values and cannot be used in calculations.