2.3: Structured Data types: Categorical vs. Numerical Flashcards

1
Q

What are the two broad categories of structured data?

A

The two broad categories of structured data are categorical and numerical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is nominal data?

A

Nominal data refers to categorical data that cannot be ranked or ordered, such as country of origin and transaction type.

It is summarized using counting, grouping, and proportion methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can you analyze categorical nominal data like the “Online or In-Person” column?

A

Categorical nominal data can be analyzed by counting and grouping transactions by their categories (e.g., online or in-person) and calculating proportions.

Proportions are found by dividing the number of observations in a category by the total number of observations in the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is proportion?

A

proportion
The number of observations in one category divided by the grand total of all available observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ordinal data, and what distinguishes it from nominal data?

A

Ordinal data allow counting, grouping, calculating proportions, and ranking based on a natural order or ranking associated with the variable.

This distinguishes ordinal data from nominal data, which lack a natural order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the three primary methods of summarizing ordinal data?

A

The three primary methods for summarizing ordinal data are counting and grouping, calculating proportions, and ranking (or sorting).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Can you provide examples of ordinal data?

A

Examples of ordinal data include letter grades (A, B, C, D, F) and Olympic medals (gold, silver, bronze). They have a natural order that allows ranking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is sorting by rank valuable in analyzing ordinal data?

A

Sorting by rank adds value to the analysis of ordinal data because it reflects the natural order associated with the data, providing insights beyond numerical or alphabetical sorting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are numerical data, and what types of analysis can be performed on them?

A

Numerical data consist of meaningful numbers, and analysis can include operations such as summing, multiplying, or dividing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the four primary methods for summarizing numerical data?

A

The four primary methods for summarizing numerical data are counting and grouping, proportion, summing, and averaging.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is interval data, and why does it lack a meaningful zero?

A

Interval data has an equal interval between data points but lacks a meaningful zero because zero in interval data does not represent the absence of something, and it is simply another number.

Examples include temperature and SAT scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What distinguishes ratio data from interval data, and why is it considered the most sophisticated form of data?

A

Ratio data has an equal interval between data points, but it also has a meaningful zero, allowing for the calculation of ratios.

It is considered the most sophisticated form of data due to its ability to measure relationships and financial positions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Provide examples of data types that fall under ratio data.

A

Examples of ratio data include transaction amounts, expenses, revenues, assets, salary, and taxes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a “flag”?

A

flag
A term used to describe a data set in which there are only two options.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the string data type, and what types of characters can it include?

A

The string data type consists of a collection of characters, which can be letters, numbers, or a combination.

However, the numbers in a string are not meaningful values and cannot be used in calculations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the date data type, and how is it typically formatted?

A

The date data type is a string of characters formatted in a traditional date format, such as mm/dd/yyyy or mm/dd/yy.

17
Q

What is the number data type reserved for, and what types of characters can be stored as numbers?

A

The number data type is reserved for numeric data, typically ratio data. Any characters stored as numbers can be used in calculations.

18
Q

What is geographic data, and how is it linked to maps?

A

Geographic data are attributes linked to a map, such as state, city, country, or zip/postal code.

These attributes are linked with latitude and longitude numbers to represent them on maps.

19
Q

What is a dimension, and where might you encounter this term, such as in Tableau?

A

A dimension is any attribute characterized as categorical.

The term dimension is often used in applications like Tableau to refer to categorical attributes.

20
Q

What is a measure, and where is it commonly used, such as in Tableau?

A

A measure is an attribute characterized as numerical. The term measure is often used in applications like Tableau to refer to numerical attributes.

21
Q

What do ETL and ELT stand for in the context of data preparation?

A

ETL stands for “Extract, Transform, Load,” while ELT stands for “Extract, Load, Transform.” Both are processes used for preparing data for analysis.

22
Q

What are the four steps involved in preparing data for analysis, often described as part of the ETL or ELT process?

A

The four steps in preparing data for analysis are

(1) ensuring data quality,

(2) validating data for completeness and integrity,

(3) cleansing the data, and

(4) performing preliminary exploratory analysis.

23
Q
A