Chapter 23 Flashcards
What is a Pivot Table
A Pivot Table is a powerful innovation in spreadsheet technology. It helps us organize, summarize, analyze, and present selected fields in a dataset. In addition, it does so relatively easily with little effort required
from the user. Earlier, in this book, we saw that we could present data by columns or by rows by transposing a dataset. However, a limitation of
this approach is that we need to transpose all columns (or all rows). A Pivot Table allows us to organize this data by choosing specific fields
around which we can organize the data. Effectively the data pivots (or turns around) given fields.
Pre-requisites for a Pivot Table
Before creating a Pivot Table, we should check our data for potential problems:
- blank rows in data
- blank columns in the data
- hidden rows
- hidden columns
- columns without headers (or labels)
- columns with multiple level headers
- poor organization of data
Pre-requisites for Pivot Table
When inserting a pivot table, if the active cell is within the dataset, Excel chooses all contiguous data as the data that will be summarized in a Pivot Table. (We do have the option to modify this selection). As Excel
automatically selects all contiguous data, it is important to ensure that the data does not have breaks in the data in the form of blank rows or
blank columns. We also need to check if there are any hidden rows or columns. If there are hidden rows/columns that contain data, the Pivot Table includes these rows (as long as the data selection for the Pivot Table includes these rows/columns). We want to check for hidden rows/columns to help us decide if we want to include or exclude these rows/columns in the pivot table.
Pre-requisites for Pivot Table
For a Pivot Table, Excel requires that each field (column) in the dataset have a column header or label. Fortunately for us, if we have a column of data, but that column does not have a label, Excel alerts us to this issue. Another issue, with regard to labels, that we need to be careful about is that the dataset we choose for our pivot table has one level of headers or labels. Consider the table below which shows five
records of fruit sales. These records have labels in two rows. Both, the FRUIT label and the TRANSACTION label, have a series of sub-labels. We need to be sure we have selected the data and one-row of labels before we are ready to start on the pivot table.
A final check has to do with data organization. Every column should have a single element of data in it.
How to invoke a Pivot Table
Keyboard: Alt-N, V
Mouse: INSERT ribbon -> Tables group -> Pivot Table
Excel intelligently chooses the data for which we wish to create a Pivot Table. If the active cell is within the dataset, Excel chooses all contiguous data as the data that will be summarized in a Pivot Table.
The ‘Create PivotTable’ dialog box also asks us to choose where we want the PivotTable summary to be placed. It is preferable to put the summary in a New Worksheet. (This way, we avoid the possibility of overwriting data in the existing worksheet).
Once we click OK, Excel creates a list of fields from the dataset that we can choose to include in our Pivot Table. We can also choose where we want to place the field, e.g., as a column header or as a row header.
The COLUMNS and ROWS fields are used for labels. The VALUES field shows the values that will be summarized. We can also choose the specific manner in which we want to summarize the value. By clicking on the drop down arrow to the right of an item in the VALUES field, we can choose ‘Value Field Settings’ (usually at the bottom of the list) from the options provided.
We can use the Value Field Settings dialog box to rename our data fields.
We can use filters within our PivotTable. Our PivotTable Fields has a section titled ‘FILTERS’.
Slicer
A filter is one way to ‘slice’ your data. A drawback with using the filter is that it is not always evident what filter has been applied. This issue is overcome when we use the pivot table’s slicer feature that gives us a graphical display of the filter.
Once you have created a pivot table, you will see two new tabs called ‘Analyze’ and ‘Design’. These tabs are grouped under a header called PivotTable Tools. Note that this header, and the two tabs, appear only
when you have selected a pivot table. The tabs disappear if your active cell is not on the pivot table. In the Analyze tab is a menu option called insert Slicer. You now have several options by which to slice your data.
The ‘design’ tab, under PivotTable Tools offers different options for the layout and formatting of the pivot table. For instance, you can choose to display, or not, the row and column totals. Likewise, you can
format the pivot table with banded rows. (Banded rows are where each alternating row is formatted in alternating colors to make it easier to read large tables. You can do the same with banded columns).
To remove a slicer, you need to right-click on the slicer options.
A major advantage of the slicer feature is the ease of use and the ability to see the slicer options selected. However, the traditional filter has features – such as a text filter – that are not available with the slicer.
Showing details
A pivot table also gives you the option to drill down into the data.
Flat files and Relational Databases
When we have all the data arranged in rows and columns in a single table, we are said to have the data in a flat file structure. This is problematic for several reasons:
- Repetitive data takes up space, both on the display screen and when we store a file with repeated information
- Data may need to be updated. For instance, a customer’s address may change. However, if this address is represented in multiple locations, each of these locations needs to be updated.
Hence, it makes sense to split the data such that we eliminate repetitions. One way to do this is to present the data in separate files. These files can be connected to each other. This type of database - where we have multiple tables connected to each other - is known as a relational database.
A relational database typically has several tables each connected to one or more of the other tables. Tables in a relational database are not standalone tables, but have a relationship to each other; hence the term ‘relational’. When we store data in multiple tables, the richness of the data is not lost. At the same time, a relational database is also more efficient
as data is not repeated.
In both flat files and relational databases, a record, or an observation, is presented on one row. Each column represents a field. An observation is made up of one or more fields. Fields are also known as attributes. A database can have many records.