Data Science using Python and R - 2 Flashcards

1
Q

What is required to run Python code?

A

A Python compiler, specifically the Spyder compiler included in the Anaconda software package.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you download Anaconda?

A

Go to the Spyder installation page and select the Anaconda link under Windows or MacOS X options.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three main boxes displayed when you first open Spyder?

A
  • Left-hand box: where you write Python code
  • Top-right box: lists data sets and items created by Python code
  • Bottom-right box: displays output and error messages.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the five kinds of actions focused on in Python coding?

A
  • Using comments
  • Importing packages
  • Executing commands
  • Saving output
  • Getting data into Python.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What character is used to start a comment in Python?

A

#

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

True or False: Comments in Python are executed by the compiler.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of comments in Python code?

A

To help others understand the code better.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you execute a single line of code in Spyder?

A

Place the cursor on the line and press the run button or use the keyboard shortcut.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you execute multiple lines of code in Spyder?

A

Highlight the relevant lines and press the ‘Run selection or current line’ button or use the keyboard shortcut.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the purpose of importing packages in Python?

A

To perform complex data science tasks without writing the code from scratch.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which two packages are commonly imported in Python for data science?

A
  • pandas
  • numpy.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What command is used to import the pandas package as pd?

A

import pandas as pd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What command is used to import the numpy package as np?

A

import numpy as np

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fill in the blank: To import specific commands from a package, use the format _____ from _____ import _____.

A

from [package_name] import [command_name]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the structure of the command to get a data set into Python?

A

your_name_for_the_data_set = pd.read_csv(‘the_path_to_the_file’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the command pd.read_csv() do?

A

It imports a CSV file into a pandas DataFrame.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the syntax for saving output in Python?

A

your_name_for_the_output = the_command_that_generated_the_output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the purpose of saving output in Python?

A

To use the output in later lines of code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What Python command is used to create a contingency table?

A

pd.crosstab()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How do you access a specific record in a pandas DataFrame?

A

Use the .loc attribute followed by the record index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you view the first record in a DataFrame named bank_train?

A

bank_train.loc[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do you access multiple records in a DataFrame?

A

Use the .loc attribute and list the record indices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

If you want to see the first 10 rows of a DataFrame, what is the syntax?

A

bank_train[0:10]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How do you access a single variable in a DataFrame?

A

Use bank_train[‘variable_name’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How do you access multiple variables in a DataFrame?

A

Use bank_train[[‘var1’, ‘var2’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What must you do to set up graphics in Spyder for better display?

A

Change the graphics settings to ‘Automatic’ in Preferences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the first step to change graphics settings in Spyder?

A

Click on Tools in the menu bar, then select Preferences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the first step to set up graphics options in Spyder?

A

Click on Tools in the menu bar, then select Preferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

In Spyder, where do you find the Graphics tab to change settings?

A

In the Preferences window, on the top of the right‐hand side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What should you select under Graphics backend to enable graphical output?

A

Choose Automatic from the Backend drop‐down menu

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What must you do after changing the graphics options in Spyder?

A

Close Spyder and reopen it for the new settings to take effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

True or False: Changing the graphics backend will open graphical output in the same window.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the main purpose of the Configure subplots button in the graphics output window?

A

To change the margins of the plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the first action required to download R?

A

Go to the R installation page and choose a mirror

35
Q

How do you open a new R script in RStudio?

A

Click on File > New File > R Script

36
Q

What is located in the top-left box of the RStudio interface?

A

Where you will type your R code

37
Q

What does the bottom-right box in RStudio primarily display?

A

Many tabs, including the ‘Plots’ tab for graphical output

38
Q

What symbol is used to start a comment in R code?

39
Q

How do you execute a single line of R code in RStudio?

A

Click the Run button or use the keyboard shortcut

40
Q

What are the two steps to make an R package available for use?

A
  • Downloading the package
  • Opening the package
41
Q

Fill in the blank: To download a package in R, you use the command _______.

A

install.packages()

42
Q

What command do you use to open an R package after it’s been downloaded?

43
Q

What is the easiest method to get a data set into R?

A

Using the ‘Import Dataset’ button in the RStudio Environment tab

44
Q

What should be selected in the Import Dataset window to indicate the presence of column headers?

A

The ‘Yes’ button for ‘Heading’

45
Q

What is the general form to rename a data set in R?

A

object_name <- object_to_be_saved

46
Q

What command is used to read a CSV file into R?

A

read.csv()

47
Q

How do you create a contingency table in R?

A

Use the table() function

48
Q

What notation is used to access a specific record in a data set in R?

A

Bracket notation: data_set_name[ rows of interest , columns of interest ]

49
Q

What does the command bank_train[1, ] return?

A

The first record in the bank_train data set

50
Q

How do you access multiple records, for example, the first, third, and fourth records in R?

A

bank_train[c(1,3,4), ]

51
Q

How can you access specific variables in a data set in R?

A

Use bracket notation for columns: data_set_name[, c(column_indices)]

52
Q

What is the result of the command bank_train[, c(1, 3)]?

A

It returns the first and third variables from the bank_train data set

53
Q

What are the first and third variables in the data set?

A

age and marital

54
Q

How do you access specific variables in a data frame in R?

A

Use the syntax bank_train[, c(1, 3)]

55
Q

How can you access the age variable from the bank_train data set?

A

bank_train$age

56
Q

What property do data frames have that allows identifying variables of interest?

A

You can use a dollar sign ($)

57
Q

What programming languages are covered in this book?

A

Python and R

58
Q

What is the purpose of comments in code?

A

To provide explanations or notes without affecting output

59
Q

What character begins a comment in Python?

60
Q

What is the use of the ‘as’ keyword when importing Python packages?

A

To rename the imported package

61
Q

How do you save output generated by Python code?

A

Use assignment to a variable

62
Q

How do you save output generated by R code?

A

Assign output to a variable

63
Q

Why is it important to specify if a data set has column headings?

A

To ensure proper data interpretation and manipulation

64
Q

What are two ways to get a data set into R?

A
  • Using read.csv()
  • Using read.table()
65
Q

What is contained in the bottom-right window of a programming environment?

A

Output or results of executed code

66
Q

What is the output of executing a comment in R?

A

No output is generated

67
Q

What packages should be imported for Python in the exercises?

A
  • pandas
  • numpy
68
Q

What package should be imported for R in the exercises?

69
Q

What is the name given to the imported bank_marketing_training data set?

A

bank_train

70
Q

What is a contingency table?

A

A table used to display the frequency distribution of variables

71
Q

What is the name of the saved output for the contingency table in Python?

A

crosstab_01

72
Q

What is the name of the saved output for the contingency table in R?

73
Q

How do you save the first nine records of the bank_train data set?

A

Assign them to a new data frame

74
Q

How do you save the age and marital records of the bank_train data set?

A

Assign them to a new data frame

75
Q

How do you save the first three records of the age and marital variables?

A

Assign them to a new data frame

76
Q

What should be done when importing the adult_ch3_training data set?

A

Use the ‘Heading: Yes’ setting

77
Q

What command should be imported for Python related to decision trees?

A

DecisionTreeClassifier from sklearn.tree

78
Q

What package should be imported for R related to decision trees?

79
Q

What is the name given to the contingency table of workclass and sex?

80
Q

What is the name given to the contingency table of sex and marital status?

81
Q

What records should be displayed to analyze sex and workclass?

A

The first record

82
Q

What records should be displayed to analyze sex and marital status?

A

Records 6–10

83
Q

What is the name of the new data set for married individuals?

A

adultMarried

84
Q

What is the name of the new data set for individuals older than 40?

A

adultOver40