Data Science Toolbox Flashcards
How do you define a function?
def functionname ():
expression
expression
functionanme() will output expression
how you do define a function with a parameter?
def fxnname(parameter):
expression
expression
e.g.
def square(value):
new_value=value**2
print(new_value)
This will square any argument (value) you input in the fxn square()
Define a fxn with a parameter, but instead of printing the value, return it
def square(value):
new_value = value ** 2
return new_value
Now can use fxn to assign the result to a new variable, e.g. num = square(4)
What are docstrings? 4 facts
- Describe what your function does
- Serve as documentation for your function
- Placed in the immediate line after the function header
- In between triple double quotes “””
e.g.
def square(value):
”"”Return the square of a value”””
new_value = value**2
return new_value
When you assign a variable to a function that prints a value but does not return a value, what type of value will the variable be?
e.g. y = print(x)
NoneType
How do you define a function with more than one parameter?
def fxnname(parameter1, parameter2)
expressions
e.g.
def raise_to_power(value1, value2)
new_value = value1 ** value2
return new_value
What can you use to make functions return multiple values?
Facts about the answer to above
Tuples
e.g. even_nums = (2, 4, 6)
- Similar to list, except can’t modify values (immutable), and constructed using parenthesis
- Can unpack tuples into several variables:
a, b, c = even_nums
print(a) will output 2, print(b) will output 4, print(c) will output 6
- Can access tuple elements just like lists (zero-indexing)
second_num = even_nums[1]
print(second_num) will output 4
Use a tuple to return multiple values. Complete the code below
def raise_both(value1, value2)
’'’Raise value1 to the power of value2 and vice versa.”””
new_value1 = value1 ** value2
new_value2 = value2 ** value 1
def raise_both(value1, value2)
’'’Raise value1 to the power of value2 and vice versa.”””
new_value1 = value1 ** value2
new_value2 = value2 ** value 1
new_tuple = (new_value1, new_value2)
return new_tuple
Define Scope
Name and define 3 types of scope
Scope is the part of the program where an object or name may be accessible
3 Types:
- Global Scope: defined in the main body of a script
- Local scope: defined inside a function. Once execution of fxn is done any name inside the function ceases to exist, so can’t access those names outside the function definition
- Built-in scope: names in the pre-defined built-ins module, e.g. print()
new_val = 10
def square(value):
”"”Returns the square of a number”””
new_value2 = new_val ** 2
return new_value2
new_val = 20
square(3)
What will be the output?
400
The global value accessed is the one at the time the function is called–not the value when the function is defined.
How do you alter the value of a global name within a function call?
using global variable
new_val = 10
def square(value):
”"”Returns square of a number.”””
global new_val
new_val = new_val ** 2
return new_val
new_val will now output 100
*If you don’t use global, it would output 10
How do you print the names in the module builtins?
import builtins
dir(builtins)
def raise_val(n):
”"”Return the inner function.”””
def inner(x):
”"”Raise x to the power of n.”””
raised = x ** n
return raised
return inner
square = raise_val(2)
square(2)
What will be the output?
4
The program created a function “square” that squares any number.
Similarly, can do cube = raise_val(3) to create a function that cubes any number, so cube(4) will return 64
How do you change names in an enclosing scope in a nested function?
use nonlocal variable
def outer():
n = 1
def inner():
nonlocal n
n = 2
print (n)
inner()
print(n)
Now, outer() will output 2
In which order are scopes searched?
Local scope, Enclosing functions, Global, Built-in
“LEGB rule”
Add a default argument of 1 for pow in the following code
def power(number, pow)
”"”Raise number to the power of pow.”””
new_value - number ** pow
return new_value
After modifying the code, what would the following output?
power(9, 2)
power(9, 1)
power(9)
def power(number, pow=1)
”"”Raise number to the power of pow.”””
new_value - number ** pow
return new_value
power(9, 2): 81
power(9, 1): 9
power(9): 9
How do you add flexible arguments to a function?
Flexible arguments allows you to pass any number of arguments to the function
Use *args
e.g.
def add_all(*args)
”"”Sum all values in *args together.”””
sum_all = 0
for num in args:
sum_all += num
return sum_all
*args creates a tuple with all the arguments then the for loop iterates over them
How do you make a flexible argument for key:value style pairs (keyword arguments)?
use **kwargs (keyword arguments)
def print_all(**kwargs):
”"”Print out key-value pairs in **kwargs.”””
for key, value in kwargs.items():
print(key + “:” + value)
This creates a dictionary kwargs. The output will look like:
print_all(name=’dumbledore’, job=headmaster’)
job: headmaster
name: dumbledore
*Note: can add as many key/value pairs as you like
Write a lambda function
functionname = lambda argument1, argument2: expression
e.g.
raise_to_power = lambda x, y: x ** y
Output: raise_to_power(2,3) = 8
What is the map function?
map(func, seq)
Applies the function to all elements in the sequence
Filter function
filter(function, sequence)
function: function that tests if each elements of a sequence true or false
sequence: sequence which needs to be filtered
Can be used to filter out certain elements from a list
Reduce function
from functools import reduce
reduce(function, seq)
It applies a rolling computation to sequential pairs of values in a list.
e.g.
product = reduce((lambda x, y: x*y), [1, 2, 3, 4])
Output: 24
(1*2*3*4)
try-except clause
Exceptions are caught during execution
Python will try to run the code following try. If there’s an exception, it will run the code following except
e.g.
def sqrt(x):
”"”Returns the square root of a number.”””
try:
return x ** 0.5
except:
print(‘x must be an int or float’)
*If try to put a string in function sqrt, will get except message
raise an error
Used when wouldn’t get an error message from python, but the output is not desired
e.g. for a sqrt function, we don’t want people to input negative numbers
def sqrt(x):
”"”Returns the square root of a number.”””
if x<0:
raise ValueError(‘x must be non-negative’)
try:
return x ** 0..5
except TypeError:
print(‘x must be an int or float’)
Define iterable
Define iterator
Iterable: an object with an associated iter() method, e.g. lists, strings, dictionaries.
Applying iter() to an iterable creates an iterator
Iterator: produces next value with next()
Create an iterator named “it” from the following iterable:
word = ‘Da’
What is the output of
next(it)
next(it)
next(it)
it = iter(word)
First next(it): ‘D’
Second next(it): ‘a’
Third next(it): Stop iteration error
How do you print all the values of an iterator?
Use:
word = ‘Data’
it = iter(word)
Use the * operator
print(*it)
Output: D a t a
*Note if you use print(*it) again, no values will print. Would need to redefine the iterator
file = open(‘file.txt’)
it = iter(file)
print(next(it))
What will be the output?
print(next(it))
What will be the output?
Output:
First line of file
Second line of file
range function
range(start, stop)
*stop is 1 + the highest number that will be in the range. Range starts from zero (unless a start is specified)
This creates a range object with an iterator that produces the values until it reaches the limit
e.g.
for i in range(5):
print(i)
0
1
2
3
4
Make a range that produces values from 10 to 20 and assign it to “values”.
Create a list from “values”.
Sum all the numbers in “values.”
values = range(10, 21)
list(values)
sum(values)
How do you change the beginning index number used by the enumerate()?
enumerate(seq, start = number)
zip function
zip(seq, seq)
Accepts an arbitrary number of iterables and returns an iterator of tuples (a zip object)
e.g.
list1 = [1, 2, 3, 4]
list2 = [5, 6, 7, 8]
z = zip(list1, list2)
z_list=list(z) (To turn it from zip object to list)
print(z_list)
[(1, 5), (2, 6), (3, 7), (4, 8)]
Use a for loop to print the objects in zip(list1, list2)
list1 = [1, 2, 3, 4]
list2 = [5, 6, 7, 8]
for z1, z2 in zip(list1, list2):
print(z1, z2)
Output:
1 5
2 6
3 7
4 8
Print all the elements of zip(list1, list2) using the splat (*) operator
list1 = [1, 2, 3, 4]
list2 = [5, 6, 7, 8]
list1 = [1, 2, 3, 4]
list2 = [5, 6, 7, 8]
z = zip(list1, list2)
print(*z)
Output:
(1, 5) (2, 6) (3, 7) (4, 8)
Unpack the tuples produced by zip() with * into list_1 and list_2
list1 = (1, 2, 3, 4)
list2 = (5, 6, 7, 8)
z = zip(list1, list2)
list_1, list_2 = zip(*z)
This will create two tuples that are like the original list1 and list2:
list_1 = (1, 2, 3, 4)
list_2 = (5, 6, 7, 8)
What can you do when there’s too much data to hold in memory?
Load data chunks
e.g. Summing column ‘x’ in data.csv
import pandas as pd
total = 0
for chunk in pd.read_csv(‘data.csv’, chunksize = 1000)
total += sum(chunk[‘x’])
print(total)
list comprehension syntax
newlist = [output expression for iterator variable in iterable if predicate expression]
The output expression determines the values you wish to create. The for loop iterates over a sequence. The if expression is optional. *Note can also add a second for loop to similate nested for loops. *Note2: can also use if statement in output expression.
List comprehension helps you make a new list from another iterable as a reference. It collapses for loops for building lists into a single line
e.g.
nums = [12, 8, 21, 3, 16]
new_nums = [num + 1 for num in nums]
print(new_nums)
[13, 9, 22, 4, 17]
The for loop equivalent would be:
new_nums = []
for num in nums:
new_nums.append(num + 1)
Make the following matrix using nested list comprehension:
matrix = [[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],]
matrix = [[col for col in range(5)] for row in range(5)]
Note that the inner list comprehension creates each list of 0-4.
What is % called and what does it do?
Modulo operator
Yields the remainder from the division of the first argument by the second
dict comprehension syntax
- dictname* = {key: expression for iterator variable in iterable}
e. g.
pos_neg = {num: -num for num in range(9)}
Use an if-else statement on the output expression of a list comprehension
newlist = [output if condition else altoutput for iterator in iterable]
How is a generator different from a list comprehension?
How is it similar?
Differences:
- Does not construct a list
- Does not store the list in memory
Similarities:
- It’s an object we can iterate over
- We can use the same syntax as list comprehension
i. e.
* generatorname* = (output expression for iterator variable in iterable)
What is a generator function? (4 facts)
- Produces generator objects when called. Note that can iterate over generator objects.
- Defined like a regular function - def
- Yields a sequence of values instead of returning a single value
- Generates a value with “yield” keyword (instead of “return”)
Example of a generator function:
def num_sequence(n):
”"”Generate values from 0 to n.”””
i=0
while i
yield i
i += 1
How do you return the first n rows of a DataFrame?
dataframename.head(n)
n = number of rows to return (default is 5)