PythonBasics Flashcards
print working directory
pwd
my computer’s network name
hostname
make directory
mkdir
change directory
cd
list directory
ls
what should you do if you get lost in terminal?
write “pwd” to print working directory and next “cd ~ ” to get back to home
How do you create a folder with multiple words in the name?
write mkdir for creating a folder and then put “” around the name. For example,…. Mkdir “that’s pretty cool”.
remove directory
rmdir
What does “cd ..” do?
it takes you back one folder from for example /desktop/temp/stuff to /desktop/temp, This can also be done for multiple folders back with ../../.. (3 folders back)
What does CLI mean?
Command line interface
What does GUI mean?
graphical user interface
What will “ls -1R” do?
it will show you an overview of all folders below your current position.
data:image/s3,"s3://crabby-images/83c09/83c09302a8cc4d80781a92652774832b4ead321a" alt=""
push directory
pushd (go to new location)
pop directory
popd (return to saved location)
what does “mkdir -p” do?
lets you create an entire directory at once aka. Multiple folders in a tree. Mkdir -p kap/jokap/chakap/ja/sådan/er/det
copy a file or directory
cp
move a file or directory
mv
What can be the problem if you are not allowed to rmdir a directory?
First of all there could be files within that folder you try to delete. Secondly, if you are 100 % sure that there is nothing, it may be due to the Mac OSX issue of an “.DS_Store” file, which you can delete by going to the folder and write “rm -rf .DS_Store”
page through a file
less
print the whole file
cat
What will
mkdir something
cp awesome.txt something/
Do?
first we create a new folder/directory. Next we make a copy of the file awesome.txt from current pwd to this new folder called “something”
create a new file
touch (write…. “touch iamcool.txt” for example)
How do we copy a file?
go to the directory of the file and write “cp filename.txt newfilesname.text”
execute arguments
xargs
find files
find
how do you rename a file?
use mv (move a file) so you write “mv filename.doc newfilename.doc”. you can check the name change after by listing the directory content with ls
how can you see the content of a file?
write “less filename.txt” and it will show you the content. To quit, just type q
How can you remove a FILE?
just write “rm filename.doc” (you can also remove multiple files at once by listing them)
How can you remove a folder/directory?
rmdir foldername
How can you delete a file within a folder?
rm foldername/filename.txt
How can you delete a folder that contains files?
rm -rf foldername
Whats the “execute” command?
xargs
find things inside files
grep
read a manual page
man
find what man page is appropriate
apropos
look at your environment
env
print some arguments
echo
export/set a new environment variable
export
exit the shell
exit
What is the command to become a super-user?
sudo makes you a super-user and lets you force through more commands
Which command can you use to change the access permissions to a file?
chmod
How can you change the file owner?
use chown
How do you run a python file from terminal?
Just write “python filename.py”
What does # do in python and what is it called?
A hash.
If you put a hash (#) in front of a line of code, it will not print / go in effect so you can use it for making comments
How can you print a hash/octothorpe? “#”
you put it inside strings; print “we have a # inside of the strings now” # the hash inside the strings will be printed but this comment after the second # will not be printed.
What is / called and what is it?
slash and it is the division sign
What is * called and what is it?
asterisk and it is the multiplier sign
How do you make additions?
with “+” for example “print 3 + 2” will output 5
How do you make subtractions?
with “-” for example “print 3 - 4” will output -1
What is “%” and what does it do?
percent and it works as ‘X divided by Y with J remaining’. For example, “100 divided by 16 with 4 remaining” so print “100 % 16” will be 4. While 100 % 10 will be 0.
What are floating point number?
Essentially just numbers that include decimals such as 5.5 or 0.02
What output will print “20/3” give?
6… if you want the precise output including decimals you need to write print “20.0/3.0) and you will get 6.666666666666667
how do you write less-than-equal?
<=
how do you write greater-than-equal?
>=
What is the order of mathematical operations?
PEMDAS, Parentheses, exponents, multiplication, division, addition, subtraction
How do you create a variable in python?
you basically just write out with a “=” sign in-between so for example,
“Cars = 100” # creates a variable that ‘cars’ is equal to 100
“cars_not_driven = cars – drivers” # creates a variable for available cars (cars_not_driven)
What is the difference between = (single-equal) and == (double-equal)?
The = (single-equal) assigns the value on the right to a variable on the left. The == (double-equal) tests if two things have the same value.
Can we write x=100 instead of x = 100?
You can, but it’s bad form. You should add space around operators like this so that it’s easier to read.
How do you define number and character variables?
Examples…
My_name = “Lars Horsbol”
And
My_age = 23
How do you insert variables inside printed text?
you use %s for character-variables and %d for number variables. For example,
Print “Let’s talk about %s” % my_name
And
Print “He’s %d centimeters tall.” % my_age
How should you refer to a number-variable inside printed text?
With %d
Print “He’s %d centimeters tall.” % my_age
How should you refer to a character-variable inside printed text?
With %s
Print “Let’s talk about %s” % my_name
How do you add variables together inside printed text?
You just write them out in the %d.
print “If I add %d, %d, and %d I get %d.” % (my_age, my_height, my_weight, my_age + my_height + my_weight)
here we are trying to add the variables together
How does the %r format work?
If you use %r, it will print no matter what, meaning that it will both print a character-variable (normally %s) and a number-variable (normally %d).
print “and with both character %r AND number %r” % (height, hair) # with %r we will end up printing both the character-variable AND the number-variable
and with both character 192 AND number ‘Dark-blond’
What are the different conversion types in Python and which ones are most common?
%d = signed integer decimal
%s = character-variables
%r = ”print this no matter what”
data:image/s3,"s3://crabby-images/d530c/d530c2d640817b37486b9fd63f78f71bed0d3d0e" alt=""
What’s the point of %s and %d when you can just use %r?
The %r is best for debugging, and the other formats are for actually displaying variables to users.
Can I make a variable like this: 1 = ‘Zed Shaw’?
No, the 1 is not a valid variable name. They need to start with a character, so a1 would work, but 1 will not.
How can we insert a number inside “x = “There are %d types of people” ?
We write “% 10”
x = “There are %d types of people.” % 10 # here we write a string and insert a variable with % outside the string
How can you combine two strings such as:
w = “this is the left side of…“
e = “a string with a right side.” ??
just write; print w + e
How can I round a floating point number?
You can use the round() function like this: round(1.7333) or round(weight / pounds_to_kilo)
Can I use single-quotes or double-quotes to make a string or do they do different things?
In Python either way to make a string is acceptable, although typically you’ll use single-quotes for any short strings like ‘a’ or ‘snow’ and double-quotes for sentences.
Should I use %s or %r for formatting?
You should use %s and only use %r for getting debugging information about something. The %r will give you the “raw programmer’s” version of variable, also known as the “representation.”
Why do you put ‘ (single-quotes) around some strings and not others?
Mostly it’s because of style, but I’ll use a single-quote inside a string that has double-quotes. Look at line 10 to see how I’m doing that.
print “I also said: ‘%s’.” % y
What will; print “.” * 10 do?
this will result in 10*. So the output will be; ……….
If, end1 = “C”
end2 = “h”
end3 = “e”
end4 = “e”
end5 = “s”
end6 = “e”
end7 = “b”
end8 = “u”
end9 = “r”
end10 = “g”
end11 = “e”
end12 = “r”,
what will;
print end1 + end2 + end3 + end4 + end5 + end6
print end7 + end8 + end9 + end10 + end11 + end12
result in?
Just.
Cheese
burger
On two different lines because there is no comma after “end6”.
Why do I have to put quotes around “one” but not around True or False?
That’s because Python recognizes True and False as keywords representing the concept of true and false. If you put quotes around them, then they are turned into strings and won’t work right.
If, end1 = “C”
end2 = “h”
end3 = “e”
end4 = “e”
end5 = “s”
end6 = “e”
end7 = “b”
end8 = “u”
end9 = “r”
end10 = “g”
end11 = “e”
end12 = “r”,
what will;
print end1 + end2 + end3 + end4 + end5 + end6,
print end7 + end8 + end9 + end10 + end11 + end12
result in?
Just.
Cheese burger
On two different lines because there is a comma after “end6” it stays on the same line in the printed output.
What will; print “Its fleece was white as %s.” % ‘snow’ result in?
Its fleece was white as snow.
If you have; formatter = “%r %r %r %r” , what will happen if you write print formatter % (formatter, formatter, formatter, formatter)
‘%r %r %r %r’ ‘%r %r %r %r’ ‘%r %r %r %r’ ‘%r %r %r %r’
If you have; formatter = “%r %r %r %r” , what will happen if you write;
print formatter % (
“I had this thing.”,
“That you could type up right.”,
“But it didn’t sing.”,
“So I said goodnight.”
)
‘I had this thing.’ ‘That you could type up right.’ “But it didn’t sing.” ‘So I said goodnight.’
Why do I have to put quotes around “one” but not around True or False?
That’s because Python recognizes True and False as keywords representing the concept of true and false. If you put quotes around them, then they are turned into strings and won’t work right.
I tried putting Chinese (or some other non-ASCII characters) into these strings, but %r prints out weird symbols.
Use %s to print that instead and it’ll work.
Why does %r sometimes print things with single-quotes when I wrote them with double-quotes?
Python is going to print the strings in the most efficient way it can, not replicate exactly the way you wrote them. This is perfectly fine since %r is used for debugging and inspection, so it’s not necessary that it be pretty.
Why do the \n newlines not work when I use %r?
That’s how %r formatting works; it prints it the way you wrote it (or close to it). It’s the “raw” format for debugging.
What are two ways to make a string that goes across multiple lines?
- You can use \n è “nJan\nFeb\nMar\nApr\nMay\nJun\nJul\nAug”
- You can use a triple double quote “”” ………… “””
What will making three double-quotes allow you to do? “”” “””
Write a lot of text within the “”” here “”” on multiple different lines
print “””
There’s something going on here.
With the three double-quotes.
We’ll be able to type as much as we like.
Even 4 lines if we want, or 5, or 6.
”””
How do you write some text that is tabbed in?
Like this?
You use; \t so for example
Print “\tI’m tabbed in.”
How do you write a string of text and then split the text on different lines?
You use \n so for example…
Print “I’m split\non a line.”
Will become;
I’m split
On a line
What will; print “I’m \ a \ cat.” ; result in?
I’m \ a \ cat
So it removes one of the backlashes
How can you produce a list of points?
data:image/s3,"s3://crabby-images/7985b/7985be7a4b525953a7804104e75644596a10675b" alt=""
What does the escape sequences \ do?
data:image/s3,"s3://crabby-images/b3e3c/b3e3cf513a1f6c370e43ce250e3b167f9eb14306" alt=""
What does the escape sequence \’ do?
data:image/s3,"s3://crabby-images/cec48/cec48a56ec90cc057291c4c8c4ddfc868620a857" alt=""
What does the escape sequence; \a do?
It inserts an extra space.
So;
Print “hey man\a whats up”
Becomes
Hey man whats up
What does the escape sequence \b do?
data:image/s3,"s3://crabby-images/2f691/2f691d33ee71427f3f163595f85a75453604e94c" alt=""
What does the escape sequence \n do?
data:image/s3,"s3://crabby-images/baedf/baedf3cecf96ec91e96f5b2e4e9312440c48e5e2" alt=""
What does the escape sequence \r do?
data:image/s3,"s3://crabby-images/8addf/8addf1591c2219df495fe95310e234a6cb9ed04b" alt=""
What does the escape sequence \t do?
data:image/s3,"s3://crabby-images/b181d/b181d28a3f3eec373b18868c21a7f1caf3c0a965" alt=""
When I use a %r format none of the escape sequences work.
That’s because %r is printing out the raw representation of what you typed, which is going to include the original escape sequences. Use %s instead. Always remember this: %r is for debugging; %s is for displaying.
What is the function of the comma in this:
print “How old are you?”,
age = raw_input()
the function of the comma is to make sure that the output is on the same line.
With comma we get; How old are you? 23
Without comma we get: How old are you?
23
What is raw_input in python and what is it useful for?
Raw_input lets you interact with the outside world to get input. The raw_input() function waits for the user to type some input and press return. It then gets whatever was typed.
(called input() in python 3 but here it also tries to convert the input whereas rawinput from python 2 simply takes the raw input which is usually easier to work with/safer)
How can you insert prompts for raw_input requests?
Just write questions or prompts inside the string of the raw_input() like this;
height = raw_input(“How tall are you? Fx. 1.82 m”)
How can you look up what any function in python does from within the terminal?
data:image/s3,"s3://crabby-images/80c42/80c42a1ccb8d4a7865d3b31411a572fddf50f2dc" alt=""
Why would I use %r over %s?
Remember, %r is for debugging and is “raw representation” while %s is for display. I will not answer this question again, so you must memorize this fact. This is the #1 thing people ask repeat- edly, and asking the same question over and over means you aren’t taking the time to memorize what you should. Stop now, and finally memorize this fact.
What do you also call a “.py” file?
A script
What are modules?
Modules are features you can add in your python script. Some people also called modules for libraries.
It could for example be: sys and argv
data:image/s3,"s3://crabby-images/a5d79/a5d79242c4ce31bcc0670a7f60fd840c77c1c32c" alt=""
Python filename.py arg1 arg2 arg3
So for example
Python e13.py I want snacks
Will give…
The script is called: e13.py
Your first variable is: i
Your second variable is: want
Your third variable is: snacks
What happens if you input less arguments for argv when trying to run in the command line?
data:image/s3,"s3://crabby-images/ed757/ed75791aa14e3e949a3277deae48e915e3da1ba8" alt=""
What’s the difference between argv and raw_input()?
The difference has to do with where the user is required to give input. If they give your script inputs on the command line, then you use argv. If you want them to input using the keyboard while the script is running, then use raw_input().
How can you use raw_input prompts for multiple places in your script?
data:image/s3,"s3://crabby-images/8bfbb/8bfbb88f4d12daa48ac158ac22501f49d9beee9d" alt=""
Does txt = open(filename) return the contents of the file?
No, it doesn’t. It actually makes something called a “file object.” You can think of it like an old tape drive that you saw on mainframe computers in the 1950s or even like a DVD player from today. You can move around inside them, and then “read” them, but the file is not the contents
What does from sys import argv mean?
For now, just understand that sys is a package, and this phrase just says to get the argv feature from that package. You’ll learn more about these later.
Why is there no error when we open the file twice?
Python will not restrict you from opening a file more than once, and in fact sometimes this is necessary.
Which command will let you close files?
Close
Which command will let you read files?
Read
Which command will let you open files?
Open
Which command will let you read just one line?
Readline
Which command will let you empty a file for its content? (be careful)
Truncate
Which command will let you write stuff into your file?
Write(stuff)
What could we write to open a file? (existing or not)
data:image/s3,"s3://crabby-images/16791/167919de7fbcf5b47d5900a55288aed65f89ab0e" alt=""
What could we write to open and read a file?
data:image/s3,"s3://crabby-images/0baa3/0baa3e5922b9a3b66d6ebe42313db4139b7ba80e" alt=""
What does the file modes “r” do in python?
This is the default mode. It Opens file for reading.
What does the file modes “w” do in python?
This Mode Opens file for writing.
If file does not exist, it creates a new file.
If file exists it truncates the file.
What does the file modes “x” do in python?
Creates a new file. If file already exists, the operation fails.
What does the file modes “a” do in python?
Open file in append mode.
If file does not exist, it creates a new file.
What does the file modes “t” do in python?
This is the default mode. It opens in text mode.
What does the file modes “b” do in python?
This opens in binary mode.
What does the file modes “+” do in python?
This will open a file for reading and writing (updating)
How can you write a simple script that will open a file specified in command/terminal?
data:image/s3,"s3://crabby-images/621f9/621f9f6b8c0ec4eab1d63060700a4dea5dc822d9" alt=""
If you open the file with ‘w’ mode, then do you really need the target.truncate()?
No. It doesn’t make any difference as the “w-mode” will overwrite the existing file so emptying that file first makes no difference.
Does just doing open(filename) open it in ‘r’ (read) mode?
Yes, that’s the default for the open() function.
What can the “import” statement do?
With the import statement we can import new modules to python.
Some modules/functions are existing in your python program already, but sometimes you may want to add new functions. Here, it can be smart to import code written by others than to innovate yourself. You can import such new modules with the import statement.
Why do you have to do output.close() in the code when the file is already closed?
It is good practice to always close files so you avoid later commands working with those files from reading descriptions from those closed files.
What does the len() function do?
It gets the length of the string that you pass to it and then returns that as a number. Play with it.
What do functions do? (put simply)
- They name pieces of code the way variables name strings and numbers.
- They take arguments the way your scripts take argv.
- Using #1 and #2, they let you make your own “mini-scripts” or “tiny commands.”
Think of “function” as a mini-script
How do you create a function in python?
Write; def (for define)
data:image/s3,"s3://crabby-images/3ccd2/3ccd247b730cf67c327f6378ca5ee750141abc5e" alt=""
The first function is overly complicated. It takes the look of when we work with args and does uses *args inside print_two. On the line below, it unpacks those args to finally print it.
There is no need to do this unpacking. We can simply put arg1, arg2 inside the function from the beginning.
How should you start a function?
- With; def (for definition)
- And then open paranteheses right after the function name
- And list the arguments comma separated
- End the parantehese AND add a colon:
def print_three(arg1, arg2, arg3):
How should your function name be?
It should only use characters AND underscore so for example
Print_two_again
How many spaces should the lines of code in a function be indented?
4 characters. No more, no less.
This usually happens automatically in your script-writing program
How do you end a function? (stop adding more to it)
data:image/s3,"s3://crabby-images/3bc15/3bc15c7b7e10247deb2a23f240574125e7e08c96" alt=""
What does it mean to run a function?
The same as to “use” or to “call” a function
What does it mean to call a function?
The same as to “run” or to “call” a function
What does it mean to call a function?
The same as to “use” or to “use” a function
What’s allowed for a function name?
Just like variable names, anything that doesn’t start with a number and is letters, numbers, and underscores will work
What does the * in *args do?
That tells Python to take all the arguments to the function and then put them in args as a list. It’s like argv that you’ve been using, but for functions. It’s not normally used too often unless specifi- cally needed.
Are functions and variables connected?
No. The functions you write in your script are not connected to the variables.
data:image/s3,"s3://crabby-images/d8226/d82263c496f2899a1773dc88913967596e805cad" alt=""
it will print:
We can just give the function numbers directly:
You have 20 cheeses!
You have 30 boxes of crackers!
Man that’s enough for a party!
Get a blanket.
As it will first “We can just give the function numbers directly:” and then all the content of the function (def) Cheese_and_crackers below it.
How should you write up an user-input-variable making sure that it will be an integer result?
Put raw_input inside int() for integer
For example
Eaten_packages = int(raw_input(“How many packages did you eat by now? “))
Is there a limit to the number of arguments a function can have?
It depends on the version of Python and the computer you’re on, but it is fairly large. The practical limit, though, is about five arguments before the function becomes annoying to use.
What does f.seek do?
The seek-function takes the “cursor” (or point of the mouse so to say) to a certain place in a file. For example, we take it to the very beginning with f.seek(0) so we are at the start of the file again
What does += do?
+=
adds another value with the variable’s value and assigns the new value to the variable.
+=
adds a number to a variable, changing the variable itself in the process (whereas +
would not).
It adds the right operand to the left. x += 2
means x = x + 2
What does -= do?
+=
subctracts a number from a variable, changing the variable itself in the process (whereas -
would not).
It subtracts the right operand from the left. x -= 2
means x = x - 2
What does *= do?
*=
multiplies a number with a variable, changing the variable itself in the process (whereas *
would not).
It multiplies the right operand with the left. x *= 2
means x = 2x
What does /= do?
/=
divides a variable with a number, changing the variable itself in the process (whereas /
would not).
It divides the right operand with the left. x /= 2
means x = x/2
What does %= do?
%=
finds modulus of a variable with a number, changing the variable itself in the process (whereas %
would not). Modulus = the remainder when dividing.
It finds modulus so: x %= 3
means x = modulus of x/3
How does readline() know where each line is?
Inside readline() is code that scans each byte of the file until it finds a \n character, then stops reading the file to return what it found so far. The file f is responsible for maintaining the current position in the file after each readline() call, so that it will keep reading each line.
What does the return statement do/how does it work?
The RETURN statement writes out what it does to the user AND can also be used by other functions
The PRINT statement simply writes out what’s inside but cannot be used by other functions
Example explanation:
Def returnFunction(num):
Num = num *2
Return num
Def printFunction(num):
Num = num *2
Print(num)
If we write…
returnFunction(4)
we get an output of; 8
if we write….
printFunction(4)
we get an output of; 8
So as such they look the same.
But.. heres the difference.
X = returnFunction(4)
X #notice this one
8
Y = printFunction(4)
8
So the return function gives us “x” but the print-function doesn’t give us anything so we cannot use it later.
If we in python now ask for x we will get 8 but if we ask for y we won’t get anything.
How can I use raw_input() to enter my own values?
Remember int(raw_input())? The problem with that is then you can’t enter floating point, so also try using float(raw_input()) instead.
What does “return” mean?
It’s the same as “output”. Synonyms.
How do you write comments in python?
Write # and everything here after the hashtag will not be shown. It is called HASH or and OCTOTHORPE
But you can still write an octothorpe inside strings like:
Print “here we # like to go on Instagram”
How do you print some text that is followed by maths?
Remember to put a comma…
Print “Hens”, 25 + 5
Will return
Hens 30
Or
Print “What is 3 + 2?”, 3 +2
Will return
What is 3 + 2? 5
How do we create a variable in python?
Simple use =
For example
Cars = 100
What does %r do?
If you use %r, it will print no matter what, meaning that it will both print a character-variable (normally %s) and a number-variable (normally %d).
print “and with both character %r AND number %r” % (height, hair) # with %r we will end up printing both the character-variable AND the number-variable
and with both character 192 AND number ‘Dark-blond’
How can we repeat a printed string?
Just write asterisk and the number of times you want to repeat it. For example:
Print “.” * 20
Will return
………………..
How do you write a list of things but on different lines?
Use \n
For example:
Months = ‘Jan\nFeb\nMar\nApr\nMay\nJun\nJul\nAug’
Print “Here are the months: “, months
How can you tab in some printed text?
Use \t
So for example
Print “””
I’ll do a list:
\t* Sugar
\t* Spice
\t* And everything nice
Will return
I’ll do a list
- Sugar
- Spice
- And everything nice
How can you ask users to input information/answer questions?
Use raw_input()
For example…
Print “How old are you?”,
Age = raw_input()
remember the comma after print
What do you need to do before you can use arguments?
Import argv by writing…
From sys import argv
Can you write strings inside raw_input()?
Yes. This is called a prompt. Help for the answer
For example
Print “What’s your birthday? “,
Raw_input(“DD-MM-YYYY” )
How do you define what your arguments are?
Write first the argument inoputs such as…
Script, filename = argv #script name and filename is defined here.
How can we create AND open a file at the same time?
note… you must define filename.
Open(filename, “w”)
How do you delete the contents of a file?
Target.truncate()
How do you write something into a file through your script?
Target.write()
How do you close a file?
Target.close()
How do you create a line break?
\n
How we get the number of bytes a file is long?
Len()
How do we open and read a file in the same line of code?
note you must define from_file
open(from_file, “r”)
How do you tell python that there is a function inside a certain file?
First you open command and run “python”
Then you “import file1” (whatever file it is)
And then tell python to run a certain function within “file1” by adding “dot” and the function’s name. For example:
E25.print_last_word(sorted_words)
How do you incorporate help statements in your python functions and later display them in command?
data:image/s3,"s3://crabby-images/92418/92418c114ecb932008db241d1bfc5048b93dba24" alt=""
break_words(stuff)
This function will break up word for us.
print_first_and_last(sentence)
Prints the first and last words of the sentence.
print_first_and_last_sorted(sentence)
Sorts the words then prints the first and last one.
print_first_word(words)
Prints the first word after popping it off.
print_last_word(words)
Prints the last word after popping it off.
sort_sentence(sentence)
Takes in a full sentence and returns the sorted words.
sort_words(words)
Sorts the words.
What is a list?
lists
- consist of a countable number of ordered values
- fruits = [’orange ’ , ’apple ’ , ’pear ’ , ’banana ’ , ’kiwi’, ’apple’, ’banana’]
What is a tuple?
- Immutable: Cannot be changed
- The advantage is that you will know the position of the elements better and thus also make it easier for the computer to locate elements: provides for speed
- You use it when you know that a variable will not change (e.g. days of the week)
- Values are separated by commas
- v = (’a’, ’b’, ’c’)
What is a set?
sets
- unordered collection
- NO duplicate elements
- basket = {’apple’, ’orange’, ’apple’, ’pear’, ’orange’, ’banana’}
What is a dictionary?
dictionary
- a set of keys: value pairs in which each key is unique
- tel = {’jack’: 4098, ’john’: 4139}
What are classes and objects/instances?
classes are used to create new user-defined data structures that contain arbitrary information about something. Class is generic.
- Class: Dog
- species = ’mammal ’
an object/instance is a copy of the class with actual values (thus you have 1 class and multiple instances). Object/instance is specific.
- Instance/object:
- def __init__(self , name, age): self .name = name
- self .age = age
Will these two turn out the same results?
data:image/s3,"s3://crabby-images/83430/83430b39f18ccfc6e75eaf1657d82bfad311fe46" alt=""
What is the difference between for- and while loops?
- For-loops you know in advance the number of times it will execute. This is unknown with while-loops as it executes as long as the Boolean condition is True
- Use a for-loop for definite iterations = if you know the maximum number of times that you need to execute the code
- Use a while-loop if you require to repeat some computation until a condition is met that you cannot calculate in advance (needed for probabilistic results and human intervention)
what does a break statement do?
a break statement is used to immediately finish a loop.
(this is considered bad practice though)
what does a continue statement do?
a continue statement causes to skip the processing of the rest of the code for a loop BUT the loop continues with the subsequent iterations of the loop
(this is considered bad practice though)
What does “syntax error” refer to?
- Missing punctuation characters, such as parentheses, quotation marks or commas
data:image/s3,"s3://crabby-images/da472/da472b1611a8e171d988f29d2bbb7029c9323e49" alt=""
What does “TypeError” refer to?
- when you try to combine to incompatible items
data:image/s3,"s3://crabby-images/627e7/627e791b2e3672d7b87ab33871bec28d1de67b53" alt=""
What does “NameError” refer to?
- Normally when you have used a variable before assigning it a value
data:image/s3,"s3://crabby-images/48231/48231398003c5ed39ec6c27d31286666d06e6339" alt=""
What does “ValueError” refer to?
- When the value passed to a function is not compatible with the function.
- in the example as string inserted as if it was to be converted to an integer
data:image/s3,"s3://crabby-images/c996d/c996deda8ccfe16b549fd175e53c17ed7a4f60b7" alt=""
What is the standard format to ask for help in programming communities?
TIRTBV
- Title
- Introduce
- Reproduce (others should be able to reproduce)
- Tags
- Background
- Version
data:image/s3,"s3://crabby-images/d0296/d0296a47873d763e881059e89977bddc54d94c41" alt=""
How can you take the sqrt of numbers in python?
data:image/s3,"s3://crabby-images/31217/3121783f3a67dbc94f0985c3129654507893bc03" alt=""
What is the difference between “a = 2” and “a == 2”?
A = 2 is defining a as 2
A == 2 is asking whether a is equal to 2 or not
What are 3 good rules for naming variables?
- Names can not start with a number
- Use _ when using space
- Do not use capital letter (best practice is lowercase)
What are the 3 types of variables?
- Float
- Integer
- Boolean
How do you make addition, subtraction, multiplication, diversion, power functions and modulo?
- Addition (i.e. +)
- Substraction (i.e. -)
- Multiplication (i.e. *)
- Division (i.e. /)
- Power functions (i.e. **)
- Modulo (i.e. %)
What is a possible reason why (0.1+0.2-0.3) equals 5.551115123125783e-17 instead of 0?
The numbers have not been rounded and are only showing with one decimal.
To turn to 0 write: round(0.1+0.2-0.3)
How do you get the number of characters in a string?
Count number of characters in “Hello world”
my_text = ‘Hello world’
len(my_text)
11
What will….
my_text = ‘Hello world’
print(my_text[7])
return?
O
H = 0, e = 1, 2 = l, 3=l, o=4, “ “ = 5, w=6, o=7
If, my_text = ‘Hello world’, how can you write it out as both capital letters, small-case letters and as two different strings?
data:image/s3,"s3://crabby-images/71748/71748c9320919a81fac014f8187664c5c58aed78" alt=""
How can you reverse the order of a list?
data:image/s3,"s3://crabby-images/1f157/1f157b29a856b2ac3acc8687ed8701eb94f4a316" alt=""
If…
a = {‘a’,’b’,’c’,’d’}
b = {‘b’,’c’,’e’}
find all the elements in a and b result={‘a’, ‘b’, ‘c’, ‘d’, ‘e’}
a | b
If.. my_list = [1,2,3,4,2,2,2,2,5,5,5,5,9,9,9] how do we find the number of unique elements and how do we create set of these unique numbers?
len(set(my_list))
6
set(my_list)
{1, 2, 3, 4, 5, 9}
If…
a = {1, 2, 3}
b = {2, 3, 4}
how do you find the symmetric difference and what does it mean?
symmetric difference is everything that is not in the intersection, everything that’s not in common
print(a ^ b)
{1, 4}
What will…
for i in range(1, 10):
if i%2 == 0:
print(i)
return?
this version is much more efficient as it will not spent time on
2
4
6
8
(A is the initial value, B is how far it should go and C is the steps)
(2, 10, 2) will turn out as 2, 4, 6, 8 (but not 10)
You can do this much more efficiently…
for i in range(2, 10, 2):
print(i)
How can you print all multiplication tables from 1 to 10? (nested for-loops)
for x in range(1, 11, 1):
for y in range(1, 11, 1):
print((x*y), end=” “)
print()
List comprehension:…. number = [1, 2, 3, 4], return their square value when > 8
list_Numbers = [1, 2, 3, 4]
l = [i**2 for i in list_Numbers if i**2 > 8]
l
with list comprehension you create a NEW list so you have [] around it
and then here I have defined the new list containing 9, 16 as “l”
you can add the new list to an existing list with .append
list_Numbers.append(l)
list_Numbers
[1, 2, 3, 4, [9, 16]]
What do while-loops do?
Now let’s talk about WHILE-LOOPS, which executes an unknown number of times, as long as the Boolean condition is True
data:image/s3,"s3://crabby-images/cebdf/cebdfcf3166b244e7b888a76205a0ffe97a2ba90" alt=""
data:image/s3,"s3://crabby-images/c68e1/c68e17eec28c20b7db4678a080e2b7cec3a18b55" alt=""
Write a loop that prints string characters one by one
data:image/s3,"s3://crabby-images/e9735/e9735ffa14d44913bf23bed36b919e98b91eefd2" alt=""
How can you…x^2: x in {0,1,2,…10}?
data:image/s3,"s3://crabby-images/dbf2b/dbf2b92ad4d57979a233557c1734d438b5dd4268" alt=""
Implement a x^4 in two nested comprehesion (x^2… then x^2)
data:image/s3,"s3://crabby-images/6710c/6710cccfb9c59398d72718e2693643755f115c3d" alt=""
Create a function that prints a hello and a name of the person
data:image/s3,"s3://crabby-images/99ada/99ada9772268d4bc44a0b54b32900b2f39fcda69" alt=""
Write a program that outputs the first recurring character in a string.
myString = “ABCDEBC”
data:image/s3,"s3://crabby-images/e513d/e513dd76e1057cb7bfa73e484ba3d55c2f82bc82" alt=""
What is the standard way to define a function in python (5 elements)?
data:image/s3,"s3://crabby-images/dd02e/dd02e8359461eab808a4a281c21bf65308153667" alt=""
How can you adequately open a python file and read it?
you can also do it this way….
with open(‘data.txt’, ‘r’) as f
for line in f:
print(line)
this will read the file line by line
with open(‘data.txt’, ‘r’) as f:
data = f.read()
however, this reads everything in one line so if there are no line shifts in the file you will get one looooooong line
How can you cope with consecutive errors in your program?
With try and except
try:
run my freaking code here and
except:
if it does not work then try to do this thing to fix it
How you close a file?
write the specified name.close()
data:image/s3,"s3://crabby-images/6b876/6b8766f0087521a25e1d1fafcc611aa90ce79465" alt=""
How do you search for files that match a particular pattern, such as ‘.csv’?
You can use…
endswith()
startswith()
fnmatch.fnmatch()
data:image/s3,"s3://crabby-images/b201b/b201b73b9bc269237bf366058526ab16275bafa6" alt=""
Retrieve all files and directories starting with ‘Py’
data:image/s3,"s3://crabby-images/4a458/4a45894ded34f1a284c9d42713ec7be451fd9454" alt=""
What is the “finally” clause used for?
data:image/s3,"s3://crabby-images/76332/76332d8aba4d01bbd0820b38f1483f1f938c94a7" alt=""
What is NumPy?
NumPy is a Python module. It stands for ’Numerical Python’. It is a library consisting of multidimensional array objects and a collection of functions to process them.
What is an array?
- An array is a data structure consisting of a collection of values, each one identified by at least one positive index (can have more than one dimension; matrixes)
- all values have the same type (int, float….)
- (array = objects, ndarray = class name of these objects)
- each element within this is a “dtype” (data-type) element in NumPy
data:image/s3,"s3://crabby-images/38630/38630e17d06d75495210fbb8726a9b0ef485f5ae" alt=""
What are 5 great rules for writing clean code?
- clear variable and function names
- consistency in styling; ‘ vs “
- Duplication vs. abstraction
- Group like items in your code so that it is more reusable
- Break long program into different files
What’s the purpose of R and Python respectively?
- R
- Focuses on better, user friendly data analysis, statistics and graphical models
- Academics, researchers, data scientists
- Focuses on better, user friendly data analysis, statistics and graphical models
- Python
- Object based and emphasizes productivity and code reliability
- Programmers, data scientists
- Object based and emphasizes productivity and code reliability
How do you convert a list into ndarray?
x = [1, 2, 3]
a = np.asarray(x)
[1, 2, 3]
or…
a = np.asarray(x, dtype = float)
[1. , 2. , 3.]
What are the parameters in array slicing?
[start:stop:step]
data:image/s3,"s3://crabby-images/1e7b4/1e7b45805ff0644a405bb81d3c8d0f1401be8697" alt=""
How do you slice ndarrays with more than one dimension?
data:image/s3,"s3://crabby-images/61904/619048eef6b4eed233502c19ce2537bfe9a72ce3" alt=""
How do you multiply two arrays element wise in NumPy?
np.multiply(a,b). Multiply two arrays element wise
How do you perform a matrix multiplication in NumPy?
np.matmul(a,b). Performs a matrix multiplication
How do you retrieve the largest element in NumPy?
np.amax(a). Retrieve the largest element
What is a series in Panda?
A series is a one-dimensional labeled array and can be created using the following code
data:image/s3,"s3://crabby-images/e7050/e7050e3819e2c0886822599058a288cfc1eed03f" alt=""
What is the difference between syntax errors and exceptions?
- Syntax errors occur when the parser detects an incorrect statement
- An exception error occurs whenever syntactically correct python code results in an error
data:image/s3,"s3://crabby-images/923ba/923ba0f4b8cb10178e1a5694a996ff38368e5e05" alt=""
How can you insert an exception?
data:image/s3,"s3://crabby-images/fe1ab/fe1ab84be584707fc71b94cedb0fdf5fbc238da6" alt=""
What are assertions and how can they be used?
Assertions are a systemic way to check that the inputs in a program are as expected by the program to avoid running into exception errors.
Often useful for:
- checking parameter types, classes or values,
- checking “can’t happen” situations such as dividing by 0
- after calling a function to make sure that its return is reasonable
assert condition, error_message
data:image/s3,"s3://crabby-images/9e4de/9e4debaac2ab4e23eaba98e46e1afc05fe59100e" alt=""
What is the “finally” clause for in python?
To implement some sort of action to clean up after executing a code.
data:image/s3,"s3://crabby-images/78b8a/78b8a61860a933b15c3a9d409e126d33dd4ec893" alt=""
What is a DataFrame in Panda?
data:image/s3,"s3://crabby-images/a2c92/a2c92c2eda6a2ba7c54f1d8582537a620065744a" alt=""
How do you remove a column in Panda?
df = df.drop(columns = [‘ColumnName1’, ‘ColumnName2’])
What does “applymap()” do in pandas?
The function applymap() applies a function to each element of a DataFrame (two-dimensional data structure)
apply() # by itself will only add to the specific column..
and
apply(______, axis = 1) # will add to a row
What is “nan” in pandas?
nan = Not a Number
and is the result you get when applying functions to a DataFrame that has null values.
To detect missing values in your dataframe you can use the isnull() and notnull() functions.
How do the different merge operations in Pandas look visually?
- Natural join
- full outer join
- left outer join
- right outer join
data:image/s3,"s3://crabby-images/6c5ad/6c5ad449a4c0fff5ca55cbfc3e23ab1eb62acdc2" alt=""
What will….
for i in range(2, 10, 2):
print(i)
result in?
data:image/s3,"s3://crabby-images/79a48/79a48b11e6746c68f4fd10de3d6750b6b4ca8ccd" alt=""
data:image/s3,"s3://crabby-images/0c85f/0c85f5b867cfe6c529a22c8ce5a47be9d84d6f19" alt=""
you can also rename the function at the same time…
from fibo import fib_print as fibonacci
How can you Retrieve all files and directories starting with ‘Py’?
data:image/s3,"s3://crabby-images/13fb3/13fb39e59c37ee648b04aa2b968bc38b9a99d963" alt=""
How can you Retrieve all .py files?
data:image/s3,"s3://crabby-images/95901/959017d6d71ae1fe2fda3e79be3f673ccf953f65" alt=""
How could study the efficiency of your code?
Use the “time()” function in python to check the time it takes for your code to run.
data:image/s3,"s3://crabby-images/e768a/e768a202ea1dc20dbfdb0002fd95d56d3dd95375" alt=""
How does a NumPy array of (3.7) look AND how can you make it?
NumPy takes rows FIRST and SECONDLY columns.
data:image/s3,"s3://crabby-images/0afdf/0afdf64f0f07d5f8e5af392465882b7bbbd338ff" alt=""
What will happen if you try to create an array from the following python list… x = [(1, 2, 3), (4, 5)]?
it basically fails as the second list only contains two elements.
data:image/s3,"s3://crabby-images/e5e83/e5e83b79b343e0b383557e9398f0a4255eb4cbe2" alt=""
Slicing multidimensional array (NumPy). How would you get [5, 6] from the following array?
data:image/s3,"s3://crabby-images/c6717/c6717dfcdea2c0f3474294bed6fc56ea3180e03f" alt=""
a[-1][-2:]
returns… array ([5, 6])
why?
you basically have…
data:image/s3,"s3://crabby-images/a125a/a125a68a4d674f50bd8885ccd6dfcdf0718628cc" alt=""
How can you define an array with values from 0 to 100 and filter out odd numbers?
y = np.arange(101)
y[y%2==0]
array([0, 2, 4, 6, 8, 10, 12 ………. 100)]
note that arange only has one r.
What will np.arange(3)+5 return?
this is broadcasting…
we have an array with 3 elements starting from 0 and we add 5.
array([5, 6, 7])
What will np.ones((3, 3)) + np.arange(3) return?
this is broadcasting…
Here we first create np.ones((3,3)) which is
1, 1, 1
1, 1, 1
1, 1, 1
and then np.arange(3) on top of it so it is increasing by adding 0, then 1, then 2.
data:image/s3,"s3://crabby-images/aaacb/aaacb9dd0e144fd4128f7e1e0cd31cf07add6a69" alt=""
What will np.arange(3).reshape((3, 1))+np.arange(3) return?
this is broadcasting…
np.arange(3) is..
0, 1, 2
BUT…. it is reshaped to 3 rows, 1 column… so…
0
1
2
and then we add another np.arange(3) to it…
0, 1, 2
and combine to…
0, 1, 2
1, 2, 3
2, 3, 4
data:image/s3,"s3://crabby-images/549ac/549ac5e9617bcaf241d62862292e4f94a46addcf" alt=""
How can you edit an array and raise all numbers to the power of 2?
say…
a = np.array([1, 2, 3, 4])
for x in np.nditer(a):
print (x*x)
1, 4, 9, 16
note that you use “nditer”
What are three classic functions used to manipulate ndarray elements?
- # transpose; permutes the dimensions of an array
- # concatenate; joins a sequence of arrays along an existing axis
- # append; appends the values to the end of an array
What will happen if you use np.transpose(a) on “a = np.array([[1, 2], [3, 3]])
it starts as..
array([[1, 2],
[3, 4]])
and transpose will change the dimensions of the array (flip rows and columns) so after np.transpose(a) we have…
array([[1, 3],
[2, 4]])
How do you join two different arrays by rows and by columns respectively?
say you have..
a = np.array([[2, 2], [3, 3]])
b = np.array([[4, 4], [6, 6]])
by rows you would….
np.append(a, b, axis = 0) #axis 0 for rows
result…
array([[2, 2],
[3, 3],
[4, 4],
[6, 6]])
and for columns…
np.append(a, b, axis = 1) #axis 1 for columns.
result….
array([[2, 2, 4, 4],
[3, 3, 6, 6]])
How can you create an array of ALL zeros?
a = np.zeros((3, 3))
print(a)
[[0, 0, 0]
[0, 0, 0]
[0, 0, 0]]
How can you create an array that has “42” in each element?
x = np.full((3, 3), 42)
print(x)
[[42, 42, 42]
[42, 42, 42]
[42, 42, 42]]
How can you create an array with random values?
e = np.random.random((2, 2))
if we have x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), what will… x[-3: 3: -1] return?
first we find -3, which is number 7
then we go from -3 until 3 which happens to be number 3 as it is in the 4th location in the row (0, 1, 2, 3)
and lastly we return the values in OPPOSITE order because of -1
array([7, 6, 5, 4])
What is the key difference to remember between NumPy and Pandas?
- NumPy = First, ROW, Second, COLUMN
- Pandas = First, COLUMN, Second ROW
How can you select multiple rows in Pandas?
Multiple rows can be selected using ’:’ operator as in numpy arrays
df[1:3]
just remember that in Pandas, you pick COLUMN FIRST, and row second
How can you get information about a file in pandas?
if the file name is titanic.csv
titanic.info()
will return
data:image/s3,"s3://crabby-images/98626/9862648e1de269a631837839030eb0334a2657eb" alt=""
How can you get descriptive statistics?
if the file name is titanic.csv
titanic.describe()
data:image/s3,"s3://crabby-images/72bc8/72bc85ab0fdeae902f647691d9ad08f5b3385bac" alt=""
What are the components in making a histogram?
- titanic.hist(
- bins = # the width of bars (lower is wider)
- figsize=(16, 9) #is the proportions of the figures
- grid=False #whether you want a grid or not
- plt.show() #and then show the histogram
data:image/s3,"s3://crabby-images/6b237/6b237edb47de23397d1d5908b9dc371be610fac4" alt=""
How do you make a histogram of the data within a specific column?
titanic. Age.hist(bins = 50, figsize=(12, 6), grid=True)
plt. xlabel(‘Age’)
plt. ylabel(‘Number of people’)
plt. show()
data:image/s3,"s3://crabby-images/ad0cc/ad0cc5ff6c3753e3d47433afa0cd1fced2eb8f6f" alt=""
Is NaN == NaN?
No…
NaN == NaN is False
NaN is special in that it doesn’t have a real value, so comparing it to itself doesn’t return true. Essentially, NaN is equal to nothing, not even NaN
Because of this, the only safe way to tell whether or not a value is missing in a DataFrame is by using the isnull() function. Indeed, this function can be used to filter rows with missing values;
edu[edu[“Value”].isnull()]
How can be said about columns and rows in pandas?
- COLUMNS
- are Series, which consists of several values, where each value has an index
- ROWS
- have a specific index to access them and be any name or value
If we just define a series as, c = pd.Series([1956, 1967, 1989, 2000]), what will the index be?
when not specifying the index, they will simply be a sequence of integers starting from 0.
0, 1, 2, 3 in this case
data:image/s3,"s3://crabby-images/a757d/a757d8ab2acf141fe0c121afbf1c340edb4437de" alt=""
How can we specify the index for the following series, c = pd.Series([1956, 1967, 1989, 2000])?
data:image/s3,"s3://crabby-images/44653/446539f054787bab6a1758330ba966d6811b487f" alt=""
What are the ‘head’ and ‘tail’ methos for?
Helps show the first (by default = 5) rows in a dataset so we can see how the data looks.
If you insert a number in head(), you will get that number of rows instead of the default 5 rows.
The tail() method is the opposite in the way that it gives you the LAST 5 rows.
data:image/s3,"s3://crabby-images/97d14/97d142a1e037eddb1508257c5def94a89916c0bd" alt=""
How can you get the names of the columns and rows in a DataFrame?
filename.columns
and
filename.index
data:image/s3,"s3://crabby-images/b2f40/b2f401418838677767dae0b89f795f5e3de9426b" alt=""
What is the difference between edu[10:14] and edu.loc[10:14]?
edu[10:14] will provide the rows from 10th to 13th position BUT this does not take your index labels into account. Only the position.
edu.loc[10:14] will return the values of all the rows between the row labeled 10 and the row labede 14. This could be 10,11,12,13,14 but it could just as well be only one value OR it could be many values. It could also be a row which is indexed 100 if that row has been placed between the rows labeled 10:14.
what is the difference between writing edu.max(axis=0) and edu.max(axis=1)?
axis = 0 finds the maximum value for each column (finding max within the rows for each column)
while
axis = 1 find the maximum value for the row (finding max within the column for each row)
data:image/s3,"s3://crabby-images/7526e/7526e9ad2752d876b9a9be09a5341f96a1a7ca78" alt=""
What is the difference between pandas and pythons interpretation of NaN?
In Python ,NaN values propagate through all operations without raising an exception. In contrast, Pandas operations exclude NaN values representing missing data. For example, the pandas max function excludes NaN values, thus they are interpreted as missing values, while the standard Python max function will take the mathematical interpretation of NaN and return it as the maximum:
data:image/s3,"s3://crabby-images/751ed/751edd8ff38d808d1829fedb9522104b3a1a792f" alt=""
What is a lambda function?
It is essentially just a function without a name.
It is written in-line.
For example…
data:image/s3,"s3://crabby-images/08b02/08b025e5499fe3b39ef1579e44b9ff0be84470dd" alt=""
What is the following doing?
edu[“ValueNorm”] = edu[‘Value’]/edu[‘Value’].max()
edu.tail()
we are defining a new column called ‘ValueNorm’ by taking ‘Value’/the highest number in the ‘Value’ column
lastly we ask to see the 5 last rows of the updated DataFrame using tail()
data:image/s3,"s3://crabby-images/265be/265bea540311b29fbac6ef08cafdbefa244aa245" alt=""
How can you insert a new row at the bottom of a DataFrame?
using the append function, which receives as argument the new row represented as a dictionary where the keys are the name of the existing columns and the values the associated values.
It is important to the the ignore_index flag to True, otherwise the index 0 is given to this new row which will produce an error if it already exists (may potentially overwrite if not set to True)
data:image/s3,"s3://crabby-images/31fae/31fae03af08843a0d87a79b021b88ad156d664ee" alt=""
How can you easily delete the last row in a Dataframe?
simply use the drop and max functions. Why max? because then we will be taking the last row if the index is ordered at least.
data:image/s3,"s3://crabby-images/f19c0/f19c069a3560f86cfd93e272f0581e38f14f09a1" alt=""
How can we fill in NaN values?
using fillna()
eduFilled = edu.fillna(value={‘Value’:0})
will fill in the NaN with 0 instead but be careful about doing so will change the statistics of your data when using pandas.
data:image/s3,"s3://crabby-images/2097a/2097ace32763d25614127f36d1a394f89dbf136f" alt=""
What will the following give us?
edu. sort_values(by=’Value’, ascending= False,inplace=True)
edu. head()
here we are using the sort function to sort the data
by
‘Value’ and NOT in ascending order (then it must be descending order), meaning that we will get the 5 highest values as we use edu.head() which by default give 5 rows.
data:image/s3,"s3://crabby-images/21890/2189014193c239db800051e214c8dbfe04462d8d" alt=""
What will the following give us?
data:image/s3,"s3://crabby-images/aae5a/aae5a45407d84b9a0b716e3e4a8f757fd787dd4a" alt=""
here we are using groupby to group some data.
we are clearly interested in GEO (countries) and the values.
We are grouping by Geo (country) and asking for the mean of values for each country.
next, we ask pandas to sort this data by values in a descending order.
data:image/s3,"s3://crabby-images/d6486/d6486f1de8d1f9625b75c1ad5c08dfe48e161038" alt=""
What is the following doing?
data:image/s3,"s3://crabby-images/f2396/f2396e1ca078c57b4aabc815450b57712bdfcb50" alt=""
- first we are filtering the data with a boolean expression asking for data dated later than 2005 from the TIME column
- secondly, we use the pd.pivot_table function to rearrange our data such that the index/rows become GEO/countries and columns will be equal to time (the time values newer than 2005 as we are using the filtered data)
- lastly we ask to see the first 10 rows
data:image/s3,"s3://crabby-images/c160b/c160b382b154a38513e8ad3e59c1f34da6ee61f3" alt=""
Say we have x = 9, y = 3,….
then we write x + 10
and python returns 19
How do we get the previous result + y?
obviously we could just write 19 + y and get 22
we could also press the arrow key up in command and pick up x + 10 and add + y to get 22
the most correct and simplest way though is:
_ + y #underscore + y
22
say
name = ‘Youtube’
name
name[1:4]
‘out’
say you have a list of numbers
numbers = [33, 44, 55, 66, 77, 88]
data:image/s3,"s3://crabby-images/6d678/6d678731e2ece7bcbca782e24b8729564c5487eb" alt=""
what are the two ways that we can remove 44 from the list?
[33, 44, 55, 66, 77, 79, 88, 79]
data:image/s3,"s3://crabby-images/d78aa/d78aa1c166d90dd91cfda6ec9943566955e2bd0a" alt=""
how can you add multiple values to an existing list?
data:image/s3,"s3://crabby-images/7cb71/7cb71dbceb69235fad219d5a9503ffcd29f9c19d" alt=""
How do you define a list, a set and how do you define a tuple?
- list =[]
- most flexible and can be both numbers and strings
- set = {}
- like list BUT… will not maintain sequence and does not support duplicate values
- tuple = ()
- tuples you cannot change the values the same way as lists. Thus, use only tuples for things that are fixed.
- tuples are though useful because they are faster for python to run
- tuples you cannot change the values the same way as lists. Thus, use only tuples for things that are fixed.
What are the different data types in Python?
- None
- Numeric
- Int, float, complex, bool
- List []
- Tuple ()
- set {}
- string “”
- range (2,10,2)
- Dictionary
If x = 2, what will be the result of x *= 3?
2 * 3 = 6
If x = 2, what will be the result of x **= 3?
2^3 = 8
Why is bin(25) = ‘11001’ for a computer?
computers think binary..
11001
1*(2^4), 1*(2^3), 0*(2^2), 0*(2^1), 1*(2^0) = 25
How do you swap the values of the
a = 5
b = 6
data:image/s3,"s3://crabby-images/af87c/af87c64cdb41384d82738063ccad4bca9e46b284" alt=""
How do you swap the values of the
two variables below in the easiest way?
c = 10
d = 15
data:image/s3,"s3://crabby-images/ea3a1/ea3a17c289cef5bcaed57738a26bdb2564578e5a" alt=""
What are python’s bitwise operators?
- complement (∼) (‘tilde’ symbol)
- and (&)
- Or (І)
- XOR (^)
- Left shift (<<)
- right shift (>>)
Why is 10 << 2 returning 40?
because.,..,
10 is in binary…. 1010
meaning that it is essentially
1010.000000000 (infinite number of 0s after the decimal)
now we shift 2 to the left so we get
101000.00000
and thus 101000
1*32, 0*16, 1*8, 0*4, 0*2, 0*1 = 40
why is 100 >> 3 returning 12?
because…
100 is in binary 1100100
meaning that it is essentially
1100100.000000(all the 0 in the world if needed)
now we shift 3 to the right so it is now..
1100.100 and thus
1*8, 1*4, 0*2, 0*1 = 12
what are math.floor() and math.ceil() for?
using math.floor and math.ceil (after importing math, import math to python) is essentially round numbers to the nearest integer floor or ceiling.
for 2.7, the floor is 2 and ceiling is 3
for 100.14 the floor is 100 and ceiling is 101
how can you print ejendomsinvestoren 5 times?
data:image/s3,"s3://crabby-images/33c6d/33c6db5310157ceb5ce45210105ac9e2934b1001" alt=""
How can you print each even number from 0 to 100?
data:image/s3,"s3://crabby-images/24bd8/24bd8dd1b37d7c7c3beb1a4738126259803e304b" alt=""
how can you print all numbers from 1 to 100
but excluding those that are divisible by 3 and 5?
data:image/s3,"s3://crabby-images/d54c8/d54c8a4b63dbb25d4b6130951fa226f4534e884d" alt=""
What is the difference between ‘continue’ and ‘break’?
if i in range(5):
if i == 3:
break
print(“Hello “, i)
with break it will stop printing so we only get; Hello 0, Hello 1, Hello 2
if i in range(5):
if i == 3:
continue
print(“Hello “, i)
with break it will CONTINUE printing so we only get; Hello 0, Hello 1, Hello 2, Hello 4,
How would you create the following:
####
####
####
####
?
data:image/s3,"s3://crabby-images/30bb4/30bb42f84e7f4cc0968d7ecd906ce40e93db764e" alt=""
How would you create the following:
#
##
###
####
?
data:image/s3,"s3://crabby-images/c2042/c2042a7f8853adecdbc02c4aa6e8654340a58175" alt=""
How would you create the following:
####
###
##
#
?
data:image/s3,"s3://crabby-images/bbc01/bbc011c849daba0c23b0153e42e43d655219cbae" alt=""
What does it mean to use FOR ELSE?
To make a for-loop in which you have an if-elif-else statement WHERE the “else” IS NOT indented.
data:image/s3,"s3://crabby-images/cb113/cb113328336e7fd459b2632459b8b65024bab949" alt=""
say we have a list of numbers
nums = [12, 15, 18, 23, 26]
data:image/s3,"s3://crabby-images/37aa3/37aa34df1bd5fa4098d7a6f10ac9efe09ae8b0b9" alt=""
What is TypeCode used for?
You use TypeCode to specify which type the values are of for an ARRAY
import array
array.array(‘b’, [5, 7, 8, 10])
data:image/s3,"s3://crabby-images/fe1e1/fe1e139791837e1864a880de91ad3d809d55b78f" alt=""
how can you ask a user to enter values to an array?
data:image/s3,"s3://crabby-images/1b47c/1b47c77a3d4bb5007d5cb20673812cd4077e06b2" alt=""
Why do we need numPy?
For multi-dimensional arrays
What are the six ways to create arrays with NumPy?
- array()
- arr = array([1, 2, 3, 4, 5])
- linspace()
- arr = linspace(0, 16, 32)
- start, stop and step/parts (BUT.. step is splitting up into parts here)
- from 0 to 16 and split into 32 numbers.
- if not specifying the number of parts it is 50 by default
- start, stop and step/parts (BUT.. step is splitting up into parts here)
- arr = linspace(0, 16, 32)
- logspace()
- arr = logspace(1, 40, 5)
- start, stop and parts/step
- split up to 5 numbers between 1 and 40 according to log
- start, stop and parts/step
- arr = logspace(1, 40, 5)
- arrange()
- arr = arange(1, 15, 2)
- will return… 1, 3, 5, 7, 9, 11, 13
- start, stop and STEP
- arr = arange(1, 15, 2)
- zeros()
- arr = zeros()
- array of ALL 0s
- fx arr = zeros(5) will return [0, 0, 0, 0, 0]
- arr = zeros()
- ones()
- arr = ones()
- array of ALL 1s
- arr = ones()
copying an array in python SAME ID
data:image/s3,"s3://crabby-images/60c7e/60c7efec2724afec63582bff8708b6d9e3e77772" alt=""
how can you add two arrays?
data:image/s3,"s3://crabby-images/b2e0b/b2e0b585d46130c9148098071ff3814d84983a55" alt=""
how can you make two arrays into one long array?
data:image/s3,"s3://crabby-images/4c507/4c507a581071176b27e95c75c5ed4870efb592d5" alt=""
copying an array in python DIFFERENT ID
data:image/s3,"s3://crabby-images/08a8d/08a8dee9bc5f7e119bde86c2d77f4f34dbf13001" alt=""
How can you check how many dimensions an array contain?
print(arr1.ndim)
How can you check the shape of an array?
print(arr.nshape)
How can you check the size of an array?
print(arr.size)
how can you convert a 2d-array into 1d-array?
multi-dimensional arrays…
data:image/s3,"s3://crabby-images/b0fd7/b0fd746fe1842605a7be03bc4bfa28d8524404f9" alt=""
how can you convert a 2d-array into 3d-array?
multi-dimensional arrays…
data:image/s3,"s3://crabby-images/ec987/ec987dcb82846694f210ed44be7aacdd60a2e3c6" alt=""
define a simple function and call it “greet”
data:image/s3,"s3://crabby-images/70890/70890d7d3dfb880b7c204a229e934fea07e2b173" alt=""
define a function that can add two numbers
data:image/s3,"s3://crabby-images/6217e/6217e198f44c39f81932425687ce3a2ae90bfbe9" alt=""
Create a function that can sum an unknown amount of numbers
data:image/s3,"s3://crabby-images/cc62f/cc62f149ccfceb648c8d3ca8a840bed74435d9cf" alt=""
What is the difference between a local and global variable?
data:image/s3,"s3://crabby-images/ec941/ec941c1fba3b5750c3d1c71987714f518de14272" alt=""
How can you pass a LIST to a function in Python and count #odd and #even?
data:image/s3,"s3://crabby-images/3e899/3e899f7a93e8dc1cd1562d97245c23630c95c6f3" alt=""
How can you create a Fibonacci Sequence?
data:image/s3,"s3://crabby-images/66875/66875d50354b9e07287b2d2e1a1699cf85d66bd2" alt=""
How can you make a function that find the Factorial of a given number?
data:image/s3,"s3://crabby-images/c92b1/c92b17237ffcf62bd922a2fba5abf8331b1e81b5" alt=""
What does recursion mean?
data:image/s3,"s3://crabby-images/82e7c/82e7cc7e37e094cd69950085102acafe33ac3b6e" alt=""
How can we make factorial using recursion?
data:image/s3,"s3://crabby-images/7b2f3/7b2f3c9809daadb766cb31f42fb2f8e07d874a75" alt=""
What is a lambda function?
essentially an anonymous function = a function with no name
filter map reduce
how can you sort this list for even numbers using lambda?
nums = [3, 2, 6, 4, 3, 9, 11]
data:image/s3,"s3://crabby-images/239c2/239c20dcf60ec177b950f5cf537c283337f7541e" alt=""
what is __name__ and why is it useful?
it is a special variable in python used to avoid running an entire
what is the better word to use for “function”?
METHOD
How do “class” and “methods” work together?
Class are overall..
class Computer:
and then the methods/functions are part of that class and must be called with…
computer.config(com1)
just as we write numpy.something
but.. you can actually also write… com1.config() which is more likely what you will see (calling the method FROM the object)
What is object oriented programming?
thinking in objects just like everything we need to do in real life
we need “objects” for such as a laptop, a camera, a frying pan etc.
in programming we need to build objects also and put them together
what matters is DESIGNING objects, not manufacturing them as that can
be done anywhere, and in the computer world they can be replicated
billions of times quickly.
What are the two types of variables?
data:image/s3,"s3://crabby-images/f821c/f821cef6f6e1d68bc3836b5ba2840a619796a3c5" alt=""
What is inheritance in python?
Class A:
Class B(A):
Class C(B):
but we can also say that C is inheriting from A and B (mom and dad) who are unrelated by…
Class A:
Class B:
Class C(A, B)
data:image/s3,"s3://crabby-images/da359/da35920d247c92a410f104f08ec6161a57cb4fb4" alt=""
What does “polymorphism” mean and how can it be implemented?
polymorphism = poly(many) morphism(forms) = objects can take many forms. It can behave in multiple ways.
duck typing
operator overloading
method overloading
method overriding
What is # method overloading?
method overloading: if you have two methods with the same name,
def average(a,b) and def average(a, b, c) it is method overloading
one takes two parameters and one takes three
What is # method overriding?
two methods with the same name AND the same number of parameters
also, if class B(A): inherits what is in A, but B has something itself, it will override the information from A.
Say class A: contains the car of your dad
Say class B(A): contains nothing, then your father’s car is yours(inheritance) but if you have your own car within B, it will be that one.
What is a logical error?
when your code returns a value but the value is wrong.
if 3+2 returns 4, you have made a logical error
What is a runtime error?
when the code does not work during part of the run time…
this is for example if people put in a string and you only allow for integers, float when only integers allowed, if a negative value or if divided by 0 and so forth.
How do you handle exceptions?
use the
try:
except: #can use multiple excepts
except:
except:
finally:
setup….
data:image/s3,"s3://crabby-images/bf725/bf725b5a180934a903798f42fc2170e5ae7f4551" alt=""
Why is multi-threading important?
data:image/s3,"s3://crabby-images/001e8/001e845ef8cedc62254013d99244e68571775a79" alt=""
Is python a compiled or interpreted language?
It is actually both…
- Compiled language
- the computer does not understand the language so we compile the code we write for the computer to understand in binary form (0101010101010110110)
- Interpreted language
- also interpreted as…
- source code => compiled => byte code => interpreted by python virtual machine => machine learning
- it is taking things line by line
How do you swap the values; a = 8 and b = 10?
a = 8
b = 10
then…
a = c # now c is 8
b = a #now a is 10
c = b #now b is 8
a = 10
b = 8
Say you have the list…
areas = [“hallway”, 11.25, “kitchen”, 18.0, “living room”, 20.0, “bedroom”, 10.75, “bathroom”, 9.50]
find the # Sum of kitchen and bedroom area and call it: eat_sleep_area
data:image/s3,"s3://crabby-images/818e6/818e609c53d91d5f9fee6c32c2c3cf98857cdf42" alt=""
If.. areas = [“hallway”, 11.25, “kitchen”, 18.0, “living room”, 20.0, “bedroom”, 10.75, “bathroom”, 9.50]….. update the area of the bathroom area to be 10.50 square meters instead of 9.50.
AND Make the areas list more trendy! Change “living room” to “chill zone”
data:image/s3,"s3://crabby-images/123f1/123f1da7880d1de7fc065a872bfe281d147e8186" alt=""
If.. areas = [“hallway”, 11.25, “kitchen”, 18.0, “living room”, 20.0, “bedroom”, 10.75, “bathroom”, 9.50]….. now You decide to build a poolhouse and a garage. Add this to the list..
data:image/s3,"s3://crabby-images/d87d9/d87d9b7930b95c7c47d719dbe86cb37ea77729df" alt=""
Say you have… areas = [“hallway”, 11.25, “kitchen”, 18.0, “chill zone”, 20.0, “bedroom”, 10.75, “bathroom”, 10.50, “poolhouse”, 24.5, “garage”, 15.45]… how do we remove the poolhouse?
del(areas[-4:-2])
What is wrong here?
data:image/s3,"s3://crabby-images/f08cb/f08cba1f8286d8e9e8ac2cbeca2994c6b5d0a28b" alt=""
The problem we have is that area_copy = areas is not explicit enough in copying the areas list. Thus, when we change the values of areas_copy[] to 5 we are in fact also changing the value [0] in the original areas list.
To avoid changing the original list when making changes in the copy, we must use either list(areas) or areas[:]
data:image/s3,"s3://crabby-images/811a4/811a401ac31e1f9f0dbf4e5a5b62ecd3a3bb38fa" alt=""
Say you have.
first = [11.25, 18.0, 20.0]
second = [10.75, 9.50]
now merge the two lists and write them in descending order…
data:image/s3,"s3://crabby-images/fab20/fab202fa4484868a2c74ae9235b8f2729f29ffbe" alt=""
How do you check the components of an in-built python function?
you can always search the internet but otherwise use..
help(function_name) so for example help(max), help(sorted) and so forth
What is important to remember about methods?
… METHODS call functions on objects
there are different methods available to different types of objects.
also, sometimes the same method is available to different types of objects but behave differently,.
fam.index()
Call the method “upper” on place = “poolhouse
place_up = place.upper()
print(place)
print(place_up)
poolhouse
POOLHOUSE
you have.. areas = [11.25, 18.0, 20.0, 10.75, 9.50], how can you add 24.5 and 15.45 to the list?
areas. append(24.5)
areas. append(15.45)
What is two of the reasons the NumPy ARRAY is better than LIST?
- the ARRAY is able to perform calculations on the entire set of values
- You can also use boolean operators to obtain results from arrays.
What are some of the disadvantages of ARRAY compared to LIST?
Array cannot contain different value types. If having different value types, type coercion will take place
In NumPy, # Print out the 50th row of np_baseball
print(np_baseball[49])
In NumPy, # Select the entire second column of np_baseball and call it np_weight_lb
np_weight_lb = np_baseball[:, 1]
In Numpy, Create numpy array np_height_in that is equal to first column of np_baseball.
- Print out the mean of np_height_in.
- Print out the median of np_height_in.
NOTE that the format is…. np.builtinfunction(nameofarray)
and… rememebr that NumPy has ROW first, column second so [:, 0]
data:image/s3,"s3://crabby-images/e7a21/e7a21151395894c3c5dab3c55b0d0432d435fc13" alt=""
In Numpy, # Print out correlation between first and second column. Replace ‘None’
corr = None
print(“Correlation: “ + str(corr))
data:image/s3,"s3://crabby-images/7d2db/7d2dbeecb4f4a0a27e6a2dda0b022c1efad18566" alt=""
Say you have two np.rrays…
np_heights = np.array(heights)
np_positions = np.array(positions)
Now define the heights of goalkeepers (GK), gk_heights and the height of other players, other_heights
Heights of the goalkeepers: gk_heights
gk_heights = np_heights[np_positions == ‘GK’]
Heights of the other players: other_heights
other_heights = np_heights[np_positions != ‘GK’]
Say you have two np.arays…
Heights of the goalkeepers: gk_heights
gk_heights = np_heights[np_positions == ‘GK’]
Heights of the other players: other_heights
other_heights = np_heights[np_positions != ‘GK’]
How can we get the median for each of them and print it?
data:image/s3,"s3://crabby-images/22438/22438fc136ef815c4fbe0a835f2517674aa88ac6" alt=""
How should you import matplotlib.pyplot?
import matplotlib.pyplot as plt
How do you make a line plot and a scatter plot in matplotlib.pyplot?
plt. plot(x, y) = line plot
plt. scatter(x, y) = scatter plot (no lines, just data points)
How do you plot something on a logarithmic scale?
say you have plt.scatter(gdp_cap, life_exp)
plt. xscale(‘log’) #changing the x-axis to logarithmic
plt. show()
data:image/s3,"s3://crabby-images/9ff53/9ff5314ed5ce291bfb11b0a5bfe2a5e98e1ed791" alt=""
What are the most basic arguments in creating a histogram?
import matplotlib.pyplot as plt
plt.hist(dataname, bins = 10) #it is 10 by default
How do you add axis names to your pyplots?
plt. xlabel(‘price’)
plt. ylabel(‘quantity’)
How do you add a title to your pyplots?
plt.title(‘my title here’)
How do you change the increments on x and y axis?
for example..
plt.yticks([0, 2, 4, 6, 8, 10])
Say your x-axis is logarithmic and go from 1,000 to 100,000, how can we simplify the x-axis ticks?
tick_val = [1000, 10000, 100000]
tick_lab = [‘1k’, ‘10k’, ‘100k’]
plt. xticks(tick_val, tick_lab)
plt. show()
data:image/s3,"s3://crabby-images/f7b25/f7b2592aa1b450999f53eebb0a4b9bb061f41adf" alt=""
Explain this…
plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha = 0.8)
- We are making a scatter plot
- with gdp_cap on the x-axis
- and life_exp on the y-axis
- size is set to np.array(pop) * 2 #doubling the sizes of the bubbles
- c = col => colors are set acording to what has been defined in col which in this case is different colors depending on continent of origin
- alpha = 0.8 sets the opacity/transparency of the bubbles. 0 = 100 % opacity
data:image/s3,"s3://crabby-images/fde61/fde6187c00b7fc0fda06911f3112a3130ac6b86e" alt=""
How can we mark China and India on the plot?
data:image/s3,"s3://crabby-images/c6133/c6133a7fa9f8fb4a9e01cf7f728aa7b951e60d50" alt=""
We can add text with
plt. text(1550, 71, ‘India’)
plt. text(5700, 80, ‘China’)
we obviously need to know these values first.
data:image/s3,"s3://crabby-images/c70de/c70de164ff64cc7c2c2f152c474198b19947bfed" alt=""
Dictionary… check which keys are in europe
print(europe.keys())
dict_keys([‘spain’, ‘france’, ‘norway’, ‘germany’])
Dictionary… called “europe”, print the value that belongs to the key, ‘norway’
print(europe[‘norway’])
oslo
When would you use list vs. dictionary?
List
- If you want to easily access values in an ordered matter (using indexes)
Dictionary
- When you want more freedom and have a unique table in which the unique keys can take different forms
data:image/s3,"s3://crabby-images/e9805/e9805df2c345835985be4cdd0a76aa6d42dea21d" alt=""
dictionary… say we have..
europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’berlin’, ‘norway’:’oslo’ }
Now add Italy to it.
and check if it has been added…
europe[‘italy’] = ‘rome’
now ask..
europe(‘italy’ in europe)
True
Somebody thought it would be funny to mess with your accurately generated dictionary. An adapted version of the europe dictionary is available in the script on the right.
europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’bonn’,
‘norway’:’oslo’, ‘italy’:’rome’, ‘poland’:’warsaw’,
‘australia’:’vienna’ }
Can you clean up?
data:image/s3,"s3://crabby-images/1ab8c/1ab8c70da8c00e407102558eba5e0ba435abe469" alt=""
Dictionaries:
europe = { ‘spain’: { ‘capital’:’madrid’, ‘population’:46.77 },
‘france’: { ‘capital’:’paris’, ‘population’:66.03 },
‘germany’: { ‘capital’:’berlin’, ‘population’:80.62 },
‘norway’: { ‘capital’:’oslo’, ‘population’:5.084 } }
print out the capital of France:
print(europe[‘france’][‘capital’])
Dictionaries:
europe = { ‘spain’: { ‘capital’:’madrid’, ‘population’:46.77 },
‘france’: { ‘capital’:’paris’, ‘population’:66.03 },
‘germany’: { ‘capital’:’berlin’, ‘population’:80.62 },
‘norway’: { ‘capital’:’oslo’, ‘population’:5.084 } }
print out the capital of France:
- Create a dictionary, named data, with the keys ‘capital’ and ‘population’. Set them to ‘rome’ and 59.83, respectively.
- Add a new key-value pair to europe; the key is ‘italy’ and the value is data, the dictionary you just built.
data:image/s3,"s3://crabby-images/6b82a/6b82ac8c43f15de3a3c099110e26e00bbf8dd1f9" alt=""
Pandas: ‘Say you have a DataFrame ‘cars’, how can we change the index labels?
data:image/s3,"s3://crabby-images/e60a5/e60a534b8f068ae4a91c208a7bc45f7841e59007" alt=""
row_labels = [‘US’, ‘AUS’, ‘JPN’, ‘IN’, ‘RU’, ‘MOR’, ‘EG’]
cars.index = row_labels
data:image/s3,"s3://crabby-images/1fb03/1fb03d22aeaa4fbb1fc36aba7e94834896e07c7f" alt=""
Pandas, say you have a file, cars.csv, how can you import it as a DataFrame?
cars = pd.read.csv(‘cars.csv’) #remember ‘’ around the file name
print(cars)
data:image/s3,"s3://crabby-images/a3f78/a3f785a890e25cfba1ab5778ac35bd868afc1896" alt=""
Pandas: What is the difference between the ouput of:
cars[‘cars_per_cap’]
and
cars[[‘cars_per_cap’]]
The single bracket version gives a Pandas SERIES while the double bracket version gives a Pandas DataFrame.
Pandas.. What is the difference between loc and iloc?
loc = label-based meaning that you have to specify rows and columns based on their row and column labels.
iloc = integer index based so you have to specify rows and columns by their integer index starting from 0.
Pandas.. What does this do….. cars.iloc[[3, 4], 0]
We ask for the rows indexed 3 and 4, and then the column indexed 0.
we get it as a DataFrame because of [[]]
Pandas.. Print out the drives_right value of the row corresponding to Morocco (its row label is MOR
data:image/s3,"s3://crabby-images/8da4b/8da4ba479754c9cebdd760eed216dc37bb9d9944" alt=""
Pandas..Print out a sub-DataFrame, containing the observations for Russia and Morocco and the columns country and drives_right.
Print sub-DataFrame
print(cars.loc[[‘RU’, ‘MOR’], [‘country’, ‘drives_right’]])
Pandas.. from ‘cars’, # Print out drives_right column as DataFrame
print(cars.loc[:,[‘drives_right’]])
can also be done with iloc if we know the index number of the drives_right column
What will “alpha” <= “beta” return and why?
True
Because Python determines the relationship based on alphabetical order when working with strings.
How can you use AND, OR, NOT operators in NumPy?
Arrays work slightly differently so you need to use the AND, OR, NOT equivalent functions in NumPy, which are:
- np.logical_and()
- np.logical_or()
- np.logical_not()
Example: np.logical_and(my_house < 11, your_house < 15)
If we have…
areas = [11.25, 18.0, 20.0, 10.75, 9.50]
How can we print the values one by one AND the index values?
use for loop + enumerate
data:image/s3,"s3://crabby-images/801f8/801f8c26c20c493361e75c11aed1945500609c6e" alt=""
If we have…
house = [[“hallway”, 11.25],
[“kitchen”, 18.0],
[“living room”, 20.0],
[“bedroom”, 10.75],
[“bathroom”, 9.50]]
Write a for loop that goes through each sublist of house and prints out the x is y sqm, where x is the name of the room and y is the area of the room.
data:image/s3,"s3://crabby-images/e5066/e5066514b65f0d5bd78e0275a09a94fb07c3cc74" alt=""
How can you iterate over key values in a dictionary?
Key values in dictionary
- for key, val in my_dict.items():
- # use .items
how can you iterate over all elements in a numpy array?
elements in an array
- for val in np.nditer(my_array):
- # use np.nditer
how can you iterate over all elements in a pandas DataFrame?
for pandas Dataframes
- for lab, row in brics.iterrows():
- # use .iterrows()
How can you iterate over key values in a dictionary and how can you iterate over all elements in a numpy array? and for Pandas DataFrames?
Key values in dictionary
- for key, val in my_dict.items():
- # use .items
elements in an array
- for val in np.nditer(my_array):
- # use np.nditer
for pandas Dataframes
- for lab, row in brics.iterrows():
- # use .iterrows()
If we have…
europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’berlin’,
‘norway’:’oslo’, ‘italy’:’rome’, ‘poland’:’warsaw’, ‘austria’:’vienna’ }
Write a for loop that goes through each key:value pair of europe. On each iteration, “the capital of x is y” should be printed out, where x is the key and y is the value of the pair.
data:image/s3,"s3://crabby-images/d0629/d0629067fe5804a1d35a4b2fdeac1ef4e33d6f87" alt=""
We have…
import pandas as pd
cars = pd.read_csv(‘cars.csv’, index_col = 0)
and need to make the he first iteration print out “US: 809”, the second iteration “AUS: 731”, and so on.
“country: cars_per_cap”
(already set as “lab” and “row”)
data:image/s3,"s3://crabby-images/3f9b9/3f9b9207c88c33233a78e517f5c854e4d933cc89" alt=""
In a DataFrame, how can you create a new column by calling a function on another column?
use .apply
very important feature.
.apply
data:image/s3,"s3://crabby-images/6a615/6a6152cb3919a46e44f74146099a001160323c7a" alt=""
random; How can you roll a dice?
data:image/s3,"s3://crabby-images/fa307/fa30799c649874ede48135a5856dc1a67fcb4f6f" alt=""
Why is it generally better to use “return” rather than “print” for functions?
Returning values is generally more desirable than printing them out because, as you saw earlier, a print() call assigned to a variable has type NoneType.
How can you return multiple values from a function?
Using tuples!!
data:image/s3,"s3://crabby-images/e8bf4/e8bf4bf46ac708a5ea06bf3bb72e13a413484801" alt=""
Explain this code:
data:image/s3,"s3://crabby-images/1bcc4/1bcc402ff670badb847b8ee7d2deeb78a09602b3" alt=""
data:image/s3,"s3://crabby-images/d9d99/d9d99c51a799683dd020594f03e6e40cc2959b15" alt=""
What are the different function scopes?
- GLOBAL scope
- defined in the main body of a script
- LOCAL scope
- defined inside a function
- BUILT-IN scope
- names in the pre-defined built-ins module (such as ‘print’)
- NOTE; you need tim ‘import builtins’ to use this.
- writing ‘dir(builtins)’, you can see all builtin functions
- NOTE; you need tim ‘import builtins’ to use this.
- names in the pre-defined built-ins module (such as ‘print’)
Python ALWAYS search for a value in LOCAL scope first, second GLOBAL scope and lastly BUILT-in scope.
How do you set a default parameter?
Here echo=1 => echo is set to 1 unless otherwise specified.
Thus, in line 13, we do not specify echo but it will return 1 time, while in line 15, we ask it to return 5 times
data:image/s3,"s3://crabby-images/dbb38/dbb38e8b51ed0c6b52c76f586624e54656fd063e" alt=""
How do you create a variable-length argument?
Use (*args) #single *
It will turn a tuple of values.
def_sums(*args):
How do you create a variable-length keyword argument?
Use (**kwargs) #double **
It will turn a dictionary of values
def report_status(**kwargs):
How would you write a lambda function add_bangsthat adds three exclamation points ‘!!!’ to the end of a string a? and How would you call add_bangs with the argument ‘hello’?
data:image/s3,"s3://crabby-images/8fa19/8fa197196d557319398829a973e25787c7d8a2a8" alt=""
Convert this function to a lambda function
data:image/s3,"s3://crabby-images/257e7/257e78d75e8731307548c396ff68714722290d39" alt=""
data:image/s3,"s3://crabby-images/73e52/73e52c5ed883ce7cfc05ddbc1d51ffcaac4d701b" alt=""
what is happening here?
data:image/s3,"s3://crabby-images/0caaa/0caaab1efc0e9256b1416cc0b24f9ebc58613b9b" alt=""
data:image/s3,"s3://crabby-images/9a12c/9a12c85dd1bab2b1e37d7e6cacbd3eb2590bb606" alt=""
If…
fellowship = [‘frodo’, ‘samwise’, ‘merry’, ‘pippin’, ‘aragorn’, ‘boromir’, ‘legolas’, ‘gimli’, ‘gandalf’]
Using lambda, create a new list that contains only strings that have more than 6 characters…
data:image/s3,"s3://crabby-images/909de/909de5e7ad4c7e5b3f9610f1827a803744e9038a" alt=""
What are the ways to write in error messages?
try:
return:
except:
and…
if ‘something’:
raise _____Error(‘explanation’)
try:
return
except: _____Error
data:image/s3,"s3://crabby-images/16fe8/16fe893203d1b777d63e63d2a206e7f4eaa83e1e" alt=""
How can you select and print the retweets ‘RT’ from a Twitter DataFrame;
data:image/s3,"s3://crabby-images/4b1e0/4b1e04458507155379d79f2c5ad942971926ad11" alt=""
What’s the difference between iterators and iterables?
Iterable
- An object with an associated iter() method
- Examples: lists, strings, dictionaries, file connections
- Applying iter() to an iterable creates an iterator
Iterator
- Produces next value with next()
How can you iterate at ONCE?
use..
print(*it)
What does the “enumerate()” function do?
It returns an enumerate object that produces a sequence of tuples, and each of the tuples is an index-value pair.
In this exercise, you are given a list of strings mutants and you will practice using enumerate() on it by printing out a list of tuples and unpacking the tuples using a forloop.
Now, create a list and unpack the tupls using a for loop:
data:image/s3,"s3://crabby-images/391c3/391c321132bfa2fa9f8d66ae5cb5b6de7537e52f" alt=""
data:image/s3,"s3://crabby-images/cc1b8/cc1b8d8b3ac308b00268b8397602847bc253edfa" alt=""
What does zip() do?
It takes any number of iterables and returns a zip object that is an iterator of tuples.
If you want to print the values of a zip object, you need to convert it to a list and then print it.
Say we have mutants and powers, how do you create a zip object? AND then unpack it again?
data:image/s3,"s3://crabby-images/92f49/92f49fead5cb565b51b7bf39ee08f26c457761f2" alt=""
Why may one want to load data in chunks and how is it done?’
Because there can be too much data to hold in memory.
pandas function: read_csv()
- specify the chunk with… chunksize
read_csv(‘filename’, chunksize=xxxx)
data:image/s3,"s3://crabby-images/79221/79221a6394626533fb64b7065af63bc95e4d5bc3" alt=""
How can you iterate over the file tweets.csv in small portions and adding new words to a dictionary and counting the number of times each word is in that dictionary?
data:image/s3,"s3://crabby-images/1533d/1533d3fd9302f9cc829f0f31e7e9f86a36c19ba1" alt=""
Say we have…
doctor = [‘house’, ‘cuddy’, ‘chase’, ‘thirteen’, ‘wilson’]
How would a list comprehension that produces a list of the first character of each string in doctor look like?
data:image/s3,"s3://crabby-images/829a5/829a5b145c037d682712f51a37bfb42ea9c6b583" alt=""
How can you add a condition to a list comprehension?
data:image/s3,"s3://crabby-images/5e559/5e5592e5e3bd41224e494ef604534a8588a00eb7" alt=""
What are generators?
uses ()
Great for very big numbers as it does not store everything in memory (as a list comprehension would do).
While a list comprehension produces a list as output, a generator produces a generator object.
data:image/s3,"s3://crabby-images/ca422/ca422857ccd85bc515ed91f9ac02a0249a36cd0c" alt=""
Create a generator object that will produce values from 0 to 30. Assign the result to result and print it using a for loop
data:image/s3,"s3://crabby-images/e712e/e712ea2fea577afd9342978e701890d4fbd184b8" alt=""
What is the basic format of a list comprehension?
data:image/s3,"s3://crabby-images/81d88/81d8843bd7941f14d57e3c0b9e54fe22f878676e" alt=""
The lists2dict() function has already been preloaded, together with a couple of lists, feature_names and row_lists. feature_names contains the header names of the World Bank dataset and row_lists is a list of lists, where each sublist is a list of actual values of a row from the dataset.
data:image/s3,"s3://crabby-images/ed839/ed8390057041ced6577f8e8d8744d65234158249" alt=""
Write a list comprehension to generate a list of values from pops_list for the new column ‘Total Urban Population’. The output expression should be the product of the first and second element in each tuple in pops_list. Because the 2nd element is a percentage, you also need to either multiply the result by 0.01or divide it by 100. In addition, note that the column ‘Total Urban Population’ should only be able to take on integer values. To ensure this, make sure you cast the output expression to an integer with int().
data:image/s3,"s3://crabby-images/4833a/4833a516ae98f359b7ec583d581764cff318de83" alt=""
How do you read a file? (best practice)
data:image/s3,"s3://crabby-images/0f8ef/0f8efec9e3235d3c566799a2fccf3e5d26f41a41" alt=""
What will “! ls” do?
! ls will display the contents of your current directory.
How do you open a file using the “with”-method?
data:image/s3,"s3://crabby-images/4c05a/4c05a879b33a988920fb3a09664a4f6d32ad5c2e" alt=""
What is flat files?
- Text files containing records
- Table data
- Record: row of fields or attributes
- Column: feature or attribute
Flat files often have headers
.csv, .txt (tab-delimited),
What are the main three ways to import data to NumPy?
- np.loadtxt(filename, delimiter=‘,’)
- digits = np.loadtxt(file, delimiter=’, skiprows = 1’)
- ‘,’ for comma-separated
- ‘\t’ for tab-delimited
- skiprows = 1 will skip the first row if we have headers for example
- usecols => lets you specify which rows wish to keep
- np.genfromtxt(‘filename’, delimiter = ‘,’, names=True, dtype=None)
- To be used when you have different data-types (strings, float, etc).
- names = True => means there is a header
- np.recfromcsv()
- Behaves similarly to np.genfromtxt BUT has default dtype=None, default ‘,’ as delimiter and default names=True.
What is pandas providing to data scientists that Numpy does not?
- Two-dimensional labeled data structures
- Columns of potentially different types
- Manipulate, slice, reshape, groupby, join, merge
- Perform statistics
- Work with time series data
Pandas is for data analysis and modeling.
From the file, digits.csv, import the first 5 rows to a DataFrame, build a numpy array from this DataFrame and print the datatype
data:image/s3,"s3://crabby-images/ba9a5/ba9a5d142b747991e01dbf962b871b50d2466321" alt=""
How is delimiter defined in pandas import?
pd.read_csv(filename, sep = ‘\t’, comment = ‘#’, na_values = ‘Nothing’)
What are pickled files?
- A file type native to python
- Not readable by humans but readable by python/machines
- JSON files are an option if you want it readable by humans
- Motivation; many datatypes for which it isn’t obvious how to store (such as lists, dictionaries etc.)
- ‘rb’ means read only binary
data:image/s3,"s3://crabby-images/fd4b5/fd4b5242e72b9d359fcca72b436989c4300677de" alt=""
How do you import Excel files to pandas?
figureo ut what the sheets in excel are called by
print(data.sheet_names) here as the file has been set to ‘data’
and to make dataframes of those sheets, use .parse and the sheet name OR sheet index
data:image/s3,"s3://crabby-images/88aa1/88aa1ede469a7d8b2efba9c2535746cdb9ce6323" alt=""
Import the file, battledeath.xlsx and display the sheet names
data:image/s3,"s3://crabby-images/10a82/10a8222502a299667153404b1adf5ae0a181b8df" alt=""
What is happening here?
data:image/s3,"s3://crabby-images/45618/4561860bc4093eee34dc3e757d49e4e3b5f35e97" alt=""
data:image/s3,"s3://crabby-images/b3ed5/b3ed5d05979426b9e7d6194a122ca59b4657457b" alt=""
What are SAS used for?
very important for data science
used for:
- advanced analytics
- multivariate analysis
- business intelligence
- data management
- predictive analytics
- standard for computational analysis
How do you import a SAS file?
data:image/s3,"s3://crabby-images/1422a/1422ad34506acba8b7e0eb1b8a9693787244edc1" alt=""
How do you import a Stata file?
data:image/s3,"s3://crabby-images/d14e7/d14e7458f0c8543d666cd8bccf87270a72d5965b" alt=""
What is a HDF5 file and how do you import it?
- Hierarchical Data format Version 5
- Standard for storing large quantities of numerical data.
- hundreds of gigabytes or terabytes (can scale to exabytes)
data:image/s3,"s3://crabby-images/f27ef/f27ef275c917f470aa3f0a9d3b8ebffebd4994b3" alt=""
what is a MATLAB file and how do you import it?
- Matrix Laboratory
- Industry standard in engineering and science
scipy. io.loadmat() = read .mat files
scipy. io.savemat() = write .mat files
data:image/s3,"s3://crabby-images/eaab9/eaab9497a60968f6548143c50ab5eb72a57e8a1a" alt=""
What is a relational database?
- Based on relational model of data
- Rows are ordered and the columns are attributes for the information in the rows
- Each row need a primary key column that has a unique entry for each row that we can call information from (such as OrderID, CustomerNumber, EmployeeNumber etc.)
How can you create a database engine with SQLAlchemy?
data:image/s3,"s3://crabby-images/9f150/9f1500cf35adc5a6771c0e82472f0318bf8344f9" alt=""
What does querying mean?
get data out of the database
How do you create an engine, connect to it, make a basic query, create a database and close it again?
- Import create_engine
- import pandas as pd
- Create the engine
- Connect to the engine
- Write a query (select * => imports all columns)
- Turn the query into a DataFrame using fetchall()
- Close the connection
data:image/s3,"s3://crabby-images/6af7c/6af7ccbf0e6596dcc07d3ae67b20a6f981c316e5" alt=""
How do you specify columns that you want to import AND specify the number of rows you want to import in a query + set the DataFrame’s column names to the corresponding names of the table columns?
data:image/s3,"s3://crabby-images/a8779/a87790a338fca9ed9371d1c2834a34657f3e6157" alt=""
How do you make a query that selects all “EmployeeID”s from “Employee” that is greater than or equal to 6?
data:image/s3,"s3://crabby-images/6ce35/6ce358bdcabdc36e695b8f9a6bcfdab1c34b9fca" alt=""
How can this be done much smarter with Pandas?
data:image/s3,"s3://crabby-images/f6aed/f6aed9242961ea77b44853c4bd043109dbde1b0a" alt=""
data:image/s3,"s3://crabby-images/5c96d/5c96d1671c1d5d75772e6a4c6c8475bde2aaa8df" alt=""
How do you import data from a url?
It is possible to skip the step of saving it locally (line 8-9) and just plug in the url directly in pd.read_csv so:
df = pd.read_csv(url, sep=’;’)
data:image/s3,"s3://crabby-images/c8dbd/c8dbddf5e1eba6670b49dfafb0c4a272c362fa63" alt=""
What does URL stand for?
Uniform/Universal Resource Locator
How do you send a GET request to a certain URL and afterwards read the response? (http.client.HTTPResponse)
data:image/s3,"s3://crabby-images/1f129/1f1290320afe0bf477d9cd8fc10033861d577de4" alt=""
How do you make HTTP requests for an URL using the requests package?
data:image/s3,"s3://crabby-images/2a407/2a4075fc011d40c63e9559c6bcc08fc5e8d21253" alt=""
What is the BeautifulSoup package for?
- Parse and extract structured data from HTML
- “Make tag soup beautiful and extract information”
data:image/s3,"s3://crabby-images/04c53/04c537a5024ae5680ee116d488518242c1b37d60" alt=""
use the BeautifulSoup package to parse, prettify and extract information from HTML.
The URL of interest is url = ‘https://www.python.org/~guido/’.
data:image/s3,"s3://crabby-images/24b6c/24b6c45b7ab65868b9f2494a0451c5ed8006e2a3" alt=""
What does API mean and what is it?
- Application Programming Interface
- Protocols and routines that build and interact with software applications
- allows two software programs to communicate with each other
What is JSON?
- JavaScript Object Notation
- Real-time server-to-browser-communication
- It is readable by humans
- loads as a dictionary {}
Load the JSON ‘a_movie.json’ into the variable json_data, which will be a dictionary. Next, explore the JSON contents by printing the key-value pairs of json_data to the shell.
data:image/s3,"s3://crabby-images/7d8b6/7d8b6cd0f0f9f0b0e7db008f1c4407d967a5f149" alt=""
How do you connect to an API?
data:image/s3,"s3://crabby-images/e9b0a/e9b0a82965546bde349ba98e5201ee4f4823cf76" alt=""
Import and print the text from the URL, http://www.omdbapi.com/ with the these two argument in the query string; apikey=72bc447a and t=the+social+network
data:image/s3,"s3://crabby-images/79178/79178935b84087d7b43e800a2fb916a3cc90b21f" alt=""
Which function do you use to decode a JSON to a dictionary?
the json() method
so…
url = ‘https://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&exintro=&titles=pizza’
Package the request, send the request and catch the response: r
r = requests.get(url)
Decode the JSON data into a dictionary: json_data
json_data = r.json()
What are some of the most common data problems?
- Inconsistent column names
- Missing data (NaN)
- Outliers
- Duplicate rows
- Untidy
- Need to process columns
- Column types can signal unexpected data values
What are some of the most important functions in pandas to check out your data?
- df.columns() #to get column names
- df.head() #first 5 rows by default
- df.tail() #last 5 rows by default
- df.shape() #shape of dataframe with (rows,columns)
- df.info() #various info about the dataframe, non-missing values, data type etc.
- df.describe() #getting basic statistics ONLY on numeric columns
How can you get the frequency counts for each unique value in a column?
whether the data is numeric or not, you can use
.value_counts(dropna=False)
dropna=False is important to add to see missing values (NaN)
note that it is value_counts with s
What are great plotting tools to inspect your data and its outliers?
import matplotlib.pylot as plt
- df[‘Existing Zoning Sqft’].plot(kind=’hist’, rot=70, logx=True, logy=True) #here setting both x and y axis as logarithmic
- df.boxplot(column=’initial_cost’, by=’Borough’, rot=90) # comparing inital_cost column across different values of the Borough column
- great for numeric values to be compared across different categories
- df.plot(kind=’scatter’, x=’initial_cost’, y=’total_est_fee’, rot=70)
- great for visiualizing two numeric columns
plt.show()
What are the 3 principles of tidy data?
- Columns represent separate variables
- Rows represent individual observations
- Observational units form tables
Tidy data makes it easier to fix common data problems
data:image/s3,"s3://crabby-images/c6948/c694881967ff039e1b5442552a4cd298b1acf738" alt=""
What is this doing?
data:image/s3,"s3://crabby-images/192d0/192d0cfe9347e36451857078c616e3a75b1a6e11" alt=""
- We are melting columns except ‘month’ and ‘day’
- we call the value-column = reading
- and the variable column = measurement
Melting turns columns into rows
What does pivoting do?
turn unique values into separate columns
why:
- analysis friendly shape to reporting friendly shape
While melting takes a set of columns and turns it into a single column, pivoting will create a new column for each unique value in a specified column.
What is this doing?
data:image/s3,"s3://crabby-images/de18e/de18eaacdb3386ade4e61f67bd8076a80cc9d050" alt=""
Pivoting such that
- we do not change the two columns ‘month’ and ‘day’
- we change the ‘measurement’ column and split the unique values in this column into new columns
- and take the values from ‘reading’ and plug them into those rows
data:image/s3,"s3://crabby-images/a6475/a647555dd11f29bfdc8e4cc6b58af6280f7057b0" alt=""
How would you split a column containing gender and age values in one such as m014, m65, f65, f4554 and so forth?
data:image/s3,"s3://crabby-images/ce9f8/ce9f8fb92bd45518cdb4e237d4bfb758ca2ac788" alt=""
How would you split a column with values such as Cases_Guinea, Deaths_Guinea, Cases_Liberia, Deaths_Liberia?
_ serves as a delimiter
we can use .split()
data:image/s3,"s3://crabby-images/4326c/4326cc85643eefce239f42f9daacd2104dc53abe" alt=""
How do you concatenate data?
concatenating ROWS
pd.concat([firstset, secondset, thirdset, etcset], axis=0)
pd.concat([firstset, secondset, thirdset, etcset], axis=1)
NOTE… Data MUST be in a list to concatenate DataFrames.
What is globbing about?
pattern matching for file names
wildcards: *?
- any csv file: *.csv
- any single character: file_?.csv
How do you merge data?
pd.merge(left=dd, right=dd_d, on=None, left_on=’something’, right_on=’something)
example:
two dataframes; visited and site
with columns that are the same called “site” and “name”
o2o = pd.merge(left=site, right=visited, left_on=’name’, right_on=’site’)
Many-to-many data merge ; The final merging scenario occurs when both DataFrames do not have unique keys for a merge. What happens here is that for each duplicated key, every pairwise combination will be created.
How can you see the data types in your DataFrame?
print(df.dtypes)
How do you convert an object column to categorical?
This will lower the memory need
data:image/s3,"s3://crabby-images/417ba/417bac2d09934adeb0ada4abf946c6b2e8a19e5f" alt=""
How do you convert a column to numeric?
data:image/s3,"s3://crabby-images/a26c7/a26c79f8bfe2991255e97bf72cf3d6934e4db9cb" alt=""
How can you make the computer interpret $17.00 with escape sequences?
data:image/s3,"s3://crabby-images/8c6bb/8c6bb965b3907a790d52bb5d3a12119d13fb667e" alt=""
How can you make the computer interpret $17.89 with escape sequences? #make sure it only takes two decimals for the monetary value
data:image/s3,"s3://crabby-images/c96d6/c96d64f08729e67726e0ec41c1d3bd8e801085d2" alt=""
How do make a pattern matching for a us phone number in the format xxx-xxx-xxxx?
data:image/s3,"s3://crabby-images/4a8e6/4a8e68235167469c4f91873d863979f01f8281c0" alt=""
How can you make python extract the numbers found in a string? ‘the recipe calls for 10 strawberries and 1 banana’
data:image/s3,"s3://crabby-images/a6e62/a6e621d71b545281b4e3586d4fcdad0544adf3ac" alt=""
How could you write a pattern matching sequence to make ‘$123.45’ turn True?
data:image/s3,"s3://crabby-images/48e4d/48e4d7cda555ebab9436f9fe57059f344b65d0a7" alt=""
How can you remove duplicate data?
df.drop_duplicates()
Important to delete duplicate values as they take up a lot of memory and cause unneeded calculations to be performed when processing data
How can you fill missing values?
.fillna()
How do you make an assert statement?
data:image/s3,"s3://crabby-images/87093/87093fdffaf73f888a37ae01045549d0038c5e82" alt=""
What is this doing?
data:image/s3,"s3://crabby-images/32f5d/32f5d08a469dafdd88072b58cea0cfaa4109e371" alt=""
- We have a dataset, ‘gapminder’ with 780 rows and 218 columns
- each column represent a year with life expectancy data
- but the very first column [0] has the country names
- we then melt the data such together such that we get rows instead of columns = one row will have a year value, the life expectancy and the associated country
- we make sure not to touch the country life by putting ‘life expectancy’ in as id_vars
- we then rename the columns to ‘country’, ‘year’ and ‘life_expectancy’
- and print the head of it
data:image/s3,"s3://crabby-images/0ecdd/0ecdd549f4bc3d0c097a38e0446ce4d156d4cba4" alt=""
How do you convert the column ‘year’ in the df ‘gapminder’ to numeric AND asset that the change has been made?
data:image/s3,"s3://crabby-images/006c4/006c40b63f326d9c043a9b83a1138f9271fc6d7d" alt=""
Say you have two lists, ‘list_keys’ and ‘list_values’, how do you create a dictionary and then a dataframe?
data:image/s3,"s3://crabby-images/ab116/ab116f6ac861306ae4037abebe9d9001e6e6ed4c" alt=""
How can you in a structured way rename the columns of your dataframe?
data:image/s3,"s3://crabby-images/ceb47/ceb474ff2e3d213b0dd4e9a01a034dd768a28773" alt=""
Say we have a list of cities in Pennsylvania, how can we create a dataframe from this list AND add a column with the value ‘PA’ for Pennsylvania in all rows?
data:image/s3,"s3://crabby-images/3dba8/3dba8ac528b91d8bd24e71729571699c5f85dd9f" alt=""
What are the 3 ways to create a histogram?
- df.plot(kind=’hist)
- iris.plt.hist()
- iris.hist()
NOTE they have different syntax
How can you get the 5th and 95th percentile of a dataframe?
print(df.quantile([0.05, 0.95])
How can you downsample hourly data of ‘Temperature’ in df to six hours and to daily?
data:image/s3,"s3://crabby-images/9106b/9106b66556a44c83d86f103db9e429b6120a6220" alt=""
Say we have a df with the column ‘Temperature’ and we want to extract all of august 2010 and show the maximum daily temperature?
data:image/s3,"s3://crabby-images/b886f/b886ffbbd3650f616230c617e42661a432fed79b" alt=""
How can you convert a column, say ‘wind_speed’ in df to numeric values?
df[‘wind_speed’] = pd.to_numeric(df[‘wind_speed’], errors=’coerce’)
Filter rows of august_2011 to keep days where the value exceeded august_max. Store the result in august_2011_high.
august_2011_high = august_2011.loc[august_2011 > august_max]
loc with boolean condition
both august_2011 and august_max are already encoded as variables
What is the difference between selecting df[‘eggs’] and df[[‘eggs’]]?
df[‘eggs’] returns a series (one-dimensional labeled data)
df[[‘eggs’]] returns a DataFrame
How can you assign a index name and overall column name to a DataFrame,df?
df. index.name = ‘MONTHS’
df. columns.name = ‘PRODUCTS’
just as you set the index names by..
df.index = [‘Jan’, ‘Feb’, ‘Mar’, ‘Apr’]
Say we have the dataframe, ‘sales’ and want to make both the ‘state’ and the ‘month’ column indexes, what will we have to do?
data:image/s3,"s3://crabby-images/50bba/50bbaea0c7b0cc026d605c3aded21f803cf944b7" alt=""
Set the index to be the columns [‘state’, ‘month’]: sales
sales = sales.set_index([‘state’, ‘month’])
Sort the MultiIndex: sales
sales = sales.sort_index()
Print the sales DataFrame
print(sales)
Can you pivot a dataframe that has multiindex?
No…
you will have to unstack the multi-index by turning an index to a column first.
df.unstack(level=’indexcolumnname’)
or
df.unstack(level=1) #to remove the second index
How can you switch around indexes in a multi-index?
swapped = df.swaplevel(0, 1)
swapped.sort_index()
data:image/s3,"s3://crabby-images/10edd/10edd7e127f2cc00107cc1ad255b4b11aa4803c1" alt=""
How can we call a zscore on a dataFrame? FIRST… groupby ‘region’ and columns, ‘life’ and ‘fertility’
standardized = gapminder_2010.groupby(‘region’)[‘life’,’fertility’].transform(zscore)
How does idxmax() and idxmin() work?
- idxmax()
- Row or column label where maximum value is located
- idxmax(axis=’columns) #for columns
- idxmin()
- Row or column label where minimum value is located
- idxmin(axis=’columns) #for columns
what type of plot do you get from usa_medals_by_year.plot.area()?
data:image/s3,"s3://crabby-images/a0b85/a0b85b628804bb0565534021bdba34ba6346e46e" alt=""
You may have noticed that the medals are ordered according to a lexicographic (dictionary) ordering: Bronze < Gold < Silver. However, you would prefer an ordering consistent with the Olympic rules: Bronze < Silver < Gold. How can we change the ordering?
You can achieve this using Categorical types.
Redefine the ‘Medal’ column of the DataFrame medals as an ordered categorical. To do this, use pd.Categorical()
medals.Medal = pd.Categorical(values=medals.Medal, categories=[‘Bronze’, ‘Silver’, ‘Gold’], ordered=True)
Say you have a list of csv files, filenames = [‘Gold.csv’, ‘Silver.csv’, ‘Bronze.csv’], how can we import all these?
use a loop….
Create the list of three DataFrames: dataframes
dataframes = []
for filename in filenames:
dataframes.append(pd.read_csv(filename))
eu_metro_areas.drop(“High Population”, axis=1, inplace=True)
eu_metro_areas
What is the purpose of inplace = True?
inplace = True is by default false.
When True, we change the current dataframe
when not set, and thus false by default, we create a new dataframe with a column missing. so… we get multiple dataframes, which is slower and often quite useless.
What will this do?
z.replace({“Nowhere”:”A city”}, inplace=True)
z
we are changing the str ‘Nowhere’ to ‘A city’ and overwrite the current DataFrame as we set inplace=True.
How do you convert a DataFrame to CSV?
eu_metro_areas.to_csv(‘My_metro_areas.csv’, sep=’,’)
How do you convert a DataFrame to Excel?
eu_metro_areas.to_excel(“My_metro_areas.xlsx”)
what is rug=True doing? sns.distplot(eu_metro_areas[“GDP”],bins=7,rug=True)
it will show small bars for each observation/datapoint in the graph so we understand how big the dataset is
data:image/s3,"s3://crabby-images/b49f5/b49f5a841bb825fc8889d6779656fb6de1905289" alt=""
What join does pd.concat take as default?
outer
How do you add a column to an existing dataframe, df?
df[‘newcolumnname’] = [‘list’, ‘of’, ‘values’, ‘for’, ‘new’, ‘column’]
When should you use .append(), pd.concat(), df.join() and pd.merge()?
- .append()
- stacking vertically
- pd.concat()
- stacking many horizontally or vertically
- simple inner/outer joins on indexes
- df.join()
- inner, outer, left, right joins on indexes
- pd.merge()
- many joins on multiple columns
What join does pd.merge_ordered take as default?
outer