PythonCheatsheet.Org Flashcards
Studying https://www.pythoncheatsheet.org/cheatsheet/basics
Math Operators
From highest to lowest precedence:
Augmented Assignment Operators
Walrus Operator
The Walrus Operator allows assignment of variables within an expression while returning the value of the variable
The Walrus Operator, or Assignment Expression Operator was firstly introduced in 2018 via PEP 572, and then officially released with Python 3.8 in October 2019.
Data Types
Concatenation and Replication
Variable naming rules
- it can only be on word
- it can only use letters, numbers, and the underscore (
_
) character - It can’t begin with a number
- variables starting with an underscore (
_
) are considered as “unuseful”
Comments
The print()
Function
The print()
function writes the value of the argument(s) it is given. […] it handles multiple arguments, floating point-quantities, and strings.
Strings are printed without quotes, and a space is inserted between items, so you can format things nicely:
The end
keyword
The keyword argument end
can be used to avoid the newline after the output, or end the output with a different string:
The sep
keyword
The keyword sep
specify how to separate the objects, if there is more than one
The input()
Function
- This function takes the input from the user and converts it into a string:
-
input()
can also set a default message without usingprint()
: - It is also possible to use formatted strings to avoid using
.format
:
The len()
Function
Evaluates to the integer value of the number of characters in a string, list, dictionary, etc.
*don’t use it to test emptiness
Should you use len()
to test emptiness?
No, Test of emptiness of strings, lists, dictionaries, etc., should not use len, but prefer direct boolean evaluation.
The str()
, int()
, and float()
Functions
These functions allow you to change the type of variable. For example, you can transform from an integer or float to a string. Or from a string to an integer or float.
abs()
Return the absolute value of a number.
aiter()
Return an asynchronous iterator for an asynchronous iterable.
all()
Return True
if all elements of the iterable are true.
any()
Return True
if any element of the iterable is true.
ascii()
Return a string
with a printable representation of an object.
bin()
Convert an integer number to a binary string.
bool()
Return a Boolean value.
breakpoint()
Drops you into the debugger at the call site.
bytearray()
Return a new array of bytes.
bytes()
Return a new “bytes” object.
callable()
Return True
if the object argument is callable, False if not.
chr()
Return the string representing a character.
classmethod()
Transform a method into a class method.
compile()
Compile the source into a code or AST object.
complex()
Return a complex number with the value real + imag*1j
.
delattr()
Deletes the named attribute, provided the object allows it.
dict()
Create a new dictionary
dir()
Return the list of names in the current local scope
divmod()
Return a pair of numbers consisting of their quotient and remainder.
enumerate()
Return an enumerate object.
eval()
Evaluates and executes an expression.
exec()
This function supports dynamic execution of Python code.
filter()
Construct an iterator from an iterable and returns true.
float()
Return a floating point number from a number or string
format()
Convert a value to a “formatted” representation
frozenset()
Return a new frozenset object
getattr()
Return the value of the named attribute of the object
globals()
return the dictionary implementing the current module namespace.
hasattr()
True
if the string is the name of one of the object’s attributes.
hash()
Return the hash value of the object
help()
Invoke the built-in help system
hex()
Convert an integer number to a lowercase hexadecimal string
id()
Return the “identity” of an object
input()
This function takes an input and converts it into a string
int()
Return an integer object constructed from a number or string
isinstance()
Return True
if the object argument is an instance of an object
issubclass()
Return True
if the class is a subclass of classinfo
iter()
Return an iterator object
len()
Return the length (the number of items) of an object.
list()
Rather than being a function, list is a mutable sequence type
locals()
Update and return a dictionary with the current local symbol table
map()
Return an iterator that applies function to every item of iterable
max()
Return the largest item in an iterable
min()
Return the smallest item in an iterable
next()
Retrieve the next item from the iterator.
object()
Return a new featureless object
oct()
Convert an integer to an octal string
open()
Open and file and return a corresponding file object
ord()
Return an integer representing the Unicode code point of a character
pow()
Return base to the power exp.
print()
Print objects to the text stream file
property()
Return a property attribute
repr()
Return a string containing a printable representation of an object
reversed()
Return a reverse iterator
round()
Return number rounded to ndigits precisions after the decimal point.
set()
Return an new set
object
setattr()
This is the counterpart of getattr()
slice()
Return a sliced object representing a set of indices
sorted()
Return a new sorted list from the items in iterable
staticmethod()
Transform a method into a static method
str()
Return a str version of object
sum()
Sums start and the items of an iterable
super()
Return a proxy object that delegates method calls to a parent or sibling
tuple()
Rather than being a function, is actually an immutable sequence type
vars()
Return the dict
attribute for any other object with a dict attribute
zip()
Iterate over several iterables in parallel
import()
This function is invoked by the import statement
Comparison Operators
evaluate to True
or False
depending on the values you give them.
Boolean operators
there are 3: and
, or
, and not
The order of precedence, highest to lowest are not
, and
, and or
The and
Operators Truth table
The or
Operators Truth table
The not
Operators Truth table
Can you mix boolean and comparison operators?
yes.
>>> 2 + 2 == 4 and not 2 + 2 == 5 and 2 * 2 == 2 + 2 True """ In the statement below 3 < 4 and 5 > 5 gets executed first evaluating to False Then 5 > 4 returns True so the results after True or False is True """ >>> 5 > 4 or 3 < 4 and 5 > 5 True """ Now the statement within parentheses gets executed first so True and False returns False. """ >>> (5 > 4 or 3 < 4) and 5 > 5 False
if
, elif
, else
The if
statement evaluates an expression, and if that expression is True
, it then executes the following indented code.
The else
statement executes only if the evaluation of the if
and all the elif
expressions are False
.
Only after the if
statement expression is False
, the elif
statement is evaluated and executed.
the elif
and else
parts are optional.
Ternary Conditional Operator
Many programming languages have a ternary operator, which define a conditional expression. The most common usage is to make a terse, simple conditional assignment statement. In other words, it offers one-line code to evaluate the first expression if the condition is true, and otherwise it evaluates the second expression.
*Ternary operators can be chained.
Switch-Case Statement
In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map.
The Switch-Case statements, or Structural Pattern Matching, was firstly introduced in 2020 via PEP 622, and then officially released with Python 3.10 in September 2022.
Matching single values
Matching with the or Pattern
Matching by the length of an Iterable
Matching default value:
Matching Builtin Classes
Guarding Match-Case Statements
while
Loop Statements
The while
statement is used for repeated execution as long as an expression is True
:
break
Statements
If the execution reaches a break
statement, it immediately exits the while
loop’s clause:
continue
Statements
When the program execution reaches a continue
statement, the program execution immediately jumps back to the start of the loop.
For loop
The for
loop iterates over a list
, tuple
, dictionary
, set
or string
:
The range()
function
The range()
function returns a sequence of numbers. It starts from 0, increments by 1
, and stops before a specified number.
The range()
function can also modify its 3 defaults arguments. The first two will be the start
and stop
values, and the third will be the step
argument. The step is the amount that the variable is increased by after each iteration.
You can even use a negative number for the step
argument to make the for loop
count down instead of up.
For else
statement
This allows to specify a statement to execute in case of the full loop has been executed. Only useful when a break
condition can occur in the loop:
Ending a Program with sys.exit()
exit()
function allows exiting Python.
Function Arguments
A function can take arguments
and return values
:
In the following example, the function say_hello receives the argument “name” and prints a greeting:
Keyword Arguments
To improve code readability, we should be as explicit as possible. We can achieve this in our functions by using Keyword Arguments:
Return Values
When creating a function using the def
statement, you can specify what the return value should be with a return
statement. A return statement consists of the following:
The return
keyword.
The value or expression that the function should return.
Local and Global Scope
Code in the global scope
cannot use any local variables
.
However, a local scope
can access global variables
.
Code in a function’s local scope
cannot use variables in any other local scope
.
You can use the same name for different variables if they are in different scopes. That is, there can be a local variable
named spam
and a global variable
also named spam
.
The global
Statement
If you need to modify a global variable
from within a function, use the global
statement:
What are the four rules to tell whether a variable is in a local scope or global scope?
- If a variable is being used in the
global scope
(that is, outside all functions), then it is always aglobal variable
. - if there is a
global statement
for that variable in a function, it is aglobal variable
. - Otherwise, if the variable is used in an
assignment statement
in the function, it is alocal variable.
- But if the variable is not used in an assignment statement, it is a
global variable
.
Lambda Functions
In Python, a lambda
function is a single-line, anonymous function, which can have any number of arguments, but it can only have one expression.
Lambda functions can only evaluate an expression, like a single line of code.
lambda
is a minimal function definition that can be used inside an expression. Unlike FunctionDef, body holds a single node.
Python Lists
Lists are one of the 4 data types in Python used to store collections of data.
['John', 'Peter', 'Debora', 'Charles']
Getting list values with indexes
*lists
‘table’
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[0] >>> furniture[1] # 'chair' >>> furniture[2] # 'rack' >>> furniture[3] # 'shelf'
Negative indexes
*lists
‘shelf’
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[-1] >>> furniture[-3] # 'chair' >>> f'The {furniture[-1]} is bigger than the {furniture[-3]}' # 'The shelf is bigger than the chair'
Getting sublists with Slices
*lists
[‘table’, ‘chair’, ‘rack’, ‘shelf’]
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[0:4] >>> furniture[1:3] # ['chair', 'rack'] >>> furniture[0:-1] # ['table', 'chair', 'rack'] >>> furniture[:2] # ['table', 'chair'] >>> furniture[1:] # ['chair', 'rack', 'shelf'] >>> furniture[:] # ['table', 'chair', 'rack', 'shelf']
Slicing
the complete list will perform a copy:
*lists
[‘cat’, ‘bat’, ‘rat’, ‘elephant’]
>>> spam2 = spam[:] >>> spam.append('dog') >>> spam # ['cat', 'bat', 'rat', 'elephant', 'dog'] >>> spam2 # ['cat', 'bat', 'rat', 'elephant']
Getting a list length with len()
4
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> len(furniture)
Changing list values with indexes
[‘desk’, ‘chair’, ‘rack’, ‘shelf’]
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[0] = 'desk' >>> furniture >>> furniture[2] = furniture[1] >>> furniture # ['desk', 'chair', 'chair', 'shelf'] >>> furniture[-1] = 'bed' >>> furniture # ['desk', 'chair', 'chair', 'bed']
Concatenation and Replication
*lists
[1, 2, 3, ‘A’, ‘B’, ‘C’]
>>> [1, 2, 3] + ['A', 'B', 'C'] >>> ['X', 'Y', 'Z'] * 3 # ['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z'] >>> my_list = [1, 2, 3] >>> my_list = my_list + ['A', 'B', 'C'] >>> my_list # [1, 2, 3, 'A', 'B', 'C']
Using for
loops with Lists
table
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> for item in furniture: ... print(item) # chair # rack # shelf
Getting the index in a loop with enumerate()
*lists
index: 0 - item: table
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> for index, item in enumerate(furniture): ... print(f'index: {index} - item: {item}') # index: 1 - item: chair # index: 2 - item: rack # index: 3 - item: shelf
Loop in Multiple Lists with zip()
*lists
The table costs $100
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> price = [100, 50, 80, 40] >>> for item, amount in zip(furniture, price): ... print(f'The {item} costs ${amount}') # The chair costs $50 # The rack costs $80 # The shelf costs $40
The in
and not in
operators
*lists
True
>>> 'rack' in ['table', 'chair', 'rack', 'shelf'] >>> 'bed' in ['table', 'chair', 'rack', 'shelf'] # False >>> 'bed' not in furniture # True >>> 'rack' not in furniture # False
The Multiple Assignment Trick
*lists
‘table’
The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code.
The multiple assignment trick can also be used to swap the values in two variables:
So instead of doing this:
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> table = furniture[0] >>> chair = furniture[1] >>> rack = furniture[2] >>> shelf = furniture[3]
You could type this line of code:
>>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> table, chair, rack, shelf = furniture >>> table >>> chair # 'chair' >>> rack # 'rack' >>> shelf # 'shelf'
The multiple assignment trick can also be used to swap the values in two variables:
>>> a, b = 'table', 'chair' >>> a, b = b, a >>> print(a) # chair >>> print(b) # table
The index
Method
*lists
The index
method allows you to find the index of a value by passing its name:
append()
*list
append
adds an element to the end of a list
insert()
*list
insert
adds an element to a list at a given position:
del()
*list
del
removes an item using the index
:
remove()
*list
remove
removes an item with using actual value of it
*If the value appears multiple times in the list, only the first instance of the value will be removed.
pop()
*list
By default, pop
will remove and return the last item of the list.
You can also pass the index
of the element as an optional parameter:
Sorting values with sort()
*lists
- sorts list in place
- You can also pass
True
for the reverse keyword argument to havesort()
sort the values in reverse order. - If you need to sort the values in regular alphabetical order, pass
str.lower
for the key keyword argument in thesort()
method call.
Sorting values with sorted()
*lists
You can use the built-in function sorted
to return a new list
Tuples vs Lists
The key difference between tuples
and lists
is that, while tuples
are immutable objects, lists
are mutable.
This means that tuples
cannot be changed while the lists
can be modified.
Tuples
are more memory efficient than the lists.
The main way that tuples are different from lists is that tuples, like strings, are immutable
.
Converting between list()
and tuple()
>>> tuple(['cat', 'dog', 5]) # ('cat', 'dog', 5) >>> list(('cat', 'dog', 5)) # ['cat', 'dog', 5] >>> list('hello') # ['h', 'e', 'l', 'l', 'o']
Python Dictionaries
In Python, a dictionary is an ordered (from Python > 3.7) collection of key: value
pairs.
The main operations on a dictionary are storing a value with some key and extracting the value given the key. It is also possible to delete a key:value pair with del
.
Set key, value
using subscript operator []
*dict
>>> my_cat = { ... 'size': 'fat', ... 'color': 'gray', ... 'disposition': 'loud', ... } >>> my_cat['age_years'] = 2 >>> print(my_cat) {'size': 'fat', 'color': 'gray', 'disposition': 'loud', 'age_years': 2}
Get value
using subscript operator []
*dict
fat
>>> my_cat = { ... 'size': 'fat', ... 'color': 'gray', ... 'disposition': 'loud', ... } >>> print(my_cat['size']) ... >>> print(my_cat['eye_color']) # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # KeyError: 'eye_color'
*In case the key is not present in dictionary KeyError
is raised.
values()
*dict
The values()
method gets the values of the dictionary:
>>> pet = {'color': 'red', 'age': 42} >>> for value in pet.values(): ... print(value) ... # red # 42
keys()
*dict
The keys()
method gets the keys of the dictionary.
>>> pet = {'color': 'red', 'age': 42} >>> for key in pet.keys(): ... print(key) ... # color # age
There is no need to use .keys()
since by default you will loop through keys.
>>> pet = {'color': 'red', 'age': 42} >>> for key in pet: ... print(key) ... # color # age
items()
*dict
The items()
method gets the items of a dictionary and returns them as a Tuple
:
>>> pet = {'color': 'red', 'age': 42} >>> for item in pet.items(): ... print(item) ... # ('color', 'red') # ('age', 42)
Using the keys()
, values()
, and items()
methods, a for
loop can iterate over the keys, values, or key-value pairs in a dictionary, respectively.
>>> pet = {'color': 'red', 'age': 42} >>> for key, value in pet.items(): ... print(f'Key: {key} Value: {value}') ... # Key: color Value: red
get()
*dict
The get()
method returns the value of an item with the given key. If the key doesn’t exist, it returns None
>>> wife = {'name': 'Rose', 'age': 33} >>> f'My wife name is {wife.get("name")}' # 'My wife name is Rose' >>> f'She is {wife.get("age")} years old.' # 'She is 33 years old.' >>> f'She is deeply in love with {wife.get("husband")}' # 'She is deeply in love with None'
You can also change the default None
value to one of your choice.
>>> wife = {'name': 'Rose', 'age': 33} >>> f'She is deeply in love with {wife.get("husband", "lover")}' # 'She is deeply in love with lover'
Adding items with setdefault()
*dict
It’s possible to add an item to a dictionary in this way:
>>> wife = {'name': 'Rose', 'age': 33} >>> if 'has_hair' not in wife: ... wife['has_hair'] = True
Using the setdefault
method, we can make the same code more short:
>>> wife = {'name': 'Rose', 'age': 33} >>> wife.setdefault('has_hair', True) >>> wife # {'name': 'Rose', 'age': 33, 'has_hair': True}
pop()
*dict
The pop()
method removes and returns an item based on a given key.
>>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> wife.pop('age') # 33 >>> wife # {'name': 'Rose', 'hair': 'brown'}
popitem()
*dict
The popitem()
method* removes* the last item in a dictionary and returns it.
>>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> wife.popitem() # ('hair', 'brown') >>> wife # {'name': 'Rose', 'age': 33}
del()
*dict
The del()
method* removes* an item based on a given key.
>>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> del wife['age'] >>> wife # {'name': 'Rose', 'hair': 'brown'}
clear()
*dict
The clear()
method *removes all the items *in a dictionary.
>>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> wife.clear() >>> wife # {}
Checking keys
in a Dictionary
>>> person = {'name': 'Rose', 'age': 33} >>> 'name' in person.keys() # True >>> 'height' in person.keys() # False >>> 'skin' in person # You can omit keys() # False
Checking values
in a Dictionary
>>> person = {'name': 'Rose', 'age': 33} >>> 'Rose' in person.values() # True >>> 33 in person.values() # True
Pretty Printing
*dict
>>> import pprint >>> wife = {'name': 'Rose', 'age': 33, 'has_hair': True, 'hair_color': 'brown', 'height': 1.6, 'eye_color': 'brown'} >>> pprint.pprint(wife) # {'age': 33, # 'eye_color': 'brown', # 'hair_color': 'brown', # 'has_hair': True, # 'height': 1.6, # 'name': 'Rose'}
Merge two dictionaries
>>> dict_a = {'a': 1, 'b': 2} >>> dict_b = {'b': 3, 'c': 4} >>> dict_c = {**dict_a, **dict_b} >>> dict_c # {'a': 1, 'b': 3, 'c': 4}
Python Sets
A set
is an unordered collection with no duplicate elements.
Basic uses include ** membership testing and eliminating duplicate entries.
>>> s = {1, 2, 3} >>> s = set([1, 2, 3]) >>> s = {} # this will create a dictionary instead of a set >>> type(s) # <class 'dict'>
A set
*automatically remove all the duplicate *values.
>>> s = {1, 2, 3, 2, 3, 4} >>> s # {1, 2, 3, 4}
And as an unordered data type, they can’t be indexed.
>>> s = {1, 2, 3} >>> s[0] # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # TypeError: 'set' object does not support indexing
Initializing a set
There are two ways to create sets: using curly braces {}
and the built-in function set()
NOTE: When creating set
, be sure to not use empty curly braces {}
or you will get an empty dictionary instead.
set: add()
and update()
*set
{1, 2, 3, 4}
Using the add()
method we can add a **single element **to the set.
~~~
»> s = {1, 2, 3}
»> s.add(4)
»> s
~~~
And with update()
, multiple ones:
>>> s = {1, 2, 3} >>> s.update([2, 3, 4, 5, 6]) >>> s # {1, 2, 3, 4, 5, 6}
set: remove()
and discard()
Both methods will remove an element from the set, but remove()
will raise a key error if the value doesn’t exist.
>>> s = {1, 2, 3} >>> s.remove(3) >>> s # {1, 2} >>> s.remove(3) # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # KeyError: 3
discard()
won’t raise any errors.
>>> s = {1, 2, 3} >>> s.discard(3) >>> s # {1, 2} >>> s.discard(3)
set: union
union()
or |
will create** a new set** with *all *the elements from the sets provided.
>>> s1 = {1, 2, 3} >>> s2 = {3, 4, 5} >>> s1.union(s2) # or 's1 | s2' # {1, 2, 3, 4, 5}
set: intersection
intersection()
or &
will return a set with* only *the elements that are common to all of them.
>>> s1 = {1, 2, 3} >>> s2 = {2, 3, 4} >>> s3 = {3, 4, 5} >>> s1.intersection(s2, s3) # or 's1 & s2 & s3' # {3}
set: difference
difference()
or -
will return only the elements that are unique to the first set (invoked set).
>>> s1 = {1, 2, 3} >>> s2 = {2, 3, 4} >>> s1.difference(s2) # or 's1 - s2' # {1} >>> s2.difference(s1) # or 's2 - s1' # {4}
set: symmetric_difference
symmetric_difference()
or ^
will return all the elements that are not common between them.
List Comprehensions
[‘Charles’, ‘Susan’, ‘Patrick’, ‘George’]
List Comprehensions are a special kind of syntax that let us create lists out of other lists, and are incredibly useful when dealing with numbers and with one or two levels of nested for loops.
List comprehensions provide a concise way to create lists. […] or to create a subsequence of those elements that satisfy a certain condition.
This is how we create a new list from an existing collection with a For
Loop:
>>> names = ['Charles', 'Susan', 'Patrick', 'George'] >>> new_list = [] >>> for n in names: ... new_list.append(n) ... >>> new_list
And this is how we do the same with a List Comprehension:
>>> names = ['Charles', 'Susan', 'Patrick', 'George'] >>> new_list = [n for n in names] >>> new_list # ['Charles', 'Susan', 'Patrick', 'George']
We can do the same with numbers:
>>> n = [(a, b) for a in range(1, 3) for b in range(1, 3)] >>> n # [(1, 1), (1, 2), (2, 1), (2, 2)]
*The basics of list
comprehensions also apply to sets and dictionaries.
Adding conditionals to list comprehensions
If we want new_list
to have only the names that start with C, with a for loop, we would do it like this:
>>> names = ['Charles', 'Susan', 'Patrick', 'George', 'Carol'] >>> new_list = [] >>> for n in names: ... if n.startswith('C'): ... new_list.append(n) ... >>> print(new_list) # ['Charles', 'Carol']
In a List Comprehension, we add the if
statement at the end:
>>> new_list = [n for n in names if n.startswith('C')] >>> print(new_list) # ['Charles', 'Carol']
To use an if-else
statement in a List Comprehension:
>>> nums = [1, 2, 3, 4, 5, 6] >>> new_list = [num*2 if num % 2 == 0 else num for num in nums] >>> print(new_list) # [1, 4, 3, 8, 5, 12]
**Set **comprehension
>>> b = {"abc", "def"} >>> {s.upper() for s in b} {"ABC", "DEF"}
Dict comprehension
>>> c = {'name': 'Pooka', 'age': 5} >>> {v: k for k, v in c.items()} {'Pooka': 'name', 5: 'age'}
A List comprehension can be generated from a dictionary:
>>> c = {'name': 'Pooka', 'age': 5} >>> ["{}:{}".format(k.upper(), v) for k, v in c.items()] ['NAME:Pooka', 'AGE:5']
Escape characters
An escape character is created by typing a backslash \
followed by the character you want to insert.
\'
Single quote
\"
Double quote
\t
Tab
\n
Newline (line break)
\
Backslash
\b
backspace
\ooo
Octal value
\r
carriage return
Raw strings
A raw string entirely ignores all escape characters and** prints** any backslash that appears in the string.
*mostly used for regex
Multiline Strings
Indexing and Slicing strings
H e l l o w o r l d ! 0 1 2 3 4 5 6 7 8 9 10 11
Indexing:
~~~
> > > spam = ‘Hello world!’
> > > spam[0]
# ‘H’
> > > spam[4]
# ‘o’
> > > spam[-1]
# ‘!’
Slicing
> > > spam = ‘Hello world!’
> > > spam[0:5]
# ‘Hello’
> > > spam[:5]
# ‘Hello’
> > > spam[6:]
# ‘world!’
> > > spam[6:-1]
# ‘world’
> > > spam[:-1]
# ‘Hello world’
> > > spam[::-1]
# ‘!dlrow olleH’
> > > fizz = spam[0:5]
fizz
# ‘Hello’
~~~
The in
and not in
operators
*strings
>>> 'Hello' in 'Hello World' # True >>> 'Hello' in 'Hello' # True >>> 'HELLO' in 'Hello World' # False >>> '' in 'spam' # True >>> 'cats' not in 'cats and dogs' # False
upper()
, lower()
and title()
*strings
Transforms a string to upper, lower and title case
>>> greet = 'Hello world!' >>> greet.upper() # 'HELLO WORLD!' >>> greet.lower() # 'hello world!' >>> greet.title() # 'Hello World!'
isupper()
and islower()
methods
Returns True
or False
after evaluating if a string is in upper or lower case:
isalpha()
returns True
if the string consists* only of letters*.
isalnum()
returns True
if the string consists only of letters and numbers.
isdecimal()
returns True
if the string consists only of numbers.
isspace()
returns True
if the string consists* only of **spaces, tabs, **and new-lines.*
istitle()
returns True
if the string consists only of words that begin with an uppercase letter followed by only lowercase characters.
startswith()
and endswith()
*strings
>>> 'Hello world!'.startswith('Hello') # True >>> 'Hello world!'.endswith('world!') # True >>> 'abc123'.startswith('abcdef') # False >>> 'abc123'.endswith('12') # False >>> 'Hello world!'.startswith('Hello world!') # True >>> 'Hello world!'.endswith('Hello world!') # True
join()
Thejoin()
method takes all the items in an iterable, like a list, dictionary, tuple or set, and joins them into a string. You can also specify a separator.
split()
The split()
method splits a string into a list. By default, it will use whitespace to separate the items, but you can also set another character of choice:
>>> 'My name is Simon'.split() # ['My', 'name', 'is', 'Simon'] >>> 'MyABCnameABCisABCSimon'.split('ABC') # ['My', 'name', 'is', 'Simon'] >>> 'My name is Simon'.split('m') # ['My na', 'e is Si', 'on'] >>> ' My name is Simon'.split() # ['My', 'name', 'is', 'Simon'] >>> ' My name is Simon'.split(' ') # ['', 'My', '', 'name', 'is', '', 'Simon']
Justifying text with rjust()
, ljust()
and center()
>>> 'Hello'.rjust(10) # ' Hello' >>> 'Hello'.rjust(20) # ' Hello' >>> 'Hello World'.rjust(20) # ' Hello World' >>> 'Hello'.ljust(10) # 'Hello ' >>> 'Hello'.center(20) # ' Hello
An optional second argument to rjust()
and ljust()
will specify a fill character apart from a space character:
~~~
> > > ‘Hello’.rjust(20, ‘’)
# ‘****Hello’
> > > ‘Hello’.ljust(20, ‘-‘)
# ‘Hello—————’
> > > ‘Hello’.center(20, ‘=’)
# ‘=======Hello========’
~~~
Removing whitespace with strip()
, rstrip()
, and lstrip()
>>> spam = ' Hello World ' >>> spam.strip() # 'Hello World' >>> spam.lstrip() # 'Hello World ' >>> spam.rstrip() # ' Hello World' >>> spam = 'SpamSpamBaconSpamEggsSpamSpam' >>> spam.strip('ampS') # 'BaconSpamEggs'
count()
*strings
3
Counts the number of occurrences of a given character or substring in the string it is applied to.
Can be optionally provided start
and end
** index**.
>>> sentence = 'one sheep two sheep three sheep four' >>> sentence.count('sheep') >>> sentence.count('e') # 9 >>> sentence.count('e', 6) # 8 # returns count of e after 'one sh' i.e 6 chars since beginning of string >>> sentence.count('e', 7) # 7
replace()
*strings
‘Hello, planet!’
Replaces all occurences of a given substring with another substring.
Can be optionally provided a third
argument to* limit *the number of replacements.
Returns a** new string**.
>>> text = "Hello, world!" >>> text.replace("world", "planet") >>> fruits = "apple, banana, cherry, apple" >>> fruits.replace("apple", "orange", 1) # 'orange, banana, cherry, apple' >>> sentence = "I like apples, Apples are my favorite fruit" >>> sentence.replace("apples", "oranges") # 'I like oranges, Apples are my favorite fruit'
Python String Formatting
The formatting operations described here (%
operator) exhibit a variety of quirks that lead to a number of common errors […]. Using the newer formatted string literals […] helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.
%
operator
(%d
)
*strings
>>> name = 'Pete' >>> 'Hello %s' % name # "Hello Pete"
We can use the %d
format specifier to convert an int value to a string:
~~~
> > > num = 5
‘I have %d apples’ % num
# “I have 5 apples”
~~~
*NOTE: For new code, using str.format
, or formatted string literals (Python 3.6+) over the %
operator is strongly recommended.
str.format
Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular.
>>> name = 'John' >>> age = 20 >>> "Hello I'm {}, my age is {}".format(name, age) # "Hello I'm John, my age is 20" >>> "Hello I'm {0}, my age is {1}".format(name, age) # "Hello I'm John, my age is 20"
Formatted String Literals or f-Strings
A formatted string literal or f-string
is a string literal that is prefixed with f
or F
. These strings may contain replacement fields, which are expressions delimited by curly braces {}
. While other string literals always have a constant value, formatted strings are really expressions** evaluated at run time**.
>>> name = 'Elizabeth' >>> f'Hello {name}!' # 'Hello Elizabeth!'
It is even possible to do inline arithmetic with it:
>>> a = 5 >>> b = 10 >>> f'Five plus ten is {a + b} and not {2 * (a + b)}.' # 'Five plus ten is 15 and not 30.'
*If your are using Python 3.6+, string f-Strings
are the recommended way to format strings.
Multiline f-Strings
>>> name = 'Robert' >>> messages = 12 >>> ( ... f'Hi, {name}. ' ... f'You have {messages} unread messages' ... ) # 'Hi, Robert. You have 12 unread messages'
The =
specifier
*fstrings
This will print the expression and its value:
>>> from datetime import datetime >>> now = datetime.now().strftime("%b/%d/%Y - %H:%M:%S") >>> f'date and time: {now=}' # "date and time: now='Nov/14/2022 - 20:50:01'"
Adding spaces or characters
*fstrings
>>> f"{name.upper() = :-^20}" # 'name.upper() = -------ROBERT-------' >>> >>> f"{name.upper() = :^20}" # 'name.upper() = ROBERT ' >>> >>> f"{name.upper() = :20}" # 'name.upper() = ROBERT
Adding thousands separator
*fstrings
‘10,000,000’
>>> a = 10000000 >>> f"{a:,}"
rounding
*fstrings
‘3.14’
>>> a = 3.1415926 >>> f"{a:.2f}"
showing as a Percentage
*fstrings
‘81.66%’
>>> a = 0.816562 >>> f"{a:.2%}"
Number: 3.1415926
Format: {:.2f}
Output…
3.14
Format float 2 decimal places
Number: 3.1415926
Format: {:+.2f}
Output…
+3.14
Format float 2 decimal places with sign
Number: -1
Format: {:+.2f}
Output…
-1.00
Format float 2 decimal places with sign
Number: 2.71828
Format: {:.0f}
Output…
3
Format float with no decimal places
Number: 4
Format: {:0>2d}
Output…
04
Pad number with zeros (left padding, width 2)
Number: 4
Format: {:x<4d}
Output…
4xxx
Pad number with x’s (right padding, width 4)
Number: 10
Format: {:x<4d}
Output…
10xx
Pad number with x’s (right padding, width 4)
Number: 1000000
Format: {:,}
Output…
1,000,000
Number format with comma separator
Number: 0.35
Format: {:.2%}
Output…
35.00%
Format percentage
Number: 1000000000
Format: {:.2e}
Output…
1.00e+09
Exponent notation
Number: 11
Format: {:11d}
Output…
11
Right-aligned (default, width 10)
Number: 11
Format: {:<11d}
Output…
11
Left-aligned (width 10)
Number: 11
Format: {:^11d}
Output…
11
Center aligned (width 10)
Template Strings
A simpler and less powerful mechanism, but it is recommended when handling strings generated by users. Due to their reduced complexity, template strings are a safer choice.
>>> from string import Template >>> name = 'Elizabeth' >>> t = Template('Hey $name!') >>> t.substitute(name=name) # 'Hey Elizabeth!'
Regular Expressions
A regular expression (shortened as regex […]) is a sequence of characters that specifies a search pattern in text. […] used by string-searching algorithms for “find” or “find and replace” operations on strings, or for input validation.
- Import the regex module with
import re.
- Create a Regex object with the
re.compile()
function. (Remember to use a raw string.) - Pass the string you want to search into the Regex object’s
search()
method. This returns a Match object. - Call the Match object’s
group()
method to return a string of the actual matched text.
?
*regex
zero or one of the preceding group.
*
*regex
zero or more of the preceding group.
+
*regex
one or more of the preceding group.
{n}
*regex
exactly n of the preceding group.
{n,}
*regex
n or more of the preceding group.
{,m}
*regex
0 to m of the preceding group.
{n,m}
*regex
at least n and at most m of the preceding p.
{n,m}?
or *?
or +?
*regex
performs a non-greedy match of the preceding p.
^spam
*regex
means the string must begin with spam.
spam$
*regex
means the string must end with spam.
.
*regex
any character, except newline characters.
\d
, \w
, and \s
*regex
a digit, word, or space character, respectively.
\D
, \W
, and \S
*regex
anything except a digit, word, or space, respectively.
[abc]
*regex
any character between the brackets (such as a, b, ).
[^abc]
*regex
any character that isn’t between the brackets.
Matching regex objects
*regex
>>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') >>> mo = phone_num_regex.search('My number is 415-555-4242.') >>> print(f'Phone number found: {mo.group()}') # Phone number found: 415-555-4242
Grouping regex with parentheses
*regex
using group()
>>> print(f'Phone number found: {mo.group()}') # Phone number found: 415-555-4242 Grouping with parentheses >>> phone_num_regex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)') >>> mo = phone_num_regex.search('My number is 415-555-4242.') >>> mo.group(1) # '415' >>> mo.group(2) # '555-4242' >>> mo.group(0) # '415-555-4242' >>> mo.group() # '415-555-4242'
To retrieve** all the groups at once** use the groups()
method:
>>> mo.groups() ('415', '555-4242') >>> area_code, main_number = mo.groups() >>> print(area_code) 415 >>> print(main_number) 555-4242
Multiple groups with Pipe
*regex
You can use the |
character anywhere you want to match one of many expressions.
>>> hero_regex = re.compile (r'Batman|Tina Fey') >>> mo1 = hero_regex.search('Batman and Tina Fey.') >>> mo1.group() # 'Batman' >>> mo2 = hero_regex.search('Tina Fey and Batman.') >>> mo2.group() # 'Tina Fey'
You can also use the pipe
to match one of several patterns as part of your regex:
~~~
»> bat_regex = re.compile(r’Bat(man|mobile|copter|bat)’)
»> mo = bat_regex.search(‘Batmobile lost a wheel’)
> > > mo.group()
# ‘Batmobile’
> > > mo.group(1)
# ‘mobile’
~~~
Optional matching with the Question Mark
*regex
The ?
character flags the group that precedes it as an optional part of the pattern.
>>> bat_regex = re.compile(r'Bat(wo)?man') >>> mo1 = bat_regex.search('The Adventures of Batman') >>> mo1.group() # 'Batman' >>> mo2 = bat_regex.search('The Adventures of Batwoman') >>> mo2.group() # 'Batwoman'
Matching zero or more with the Star
*regex
The *
(star or asterisk) means “match zero or more”. The group that precedes the star can occur any number of times in the text.
>>> bat_regex = re.compile(r'Bat(wo)*man') >>> mo1 = bat_regex.search('The Adventures of Batman') >>> mo1.group() 'Batman' >>> mo2 = bat_regex.search('The Adventures of Batwoman') >>> mo2.group() 'Batwoman' >>> mo3 = bat_regex.search('The Adventures of Batwowowowoman') >>> mo3.group() 'Batwowowowoman'
Matching one or more with the Plus
*regex
The +
(or plus) means match one or more. The group preceding a plus must appear at least once:
>>> bat_regex = re.compile(r'Bat(wo)+man') >>> mo1 = bat_regex.search('The Adventures of Batwoman') >>> mo1.group() # 'Batwoman' >>> mo2 = bat_regex.search('The Adventures of Batwowowowoman') >>> mo2.group() # 'Batwowowowoman' >>> mo3 = bat_regex.search('The Adventures of Batman') >>> mo3 is None # True
Matching specific repetitions with Curly Brackets
If you have a group that you want to repeat a specific number of times, follow the group in your regex with* a number in curly brackets*:
>>> ha_regex = re.compile(r'(Ha){3}') >>> mo1 = ha_regex.search('HaHaHa') >>> mo1.group() # 'HaHaHa' >>> mo2 = ha_regex.search('Ha') >>> mo2 is None # True
Instead of one number, you can specify a range with minimum and a maximum in between the curly brackets. For example, the regex (Ha){3,5} will match ‘HaHaHa’, ‘HaHaHaHa’, and ‘HaHaHaHaHa’.
>>> ha_regex = re.compile(r'(Ha){2,3}') >>> mo1 = ha_regex.search('HaHaHaHa') >>> mo1.group() # 'HaHaHa'
Greedy and non-greedy matching
*regex
Python’s regular expressions are greedy by default : in ambiguous situations they will match the longest string possible.
The non-greedy version of the curly brackets, which matches the shortest string possible, has the closing curly bracket followed by a question mark.
>>> greedy_ha_regex = re.compile(r'(Ha){3,5}') >>> mo1 = greedy_ha_regex.search('HaHaHaHaHa') >>> mo1.group() # 'HaHaHaHaHa' >>> non_greedy_ha_regex = re.compile(r'(Ha){3,5}?') >>> mo2 = non_greedy_ha_regex.search('HaHaHaHaHa') >>> mo2.group() # 'HaHaHa'
The findall()
method
*regex
[‘415-555-9999’, ‘212-555-0000’]
The findall()
method will return the strings of every match in the searched string.
>>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') # has no groups >>> phone_num_regex.findall('Cell: 415-555-9999 Work: 212-555-0000') # ['415-555-9999', '212-555-0000']
Making your own character classes [ ]
and [a-zA-Z0-9]
*regex
You can define your own character class using square brackets. For example, the character class [aeiouAEIOU]
will match any vowel, both lowercase and uppercase.
You can also include ranges of letters or numbers by using a hyphen. For example, the character class [a-zA-Z0-9]
will match all lowercase letters, uppercase letters, and numbers.
>>> vowel_regex = re.compile(r'[aeiouAEIOU]') >>> vowel_regex.findall('Robocop eats baby food. BABY FOOD.') # ['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']
By placing a caret character (^
) just after the character class’s opening bracket, you can make a **negative character class **that will match all the characters that are not in the character class:
>>> consonant_regex = re.compile(r'[^aeiouAEIOU]') >>> consonant_regex.findall('Robocop eats baby food. BABY FOOD.') # ['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', ' # ', 'B', 'B', 'Y', ' ', 'F', 'D', '.']
Making your own character classes [^aeiouAEIOU]
*regex
[‘R’, ‘b’, ‘c’, ‘p’, ‘ ‘, ‘t’, ‘s’, ‘ ‘, ‘b’, ‘b’, ‘y’, ‘ ‘, ‘f’, ‘d’, ‘.’, ‘
By placing a caret character (^
) just after the character class’s opening bracket, you can make **a negative character class that will match* all *the characters that are not **in the character class:
~~~
»> consonant_regex = re.compile(r’[^aeiouAEIOU]’)
»> consonant_regex.findall(‘Robocop eats baby food. BABY FOOD.’)
# [‘R’, ‘b’, ‘c’, ‘p’, ‘ ‘, ‘t’, ‘s’, ‘ ‘, ‘b’, ‘b’, ‘y’, ‘ ‘, ‘f’, ‘d’, ‘.’, ‘
# ‘, ‘B’, ‘B’, ‘Y’, ‘ ‘, ‘F’, ‘D’, ‘.’]
~~~
The Caret and Dollar sign characters
*regex
- You can also use the caret symbol
^
at the start of a regex to indicate that a match must occur at the beginning of the searched text. - Likewise, you can put a dollar sign
$
at the end of the regex to indicate the string must end with this regex pattern. - And you can use the
^
and$
together to indicate that the entire string must match the regex.
The r'^Hello’
regular expression string matches strings that begin with ‘Hello’:
>>> begins_with_hello = re.compile(r'^Hello') >>> begins_with_hello.search('Hello world!') # <_sre.SRE_Match object; span=(0, 5), match='Hello'> >>> begins_with_hello.search('He said hello.') is None # True
The r'\d\$'
regular expression string matches strings that end with a numeric character from 0 to 9:
>>> whole_string_is_num = re.compile(r'^\d+$') >>> whole_string_is_num.search('1234567890') # <_sre.SRE_Match object; span=(0, 10), match='1234567890'> >>> whole_string_is_num.search('12345xyz67890') is None # True >>> whole_string_is_num.search('12 34567890') is None # True
The Wildcard character
The .
(or dot) character in a regular expression will match any character except for a newline:
>>> at_regex = re.compile(r'.at') >>> at_regex.findall('The cat in the hat sat on the flat mat.') ['cat', 'hat', 'sat', 'lat', 'mat']
Matching everything with Dot-Star
*regex
The .*
uses** greedy mode: It will always try to match *as much text as possible.
~~~
»> name_regex = re.compile(r’First Name: (.) Last Name: (.*)’)
> > > mo = name_regex.search(‘First Name: Al Last Name: Sweigart’)
mo.group(1)
# ‘Al’
> > > mo.group(2)
‘Sweigart’
~~~
To match any and all text in a** non-greedy** fashion, use the dot, star, and question mark (.*
?). The question mark tells Python to match in a non-greedy way:
>>> non_greedy_regex = re.compile(r'<.*?>') >>> mo = non_greedy_regex.search('<To serve man> for dinner.>') >>> mo.group() # '<To serve man>' >>> greedy_regex = re.compile(r'<.*>') >>> mo = greedy_regex.search('<To serve man> for dinner.>') >>> mo.group() # '<To serve man> for dinner.>'
Matching newlines with the Dot character
*regex
‘Serve the public trust.’
The .*
dot-star will match everything except a newline.
By passing re.DOTALL
as the second argument to re.compile()
, you can make the dot character match all characters, including the newline character:
>>> no_newline_regex = re.compile('.*') >>> no_newline_regex.search('Serve the public trust.\nProtect the innocent.\nUphold the law.').group() >>> newline_regex = re.compile('.*', re.DOTALL) >>> newline_regex.search('Serve the public trust.\nProtect the innocent.\nUphold the law.').group() # 'Serve the public trust.\nProtect the innocent.\nUphold the law.'
Case-Insensitive matching
*regex
‘Robocop’
To make your regex case-insensitive, you can pass ` re.IGNORECASE or
re.I as a *second* argument to
re.compile():`
>>> robocop = re.compile(r'robocop', re.I) >>> robocop.search('Robocop is part man, part machine, all cop.').group() >>> robocop.search('ROBOCOP protects the innocent.').group() # 'ROBOCOP' >>> robocop.search('Al, why does your programming book talk about robocop so much?').group() # 'robocop'
Substituting strings with the sub()
method
‘CENSORED gave the secret documents to CENSORED.’
The sub()
method for Regex objects is passed two arguments:
- The** first** argument is a string to replace any matches.
- The second is the string for the regular expression.
The sub()
method returns a string with the substitutions applied:
>>> names_regex = re.compile(r'Agent \w+') >>> names_regex.sub('CENSORED', 'Agent Alice gave the secret documents to Agent Bob.')
Managing complex Regexes
*regex
To tell the re.compile()
function to ignore whitespace and comments inside the regular expression string, “verbose mode” can be enabled by passing the variable re.VERBOSE
as the *second *argument to re.compile()
.
Now instead of a hard-to-read regular expression like this:
phone_regex = re.compile(r'((\d{3}|\(\d{3}\))?(\s|-|\.)?\d{3}(\s|-|\.)\d{4}(\s*(ext|x|ext.)\s*\d{2,5})?)')
you can spread the regular expression over multiple lines with comments like this:
phone_regex = re.compile(r'''( (\d{3}|\(\d{3}\))? # area code (\s|-|\.)? # separator \d{3} # first 3 digits (\s|-|\.) # separator \d{4} # last 4 digits (\s*(ext|x|ext.)\s*\d{2,5})? # extension )''', re.VERBOSE)
What are the two main modules in Python that deal with path manipulation.
os.path
and pathlib
The pathlib
module was added in Python 3.4, offering an object-oriented way to handle file system paths.
Linux and Windows Paths
On Windows, paths are written using backslashes (\
) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash (/
) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms.
Fortunately, Python provides easy ways to handle this. We will showcase how to deal with both, os.path.join
and pathlib.Path.joinpath
Using os.path.join
on Windows:
>>> my_files = ['accounts.txt', 'details.csv', 'invite.docx'] >>> for filename in my_files: ... print(os.path.join('C:\\Users\\asweigart', filename)) ... # C:\Users\asweigart\accounts.txt # C:\Users\asweigart\details.csv # C:\Users\asweigart\invite.docx
using pathlib
on *nix:
>>> from pathlib import Path >>> print(Path('usr').joinpath('bin').joinpath('spam')) # usr/bin/spam
pathlib
also provides a shortcut to joinpath using the /
operator:
~~~
> > > from pathlib import Path
> > > print(Path(‘usr’) / ‘bin’ / ‘spam’)
# usr/bin/spam
Joining paths is helpful if you need to create different file paths under the same directory.
> > > my_files = [‘accounts.txt’, ‘details.csv’, ‘invite.docx’]
home = Path.home()
for filename in my_files:
… print(home / filename)
…
# /home/asweigart/accounts.txt
# /home/asweigart/details.csv
# /home/asweigart/invite.docx
~~~
The current working directory, using os
on Windows
>>> import os >>> os.getcwd() # 'C:\\Python34' >>> os.chdir('C:\\Windows\\System32') >>> os.getcwd() # 'C:\\Windows\\System32'
The current working directory, using pathlib
on *nix
>>> from pathlib import Path >>> from os import chdir >>> print(Path.cwd()) # /home/asweigart >>> chdir('/usr/lib/python3.6') >>> print(Path.cwd()) # /usr/lib/python3.6
Creating new folders, using os
on Windows
>>> import os >>> os.makedirs('C:\\delicious\\walnut\\waffles')
Creating new folders, using pathlib
on *nix
>>> from pathlib import Path >>> cwd = Path.cwd() >>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir() # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # File "/usr/lib/python3.6/pathlib.py", line 1226, in mkdir # self._accessor.mkdir(self, mode) # File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped # return strfunc(str(pathobj), *args) # FileNotFoundError: [Errno 2] No such file or directory: '/home/asweigart/delicious/walnut/waffles'
Oh no, we got a nasty error! The reason is that the ‘delicious’ directory does not exist, so we cannot make the ‘walnut’ and the ‘waffles’ directories under it. To fix this, do:
>>> from pathlib import Path >>> cwd = Path.cwd() >>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir(parents=True)
And all is good :)
** absolute path**
An absolute path, which always begins with the root folder
relative path
A relative path, which is* relative* to the program’s current working directory
dot (.
) and dot-dot (..
) folders.
These are not real folders, but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.”
Two periods (“dot-dot”) means “the** parent** folder.”
Handling Absolute paths, using pathlib
on *nix
>>> from pathlib import Path >>> Path('/').is_absolute() # True >>> Path('..').is_absolute() # False
extract an absolute path
~~~
from pathlib import Path
print(Path.cwd())
# /home/asweigart
print(Path(‘..’).resolve())
# /home
~~~
Handling Relative paths, using pathlib
on *nix
>>> from pathlib import Path >>> print(Path('/etc/passwd').relative_to('/')) # etc/passwd
Checking if a file/directory exists, using pathlib
on *nix
True
from pathlib import Path >>> Path('.').exists() >>> Path('setup.py').exists() # True >>> Path('/etc').exists() # True >>> Path('nonexistentfile').exists() # False
Checking if a path is a file, using pathlib
on *nix
True
>>> from pathlib import Path >>> Path('setup.py').is_file() >>> Path('/home').is_file() # False >>> Path('nonexistentfile').is_file() # False
Checking if a path is a directory, using pathlib
on *nix
True
>>> from pathlib import Path >>> Path('/').is_dir() >>> Path('setup.py').is_dir() # False >>> Path('/spam').is_dir() # False
Getting a file’s size in bytes, using pathlib
on *nix
>>> from pathlib import Path >>> stat = Path('/bin/python3.6').stat() >>> print(stat) # stat contains some other information about the file as well # os.stat_result(st_mode=33261, st_ino=141087, st_dev=2051, st_nlink=2, st_uid=0, # --snip-- # st_gid=0, st_size=10024, st_atime=1517725562, st_mtime=1515119809, st_ctime=1517261276) >>> print(stat.st_size) # size in bytes # 10024
Listing directories, using pathlib
on *nix
>>> from pathlib import Path >>> for f in Path('/usr/bin').iterdir(): ... print(f) ... # ... # /usr/bin/tiff2rgba # /usr/bin/iconv # /usr/bin/ldd # /usr/bin/cache_restore # /usr/bin/udiskie # /usr/bin/unix2dos # /usr/bin/t1reencode # /usr/bin/epstopdf # /usr/bin/idle3 # ...
Directory file sizes, using pathlib
on *nix
WARNING: Directories themselves also have a size! So, you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section.
>>> from pathlib import Path >>> total_size = 0 >>> for sub_path in Path('/usr/bin').iterdir(): ... total_size += sub_path.stat().st_size ... >>> print(total_size) # 1903178911
Copying files and folders with shutil
The shutil
module provides functions for copying files, as well as entire folders.
>>> import shutil, os >>> os.chdir('C:\\') >>> shutil.copy('C:\\spam.txt', 'C:\\delicious') # C:\\delicious\\spam.txt' >>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt') # 'C:\\delicious\\eggs2.txt'
While shutil.copy()
will copy a single file, shutil.copytree()
will copy an entire folder and every folder and file contained in it:
~~~
»> import shutil, os
> > > os.chdir(‘C:\’)
shutil.copytree(‘C:\bacon’, ‘C:\bacon_backup’)
# ‘C:\bacon_backup’
~~~
Moving and Renaming, with shutil
>>> import shutil >>> shutil.move('C:\\bacon.txt', 'C:\\eggs') # 'C:\\eggs\\bacon.txt'
The destination path can also specify a filename. In the following example, the source file is moved and renamed:
~~~
> > > shutil.move(‘C:\bacon.txt’, ‘C:\eggs\new_bacon.txt’)
# ‘C:\eggs\new_bacon.txt’
If there is no eggs folder, then `move()` will rename `bacon.txt` to a file named eggs:
> > > shutil.move(‘C:\bacon.txt’, ‘C:\eggs’)
# ‘C:\eggs’
~~~