Module 1: Intro to Data Science Flashcards
what is the predictive modelling process
Define the problem
Prepare data: finding the data you need and transforming it into a suitable format
Create a model: initial analysis of the dataset to find useful patterns and selecting one or more modelling approaches. If machine learning toolsets are used this will also involve training the model on a dataset of previously observed results for a given set of inputs
Test the model
Validate the model to confirm if it’s predictions are correct using data that was not used to train the model
Evaluate model and compare several models to determine which one performs best
Deploy model in a business context this means integrating the model into business operations
what is data mining, machine learning, and AI writ large
1.4 data mining
If the focus is less on developing a predictive model and more on discovering relationships within a dataset to present new and valuable insights
Using a large dataset of network events captured from a credit card processing system and identifying patterns that represent attempts by hackers to break in
Identifying behaviours of smartphone customers indicating they’re leaving for a competitor, so that the customer relationship team can proactively attempt to retain them
1.5 machine learning
A term used to describe the automatic extraction of knowledge from data
The knowledge can be derived from hard rules, statements that appear to always be true, or soft rules that are probabilistically true
1.6 artificial intelligence
Machine learning is a subset of knowledge and techniques within the much broader topic of AI
AI is intelligence exhibited by machines to perform human-like tasks that solve complex problems.
what is a variable in python
Variable: symbolic name that refers to a value, it’s a container that holds information, you assign a value to a variable, then letter on you can call on the variable instead of the value
Starts with a letter – can contain both letters and numbers
Snake case: underscoring between words in a longer variable name
what are the different datatypes in python
integer: whole numbers, both positive and negative
Float: numbers with a decimal point
String: sequences of characters, enclosed within either a single or double quote
Boolean: represent truth values, and they can have only two possible values: true or false – used in conditional statements to control the flow of programming
Example of defining a variable. Type is to print the type of data it is.
Ozan = “the goat”
print (“the goat”)
print (type(“the goat”))
type(9)
type (9/3)
statements and expressions
A statement if a unit of code that the Python interpreter can execute. Think of it as instructions telling python what to do.
It represents an action that performs some operation – includes assignments (x = 5), conditional statements (if, else, elif), loops (for, while), function definitions (def)
Typically does not have a value, purpose to perform an operation or control the flow of the program
Expression is a combination of values, variables, operators, and function calls that evaluates a single value – think of it as calculations that always give you as a result
Represent computations and always produce a value
Arithmetic expressions (2 + 3), function calls (len(“hello)), variable reference (x) and more
Value can be integer, float, string, etc)
Python supports (+, -, *, /, // (integer division – divides without giving the remainder decimal part), % gives you the remainder when one number is divided by another 5 % 2 = 1, exponenation ** raises one number to the power of another.
what is a function
A named sequence of statements that perform a computation. Blocks of reusable code that perform a specific task. Defined by specifying function name and a sequence of statements. Results of functions are return values or if they perform an action, they are void actions. When you build a house you don’t want a new hammer everytime you need it, you have one and you reuse it each time.
Once defined a function can be called by its name – known as a function call
The definition of a new function must start with the keyword def, followed by the functions name and a list of parameters in brackets. This row of code, a function definition, ends with a colon (:). If the function returns value, the body ends with a return statement.
def function_name(param_1, param_2, …):
do something with parameters
…
return final_result
Example:
def hello_func():
print(‘Hello Function!’)
hello_func()
You can also pass arguments within a function inside the parameters.
def hello_func(greeting):
return ‘{ } Function.’.format(greeting)
print(hello_func(‘Hi’))
This makes the greeting parameter a required argument.
Another thing you can do is put a default value if there’s nothing passed through.
def hello_func(greeting, name=’You’):
return ‘{ }, { }’.format(greeting, name)
print(hello_func(‘Hi’))
what are conditionals
Boolean logic on how the code processes based on true or false values.
The if statement will execute when the condition evaluates to true, otherwise the else branch will be executed
score = 92
if (score > 50):
print(“You passed!”)
else:
print(“You failed.”)
Comparison operators used for conducting Boolean expressions
Another example
Language = ‘Python’
If language == ‘Python’:
Print(‘Conditional was True’)
Chained Conditionals:
Elif – a way to add additional conditions to an if statement
The initial if statement checks a condition, if it’s true, it’s code runs and the rest elif and else branches are skipped
If it’s false, python moves to check each elif statement in order, if it’s true it’s code will run but if not, then the final statement, the else statement will run
You can only use one else and if statement, so elif will help you create complex decision-making processes
And/or keyword:
And returns true only if both expressions are true
Or returned true if at least one of the Boolean expressions is true
Not flips the operator it falls on, it flips true to false, and false to true
For example, if you only want to do something when a user is not logged in, you can do is_logged_in = false (you know someone isn’t logged in), if not is_logged_in: print (“please log in”) – if not checks if the user is not logged which makes itself true then it prints that statement
Another example:
User = ‘Admin’
Logged_in = False
If user == ‘Admin’ and logged_in:
Print(‘Admin Page’)
Else:
Print(‘Bad Creds’)
Note that certain values return false either way:
= false
= none
= 0
Empty sequence ‘’ () [] {}
What is looping in python
For loop = can be used to access each item in a list like a string. Indented because it is telling us everything is within the for loop.
Nums = [1, 2, 3, 4, 5]
For num in nums:
print(num)
Break statement – let’s say we’re looking for a certain value and once we find it we don’t need to use the rest of the loop.
For num in nums:
If num == 3:
Print(‘Found’)
Break
Print(num)
Continue statement – will skip/ignore a value and continue the iteration of a loop
For num in nums:
If num == 3:
Print(‘Found’)
Continue
Print(num)
Note on for loop: it is a temporary variable, so you can modify a list, multiply all elements by 2 for example, but it only exists within the loop – we are not modifying the list. Called a loop because we are getting every element in our collection. To make the change you need an extra line like .append
You are accessing each value, but you might want to know the index that you are on. This is done through the enumerate function.
For index, item in enumerate(my_list):
Print(index, my_list)
Range method - times when you want to go through a loop a certain number of times
For I in range(10):
Print(i)
This will print out 0 through 9.
If you don’t want to start at 0 you can also start at the number you want.
For I in range(1, 11):
Print(i)
While loops – for loops iterate through values, while loops will keep going until a certain condition is met or until we hit a break
X = 0
While x < 10:
Print(x)
X += 1
The last line is to increment the x so that it’s get to the number 10.
At any point you can create a break statement within the break statement.
While x < 10:
If x == 5:
break
Print(x)
X += 1
Infinite loops – never end until we get some input or find some value
While True:
If x == 5:
break
Print(x)
X += 1
You change the while statement to true – but here we have a nested conditional that says if you get to 5 you should break.