1 Flashcards

1
Q
  1. how to execute code in a Jupyter Notebook. What happens when you press SHIFT+ENTER?
A

To execute code in a Jupyter Notebook, press shift and enter at the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. How do you assign a value to a variable in Python?
A

Define a variable and equal it to a number, string, list, or dictionary

Number: x = 5
String: name = “Alice”
List: numbers = [1, 2, 3, 4, 5]
Dictionary: person = {“name”: “Bob”, “age”: 30}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Differences between list, regular variable, and dictionary?
A

Regular variables in Python are used to store single data points. A variable can hold data of any type, like integers, floating-point numbers, strings, or even complex objects.

List can store more than one element.

A dictionary in Python is an unordered collection of data in a key:value pair form. Dictionary stores more than one element and is able to make a link between elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. How would you ensure the input data is a string?
A

data = input(“Enter something: “)
data = str(data) # Ensures that data is a string
print(data)

OR

print(type(“input”)) # Prints the type of the input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. How to define strings in jupyter?
A

Put the data in quotation mark

Welcome = “Welcome to my exam”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. How can you determine the type of a variable in Python?
A

Use the type function

print(type(“variable”)) # Prints the type of the variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. What is the purpose of an if statement?
A

It is a fundamental control structure that enables a program to make decisions and execute different code branches based on whether a given condition is True or False.

Temperature = 25 # Assign a value to the variable

if Temperature > 24: # If statement to check if the number is greater than 24
print(“The weather is hot outside.”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. How would you check some statement and criteria ?
A

if is the first condition to check, and it triggers its block if the condition is True.
Use if, for one-condition problems.

Temperature = 25 # Assign a value to the variable

if Temperature > 24: # If statement to check if the number is greater than 24
print(“The weather is hot outside.”)

else does not have a condition; it catches anything that wasn’t caught by the preceding if and elif statements. It’s essentially the “default” action when all other conditions fail. Use else for two condition problems.

if Temperature > 24: # If statement to check if the number is greater than 24
print(“The weather is hot outside.”)
else:
print(“The weather is not hot outside”)

elif follows an if or another elif and provides additional conditions to check. It is useful when multiple, mutually exclusive conditions need to be checked sequentially. Use elif for more than two condition problems.

if Temperature > 24: # If statement to check if the number is greater than 24
print(“The weather is hot outside.”)
elif Temperature > 20:
print(“The weather is okay outside”)
elif Temperature < 15:
print(“The weather is not good outside”)
else:
print(“The weather is not hot outside”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. How is a dictionary different from a list in Python?
A

List: A list is an ordered sequence of elements. Each element in a list can be accessed by its position, or index, within the list. Indexes in lists are integers starting from 0 for the first element.

List: numbers = [1, 2, 3, 4, 5]

Dictionary: A dictionary is an unordered collection of data in a key-value pair format. Each element in a dictionary is stored as a key paired with a value. The keys must be unique within a single dictionary, and they are typically used to describe or identify the associated value. Elements in a dictionary would be link together.

Dictionary: person = {“name”: “Bob”, “age”: 30}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Use cases of list and dictionary? (when is it best to use list and when is it best to use dictionary)
A

List: Ideal for ordered tasks where the arrangement of elements is significant, such as storing a list of numbers, processing items in a specific sequence, or iterating through elements in the order they were added.

Dictionary: Used when you need a logical association between key:value pairs of items. Dictionaries are faster at finding values than lists since they use a hashing mechanism to store and retrieve data. They are ideal for representing real-world data that involves mapping between unique identifiers and data (like a database).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Explain the difference between the “and” & “or” operators
A

Functionality: The and operator checks that all conditions specified are True. If all conditions are True, then the and expression itself returns True. If any one of the conditions is False, the and expression returns False.

age = 17 # Assign values to variables
is_citizen = True

if age >= 17 and is_citizen: # If statement using the ‘and’ operator to check both conditions
print(“You are eligible to vote.”)
else:
print(“You are not eligible to vote.”)

Functionality: The or operator checks if at least one of the conditions is True. The or expression returns True as soon as one of its conditions is True. It only returns False if all conditions are False.

if age >= 17 or is_citizen: # If statement using the ‘and’ operator to check both conditions
print(“You are eligible to vote.”)
else:
print(“You are not eligible to vote.”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. What libraries are necessary to visualize our data?
A

Seaborn and Matplotlib are two of the most popular libraries in Python for data visualization

Import seaborn: import seaborn as sns

Import Matplotlib: import matplotlib.pyplot as plt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. In what cases we need to import Seaborn and Matplotlib?
A

To visualize our data and find a trend. They offer a wide array of plots and customization options, making it highly flexible for creating sophisticated and professional figures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. What function can execute and run a loop as many times as we need?
A

The For Loop function

Define a list of numbers
numbers = [1, 2, 3, 4, 5]

Loop through each number in the list
for number in numbers:
print(number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  1. Why do we use For Loops?
A

Repetition: For loops are used to repeat a block of code multiple times. For example, if you want to perform the same operation on every element of a list, a for loop can automate and streamline this process.

Control Structure: For loops offer a way to include decision-making processes within the loop, using conditional statements like IF, to perform different actions depending on the item.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. What was the solution to run codes to find best match in Tinder data for all users at the same time, instead of coding for each of them manually?
A

For loop can run a specific code many times without the need to run it for individuals.

17
Q
  1. What is PuLP library for?
A

PuLP is a popular open-source library in Python used for linear programming. It enables users to formulate mathematical optimization problems using Python expressions and solve them using various supported solvers.

What type of problems would be solved by PuLP?
PuLP is primarily used for linear optimization problems but can also handle integer and mixed-integer programming problems.

18
Q
  1. What are steps to code a LP mathematical model in PuLP?
A

Import the PuLP Library>
specify the name of the problem and whether it is a maximization or minimization problem>
Define Decision Variables>
Define the Objective Function>
Add Constraints>
Solve the Problem>
Retrieve and Display the Results>

19
Q
  1. What elemenst are essential to define a variable in PuLP?
A

Name, lower bound, upper bound, type(integer, float)

x = LpVariable(“x”, lowBound=0, upBound=1000, cat=’continuous’)

y = LpVariable(“y”, lowBound=0, upBound=1000, cat=’integer’)

z = LpVariable(“z”, lowBound=0, upBound=1000, cat=’binary’)

20
Q
  1. How would you define an objective function in PuLP?
A

Set a name for the model and define its type(max or min)

from pulp import LpMaximize, LpProblem, LpVariable

problem = LpProblem(“Maximize_Profit”, LpMaximize)

A = LpVariable(‘A’, lowBound=0, upBound=100, cat=’Continuous’)

B = LpVariable(‘B’, lowBound=0, upBound=100, cat=’Continuous’)

problem += 5 * A + 3 * B, “Total_Profit”

21
Q
  1. What are the output of PuLP solvers?
A

Optimal value of variables and objective functions.

Is it possible that an optimal value of a variable be zero?how?

Yes, it depends on the opjective function and coefficient of variables.

22
Q
  1. What is difference of optimality and feasibility?
A

Feasibility refers to whether a given solution satisfies all the constraints of the optimization problem.

Optimality, on the other hand, refers to a solution that not only satisfies all the constraints of the optimization problem (i.e., it is feasible) but also maximizes or minimizes the objective function. An optimal solution is the best possible solution among all feasible solutions according to the criterion defined by the objective function

23
Q
  1. What is dataframe in PANDAS?
A

A DataFrame is a two-dimensional data structure with labeled axis (rows and columns). It is one of the most commonly used data structures in data science and analytics. DataFrame is abale to make a data set just like a real table with headers.

24
Q
  1. What library is essential to use DataFrame?
A

Pandas is a powerful library for data manipulation and analysis in Python. It provides data structures and functions designed to make working with structured data fast, easy, and expressive.

NumPy is a fundamental package for scientific computing with Python. It provides support for multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently.

Pandas relies heavily on NumPy arrays for its internal data representation and operations. When you create a Pandas DataFrame, the data is stored internally as one or more NumPy arrays

25
Q
  1. Differences Between DataFrame, List, and Dictionary
A

List: A list is an ordered sequence of elements. Each element in a list can be accessed by its position, or index, within the list. Indexes in lists are integers starting from 0 for the first element.

List: numbers = [1, 2, 3, 4, 5]

Dictionary: A dictionary is an unordered collection of data in a key-value pair format. Each element in a dictionary is stored as a key paired with a value. The keys must be unique within a single dictionary, and they are typically used to describe or identify the associated value. Elements in a dictionary would be link together.

Dictionary: person = {“name”: “Bob”, “age”: 30}

Dataframe: A DataFrame is a two-dimensional data structure with labeled axis (rows and columns). It is one of the most commonly used data structures in data science and analytics. DataFrame is abale to make a data set just like a real table with headers.

Dateframe:
import pandas as pd
data={
‘age’:[19,21,22],
‘name’:[‘Lukas’,’Hayat’,’Emma’],
‘city’:[‘Odense’,’Odense’,’Odense’]}
df=pd.DataFrame(data)
print(df)

26
Q
  1. What codes can locate and index specific data in a DataFrame?
A

The loc method is used for label-based indexing and selection. It allows you to access and manipulate data using row and column labels (names), which can be particularly intuitive and readable when working with DataFrames that have meaningful labels.

The iloc method is used for integer-based indexing and selection. It allows you to access and manipulate data using integer positions (indices), which can be useful when you want to access data based on its positional location within the DataFrame.

loc vs iloc: loc uses labels (names) and is more intuitive when working with labeled data, while iloc uses integer positions and is useful when you know the exact position of the data.

print(df.iloc[0,1])
print(df.loc[0])

27
Q
  1. Explain two ways to import datasets into NoteBook.
A

Importing the datasets by their path and address
file_path = ‘path/to/your/dataset.xlsx’
df = pd.read_excel(file_path)
or
Uploading them in the jupyter notebook’s direction, then recall their name in jupyter.

28
Q
  1. What are step to clean a data sets?
A

Find and remove duplicated data
Find and remove NA’s data

29
Q
  1. How can we deal with duplicated data in our datasets?
A

Find them and remove them

30
Q
  1. What are the benefits of using scatter plots in data analysis?
A

Scatter plots are useful for visualizing the relationship between two numerical variables, helping to identify correlations, trends, clusters, and outliers in the data.

31
Q
  1. How can we measure the strength of relationship and correlation between variables?
A

Correlation Method quantify the strength of the relationship between two numeric variables. This helps in understanding how closely changes in one variable are associated with changes in another.

32
Q
  1. Can you explain difference of using scatter plot and correlation method to find a relationship between two variables?
A

Scatter plot determine the general relationship and link between two variables, but correlation can measure the exact quantitative strength between them

33
Q
  1. What are main correlation coefficient values?
A

0, 1 and -1

34
Q
  1. Can you explain the the correlation coefficient values?
A
  • +1 or -1: As previously explained, values of +1 or -1 represent a perfect linear relationship, with +1 being a perfect positive linear relationship and -1 being a perfect negative linear relationship.
  • Values Close to +1 (e.g., 0.8, 0.9): These indicate a strong positive linear relationship. It means that as one variable increases, the other variable tends to also increase. The closer the coefficient is to +1, the stronger the association. For example, a correlation coefficient of 0.9 suggests a very strong positive relationship.
  • Values Close to -1 (e.g., -0.8, -0.9): These indicate a strong negative linear relationship. It means that as one variable increases, the other variable tends to decrease. A correlation coefficient of -0.9 also indicates a very strong relationship but in the opposite direction.
  • Values Around 0 (e.g., -0.1, 0, +0.1): These suggest no linear relationship. As one variable increases, there is no predictable change in the other variable. A correlation coefficient of 0 means there is no linear correlation between the variables.
35
Q
  1. What is the main purposes of using correlation method?
A

Understanding correlation coefficients and their values helps in multiple aspects:

Data Analysis: Identifying which variables are related and how changes in one might impact another.

Feature Selection: In machine learning, choosing features that are strongly correlated with the target variable but minimally correlated with each other.