Random Question Flashcards

Question

What is the main advantage of using ensemble methods in machine learning?

Answer 1

They combine multiple models to improve predictive performance.

Answer 2

Data points that differ significantly from other observations in a dataset.

Answer 3

To summarize the main characteristics of a dataset, often using visual methods.

Answer 4

Batch learning processes data in batches, while online learning processes data one instance at a time.

Answer 5

Data -> linear model->proba-> sigmoïde fonction -> values 0 et 1-> treshold classifier

Answer 6

P= 1/( 1+ exp(-y)) ou y =ax+b

Answer 7

1) calculate the entropy of the target and prédiction attributes. 2) calculate the information gain. 3) Root is the feature with the highest info gain Repeat

Answer 8

Select k record randomly ( k< m) Calculate the node D using the best Split. Repeat for daugther nodes Repeat with another k

Answer 9

1) keep model simple 2 detect via cross validation 3 ) régularisation 4 ) add more data or feature selectio 5) early stop

Answer 10

Filtre méthods: Lda, chi-square,ANOVA Wrapper méthods: forward fea sel Backward feature, récursive feature sélection ( thé others two look at one AT thé Time)

Answer 11

LESS storage Space, less computational power, removing redundant features

Answer 12

Lamda3 - 4lamda2 - 27lamda +90

Answer 13

Collaboratrice filtering,content based filtering

Answer 14

Calculate the Wss, sum of squared distance between the centroid and each membre of a cluster and search for elbow method

Answer 15

Remove if they are garbage Normalise Use another model Use algo rebust against outlier random forest

Answer 16

TP /TP+ fp

Answer 17

TP / TP + fn

Answer 18

-sum(P * log2.p)

Answer 19

L'output est catégorique vs continue

Answer 20

Bagging: aim to reduce variance in a noisy dataset: Split data, train models, average. Boosting IS ensemble learning to strengthen weak models Learning from previous errors. Gradient boosting (risk overfitting)

Answer 21

2 x Precison x recall/ (précision + recall)

Answer 22

Linear dependency between feature and y Independence

Answer 23

Prédictive analyses to find relatioships between dépendant binary variable and indépendant features using logistic regression équation

Answer 24

Tool to classify data and déterminé thé probabilités of a outcome of a système. Thé base IS a Root node, branches in décision node and into leaves node

Answer 25

Éliminate leaves to avoid overfitting using gini index

Answer 26

Observed value- true values Observed value - estimated valuez

Answer 27

Multiple models are uséd to improve prédictive performance

Answer 28

Classification algorithme that assumes that the feature are indépendant

Answer 29

Prédictive and classification using hyperplanes to ségrégate between two classes

Answer 30

To get thé expected result one should run thé experiment a large number of times

Answer 31

Variable that have an effect on other cause and effect

Answer 32

No there are some local optimum

Answer 33

N!/(N-x)! X! P^x q^n-x

Answer 34

False positive

Answer 35

False négatif

Answer 36

L1 absolute value of weight * lamda3 leads to sparse model and values near to zéro good for high dim data with irrelevant features L2 squared values prevent overfitting without éliminating features works well correlated features

Answer 37

Min max scalling z score transformation log transformation X- min/ (max-min) X- mean/ std Robuste scaling

Answer 38

Visualisez, statistical méthodes (z score, iqr) 1) removal 2 transfo 3 capping 4) investigation

Answer 39

Transformation catégorie into binaries

Answer 40

One hot encoding (0,1) Label encoding (1,2,3...) Target encoding uses thé mean Fréquence replace thé catégorie with fréquence Domaine specific encoding i'e encoded based on the distance of a central point

Answer 41

Biais error introduce by the d'simplication underfit thé data Variance error by model sensitivity overfitting thé data

Random Question Flashcards

(68 cards)