DS Entretien 2 Flashcards

1
Q

Bias and Variance

A

Bias = underfitting (too simple)

Variance = overfitting (too complex)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Time Complexity:

looping over N elements
for i in range(n)

A

O(N), linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Time Complexity:

Nested Loops
for i in range(n): for j in range(n):

A

O(N^2), quadratic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Time Complexity:

Sorting
- order by column (SQL)
- quick sort, merge sort

A

O(N log N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Time Complexity:

Using an index
- where id = x (SQL)

A

O (log N), logarathmic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Space Complexity:
Single variable

A

O(1), linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Space Complexity:
Array of size N

A

O(N), linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Space Complexity:
2D matrix of size NxN

A

O(N^2), quadratic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Complexity:

Select avg(sales) over (order by month rows between 2 preceding and current row) from sales_data

A
  • O(N log N) time complexity
  • Space O(1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Complexity:

select customer
from orders
group by customer
having count(order_id)>2;

A
  • Group by causes extra space of O(N)
  • Time complexity is O(N log N) as counting order_id is O(N) and group by is O(N log N)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Complexity:

Select rank() over (partition by department order by salary desc) as rank_salary
From employees
Where rank_salary <= 3;

A
  • O(N log N) time, because database needs to sort the data into departments (partition by department)
  • O(N) space for the storage required when partitioning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Complexity:

select tweet_id
from Tweets
where length(content)>15;

A
  • Time complexity: O(N) full table scan
  • O(1) space complexity, no extra storage beyond the query execution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Complexity:

def count_salary_categories(accounts: pd.DataFrame) -> pd.DataFrame:
low = len(accounts[accounts.income<20000])
high = len(accounts[accounts.income>50000])
avg = len(accounts) - low - high
return pd.DataFrame(
{‘category’: [‘Low Salary’,’Average Salary’,’High Salary’], ‘accounts_count’:[low,avg,high]})

A
  • Time complexity O(N)
    - Len lines scan table, filtering data O(N), although Len operation itself is O(1)
    - Pd.dataframe creates 3 rows so its constant O(1)
    • Space complexity O(1)
      • Len lines are integers (constant space) O(1)
      • Dataframe is constant size too
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What’s a holdout set?

A

A holdout set is a portion of data that is separated from the training set and used to evaluate the performance of a machine learning model after it has been trained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When to use grid search or random search for hyper parameter tuning.

A

Grid search tries every possible parameter combination (comp expensive). Random only tries a subset but works pretty well.

Use Grid Search when you have <100 combinations and want precision.

Use Random Search when you have hundreds or thousands of possible combinations and need efficiency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why are insertions and deletions more efficient in linked lists than arrays?

A

Arrays require shifting elements when inserting/deleting from the beginning or middle, making these operations O(n). Linked lists use pointers, allowing O(1) insertion/deletion at the head since no shifting is needed. However, inserting or deleting in the middle or end still requires traversal (O(n)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the time complexity of inserting an element at the head of a linked list?

A

O(1). The new node is created, pointed to the current head, and the head pointer is updated. No shifting is required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the time complexity of inserting an element in the middle of a linked list?

A

O(n). The list must be traversed to find the insertion point before updating pointers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the time complexity of deleting an element at the head of a linked list?

A

O(1). The head pointer is updated to the next node, and the old head is freed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the time complexity of deleting an element in the middle of a linked list?

A

O(n). The list must be traversed to locate the node before updating pointers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does inserting in an array compare to inserting in a linked list?

A

Inserting at the head of an array is O(n) due to shifting, while it is O(1) in a linked list. Inserting in the middle or end of both structures is O(n) unless a tail pointer is used in a linked list, which makes end insertions O(1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How does deleting in an array compare to deleting in a linked list?

A

Deleting at the head of an array is O(n) due to shifting, while it is O(1) in a linked list. Deleting in the middle or end is O(n) in both structures unless a tail pointer exists in a doubly linked list, making end deletions O(1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How does Merge Sort work, and what is its time complexity?

A

Merge Sort is a divide-and-conquer algorithm that:
1. Recursively splits the array into two halves until each half has one element.
2. Merges the sorted halves back together in a sorted order.

Time Complexity:
- Best, Worst, and Average Case: O(n log n)
- Space Complexity: O(n) (due to extra space for merging)
- Stable? Yes (maintains relative order of equal elements)

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Quick Sort, and why is it efficient in practice?

A

Quick Sort is a divide-and-conquer algorithm that:
1. Selects a pivot element.
2. Partitions the array into elements smaller and larger than the pivot.
3. Recursively sorts the subarrays.

Time Complexity:
- Best and Average Case: O(n log n)
- Worst Case: O(n²) (if the pivot is always the smallest or largest element)
- Space Complexity: O(log n) (for recursion)
- Stable? No (swaps can change the relative order of equal elements)
- Why is it fast in practice? Has good cache locality and usually outperforms Merge Sort for large datasets.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How does Binary Search work, and what are its advantages?

A

Binary Search is an efficient search algorithm that:
1. Checks the middle element of a sorted array.
2. If the target is smaller, it searches the left half; if larger, it searches the right half.
3. Repeats until the element is found or the search space is empty.

Time Complexity: O(log n)
Space Complexity: O(1) (iterative), O(log n) (recursive)
Limitations: Only works on sorted arrays.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

When is Linear Search better than Binary Search?

A

Linear Search checks each element one by one.
- Best for: Unsorted or small datasets.
- Time Complexity: O(n) (worst case)
- When to use it? When data is unsorted, or when you expect the target to be near the beginning.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is Gradient Descent, and what are its types?

A

Gradient Descent is an optimization algorithm used in ML to minimize loss functions by updating model parameters.

Types:
- Batch Gradient Descent: Uses all data points; slow but stable.
- Stochastic Gradient Descent (SGD): Updates after each data point; noisy but faster.
- Mini-Batch Gradient Descent: Uses small batches; balances speed and stability.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the EM Algorithm, and where is it used?

A

The EM Algorithm finds maximum likelihood estimates for models with latent variables (e.g., Gaussian Mixture Models).
1. Expectation Step (E-step): Estimates missing data based on current parameters.
2. Maximization Step (M-step): Updates parameters to maximize likelihood.
3. Repeats until convergence.

Used in clustering (GMMs), HMMs, and topic modeling.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is BFS, and when is it used?

A

BFS explores a graph level by level using a queue.
- Time Complexity: O(V + E) (V = vertices, E = edges)
- Used for: Finding shortest paths (unweighted graphs), web crawling, network broadcasting.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How does DFS work, and what are its applications?

A

DFS explores as deep as possible before backtracking.
- Time Complexity: O(V + E)
- Used for: Cycle detection, topological sorting, solving mazes.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is Dijkstra’s Algorithm, and when does it fail?

A

Finds the shortest path in a weighted graph using a priority queue.
- Time Complexity: O((V + E) log V) with a priority queue
- Fails when: Graph has negative weights (use Bellman-Ford instead).

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is K-Means, and what are its limitations?

A

K-Means partitions data into K clusters by minimizing variance.
- Limitations: Sensitive to initialization, assumes spherical clusters, struggles with non-convex shapes.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How does DBSCAN handle noise better than K-Means?

A

DBSCAN groups dense regions and labels outliers separately.
- Advantages: Detects arbitrary-shaped clusters, handles noise well.
- Limitations: Struggles with varying density clusters.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What does PCA do, and how does it work?

A

PCA reduces dimensionality by transforming data into uncorrelated principal components.
- Steps:
1. Standardize the data.
2. Compute the covariance matrix.
3. Find eigenvalues/eigenvectors.
4. Select top k eigenvectors.
- Used in: Feature reduction, noise filtering, data visualization.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Why is t-SNE better for visualization than PCA?

A

t-SNE preserves local structures in high-dimensional data, making clusters more visible, while PCA preserves global structure.
- Limitation: Computationally expensive, non-deterministic.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

How is LDA different from PCA?

A

LDA:
- Supervised: Maximizes class separability.
- Used for: Feature reduction in classification tasks.

PCA:
- Unsupervised: Maximizes variance.

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is Dynamic Programming, and when is it useful?

A

DP solves problems by breaking them into overlapping subproblems and storing results to avoid redundant calculations.
- Used in: Time series forecasting, reinforcement learning, optimization problems (e.g., Knapsack, Fibonacci).

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is the time complexity of Merge Sort?

A

O(n log n)

Merge Sort is a divide-and-conquer algorithm that splits the array into halves, sorts each half, and merges them back together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the space complexity of Merge Sort?

A

O(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is the description of Quick Sort?

A

A divide-and-conquer algorithm that selects a pivot and partitions the array around the pivot, sorting recursively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the time complexity of Quick Sort in the worst case?

A

O(n²)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is the space complexity of Quick Sort?

A

O(log n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What does Binary Search do?

A

Finds the position of a target value within a sorted array by repeatedly dividing the search interval in half.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is the time complexity of Binary Search?

A

O(log n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is the space complexity of Binary Search?

A

O(1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is Ternary Search?

A

A variation of binary search that divides the array into three parts and eliminates two-thirds of the search space each time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is the time complexity of Ternary Search?

A

O(log₃ n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is the space complexity of Ternary Search?

A

O(1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is Counting Sort?

A

A non-comparative sorting algorithm that counts the occurrences of each element and uses this information to place elements in the correct position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is the time complexity of Counting Sort?

A

O(n + k)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What is the space complexity of Counting Sort?

52
Q

What is Radix Sort?

A

A non-comparative sorting algorithm that sorts numbers digit by digit starting from the least significant digit.

53
Q

What is the time complexity of Radix Sort?

54
Q

What is the space complexity of Radix Sort?

55
Q

What does BFS stand for?

A

Breadth-First Search

56
Q

What is the time complexity of BFS?

57
Q

What is the space complexity of BFS?

58
Q

What is the description of DFS?

A

A graph traversal algorithm that explores as far as possible along each branch before backtracking.

59
Q

What is the time complexity of DFS?

60
Q

What is the space complexity of DFS?

61
Q

What is Dijkstra’s Algorithm used for?

A

Finds the shortest paths from a source node to all other nodes in a weighted graph.

62
Q

What is the time complexity of Dijkstra’s Algorithm using priority queue?

A

O(E + V log V)

63
Q

What is the space complexity of Dijkstra’s Algorithm?

64
Q

What does the Bellman-Ford Algorithm handle?

A

Negative edge weights and detects negative weight cycles.

65
Q

What is the time complexity of Bellman-Ford Algorithm?

66
Q

What is the space complexity of Bellman-Ford Algorithm?

67
Q

What is the A* Algorithm?

A

A pathfinding algorithm that uses heuristics to find the shortest path while optimizing search time.

68
Q

What is the time complexity of the A* Algorithm?

69
Q

What is the space complexity of the A* Algorithm?

70
Q

What does the Floyd-Warshall Algorithm do?

A

Finds all pairs shortest paths in a graph.

71
Q

What is the time complexity of the Floyd-Warshall Algorithm?

72
Q

What is the space complexity of the Floyd-Warshall Algorithm?

73
Q

What is Kruskal’s Algorithm used for?

A

Finding the minimum spanning tree of a graph by adding edges in increasing weight order.

74
Q

What is the time complexity of Kruskal’s Algorithm?

A

O(E log E)

75
Q

What is the space complexity of Kruskal’s Algorithm?

76
Q

What does Prim’s Algorithm do?

A

Finds the minimum spanning tree by growing the tree one vertex at a time.

77
Q

What is the time complexity of Prim’s Algorithm?

A

O(E log V)

78
Q

What is the space complexity of Prim’s Algorithm?

79
Q

What is Tarjan’s Algorithm used for?

A

Finding strongly connected components (SCC) in a directed graph.

80
Q

What is the time complexity of Tarjan’s Algorithm?

81
Q

What is the space complexity of Tarjan’s Algorithm?

82
Q

What does Hopcroft-Karp Algorithm find?

A

The maximum matching in a bipartite graph.

83
Q

What is the time complexity of Hopcroft-Karp Algorithm?

A

O(√V * E)

84
Q

What is the space complexity of Hopcroft-Karp Algorithm?

85
Q

What is Topological Sort?

A

A linear ordering of vertices in a Directed Acyclic Graph (DAG).

86
Q

What is the time complexity of Topological Sort?

87
Q

What is the space complexity of Topological Sort?

88
Q

What does Kadane’s Algorithm find?

A

The maximum sum subarray in an array of integers.

89
Q

What is the time complexity of Kadane’s Algorithm?

90
Q

What is the space complexity of Kadane’s Algorithm?

91
Q

What is Floyd’s Cycle Detection (Tortoise and Hare)?

A

A fast and slow pointer algorithm used to detect cycles in a linked list.

92
Q

What is the time complexity of Floyd’s Cycle Detection?

93
Q

What is the space complexity of Floyd’s Cycle Detection?

94
Q

What does the Knapsack Algorithm solve?

A

The 0/1 knapsack problem by finding the optimal selection of items.

95
Q

What is the time complexity of the Knapsack Algorithm?

96
Q

What is the space complexity of the Knapsack Algorithm?

97
Q

What is the Longest Common Subsequence (LCS)?

A

A dynamic programming algorithm to find the longest subsequence common to two sequences.

98
Q

What is the time complexity of Longest Common Subsequence?

99
Q

What is the space complexity of Longest Common Subsequence?

100
Q

What does the Bellman-Ford Algorithm (DP version) do?

A

Finds the shortest paths from a source to all vertices in a graph with negative weights.

101
Q

What is the time complexity of Bellman-Ford Algorithm (DP version)?

102
Q

What is the space complexity of Bellman-Ford Algorithm (DP version)?

103
Q

What does Longest Increasing Subsequence (LIS) find?

A

The longest subsequence in a sequence where the elements are in strictly increasing order.

104
Q

What is the time complexity of Longest Increasing Subsequence?

105
Q

What is the space complexity of Longest Increasing Subsequence?

106
Q

What does the Coin Change Problem solve?

A

Finds the minimum number of coins required to make a given amount using a set of denominations.

107
Q

What is the time complexity of the Coin Change Problem?

108
Q

What is the space complexity of the Coin Change Problem?

109
Q

What is Union-Find (Disjoint Set Union)?

A

A data structure to manage a partition of a set into disjoint subsets.

110
Q

What is the time complexity of Union-Find per operation?

111
Q

What is the space complexity of Union-Find?

112
Q

What is a Segment Tree?

A

A tree-based data structure that supports range queries and updates efficiently.

113
Q

What is the time complexity of Segment Tree for queries and updates?

114
Q

What is the space complexity of Segment Tree?

115
Q

What is a Trie (Prefix Tree)?

A

A tree-based data structure used for efficient string storage and retrieval.

116
Q

What is the time complexity of Trie for search/insert?

117
Q

What is the space complexity of Trie?

118
Q

What does KMP Algorithm stand for?

A

Knuth-Morris-Pratt

119
Q

What is the time complexity of KMP Algorithm?

120
Q

What is the space complexity of KMP Algorithm?

121
Q

What is Reservoir Sampling?

A

An algorithm for selecting k random samples from a stream of data where the total number of elements is unknown.

122
Q

What is the time complexity of Reservoir Sampling?

123
Q

What is the space complexity of Reservoir Sampling?

124
Q

What is Fenwick Tree (Binary Indexed Tree)?

A

A data structure that provides efficient methods for querying and updating prefix sums in a dynamic array.

125
Q

What is the time complexity of Fenwick Tree for updates and queries?

126
Q

What is the space complexity of Fenwick Tree?