ML_SWEngDevOps Flashcards

Question 1

Q

report/experiment data science

Answer

A

get dataset
clean data
process data
optimize hyperparameters
call fit/predict
report results

Question 2

Q

CI/CD - DevOps pipeline

Answer

A

iterate through:

code and push to SVC
test
build
deploy

Question 3

Q

Testing in ML

Answer

A

applicable in 2 different scenarios:
- using DEV Test set for estimator optimization
- using Test set for variance, performance evaluation
different from integration, unit testing
test data sets must be kept separately from training data

Question 4

Q

ML training and DevOps

Answer

A

ML training time can be much longer than CI/CD test, build time
do it outside the CI/CD cycle with its own timeline
ML training data should not be kept (size and business reason) in the same repo as the code
also given the performance targets retraining of a model may not produce the desired outcome w/o a bug in code

Question 5

Q

ML as a separate service

Answer

A

Pros:
- clear separation of responsabilities
- ability to use different progr. lang &amp; framework suitable for the task
Cons:
- unclear boundaries for the ML service

Question 6

Q

std DevOps methods - technical debt

Answer

A

refactoring
increase code coverage with unit tests
remove dead code
decrease dependencies
tighten APIs
improve documentation

Question 7

Q

ML related technical debt

Answer

A

blur system-level abstraction boundaries
reuse signals that increase component coupling
use glue code for the ML black boxes
changes in real world signals may change unexpectedly ML system behaviour affecting maintenance cost

Question 8

Q

Changing Anything Changes Everything (CACE) / Entanglement

Answer

A

machine learning systems create entanglement with isolation from data sources effectively impossible
no inputs are ever really independent
changes in hyperparameters have a similar effect on behavior
first version of ML system may be easy, subsequent improvements difficult

Question 9

Q

CACE - mitigation strategies

Answer

A

isolate models and server ensembles
develop methods that allow for deep insights into models prediction behavior
use more sofisticated regularization methods to enforce that changes in perf are related to a cost in the objective function in training

Question 10

Q

CACE -mitigation strategies cons/pros

Answer

A

strategy 1.
- this approach may not scale in all situations
- when maintenance cost is outweighed by modularity benefits
strategy 2.
- use vizualization to see effects on diff. dimensions
- use metrics on a slice-by-slice basis
strategy 3.
- may add more debt by increasing system complexity

Question 11

Q

Hidden feedback loops

Answer

A

changes from the real-world (i.e. user clicks) are included in the data over a period of time longer than the rate of occurence of the event (i.e. clicks weekly aggregated)
system change may occur subtly over a longer period of time (i.e. more than a week) thus not visible in quick experiments
remove such loops whenever feasible

Question 12

Q

Undeclared consumers

Answer

A

aka visibility debt
different output of a ML system (i.e. logs) can be used by other ‘undeclared’ system creating ‘undeclared’/unintended dependencies that can be affected by future changes of the ML system
signal is grabbed when available and under deadline pressure
difficult to detect
design to effectively guard against it

Question 13

Q

Data dependencies

Answer

A

contributes to code complexity and technical debt

- building large data-dependency chains are difficult to untangle

Question 14

Q

Data dependency problems

Answer

A

unstable data dependencies
underutilized data dependencies
static analysis data dependencies
correction cascades

Question 15

Q

Unstable data dependencies

Answer

A

qualitatively change over time
from another model that updates over time
from data-dependent lookup table (i.e. tf-idf calc)
from engineering ownership of data input signal diff from the eng ownership of ML model
can be mitigated with ‘data/signal versioning’
- can also increase technical debt

Question 16

Q

Underutilized data dependencies

Answer

Study These Flashcards

A

similar to YAGNI code
legacy features (in time redundant and not removed)
bundled features (may include features with little or no value)
eta-features (their related performance gain is outweighted by increase in complexity)
can be mitigated by often eval and removal of usch features

Question 17

Q

Static analysis of data dependencies

Answer

Study These Flashcards

A

no real correspondent to similar tools for code analysis
in large systems not everyone may know all features or where are used
can be mitigated by doing annotation for data sources and code - ideally automatically

Question 18

Q

Correction cascades

Answer

Study These Flashcards

A

similar to ‘boosting’ for expedience - use a new model to correct the errors of an existing one (used for a slightly different problem, applied to slightly different test distributions)
difficult to make improvements over time and may end up in a local optimum
can be mitigated by augmenting the original model with new features to help with distiguinshing between use-cases

Question 19

Q

System-level spaghetti code

Answer

Study These Flashcards

A

system-design anti-patterns:

glue code needed for general purpose self-contained packages
pipeline jungles when data prepration is made of a bunch of scrapes, joins, sampling steps with intermediate files, diffincult to maintain, test, recover from failure
dead experimental code paths from doing alternative experiments as conditional branches in production code; difficult to maintain backward compatibility, can interact in unpredicatble ways, increase system complexity
configuration debt - large systems with lots of configuration options: feature used, how data is selected, algo settings, pre & post processing, etc.

Question 20

Q

Spaghetti code - solutions

Answer

Study These Flashcards

A

glue code:
- reduce it by re-implementation within system’s problem space tweaked with problem-specific knowledge
- hybrid teams: research and engineering
pipeline jungle:
- do a ‘hollistic’ design for data collection and feature extraction from the very beginning (may require re-start)
- hybrid teams: research and engineering
dead experimental code paths:
- evaluate & remove dead code paths
- isolate experimental code & tighten code APIs
configuration debt:
- visual side-by-side diffs of configs (usually copy&paste files w/ small modifications)
- assertions about config invariants (must be carefully thought out)

Question 21

Q

Changes in the external world

Answer

Study These Flashcards

A

fixed thresholds (manually set) in dynamic systems: evaluate them w/ heldout validation data, estimate with numerical optimization
when correlations no longer correlate: use ML strategies to tell apart the correlation effects
monitoring and testing: don’t rely only on unit and integration testing but also on live monitoring
- start ‘what to monitor?’ analysis from:
- - prediction bias: i.e. the distribution of predicted labels matching that of observed labels (with caveats) is a useful diagnostic to detect sudden changes in real world
- - action limits for systems that take action in the real world to trigger (not spuriously) notifications that should be looked into

ML_SWEngDevOps Flashcards

(21 cards)