! S 11: Distributed Deep Learning Flashcards
1
Q
Deep learning frameworks
A
- software libraries that provide tools for development, training, and deployment of deep learning models
- e.g. Tensorflow, Py Troch
- focus on neural networks (basis for deep learning)
2
Q
Deep Learning Frameworks - Steps
A
- Build Computational Graph from network definition
- Input training data & compute loss function
- Update parameters
3
Q
Deep Learning Frameworks - Model Training
A
- Define-and-run = complete 1. (build graph) before others e.g. TensoFlow, Caffe
- Define-by-run = combine 1. & 2. (input training data & compute loss function) e.g. PyTorch
4
Q
ONNX
A
- Open NN eXchange
- Opn-source shared model representation for framework interoperability & shared optimization
5
Q
Distributed ML
A
- using multiple computer ressources that are distributed across a network / machines
- leverage power of parallel computing
- better training, handle larger data sets, more complex deep learning tasks
6
Q
Ways of distributed ML
A
- Data Parallelization
- Model Parallelization
7
Q
Data Parallelization
A
- each machine receives copy of model & operates on batch of data
- Each worker: training on non-overlapping batch -> update of paramaters
- Requirment: parameter syncrhonization
8
Q
Model Parallelization
A
- each machine processes differen portion of model
- Forward pass: layer computs output signal -> workers that hold next layer
9
Q
Tensorflow
A
- offers tools, libraries & open-source platform for ML created by Google
- tensor = multidimensional array (like numpy with GPU support)
- Main feature: Express numeric computation as graph
- graph can be exported to train & run at different playes