Hugging Face ecosystem | HF NLP course | 2. Using Hugging Face Transformers | Other Flashcards

1
Q

[page] Models: [page section] Creating a Transformer: [q] To initialize a BERT model from scratch, you first need to initialize a ? and a ?

A
from transformers import BertConfig, BertModel
					
# Building the config
config = BertConfig()
					
# Building the model from the config
model = BertModel(config)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

[page] Models: [page section] Creating a Transformer: [section] Different loading methods: [q] Code to load a Transformer model with a specific architecture that is already trained.

A
from transformers import BertModel
					
model = BertModel.from_pretrained("bert-base-cased")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

[page] Models: [page section] Creating a Transformer: [section] Different loading methods: [q] How can you can customize your cache folder?

A

HF_HOME environment variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

[page] Models: [page section] Creating a Transformer: [section] Saving methods: [q] How do you save a model?

A

model.save_pretrained(“directory_on_my_computer”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

[page] Models: [page section] Using a Transformer model for inference: [q] What does the model accept as inputs? What creates these inputs?

A

Tensors (of rectangular shape). Tokenizers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

[page] Models: [video] Instantiate a transformers model: [q] To load a model configuration from a checkpoint?

A
bert_config = AutoConfig.from_pretrained("bert-base-cased")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

[page] Tokenizers: [page section] Loading and saving: [q] Code to load the BERT tokenizer trained with the same checkpoint as BERT.

A
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

[page] Handling multiple sequences: [page section] Models expect a batch of inputs: [q] For a single sequence, initialize a model for classification and show the intermediate methods of the tokenizer to generate ids for the example (3 steps).

A
"import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
					
checkpoint = ""distilbert-base-uncased-finetuned-sst-2-english""
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
					
sequence = ""I've been waiting for a HuggingFace course my whole life.""
					
tokens = tokenizer.tokenize(sequence)
ids = tokenizer.convert_tokens_to_ids(tokens)
					
input_ids = torch.tensor([ids])
print(""Input IDs:"", input_ids)
					
output = model(input_ids)
print(""Logits:"", output.logits)"
How well did you know this?
1
Not at all
2
3
4
5
Perfectly