Hessian Approximation Flashcards

1
Q

So if I had a large language model with 1 trillion parameters, how many entries would the Hessian have?

A

1 trillion by 1 trillion = 10^24 = one septillion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How much memory would you need to store 1 septillion 2byte params?

A

2 yottabytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly