Floating-point arithmetic Flashcards

1
Q

What numbers can we store exactly on a computer?

A

Integers up to some maximum size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the largest possible number than can be stored using 64-bit?

A

Assuming one bit is used to store the sign ±, the largest possible number is 263 - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is fixed point representation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is (10.1)2

A

1 x 21 + 0 x 20 + 1 x 2-1 = 2.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

With fixed-point numbers are any numbers ever the same?

A

No - every number has a unique representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a problem with fixed-point representation?

A

Easy to “escape”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is meant by fixed-point representaion being easy to escape?

A

Numbers like (0.01)10(0.10)10 = (0.001)10 can’t be represented.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is floating-point representation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the (0.d1d2…dm)β in the following called?

A
  • Fraction
  • Significand
  • Mantissa
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is β and e in the following called?

A
  • Base
  • Exponent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is one advantage and disadvantage to usinh floating point numbers over fixed point numbers?

A
  • You can represent a much larger range of numbers in a floating-point representation
  • However the numbers in floating-point representation are not equally spaced
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In floating-point numbers if d1 ≠ 0 then each number in F has a unique representaion and is called?

A

Normalised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the IEEE?

A

A standard for double-precision (64 bit) arithmetic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the 64 bits used in the IEEE standard?

A
  • 52 bits for the fraction
  • 11 for the exponent
  • 1 for the sign
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the IEEE representation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does exponent bias mean in the IEEE standard?

A

The actual exponents are in range -1022 go 1055

17
Q

What are the exponents -1022 and 1025 used to store in the IEEE standard?

A

±0 and ±∞ respectively

18
Q

When β = 2, what does the first digit being normalsied mean?

A

The first digit is normalised to 1, so doesn’t need to be stored in memory

19
Q

Define underflow.

A

If a calculation falls below the lower non-zero limit (in absolute value it is called underflow.

20
Q

Define overflow

A

If a calculation falls above the upper limit (in absolute value) it is called overflow, and usually results in a flaoting-point exception

21
Q

Define rounding.

A

The mapping from ℝ to F is called rounding.

22
Q

What is used to denote rounding?

A

fl(x)

23
Q

How do you round a number?

A

Round the nearest number in F to x, if x lies exactly midway between two numbers in F, a method of breakinf ties is required. This is to round to the nearest even digit

24
Q

How do we count significant figures.

A

Start with the first non-zero digit from the left, and count all digits thereafter, uncluding ginal zeros if they are after the decimal point.

25
Q

What is the equation for the fl(x)?

A

fl(x) = x(1 + δ)

26
Q

What is the equation for the relative error incurred by rounding?

A
27
Q

What does δ stand for?

A

The relative rounding error

28
Q

How do we find an upper bound of |δ|?

A
29
Q

What is the upper bound of |δ|?

A

|δ| ≤ εM

30
Q

What does εM stand for?

A

Machine epsilon (or unit roundoff)

31
Q

Why is the machine epsilon also called the unit roundoff?

A

It is the distance between the smallest number in F greater than 1 but not rounded to 1

32
Q

What does εM equal?

A
33
Q

What is the fundamental axiom of floating-point arithmetic?

A
34
Q

What is the error when we are adding the following two numbers?

A
35
Q

What is a major cause of error in floating-point calculations?

A

Loss of significance

36
Q

What is loss of significance?

A

If x ± y is very close together, then there can be an arbitrarily large relative error in the result compared to the inital values of x and y.

37
Q

Does (a + b) + c = a + (b + c) in floating point arithmetic?

A

Not always

38
Q
A