Chapter 3 - Machine-Level Representation of Programs Flashcards
What register is used to store where a computer is in its program sequence?
The program counter register, %rip.
Which register is associated with the program counter?
%rip.
How many locations are there in a typical modern integer register file?
16 registers.
How many bits of information can each register in the register file hold?
64 bits.
What program can be used on a Linux system to view the assembly code of a given object file?
objdump.
What term is given to the class of programs used to inspect the contents of machine-code files?
Disassemblers.
What command can be used to generate a file containing the assembly-level version of the source file prog.cc?
g++ -Og -S prog.cc
Using g++, a source file is translated into both assembly code (with the -S flag) and an executable (with the -o flag). What is likely to be the most noticeable difference after dumping the contents of the generated files? (Either using ‘objdump’ or ‘cat’.)
The offset in the address.
How does the Intel format differ from the ATT format?
- Intel code omits size designation suffixes (i.e. push not pushq).
- Intel code omits the ‘%’ character in front of register names.
- Intel code describes memory locations differently (e.g. QWORD PTR [rbx] rather than (%rbx)).
- Intel code lists operands in reverse order to ATT code.
State the number of bytes taken to represent the following data types: char short int long char* float double
char -- 1 short -- 2 int -- 4 long -- 8 char* -- 8 float -- 4 double -- 8
How many registers did the original 8086 processor have and what size data could they contain?
There were eight 16-bit registers.
How many registers did the IA32 processor have and what size data could they contain?
There were eight 32-bit registers.
How many registers does the x86-64 processor have and what size data can it contain?
There are 16 64-bit registers.
What labels did the registers in the original 8086 processor have?
There were eight registers labelled %ax through to %sp.
What labels did the registers in the IA32 processor have?
There were eight registers labelled %eax through to %esp.
What labels do the registers of the x86-64 processor have?
There are 16 registers; the first eight registers are labelled %rax through to %rsp, and the other eight are labelled %r8 through %r15.
When instructions have registers for destinations, what happens to the remaining bytes when the instructions generate size 1, 2 or 4-byte values?
When 1 and 2-byte values are generated, the remaining bytes in the register are left untouched.
When 4-byte values are generated, the high-order 4 bytes are set to zero.
Which register is associated with the stack pointer?
%rsp.
What suffixes are used in integer instructions to indicate the size of the operand?
1-byte = 'b' 2-byte = 'w' 4-byte = 'dw' 8-byte = 'qw'
What is the syntax for providing an immediate value as an operand to an assembly instruction?
The given value is preceded by the $ character.
What is the general syntax for providing a memory location as an operand to an assembly instruction?
The syntax has the form Imm(rb,ri,s).
An assembly instruction is given “Imm(rb,ri,s)” as an operand. Describe what this operand represents.
Imm(rb,ri,s) corresponds to the byte at the memory location:
M[ Imm + R[rb] + R[ri] * s ]
The individual terms are: Imm = Immediate offset R[rb] = Base register R[ri] = Index register s = scaling index
What are the three main types of operands supplied to assembly instructions?
Immediate, register and memory.
Assume the following values are stored at the indicated memory addresses and registers:
Address Value 0x100 0xFF 0x104 0xAB 0x108 0x13 0x10C 0x11
Register Value
%rax 0x100
%rcx 0x1
%rdx 0x3
What are the values of the following operands?
260(%rcx,%rdx)
0xFC(,%rcx,4)
(%rax,%rdx,4)
260(%rcx,%rdx) = M[0x108] = 0x13.
0xFC(,%rcx,4) = M[0x104] = 0xFF.
(%rax,%rdx,4) = M[0x10C] = 0x11.
Name the four instructions in the MOV class.
movb, movw, movl, and movq.
True or false? Both operands supplied to a move instruction can be memory locations.
False.
How many instructions does it take to copy data from one memory location to another?
Two; one to copy the data into a register and another to copy it to the destination.
What is the movabsq instruction used for and what operand can it have for the destination?
To move 64-bit immediate values, and the destination must be a register.
N.B. movq can only handle 32-bit immediate values which are sign-extended to 64 bits.
What two classes of move instructions can be used to copy a smaller source to a larger destination?
The movz and movs class.
Name the five different movz instructions.
movzbw, movzbl, movzwl, movzbq, and movzwq.
Why is there no movzlq instruction?
A 4-byte source is automatically zero extended when copied it to a 8-byte destination, so the movl instruction automatically implements the movzlq instruction.
Name the six different movs instructions.
movsbw, movsbl, movswl, movsbq, movswq, and movslq.
What instruction requires no operands and sign extends a 4-byte source to a 8-byte location? What register does it operate on?
The cltq instruction, which operates on the %rax register.
For each of the following lines of assembly language, determine the appropriate instruction suffix based on the operands.
mov___ %eax, (%rsp)
mov___ (%rax), %dx
mov___ $0xFF, %bl
movl %eax, (%rsp)
movw (%rax), %dx
movb $0xFF, %bl
For each of the following lines of assembly language, determine the appropriate instruction suffix based on the operands.
mov___ (%rsp,%rdx,4), %dl
mov___ (%rdx), %rax
mov___ %dx, (%rax)
movb (%rsp,%rdx,4), %dl
movq (%rdx), %rax
movw %dx, (%rax)
Each of the following lines of code generates an error message when we invoke the assembler. Explain what is wrong with each line.
movb $0xF, (%ebx)
movl %rax, (%rsp)
movw (%rax),4(%rsp)
movb $0xF, (%ebx)
// Memory references require 8-byte registers.
movl %rax, (%rsp)
// Mismatch between instruction suffix and register ID
movw (%rax),4(%rsp)
// Only one operand can be a memory location.
Each of the following lines of code generates an error message when we invoke the assembler. Explain what is wrong with each line.
movb %al,%sl
movq %rax,$0x123
movb %al,%sl
// There is no register named sl.
movq %rax,$0x123
// The destination cannot be an immediate value.
Each of the following lines of code generates an error message when we invoke the assembler. Explain what is wrong with each line.
movl %eax,%rdx
movb %si, 8(%rbp)
movl %eax,%rdx
// The destination of the movl instruction cannot be an 8-byte register.
movb %si, 8(%rbp)
// Mismatch between instruction suffix and register ID.
Which register is associated with an integer return value?
%rax.
Assume variables sp and dp are declared with types
src_t *sp;
dest_t *dp;
where src_t and dest_t are data types declared with typedef. We wish to use the appropriate pair of data movement instructions to implement the operation
*dp = (dest_t) *sp;
Assume that the values of sp and dp are stored in registers %rdi and %rsi. For each entry below, show the two instructions that implement the specified data movement.
S = long, D = long:
movq (%rdi), %rax
movq %rax, (%rsi)
S = char, D = int:
______________
______________
S = char, D = unsigned:
______________
______________
S = char, D = int:
movsbl (%rdi), %eax
movl %eax, (%rsi)
S = char, D = unsigned:
movsbl (%rdi), %eax
movl %eax, (%rsi)
Assume variables sp and dp are declared with types
src_t *sp;
dest_t *dp;
where src_t and dest_t are data types declared with typedef. We wish to use the appropriate pair of data movement instructions to implement the operation
*dp = (dest_t) *sp;
Assume that the values of sp and dp are stored in registers %rdi and %rsi. For each entry below, show the two instructions that implement the specified data movement.
S = unsigned char, D = long:
______________
______________
S = int, D = char:
______________
______________
S = unsigned char, D = long:
movzbl (%rdi), %eax
movq %rax, (%rsi)
// Note the special trick with the first step; this relies on the fact that the high-order 4 bytes of the register will be cleared with this instruction.
S = int, D = char:
movl (%rdi), %eax
movb %al, (%rsi)
Assume variables sp and dp are declared with types
src_t *sp;
dest_t *dp;
where src_t and dest_t are data types declared with typedef. We wish to use the appropriate pair of data movement instructions to implement the operation
*dp = (dest_t) *sp;
Assume that the values of sp and dp are stored in registers %rdi and %rsi. For each entry below, show the two instructions that implement the specified data movement.
S = unsigned, D = unsigned char:
______________
______________
S = char, D = short:
______________
______________
S = unsigned, D = unsigned char:
movl (%rdi), %eax
movb %al, (%rsi)
S = char, D = short:
movsbw (%rdi), %ax
movw %ax, (%rsi)
What discipline does a stack adhere to?
A “last-in, first-out” (LIFO) discipline.
What effect does the pushq instruction have?
The stack pointer is decremented by 8 (to allocate 8 bytes) and the quad word provided as an operand is written to the value at the new top-of-stack address.
What effect does the popq instruction have?
The quad word is read from the top-of-stack location and stored at the operand destination, then the stack pointer is incremented by 8 (to deallocate 8 bytes).
What has the smallest address, the top of the stack or the bottom?
The top.
What has the smallest address, the top of the stack or the bottom?
The top.
What does the cltq instruction do? How many operands does it take?
It sign extends a 4-byte source to a 8-byte location and requires no operands; it sign extends the data in %eax and stores it at %rax.
Which assembly instruction is used to generate pointers?
The leaq instruction.
What instruction classes are used to compute x++, x–, -x, and ~x?
x++ = INC x-- = DEC -x = NEG ~x = NOT
How many operands do instructions in the INC, DEC, NEG, and NOT classes take?
The instructions in these classes apply unary operations. They take a single operand which corresponds to both the source and the destination.
How many operands do instructions in the ADD, SUB, MUL, DIV, XOR, OR and AND classes take?
They take two operands: a source, and a destination (in that order).
What are the classes of instructions used for left and right-shift operations? Are they all unique?
SAL, SAR, SHL, SHR. The SAL and SHL are not unique; they both have the same effect.
What is the result of the following assembly code?
long scale(long x, long y, long z) x in %rdi, y in %rsi, z in %rdx scale: leaq (%rdi,%rsi,4), %rax leaq (%rdx,%rdx,2), %rdx leaq (%rax,%rdx,4), %rax ret
long t = x + 4y + 12z;
What is the result of the following assembly code?
short scale3(short x, short y, short z) x in %rdi, y in %rsi, z in %rdx scale3: leaq (%rsi,%rsi,9), %rbx leaq (%rbx,%rdx), %rbx leaq (%rbx,%rdi,%rsi), %rbx ret
long t = 10y + z + xy;
Suppose register %rbx holds value p and %rdx holds value q. Fill in the table below with formulas indicating the value that will be stored in register %rax for each of the given assembly-code instructions:
leaq 9(%rdx), %rax \_\_\_\_\_\_\_\_\_\_\_\_ leaq (%rdx,%rbx), %rax \_\_\_\_\_\_\_\_\_\_\_\_ leaq (%rdx,%rbx,3), %rax \_\_\_\_\_\_\_\_\_\_\_\_
leaq 9(%rdx), %rax = 9+q leaq (%rdx,%rbx), %rax = p+q leaq (%rdx,%rbx,3), %rax = q+3*p
Suppose register %rbx holds value p and %rdx holds value q. Fill in the table below with formulas indicating the value that will be stored in register %rax for each of the given assembly-code instructions:
leaq 2(%rbx,%rbx,7), %rax \_\_\_\_\_\_\_\_\_\_\_\_ leaq 0xE(,%rdx,3), %rax \_\_\_\_\_\_\_\_\_\_\_\_ leaq 6(%rbx,%rdx,7), %rax \_\_\_\_\_\_\_\_\_\_\_\_
leaq 2(%rbx,%rbx,7), %rax = 2+8*p leaq 0xE(,%rdx,3), %rax = 14+3*q leaq 6(%rbx,%rdx,7), %rax = 6+p+7*q
What type of operands can be supplied to a bit-shift instruction?
An immediate value or a single-byte register.
What is the shift amount of the following instruction?
shll $0xF3,%eax
A 32-bit value can only be shifted by 2^5 so only the first five bits of 0xF3 are used, i.e. [10011]. Thus, the shift amount is 19.
What does the cqto instruction do?
Convert a quad word to an oct word.
What is the difference between the imulq and mulq instructions?
imulq = signed multiplication mulq = unsigned multiplication
If the imulq and mulq instructions are only provided one operand, what is their function?
They are special instructions that provide full 128-bit multiplication and division.
imulq = signed multiplication mulq = unsigned multiplication
Name four of the most useful condition flags.
The carry flag, zero flag, sign flag, and overflow flag.
What is the carry flag used for?
It indicates that the most recent operation generated a carry out of the most significant bit. It’s used to detect overflow of unsigned calculations.
What is the zero flag used for?
To indicate that the most recent operation yielded zero.
What is the sign flag used for?
To indicate that the most recent operation yielded a negative value.
What is the overflow flag used for?
To indicate that the most recent operation caused a two’s complement overflow – either positive or negative.
What two classes of instructions set the condition codes without altering any registers?
The CMP and TEST classes of instructions.
What is the syntax of an instruction in the CMP family and what is its effect?
Syntax: cmp{b,w,l,q} S_1, S_2.
Effect: Set condition flags based on S_2 - S_1.
What is the syntax of an instruction in the TEST family and what is it’s effect?
Syntax: test{b,w,l,q} S_1, S_2.
Effect: Set condition flags based on S_2 & S_1.
The condition flags are CF, ZF, SF, and OF. Combine them to produce the instruction “setle”.
(SF ^ OF) | ZF // Note: this form ensures that the most recent operation has not overflowed.
The condition flags are CF, ZF, SF, and OF. Combine them to produce the instruction “setg”.
~(SF ^ OF) & ~ZF // Note: this form ensures that the most recent operation has not overflowed.
The C code:
int comp(data_t a, data_t b) { return a COMP b; }
shows a general comparison between arguments a and b, where data_t, the data type of the arguments, is defined (via typedef) to be an integer data types and either signed or unsigned. The comparison COMP is defined via #define.
Suppose a is in some portion of %rdx while b is in some portion of %rsi. For each of the following instruction sequences, determine which data types data_t and which comparisons COMP could cause the compiler to generate this code.
A. cmpl %esi, %edi
setl %al
B. cmpw %si, %di
setge %al
A. cmpl %esi, %edi
setl %al
// Data types: int
// Operation: a<b>=b</b>
The C code:
int comp(data_t a, data_t b) { return a COMP b; }
shows a general comparison between arguments a and b, where data_t, the data type of the arguments, is defined (via typedef) to be an integer data types and either signed or unsigned. The comparison COMP is defined via #define.
Suppose a is in some portion of %rdx while b is in some portion of %rsi. For each of the following instruction sequences, determine which data types data_t and which comparisons COMP could cause the compiler to generate this code.
A. cmpb %sil, %dil
setbe %al
B. cmpq %rsi, %rdi
setne %a
A. cmpb %sil, %dil
setbe %al
// Data types: unsigned char
// Operation: a<=b
B. cmpq %rsi, %rdi
setne %a
// Data types: long, unsigned long or a pointer
// Operation: a!=b
What three common ways are there to access condition codes?
(1) We can set a single byte to 0 or 1 depending on some combination of the condition codes;
(2) We can conditionally jump to some other part of the program, or;
(3) We can conditionally transfer data.
What do the suffixes of the instructions in the SET class denote?
The condition, NOT the data size.
The C code:
int test(data_t a) { return a TEST 0; }
shows a general comparison between argument a and 0, where we can set the data type of the argument by declaring data_t with a typedef, and the nature of the comparison by declaring TEST with a #define declaration. The following instruction sequences implement the comparison, where a is held in some portion of register %rdi. For each sequence, determine which data types data_t and which comparisons TEST could cause the compiler to generate this code.
A. testq %rdi, %rdi
setge %al
B. testw %di, %di
sete %al
A. testq %rdi, %rdi
setge %al
// Data types: long
// Operation: a>=0
B. testw %di, %di
sete %al
// Data types: short
// Operation: a==0
The C code:
int test(data_t a) { return a TEST 0; }
shows a general comparison between argument a and 0, where we can set the data type of the argument by declaring data_t with a typedef, and the nature of the comparison by declaring TEST with a #define declaration. The following instruction sequences implement the comparison, where a is held in some portion of register %rdi. For each sequence, determine which data types data_t and which comparisons TEST could cause the compiler to generate this code.
A. testb %dil, %dil
seta %al
B. testl %edi, %edi
setle %al
A. testb %dil, %dil
seta %al
// Data types: unsigned char
// Operation: a>0
B. testl %edi, %edi
setle %al
// Data types: int
// Operation: a<=0