CSCI 223 Test 1 Flashcards
source program
created with an editor and saved in a text file (extension .c)
how many bits are in a byte
8
files that consist exclusively of ASCII characters are known as ?; all other files are known as
text files; binary files
? goes into the preprocessor and ? comes out
source program (text); modified source program (text)
? goes into the compiler and ? comes out
modified source program (text); assembly program (text)
? goes into the assembler and ? comes out
assembly program (text); object file (binary)
? goes into the linker and ? comes out
object file (binary); executable file (binary)
what programs perform the four phases of preprocessing, compiling, assembling, and linking?
preprocessor, assembler, compiler, linker
Why do we need to understand how compilation systems work?
to optimize program performance, understand link-time errors, and avoid security holes
assembly language definition
low-level programming language which is a human readable, textual representation of machine code
assembly language facts
each statement corresponds to a single machine code; each assembly language is specific to a particular processor architecture (ISA); it contains a lot of hardware info
ISA
Instruction Set Architecture: interface b/w software and hardware (e.g. x86, ARM, MIPS, …); assembly language is very important to the ISA
what translates high level languages to low level languages?
the compiler
examples of high level programming
C/C++, Java, …
examples of low level programming
x86, ARM, MIPS, …
a system is made up of
hardware and software
software is made up of
application programs and the operating system
hardware is made up of
processor, memory, I/O devices, etc.
ALU
arithmetic logic unit
what is the fastest yet smallest memory space?
the register file (inside the CPU)
onchip memory versus offchip memory
offchip: outside CPU and onchip: inside CPU (e.g. main memory is offchip; cache, register file is onchip)
onchip memory has (faster/slower) access time than main memory
faster
memory hierarchy generalization
smaller, faster, and costlier (per byte) storage devices at the top to larger, slower, and cheaper (per byte) storage devices at the bottom
memory hierarchy L0
registers
memory hierarchy L1
L1 cache (SRAM)
memory hierarchy L2
L2 cache (SRAM)
memory hierarchy L3
L3 cache (SRAM)
memory hierarchy L4
main memory (DRAM)
memory hierarchy L5
local secondary storage (local disks)
memory hierarchy L6
remote secondary storage (distributed file systems, web servers)
between levels ? and ? in the memory hierarchy is where we can differentiate onchip (above) and offchip (below)
L3 and up is onchip, L4 and below is offchip
main memory holds ? retrieved from local disks
disk blocks
cache holds ? retrieved from memory
cache lines
CPU registers hold ? retrieved from cache memory
words
why does the cache matter?
for performance; performance depends on access patterns
binary representation
everything is represented as a sequence of binary numbers in digital systems (0 or 1)
byte
8 bit chunk; smallest data access unit
each byte has a ? memory address
unique
machine code views memory as ?
a very large array of bytes (referred to as virtual memory)
widely-used numeral systems
decimal (base 10), hexadecimal/hex (base 16), octal (base 8), and binary (base 2)
how to convert from decimal to hexadecimal
divide by 16, read the remainders from bottom to top
how to convert from decimal to binary
divide by 2, use the remainders from bottom to top (should be 0s and 1s)
how to convert from hexadecimal to decimal
label the positions from right to left as 0, 1, 2, … and multiply the number by 16^? and add the results together
how to convert from binary to decimal
label the positions from right to left as 0, 1, 2, … and multiply the number by 2^? and add the results together
how to convert from binary to hexadecimal or binary to hexadecimal
use the chart
0x0
0000
0x1
0001
0x2
0010
0x3
0011
0x4
0100
0x5
0101
0x6
0110
0x7
0111
0x8
1000
0x9
1001
0xA
1010
0xB
1011
0xC
1100
0xD
1101
0xE
1110
0xF
1111
Little Endian ordering
least significant byte (furthest to the right) has lowest address
pointer
a variable whose content is a memory address
how to declare a pointer
int *p;
and chart
& 0 1
0 0 0
1 0 1
or chart
0 1
0 0 1
1 1 1
not chart
0 1
~ 1 0
exclusive or chart
^ 0 1
0 0 1
1 1 0
logical operations in C (logical and, logical or, logical not)
&&, ||, !
view zero as (true/false) and anything nonzero as (true/false)
false, true
for logical or, you want to place the more likely term to be (0 or 1) first
1
for logical and, you want to place the more likely term to be (0 or 1) first
0
left shift x «_space;y will
shift bit-vector x left by y positions: throw away extra bits on left and will with 0’s on the right; similar to multiplication by 2^x
right shift x»_space; y will
shift bit-vector x right by y positions; similar to division by power of 2
logical right shift
fill with zero’s on the left
arithmetic right shift
replicate the most significant bit to the right on the left
ISA
Instruction Set Architecture - interface between software and hardware
ISA is considered (micro/macro)architecture
macroarchitecture
microarchitecture
implementation of macroarchitecture; not visible to software; can vary as long as it satisfies the macroarchitecture
system stack: software
problem, algorithm, program
system stack: hardware
microarchitecture, circuits, transistors
Moore’s Law
transistor counts double every 18-24 months on a single chip (means the performance will double)
PC (program counter)
contains the address of the next instruction to execute; called “%eip” (IA32) or “%rip” (x86-64)
register file
collection of registers; used to store heavily used program data
condition codes
store status information about most recent arithmetic operation; used for conditional branching
memory
byte addressable array; code, user data, some OS data; includes stack used to support procedures
three basic operations of machine instructions
- arithmetic and logic operation (ALU)
- memory operation (AKA data movement)
- control-flow operation
x86 is a (CISC/RISC) type
CISC
CISC
Complex Instruction Set Computers; large number of instructions, varying instruction length, various addressing formats, complex compiler and hardware, compact code size
RISC
Reduced Instruction Set Computers; smaller number of instructions, fixed instruction length, only a few addressing formats, simpler compiler and hardware, larger code size
assembly data types
integer data (including memory address), floating point data; no aggregate types such as arrays or structures, just continuously allocated bytes in memory
assembly operations
perform arithmetic/logical function on register or memory data, transfer data b/w memory and register, transfer control
disassembler
useful tool for examining object code; analyses bit pattern of series of instructions, produces approximate rendition of assembly code, can be run on either a.out (complete executable) or .o file
x86-64 Integer registers
supposed to contain integer value or memory address; can reference low-order 4 bytes; controlled by compiler; 16 registers
IA32 integer registers
8 integer registers (first 6 - general purpose, %esp: stack pointer, %ebp: base pointer); can reference low-order 1 byte, 2 bytes, and 4 bytes; 16-bit registers with backwards compatibility
operand types
immediate, register, memory
immediate operand type
constant integer data; like C constant, but prefixed w/ ‘$’; encoded with 1, 2, or 4 bytes
register operand type
one of 16 integer registers; %rsp reserved for special use; others have special uses for particular instructions
memory operand type
8 consecutive bytes of memory at address given by register; various other “address modes”
simple memory addressing modes
Normal: (R) Mem[Reg[R]]
Diplacement: D(R) Mem[Reg[R]+[D]]
complete memory addressing modes
most general form: D(Rb, Ri, S)
Mem[Reg[Rb] + S*Reg[Ri]+D]
where D = constant “displacement”, Rb = base register, Ri = index register, and S = scale
special memory address cases
(Rb, Ri) > Mem[Reg[Rb]+Reg[Ri]]
D(Rb, Ri) > Mem[Reg[Rb]+Reg[Ri]+D]
(Rb, Ri, S) > Mem[Reg[Rb]+S*Reg[Ri]]
ALU two operand instructions (Src, Dest)
addq (+), subq (-), imulq (*), salq (<>), shrq (»), xorq (^), andq (&), orq (|)
ALU one operand instructions (Dest)
incq (Dest+1), decq (Dest-1), negq (-Dest), notq (~Dest)
leaq Src, Dest
pure ALU instruction that simply computes memory addresses (in contrast to movq, which accesses memory); Src is address mode expression, set Dst to address denoted by expression; useful for computing addresses without a memory reference and computing arithmetic instructions fo the form x+k*y
control flow instructions: processor state
info about currently executing program:
temporary data, location of runtime stack, location of current code control point, status of recent tests
condition codes for single bit registers
CF - Carry Flag (for unsigned), ZF - Zero Flag, SF - Sign Flag (for unsigned), OF - Overflow Flag (for signed)
implicitly set condition codes by arithmetic operations
e.g. addl Src, Dest //t = a+b
CF set if carry out most significant bit (used for unsigned comparisons)
ZF set if t==0
SF set if t<0 (as signed)
OF set if two’s complement (signed overflow): (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b) > 0)
condition codes explicit setting by Test instruction
testl/testq Src2, Src1
testl b,a like computing a&b without setting destination
sets condition codes based on value of Src1 & Src2
useful to have one of the operands be a mask
ZF set when a&b == 0
SF set when a&b < 0
C data type char size
1 byte
C data type short size
2 bytes
C data type int size
4 bytes
C data type long size
4 bytes
C data type long long size
8 bytes
C data type float size
4 bytes
C data type double
8 bytes
C data type long double size
8 bytes
C data type pointer size
4 bytes
condition codes explicit setting by Compare instruction
cmpl/cmpq Src2, Src1
cmpl b,a like computing a-b (subtraction) without setting destination
CF set if carry out from most significant big (for unsigned)
ZF set if a==b
SF set if (a-b)<0 (as unsigned)
OF set if two’s complement (signed) overflow
SetX instructions
set a single byte based on combo of condition codes
one of 8 addressable byte registers; does not alter remaining 3 bytes; typically use movzbl to finish job
jX instructions
jump to a different part of the code depending on condition codes
jump target
in assembly code - symbolic labels
in machine binary - PC relative, absolute
difference between while, do-while, and for loops
do while will run at least once
while checks condition first
for loop initializes, tests, and updates in condition
mechanisms in procedures
passing control (to beginning of procedure code or back to return point), passing data (procedure arguments, return value), memory management (allocate during procedure execution, de-allocate upon return)
all implemented with machine instructions; x86-64 implementation of a procedure only uses those mechanisms required
x86-64 stack
region of memory managed with stack discipline (FILO), grows toward lower addresses, register %rsp contains the address of the top of the stack (lowest address)
x86-64 stack: push
pushq Src (Dest implicit > top of stack)
- fetch (read) operant at Src
- decrement %rsp by 8
- write operand at address given by %rsp
x86-64 stack: pop
popq Dest
- read value at address given by %rsp
- increment %rsp by 8
- store value at Dest (must be register)