Midterm Flashcards
why is this code “dangerous”?
- int test1 = 0;
- char line[20];
- int test2 = 0;
- scanf(“%s”, line);
scanf does not honour the size of the target address. it doesn’t let you tell it how much to read
reading input that goes beyond the end of the buffer will overwrite values of other variables (in this case, test1 and test2
why is this code “dangerous”?
int readData(char *string, int max) {
int read = -1;
FILE *in;
in = fopen( “in.txt”, “r” );
if (in != NULL) {
fgets( string, max, in );
read = strlen( string );
}
return read;
}
file is not closed using fclose.
-> the code leaks
could potentially cause the code to crash
- eg. someone could write code to call this function repeatedly
Here is a physical memory that is pointed by char *str.
How long is this string? (i.e. what will strlen(str) return?)
str -> [a][v][4][0][\0][E][\0][h][\0]
4
The string is terminated by the null terminator at str[4].
extra:
- The null terminator is essential because C does not store the length of a string explicitly
-> functions that work with strings rely on the \0 to know where the string ends.
explain why you would not test for NULL when writing test cases, even as an edge case. (3)
- testing is meant to simulate user input. users can’t input NULL
- Applying the principles of design by contract should catch programming errors (like passing NULL to a
function) - Passing NULL to a function would cause a Segmentation fault or Abort.
- A Segmentation fault or Abort isn’t a test failure, it’s a crash.
passing NULL to a function would cause ________________ or _________
segmentation fault or abort
what is the main purpose of testing code?
to simulate human interaction with our code
(user input)
what are the 3 general classifications of inputs we need to use to test our code?
- general cases
- valid, expected input - edge cases
- input is valid but looks weird - memory leaks
- use loops to run code repeatedly
- observation with tools like top (see man top) or profilers
how do you check for memory leaks in your code?
- use loops to run code repeatedly
- observation with tools like top or profilers
list three edge cases for the following in-place sorting routine:
void sort( int *array, int size );
- array is empty
- [], 0 - array has one value
- [1] - array has repeated values
[2, 2, 2]
the only output you receive when you run a program is Segmentation fault. Explain what tool you would use to identify the problem, and what the tool is helping you understand about the program.
asserts, debugger, print statements can all help but for this the best option is a debugger
a debugger helps you:
- step through your program line-by-line (observe the flow)
- helps you find where the code is crashing - watch variables as they change (observe the state)
- determine which variables have the wrong values that are causing the crash
Given the following function, what should be the pre and postconditions for the function, and what
should the loop invariant(s) be? Write your answer as a series of assertions. Give at least two examples of
each type
void substr(char *in, int start, int end, char *out) {
int i;
for (i = start; i < end; i++)
out[i - start] = in[i];
out[i] = ‘\0’;
}
// preconditions:
assert(in != NULL);
assert(out != NULL);
assert(start < end);
assert(start >= 0);
assert(end < strlen(in));
// loop invariant
assert(i - start >= 0);
assert(i - start < end);
// postconditions
assert(out[0] == in[start]);
assert(out[strlen(out)] == ‘\0’);
Below is a structure definition for a string data structure in C:
typedef struct STRING {
char *content;
int length;
} string;
Write a function that will return the index of a character in an instance of this string structure. For complete points, your code must apply the principles of design by contract (you must define pre and postconditions for this function). No invariant function is provided, you should write your own.
Here’s the prototype:
// return the index at the specified character in the string (similar to // Java’s String#indexOf
method). Returns -1 if not found.
int index_of(string *, char);
see notes
The degree of reliance on other modules/functions
is called ________
coupling
Higher coupling means higher __________ and we are more likely to be affected [positively/negatively] by changes
dependency
negatively
The degree to which a function adheres to one task is called __________
cohesion
_________ cohesion means doing many independent
tasks. is this good?
low
no. we want high cohesion. we dont want our functions doing many independent tasks
well designed programs should have:
______ cohesion. The elements of each module
should be closely related to one another.
– makes modules easier to use and makes the entire program easier to understand.
_______coupling. Modules should be as independent
of each other as possible.
– makes it easier to modify the program and reuse modules.
high
low
high cohesion means the elements of each module are what? how does this help?
closely related to each other
makes modules easier to use and makes the program easier to understand
low coupling means modules should be…
how does this help?
as independent of each other as possible
makes it easier to modify the program and reuse modules
The assembler translates the assembly to…
binary/machine code
The von Neumann Architecture
A “stored-program” computer architecture consists of: (4)
- input device
- output device
- memory unit
- central processing unit
what is piping for?
piping redirects standard output from one program to another to be processed as standard input
transfers standard output to some other destination
what are the key differences between C and Java (7/8)
- C doesn’t have objects
- no String class. char arrays are strings - C doesn’t garbage collect
- Java destroys unused arrays and frees memory for you. C does not do this - C doesn’t have exceptions
- because it doesn’t have objects
- you must handle exceptional cases yourself - C does not check bounds for you
- will allow you to overflow and overwrite other blocks of memory - C has no concept of information hiding
- everything is public - variables can be used without being initialized
- arrays
- not bounds checked
- array size must be initialized. C doesn’t do this for you. eg. char name[]; // ERROR
- array data isn’t initialized with 0s (depends)
- you should initialize them yourself - C gives you direct access to memory
- there are types but they are just representation of the bits & bytes in memory
- in Java, bits & bytes are abstracted away into types
what is important about C not having objects
- no string class. char arrays are strings
- no exceptions. must do that shit yourself
explain the statement that “C does not garbage collect”
- as variables go out of scope, their resources are not automatically reclaimed
- the OS has no idea it can use those resources again (leaks)
- this is also a problem with dynamic memory allocation
- C doesn’t destroy things for you after you finish using them like Java does
- this is important because to make efficient use of memory, you must do it yourself
cmd 1 | cmd 2
explain whats happening here
piping
cmd 1 output is being fed as input to cmd 2
if the output is going by too quickly to see, what can you do?
you can pipe it somewhere (to a pager, like more or less) to view it easier
why is it important that C doesn’t have exceptions
no try/catch. no bounds checking
- C will write outside of the bounds and overwrite other information
why is it important that C does not check bounds for you?
because it wont cause an error. even if the input has too much data for the target address, it will keep going
it will overflow and write over other memory, potentially replacing other variables in your program
memory can be thought of as…
a long piece of paper
separated into segments that have an address
each segment can store info
What can function as a Boolean in C?
numbers.
0 = false
anything else = true
int i = 10;
while (i){
i–
}
what are some important things to remember for arrays in C
not bounds checked
must be passed using pointers
you must initialize the size yourself
- say how much space you’ll need
data in arrays aren’t initialized like in Java
- int array wont necessarily be filled with 0s
- depends on OS
how is C compiled?
- preprocessor
- code generator
- assembler
- linker
what is the preprocessor? what does it do?
define
a text replacement engine
- finds lines that start with #
- Replace those lines with file contents (#include)
- Uses these lines to replace other text in the file (#define)
what does the code generator do?
where does it take output from?
- generates abstract syntax tree,
- optimizes it,
- converts to assembly
- takes output from preprocessor
what does the assembler do
where does it take output from
translates assembly to an object file (machine code)
takes output from code generator
what does the linker do
where does it take output from
what does it produce?
links the functions
- prototypes are important for this
- will get errors from the linker otherwise
takes output from the assembler
Produces an executable binary file
how does an array look in memory?
contiguous blocks of memory
- the size you set for an array will reserve that many blocks of memory
define exceptions
unexpected behaviour
happens during code execution
interrupts flow of execution
what are some causes of exceptions?
- file handling
- null pointers
- division by 0
explain what happens when a function is called
a stack frame is created with the memory necessary for the variables in the function
when the function has finished executing, the frame is popped off, and the memory allocated becomes available to the rest of the program
explain why stack memory cannot be returned from a function
the memory only becomes available after the function is finished executing, and the stack frame gets popped off
any variables held within the stack frame become available and can be rewritten
struct vs class in Java
struct just has variable declarations
struct does not have methods
explain what happens when resources aren’t released after use
leaks.
if a function leaks, and is called repeatedly, can cause program to crash
what are some programming errors (not at run time)
syntax
indexing out of bounds
explain design by contract
what does it contain?
helps programmer ensure code is running the way they expect it to
contains preconditions, invariants, postconditions
essentially a debugging tool
classifications of test data (3)
- general cases
- valid and expected input - edge cases
- valid input that may require special handling - leaks
- run code repeatedly
explain the purpose of automated testing
makes it easier for testing throughout changing our program
create template
- saves time in the long run, especially for larger programs
strategies for debugging
- printf statements
- assertions
- spelunking
- debugger
whats the problem with using printf for debugging?
what can be done about this?
printf is buffered
- may not print near the crash
use fflush to ensure buffer is emptied
what does debugging reveal
flow
- how the program is running. where the crash is
state
- state of variables
explain the difference between interface and implementation
what does an interface include?
what does implementation include?
interface
- contains min amount of info for a piece of data
- tells you what you can do
- includes function prototypes
- data type declaration
- header files
implementation
- data type definitions
- function implementations
classify public interface and private behaviour
private behaviour hides implementation
- user doesnt need to know whats going on behind the scenes
whats the purpose of a build tool
helps us compile our code
- useful for large programs that contain many .c and .h files
- compiling those ourself each time we make a change to our code is a pain in the ass
example: make
explain why information/implementation hiding is important
hiding implementation allows for the use of programs/code/tools without needing to understand the inner workings
is sharing data across scopes ok? why or why not
its bad!
- it exposes implementation
- it increases coupling
identify code that can be separated into modules
abstract data types, data structures, and subsystems each get their own module
examples of abstract data types
lists, queues, stacks
examples of data structures
linked lists, hash tables
what is coupling
measurement of how much a class/module/function/concept depends on others
how interdependent
- coupling is unavoidable but we want to keep it as low as possible
what is cohesion
measurement of how much code and concepts belong together
- classes/modules implement ONE concept
do we want high or low coupling? why? how can this be helped?
we want low coupling
high coupling makes it hard to make changes to our code
- our code is interdependent. changing one thing affects many others
modular design helps avoid coupling
how do we reduce coupling?
modular design
(what is that?)
what is modular design?
subdividing a system into smaller parts (modules)
modules can be independently created, modified, replaced
what is the purpose of modular design(4)
- breaks code down and makes it easier to read and understand
- provides abstractable concepts that are easier to use
- allows multiple ppl to work on same thing
- allows for dynamic implementation which can be changed and not affect the rest of the program
T or F
C gives you the choice for arrays to be placed in “stack memory” or “heap memory”
true
how can you prevent exceptions from occurring
- check that variables have valid values before you use them
- for now: writing a lot of if statements & checking error codes (invariants)
- assertions
how do we return an array in C
using a pointer
what is a pointer
a type that stores an address
what is the basic idea of Output Parameters
each function should be responsible for allocating the memory it needs, even if that memory is populated by another function
int * x_ptr, y_ptr;
what will these be declared as
x_ptr = a pointer
y_ptr = an int
Practice: consider the following program. draw it out
int x = 1;
int y = 2;
int z = 3;
int *a, *b;
a = &y;
b = &x;
*b = z;
x = *a;
see notes sep 17/19
scanf returns one of three things:
- number of tokens parsed (success)
- 0 for a formatting error
- EOF for the end of the file
how does fgets get input safely? (unlike _______)
unlike scanf,
fgets inserts a null terminator at the end of the desired length you set
what are the steps to opening a file
- fopen
1.5 check if file opened
- fopen returns NULL if file doesn’t exist - read file
- fclose the file
- only if it opened successfully
T or F
C treats stdout as a file
True
what is a class?
a “blueprint” for what data and methods go together
what is a struct
a named grouping of related heterogenous data
like a class but only variables. no methods
what is typedef for
so that every time we declare a variable of type struct FOO, we DONT have to write struct FOO
- defines foo as a type
what happens if you pass a struct as a parameter or return a struct
a copy will be made
also, if you assign structs to each other
can you compare two structs?
no
but you could do a field-by-field comparison to check for duplicates
what should you do if you want to modify a struct in a function
pass it as a pointer
otherwise you will be working with a copy and wont change the original struct
what does malloc do
what does it return
requests allocation of heap memory
a pointer to the space in memory it allocated the amount of space you requested (amount of bytes)
- or NULL if it failed
malloc takes a size in bytes
whats important to know for this?
how many things we want to store
multiplied by:
how many bytes each thing takes up (what type)
- use sizeof
different platforms use different amounts of space for the same type. how do we deal with this when using malloc?
sizeof
- unary operator that tells us how many bytes a specific data type requires
char * arr = malloc(10 * sizeof(char));
// gives us 10 characters worth of space
what should you always remember to do when using malloc?
garbage collect!
free memory when we’re done using it
eg. free(array)
when we design a data structure, our responsibility is to provide… (2)
hint, one is unique to C
- a constructor
- a destructor
what is design by contract a tool for?
what is it not a tool for?
- prevent programming errors
- exception prevention
NOT for:
- preventing user errors
what is the state of a program?
the value of all variables that currently exist
what kind of programming errors can a programmer write?
- index and bounds errors
- forget to check inputs (data type, NULL, format)
- syntax errors (handled by the compiler)
exceptions usually come from a violation of our ____________
3 places this can happen:
assumptions
- before we start processing data
- after we finish processing data
- the state throughout execution
(while we’re processing data)
char char_at(String* str, int loc){
return str.contents[loc];
}
what are some assumptions we may have about this code?
- loc <= strlen(str)
- loc >= 0
- str != null
- str.contents != null
- str.contents has a null terminator at position str.length (and this is the first null terminator)
the preconditions, postconditions, and invariants of a function/block/thing are its ___________
contracts
how do we check for leaks?
run code repeatedly and see if it crashes
what are some edge cases for a sorting algorithm (5)
edge cases:
○ a list of length 1
○ a sorted list
○ a reverse sorted list
○ an empty list (length 0)
○ n identical elements
preventing exceptions…
what kind of assumptions might we have
- provided arguments are not NULL
- variables are non-negative
- values are within bounds of array
- a counter actually reflects the contents of an object
what would some edge cases for a split function be?
(split words separated by spaces)
- “hello” -> no spaces
– 1 element {“hello”}
program expects 0 elements - “hello, world” -> 2 spaces in a row
– 2 elements {“hello,”, “world”}
prgrm expects: 3 elements - “ “ only space characters
– 1 element {“”}(empty string)
program expects n elements - “ hello “ -> space at the beginning and the end
– 1 element {“hello”}
program expects 3 elements - “” -> empty string
– 0 element
organizing tests:
1. Each test function should be as ______ as
possible
– Test _____ thing. (and test it well.)
2. The data for the test belongs in the test ________.
3. The main function should only call other __________
atomic. test one thing
function
functions
advantages of typedefs?
- make our program easier to understand
- make our program easier to modify
An __________ type is a type whose values are
listed (“____________”) by the programmer
(same word fits both blanks)
enumerated
enumerations are a tool for
creating types that can only take on a small number of values
eg.
enum suit {
heart,
spade,
club,
diamond
};
enum suit card = heart;
whats one thing enumerations are useful for
creating a boolean
typedef enum { false, // 0 true // 1 } boolean;
general strategy for code that crashes
key words: flow and state
- run the code in the debugger with no breakpoints (run or r)
- inspect the stack trace. (bt)
○ tell you where the crash is happening - add a break point near the crash. (b or break, followed by the line number)
- run program again, stepping through each line after the breakpoint. (n or next)
- observe state step by step
goals of modular design (3)
- smaller, easier to understand components
- ability for multiple ppl to work on program
- extract reusable structures and concepts
modular design has 2 main parts
(what do we want to do with these two things)
- the interface of the software
- the implementation of the software
- our main goal is to separate these two things
what does a header file contain? (3)
+ (1) extra detail
- function prototypes (public function declarations)
- struct definitions (as necessary)
- enum definitions (as necessary)
-> should not expose internal implementation details
running the compiler without the linker -c produces __________ files
object (.o)
what are header guards
for when we include the same header more than once
Problem: what happens if we include the same header twice?
what can we use to help this?
redefinitions -> compilation fails
header guards
a makefile consists of a list of ______ that describe how to build a program
a ____ has 3 parts:
rules/rule
- a target
○ the name of the thing this rule generates - a list of prerequisites
○ things that must exist before you build this thing - a list of recipes
○ the commands that will be run to generate the target
eg. string.c is a prereq for string.o
problem: in C, everything is effectively public by default
how do we prevent calling private functions?
use “static” modifier
- in C, static means per-file private
- ie, this function/data is only available in the file where its declared
what is the single responsibility principle
each piece should do one thing
re: rules
a target is
the name of the thing this rule generates
re: rules
a list of prerequisites is…
example ?
things that must exist before you build this thing
eg. string.c is prereq for string.o
re: rules
a list of recipes is…
the commands that will be run to generate the target
Questions to ask when identifying functions to manipulate your data model: (4)
- what kind of input do you have?
- how do you get that input?
- how do you transform it to/from your data model?
- what kind of operations might someone else need to do? (what are the public operations?)