Algorithms Flashcards
What is hardware?
physical devicecs
What is software?
programs that run on hardware to implement applications
How has computer storage changed?
computers have gotten much better at storing information in smaller spaces
changed computing – much larger number of pieces of data in a device
What are the 4 types of computer memory?
registers
caches
RAM
hard drive
What does the actual work in computers?
CPU/”chip” actually does the work
What memory does the CPU/”chip” have?
- registers
- several layers of caches
What are registers?
hold the data that the computer is actively working with
very small, and very fast to access
What are caches?
fast memory on CPU/chip
What memory does the computer have?
- RAM
- hard drive/solid state drive (ssd)
What is RAM?
random access memory
- fairly fast and small
- slower to access + a bit bigger than cache
Where is the RAM?
on motherboard, NOT on chip
Where is the hard drive/solid state drive (ssd)?
NOT on chip
NOT on motherboard
How fast is access to the hard drive/solid state drive (ssd)?
very slow to access, but generally much bigger
What is Moore’s Law?
computer speed and memory on a chip will double every 18 months to 2 years
1965: number of transistors on chips doubles every year
1975: number of transistors on microchips doubles every 2 years
2019: Moore’s law no longer holds true – the trend persists, but it is slowing down
(making chips smaller also makes them faster – but this has come to an end)
How big are transistors?
around 10 nm
Why is it becoming harder to double the number of transistors on a chip?
- harder to work on environments that are that small
- atom size – we are limited by the size of an atom
- battery cells too close together – overheating
- software is not the problem
What is computing speed up due to?
- hardware
- software
What is an operating system?
special kind of software that allows the other software to run (ie. Windows, Mac OS)
How does software work?
programmers write down algorithms in a special language that the computer can understand
Evaluating Algorithms - Sequential Algorithm
-
Evaluating Algorithms - Breaking Bad Algorithm
-
What do signal algorithms use?
decomposition
abstraction
What must occur before sorting?
swapping
What are the 3 ways to evaluate algorithms?
- validity
- time (copies, swaps, comparisons)
- space (memory slots for cards, markers, dividers)
Simple Sort vs. Selection Sort vs. Insertion Sort
on average, Selection Sort and Insertion Sort have roughly ½ as many comparisons, and use roughly ½ as much space as Simple Sort
Quick Sort and Merge Sort
even faster
What is a classifier?
algorithm that maps input data to a specific category, derived from patterns or correlations from data
What is training data?
data that classifiers learn the patterns that has the “answer”
What is test data?
some training data that is held back to check and see if the classifier works
What do classifiers do?
apply patterns to new data with no “answer”
What is classification?
general class of algorithms
What is the idea behind classification?
want to use patterns and/or correlations to make decisions
What are classifiers?
algorithms that perform classification, and are specific
What are the steps of classification?
- start with the data you have
- split data into training and test sets
- build classifier (find pattern in training set)
- use classifier on test data - after you come up with a classifier that seems to do okay with your training data, you use it on your test data to see what kinds of decisions it makes
- calculate accuracy - if results of your classifier match up with decisions you made in your test data, it’s looking good
What is a tree?
collection of nodes that…
- one node is the designated root
- a node can have zero or more children; a node with zero children is a leaf
- all non-root nodes have a single parent
- edges denote parent-child relationships
- nodes and/or edges may be labeled by data
How are rooted trees in CS often, but not always, drawn?
with root on top
What is a decision tree?
trees whose node labels are attributes, and edge labels are conditions
How do you build decision trees from training data?
look for the best way to reduce entropy
How do you determine entropy?
- if all are ‘yes’ or all are ‘no’ then entropy would be 0
- if there is a mix between ‘yes’ and ‘no’ then entropy is the total number of ‘yes’ and ‘no’
Building Decision Trees from Training Data
-
What is the purpose of a decision tree?
to decide which class(es) a set of objects belong to
Will a greedy algorithm for creating a decision tree always create the most optimal solution? Why or why not?
no
optimal choice for now may not be the most optimal choice in the long run
What is a greedy algorithm?
algorithm that makes choices based on what is the best choice at that current moment
does not consider what would be the best choice in the long run, although sometimes greedy algorithms give you the same answer as what would be the best choice in the long run
Which memory type has the highest access speed?
CPU registers
List the memory types from slowest to fastest access speed.
disc storage (virtual RAM, hard drive) - slowest
physical RAM (main memory)
level 2 cache
level 1 cache
CPU registers
List the memory types in order from lowest to highest capacity.
CPU registers
level 1 cache and level 2 cache
physical RAM (main memory)
disc storage (virtual RAM, hard drive)
What are CPU registers for?
used for current tasks that the computer is running, and are often overwritten as the computer switches between tasks
What may happen to values that CPU registers hold?
may eventually get written to memory and stored permanently, but it will not store any values on a permanent basis
What do hard drives do?
house all the data on your computer on a permanent basis
that’s why you can turn computer on and off and still retain access to the data you had previously
What does CPU stand for?
central processing unit
What is the brain of the computer system? Why?
CPU
responsible for telling the computer which instruction it needs to execute
What does RAM store?
frequently used values
What does RAM do?
helps increase overall speed of computer tasks since computer does not need to go to the hard disk to get the values it needs
(getting a value from the hard drive is much slower than getting it from RAM)
What is the motherboard?
circuit board that allows other parts of computer (like CPU and memory) to communicate with each other
What are hard drive disks (HDD)?
where data is stored on a long term basis
What do hard drive disks (HDD) do?
allows you to restart or power on computer without losing all values you were working with previous to restarting/powering on
What do operating systems do?
- provide platform for applications (bridge between additional software/applications and computer’s hardware)
- manage computer’s resources (CPU, memory, peripherals)
- establish user interface (icons, windows, menus)
What is the connection between the operating system, software applications, and hardware?
- operating system has instructions that tell hardware what to do
- software applications are built on top of operating system, and tell operating system what to do in order to run the software
What are peripherals?
anything connected to your computer such as your mouse, keyboard, printer, etc.
What is a transistor?
semiconductor device used to amplify or switch electronic signals and electrical power
When is fairness a non-issue?
for unambiguous tasks
What is bias?
prejudice in favour of or against one thing, person, or group compared with another, usually in a way considered
to be unfair
When do we tend to see bias?
for ambiguous tasks
ie. shows higher hotel prices to Mac users as compared to PC users
ie. rate of getting misidentified by a facial recognition algorithm goes up as the colour of your skin becomes darker
What are unambiguous tasks?
tasks in which there is a clear input, output, and way to evaluate
What are ambiguous tasks?
tasks in which the input is variable, the mode of evaluation is subjective, and the output is clear or variable
Do computers have bias?
computers are susceptible to bias that humans have because humans are still the ones collecting data and feeding the algorithm
What is the Garbage In, Garbage Out (GIGO) principle?
flawed (nonsense) input data produces nonsense output
What classifier is fairest?
Max Profit Rationale
drops least amount of people, makes most money
What classifier is fairest?
Demographic Parity and Equal Opportunity Rationale
relatively higher profit while kind of decreasing gap between loan threshold of two different demographics
What classifier is fairest?
Equal Opportunity Rationale
has highest true positives (best)
What classifier is fairest?
Group Unaware Rationale
it’s unethical to discriminate based on something racial or similar types of information to that
Should sensitive information be considered?
yes - fairness through awareness
sometimes, in order to be
fair, it is important to make use of sensitive information…
What does fairness mean?
similar people are treated similarly