Part 5, diving into data Flashcards
what is the median
this is a measure of location or a measure of central tendency. this tells us exactly where the centre of the dataset is
what is the interquartile range
this is a measure of spread or dispersion. It tells us how far the data travels from the median.
Its found by splitting the data set into 4 quarters and then taking the middle 50%
what would the value r be for each of these
- strong positive correlation
- weak negative correlation
- no correlation
what do the correlation coefficients of each of these mean
- r = 1
- r = -0.3
- r =0
at this stage we would
writing and testing code by implementing the algorithm
in the context of programming what would we do at stage 3 carry out the plan of george polyas problem solving process
This is a built in library in python that can read csv files and translate them into lists
what is the python csv libary for
what are the steps of the high level algorithm for finding the median
- sort numbers
- find length of list
- if length is odd
3a. set median to middle number
- otherwise
4a. find two middle numbers
4b. set median to mean of two middle numbers
this was a hungarian who studied the process of problem solving
who was george polya
This function opens a file, and returns it as a file object
what is the python open() function for
what are the 4 steps of an algorithm to find the interquartile range
- Find the median
- Split the data into a lower half and an upper half
- Find the median of each half
- Subtract the lower median from the upper one.
who was george polya
this was a hungarian who studied the process of problem solving
what is the python open() function for
This function opens a file, and returns it as a file object
- Find the median
- Split the data into a lower half and an upper half
- Find the median of each half
- Subtract the lower median from the upper one.
what are the 4 steps of an algorithm to find the interquartile range
what are heuristics
these are methods of discovery or invention. Rules of thumb that help us to solve problems.
what do the correlation coefficients of each of these mean
- r = 1
- r = -0.3
- r =0
what would the value r be for each of these
- strong positive correlation
- weak negative correlation
- no correlation
- sort numbers
- find length of list
- if length is odd
3a. set median to middle number
- otherwise
4a. find two middle numbers
4b. set median to mean of two middle numbers
what are the steps of the high level algorithm for finding the median
in the context of programming what would we do at stage 2 make a plan of george polyas problem solving process
at this stage we would
Create an algorithm and apply heuristics to help solve any complex problems
at this stage we would
Create an algorithm and apply heuristics to help solve any complex problems
in the context of programming what would we do at stage 2 make a plan of george polyas problem solving process
- Understand the problem
- Make a plan
- Carry out the plan
- Look back
what are the four steps of problem solving that hungarian george polya laid out
this is a measure of location or a measure of central tendency. this tells us exactly where the centre of the dataset is
what is the median
this is a measure of spread or dispersion. It tells us how far the data travels from the median.
Its found by splitting the data set into 4 quarters and then taking the middle 50%
what is the interquartile range
- Break the problem into smaller sub-problems.
- Try to solve a simpler form of the problem.
- Think if there is a pattern you have seen before.
- Try working backwards from the solution.
- Try representing the problem in a different way.
- Don’t give up too quickly.
- Don’t be afraid to make mistakes.
name 7 heuristics
what is the syntax for the python open() function
the syntax for this is
name(file, mode)
file = the path and name of the file
mode = a string defining what mode you want to open the file in.
in the context of programming what would we do at stage 3 carry out the plan of george polyas problem solving process
at this stage we would
writing and testing code by implementing the algorithm
what is the algorithm for including or excluding rows from a table given a condition
- initialise an empty table table1
- for each row in table0
2a. if row satisfies condition
2b. append row to table1
these are methods of discovery or invention. Rules of thumb that help us to solve problems.
what are heuristics
name 7 heuristics
- Break the problem into smaller sub-problems.
- Try to solve a simpler form of the problem.
- Think if there is a pattern you have seen before.
- Try working backwards from the solution.
- Try representing the problem in a different way.
- Don’t give up too quickly.
- Don’t be afraid to make mistakes.
in the context of programming what would we do at stage 1 undertsnad the problem of george polyas problem solving process
at this stage we would
Write tests first to better understand the problem at hand
what are the four steps of problem solving that hungarian george polya laid out
- Understand the problem
- Make a plan
- Carry out the plan
- Look back

what is the formula to find the interquartile range
the syntax for this is
name(file, mode)
file = the path and name of the file
mode = a string defining what mode you want to open the file in.
what is the syntax for the python open() function
at this stage we would
Write tests first to better understand the problem at hand
in the context of programming what would we do at stage 1 undertsnad the problem of george polyas problem solving process
what is the python csv libary for
This is a built in library in python that can read csv files and translate them into lists
- initialise an empty table table1
- for each row in table0
2a. if row satisfies condition
2b. append row to table1
what is the algorithm for including or excluding rows from a table given a condition
this is a number between -1 and +1 and is usually denoted by r. it tells us how closely the points of two data sets follow the trend line
what is the correlation coefficient
what is the formula to find the interquartile range

at this stage we would
Look at any challenges overcome, any resources used, what was learnt that can be applied in the future
in the context of programming what would we do at stage 4 look back of george polyas problem solving process
what is the correlation coefficient
this is a number between -1 and +1 and is usually denoted by r. it tells us how closely the points of two data sets follow the trend line
in the context of programming what would we do at stage 4 look back of george polyas problem solving process
at this stage we would
Look at any challenges overcome, any resources used, what was learnt that can be applied in the future