Regression Flashcards

1
Q

what is regression?

A

used to estimate equations that describe the relationship between variables, such as demand and cost functions, in the form Y = f(x) {OR Y = f(X1, X2 … XN)

we use it to understand the impact of the regressor X, or the independent casual variable, on any one specific regressand Y, or the dependent affected variable

There will always be a SET of casual variables (X1, X2 … XN) impacting the dependent variable (Y) but for this class you only need to understand how to study the impact of any ONE x on y at a time.

converts data clumps into comprehensible information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

differentiate between X and Y

A

1) independent VS dependent variable
2) regessor VS regressand
3) predetermined VS determined variable
4) serves as cause/stimulant VS effect/consequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

define ‘stochastic variable’. why is Y also known as this?

A

stochastic = random variable that can assume more than one value due to chance, AKA probabalistic variable

  • the value of Y is not fixed and depends on the outcome of a random process e.g. rolling a die or random selection using computer generated number
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

describing the functional relationship Y = f(X)

A
  • either positive/direct where Y is directly proportional to X (use correct notation)
  • or negative/indirect/inverse where Y is inversely proportional to X (Y is proportional to 1/X)
  • positive has positive slope and upward sloping trajectory; negative has negative slope and downward sloping trajectory
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

converting regression data into a linear equation

A
  • the equation will indicate the curve’s TRAJECTORY (sign of gradient) and VALUE OF SLOPE (value of gradient)
  • Y = mX + h (population regression function PRF) OR Y^ = m^X + h^ (sample regression function SRF)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

significance of h

A

h is the y intercept and tells us the value of Y at which Y becomes independent of X/in the absence of X - i.e. the value of Y when X is zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

definition of ceteris paribus

A

with all else in a state of dormancy/passivity, NOT constance (as these variables can change, you are only holding them constant for the sake of observation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the best way to collect data, and why?

A

randomisation, because it ensures all elements in the population have an equal chance of being selected and thus minimises selection bias and improves both representativeness and generalisability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

format of functional fractions

A

effect / cause

this is done to examine the proportion of an effect RELATIVE to a specific cause

link to dy/dx - we want to see how much y changes after x changes by a certain amount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

why do we call it the X and Y axes?

A
  • X axis measures ONLY cause, cause is represented by the variable X

opposite for Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

trajectory VS slope

A

trajectory = direction and has NO VALUE; represented by the sign accompanying the value of the slope

slope = angle or gradient and has VALUE

slopedness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

relationship between elasticity and slope of a curve

A

elasticity is represented by the slope of the curve - the greater the slope, the more price INELASTIC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

n VS N

A

n (sample size) is a SUBSET of N (population size)

the use of n will give you a result that is only an ESTIMATE

as n tends to N, estimation tends to certainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

conditions for n to ensure quality of statistical analysis

A

1) n>30
2) n must make up at LEAST 10% of N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

hat notation

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

definition of goodness of fit

A

how well the sample data fits with the population data; measured by R^2 (coefficient of determination of goodness of fit)

17
Q

3 methods of regression

A

1) OLS
2) Time-series data
3) Exponential/LOGIT

18
Q

OLS method

A

Ordinary Least Squares

  • is used to find the line of best fit (or OLS regression line) i.e. the line for the ‘squared gaps’ between points on your scatter plot and your line is as small as possible
  • line closest to all the points in a fair way!
  • measure each gap, square the gaps (to ensure no negative and positive numbers cancel out each other) and add all the squared gaps to get a ‘total gap score’. You adjust the line until this score is minimised, which means you have found the line of best fit
19
Q

time-series regression

A

using past patterns to estimate future data

20
Q

for time-series regression, why is using the median year convenient? (for examples used in class, ONLY)

A

in the cases done in class, sigma X = 0 which results in several parts of m’s formula cancelling out –> simplifies calculation

21
Q

for time-series regression, forecast
estimates for how many years into the future are considered unreliable?

A

more than 5 years into the future - we DO NOT KNOW what will happen by that time!

no similar issue for precasts because we already know what has happened in the past and can use this knowledge to explain our results and adjust for various factors

22
Q

exponential regression

A

used when the data changes (increases or decreases) exponentially rather than linearly, i.e. it multiplies quickly and does not occur in a straight way by the same amount each time

23
Q

how to describe concave to x upwards, downwards and convex to x upwards, downwards

A

1) concave upwards - for more of x, we get less and less of Y (Y increases at a decreasing rate)

2) concave downwards - for more of X, we give up more and more of Y (Y decreases at an increasing rate)

3) convex upwards - for more of X, we get more and more of Y (Y increases at increasing rate)

4) convex downwards - for more of X, we give up less and less of Y (decreases at decreasing rate)

the change in Y (delta Y) increases/decreases at an increasing/decreasing rate for the same increase in X

24
Q

what is a learning curve?

A
  • the time it takes for an employee to learn the systems involved in the production process
  • visual representation of the relationship between task proficiency and experience in that job
  • based on the premise that individuals need time to become proficient at something new
  • therefore businesses need to invest in training to obtain a certain target output - over time, the trainee learns, becomes more efficient and therefore more productive
  • applied to business, the learning curve represents the relationship between cost and output
25
Q

describe a learning curve

A

X - number of attempts at learning
Y - performance measure
- slow-paced, fast-paced and plateau phase

26
Q

types of learning curves

A

1) diminishing returns - illustrates tasks quick to learn and early to plateau; manual tasks

2) increasing returns - tasks difficult to learn at first but rate of return is significant after some time; operating sophisticated instrument

3) sigmoid - AKA increasing-decreasing return curve, represents tasks difficult to learn initially, but begins to plateau once proficiency is obtained

4) complex - learning trajectory traced over a long time period where the individual may experience a temporary belief of mastery, only to discover there is more to learn

27
Q

what is inflexion?

A

the point at which the rate of change changes and the trajectory stays the same. The curve is neither concave up nor concave down at that point and the CONCAVITY MIGHT BE CHANGING

28
Q

second derivatives and inflexion

A

f’‘(x)>0 means minima (concave up - smile)
f’‘(x)<0 means maxima (concave down - frown)
f’‘(x)=0 means inflexion occurs at that value of x. ( f’‘(X) must change sign from positive to negative as you pass through this point. If no change in sign, it is NOT an inflexion point and merely a stationary/flat region)

29
Q

why don’t parabolas have inflexion points?

A
  • inflexion occurs where the curve changes its concavity, but parabolas are always either fully concave up or fully concave down
30
Q

define minima and maxima points

A

maxima - highest point on curve FOLLOWING WHICH there is a fall

minima - lowest point on curve FOLLOWING WHICH there is an increase