midterm 2 Flashcards
what are the formulas for regression?
- y = mx + b (slope intercept form of a line)
- m = nΣxy - ΣxΣy / nΣx^2 - (Σx^2) (slope of line of best fit)
- b = ȳ - mx̄ (y-intercept of the line of best fit)
what is the formula for spearman’s coefficient?
s = 1 - 6ΣD^2 / n(n^2 - 1)
what is the formula for exponential smoothing?
F(t+1) = αy(t) + (1 - α)F(t)
what is the formula to calculate simple averages
simple average = Sum of all data points / Number of data points
what is the formula to calculate moving averages?
moving average (eg. 3 months) = Sum of data points in last 3 periods / 3
how do you calculate sum of squared errors (SSE)?
- find the errors - for each data point, subtract the forecasted/line of best fit value from the observed value: Error(i) = Actual(i) - Predicted(i)
- square each error: Error(i)^2
- Sum the squared errors: SSE = Σ(Error(i))^2
what is the formula for the correlation coefficient?
r = nΣ(xy) - ΣxΣy / √ (nΣx^2 - (Σx)^2) (nΣy^2 - (Σy)^2)
what are variables?
numbers that can change
what does y represent?
the value on the vertical axis (y-axis)
what does x represent?
the value on the horizontal axis (x-axis)
what does m represent?
the slope of the line of best fit (how much y changes when x increases by 1 unit)
what does b represent?
the y-intercept of the line (where the line crosses the y-axis when x = 0)
what does n represent?
the number of data points
what does Σx mean?
the sum of all the x variables
what does Σy mean?
the sum of all the y variables
what does Σxy mean?
the sum of the product of each pair of x and y values (multiply each x-value by its corresponding y-value, then add up all those results)
what does Σx^2 mean?
the sum of all the squares of x-values (for each x-value, square it, then add up all those values)
what does Σy^2 mean?
the sum of all squares of y-values (for each y-value, square it, then add up all those values)
what does (Σx)^2 mean?
the square sum of x-values (add up all the x-values, then square the total)
what does (Σy)^2 mean?
the square sum of y-values (add up all the y-values, then square the total)
what does ȳ represent?
the average of all y-values
what does x̄ represent?
the average of all x-values
what does S represent?
spearmans coefficient (always between -1 and 1)
what does it mean if S = 1?
there’s a positive relationship between the two sets of ranks (meaning as one variable’s rank goes up, the other does as well)
what does it mean of S = -1?
there’s a perfect negative relationship between the two ranks (as one rank goes up, the other goes down)
what does it mean of S = 0?
there’s no relationship between the ranks
what does D represent?
the difference between the ranks of each pair of items
what does ΣD^2 mean?
the sum of all squared rank differences (for each item, find the rank difference, square it, and then add up all those squared values)
what does n(n^2 - 1) mean?
it is a normalizing factor that ensures S falls between -1 and 1
what does it mean if S is close to 1?
the two ranks are very similar (a strong positive correlation)
what does it mean if S is close to -1?
the ranks are inversely related (as one rank goes up, the other goes down)
what does it mean if S is near 0?
there’s little to no rank correlation
what does F(t+1) represent?
the forecasted value for the next period (or future data point)
what does α represent?
the smoothing constant (a value between 0 and 1 that controls how much weight is given to the most recent data points)
what does a higher α (closer to 1) mean?
more weight is given to the recent data point, making the forecast more volatile
what does a lower α (closer to 0) mean?
makes the forecast rely more on the previous forecasted value, making the forecast more stable
what does Y(t) represent?
the actual value observed for the current period (the latest data you have)
what does F(t) represent?
the forecasted value from the previous period, which provides a baseline (or trend) from past predictions
what is the line of best fit?
a straight line that best represents the data on a scatter plot, it is used to show the relationship between two variables
what is the formula to calculate line of best fit?
y = mx + b
what is exponential smoothing?
applies a smoothing constant (α) to determine the weight given to recent data, and therefore, the forecasts stability
what is a moving average?
calculates the average of a fixed number of past data points to smooth out short-term fluctuations and highlight long-term trends
what is regression analysis?
uses relationships between variables to project future values
what is the sum of squared errors (SSE)?
a measure of the accuracy of a forecast/trend line by showing how much the data points deviate from the predicted values
what does a low SSE indicate?
the forecast/model is a good fit for the data
how do you calculate the inter-relation of variables?
calculate the correlation coefficient (r)
what does r represent?
the correlation coefficient (used to measure inter-relation of variable)
what does it mean if r = 1?
positive correlation between the variables
what does it mean if r = -1?
perfect negative correlation between the variables
what does it mean if r = 0?
no correlation between the variables
what is the excel function to calculate slope?
=SLOPE(known_y’s, known_x’s)
what is the excel function to calculate intercept?
=INTERCEPT(known_y’s, known_x’s)