Midterm \ Flashcards
Independent Variable
Variables used for explaining values of the dependent variable and denoted by X
Linear Regression
Regression analysis where relationships between the independent and dependent variable are approximated by a straight line
Simple linear regression
regression analysis that involves one dependent and one independent variable
Multiple Linear Regression
regression analysis involving one dependent variable and more than one independent variable
Dummy Variable
a variable used to model the effect of categorical independent variables in a regression model
Least Square Method
a procedure for using sample data to find the estimated regression equation
Business Analytics
the scientific process of transforming data into insight for making a better decision
what are the 3 types of analytics
Descriptive, Predictive, Prescriptive
What is descriptive analytics?
Analytics that has described what has happened (ex: queries, reports, data mining)
What is predictive analytics?
techniques that used models constructed from past data to predict the future
what is prescriptive analytics
Techniques that analyze input data and yield the best course of action.
What Are some Business Analytics examples
Financial, H.R., Marketing, Healthcare, and supply chain
what are the steps in the decision-making process?
1)identify and define the problem,
(2)determine the criteria that will be used to evaluate alternative solutions,
(3)determine the set of alternative solutions,
(4)evaluate the alternatives, and
(5)choose an alternative.
Cluster Analysis
the goal of clustering is to organize observations into smaller groups based on observable variables.
what does k-clustering mean?
process of organizing observations into one of K groups based on a measure of similarity
what is hierarchical clustering?
Process of agglomerating observations into a series of nested groups based on a measure of similarity.
what is Euclidean distance?
Geometric measure of dissimilarity between observations based on the Pythagorean theorem.
Jaccard distance
Measure of dissimilarity between observations based on Jaccard’s coefficient.
Matching distance
Measure of dissimilarity between observations based on the matching coefficient.
Matching coefficient
Measure of similarity between observations based on the number of matching values of categorical variables.
Manhattan distance
Measure of dissimilarity between two observations based on the sum of the absolute differences in each variable dimensions.
Single linkage
The measure of calculating dissimilarity between clusters by considering only the two most similar observations between the two clusters.
Complete linkage
2
Measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations between the two clusters.
Group average linkage
Measure of calculating dissimilarity between clusters by considering the distance between each pair of observations between two clusters.
Centroid linkage
Method of calculating dissimilarity between clusters by considering the two centroids of the respective clusters.
Median linkage
Method that computes the similarity between two clusters as the median of the similarities between each pair of observations in the two clusters.