Chemometrics quizzes Flashcards
Difference between nominal and ordinal variable
Nominal is categorical and cant be ranked ordinal is also categorical and a number but can be ranked
Whats the difference between discrete and continuous variables
discrete variables are specific numebrs like integers and continuous can exist between real numbers
Inferential vs descriptive stats?
Inferential stats make an inference based off the data (analyzes it - draws conclusions) - descriptive just describes it (eg mean, median mode)
What does the null hypo mean and what is the alternative hypo
null is that there there is no sig difference between the groups or means -alt is that there is
What is a two tailed t test
a t test that looks for differnce both greater and less than (either direction
Why do we use post hoc tests with ANOVA
to see specifically the relationship between groups - which groups specifically have a statistically significant relation
With bonferroni adjusted p value - what do you use to adjust the p value
of tests/comparisons
In two way anova how many factors do you have
2
When checking for normal dist - is p < or> than 0.05 for normal
p>0.05
know how to name a box and whisker plot
we have the whisker and the interwuartile range
we also have min, max, lower quartile, upper quartile and median
What is a correlation matrix
its a matrix showing correlation between all combinations of variables
what is one way repeated measures ANOVA
When the same subjects are measured more than once (eg same subject but different time points
With correlatoins what does the magnitude of the correlation describe
Strenght of relation
Name 2 ways 1st order poynomial regressions differ from 2nd order
Different DOF, quadratic has C term, quadratic non linear
What is a residual and when they are all summed what do they equal
Distance of each point from best fitted line - all summed up they equal 0
if slope not statiscally significant (p > 0.05) what does this mean - if the overal model has ap value not sig what does this mean
If slope not sig that means no relationship between x and y (b=0)
if model not significant - doesnt effectively predict
What is the difference between two way anova and MANOVA
more than one variable
What is the equation for linear regression
y = mx + b
Things to check on influence plot
Outliers, Leverage poitsn and influence
4 assumptions for linear regression
error in x negligible, dependant vriable needs to be normally dist, variance in error across y should be constant and x and y are continuous and independant
What is the parametric Mann whitney U
2 sample independant t test
What is the non parametric equivalent for the one way anova test
Kruskal Wallis
Difference between supervised and unsupervised learning
Supervised - we know outcome and this informs the model - unsupervised only give data no existing info or input
What are the two things used to calculate PCA scores
Magnitude (concentration) and influence (variance)
difference between PCA scores plot and PCA loading plot
PCA scores plot shows pC scores for each group on 2d plane
Loadings plot - shows individual feature within whole experiment and which incluence the PC’s the most
How do robust methods work
they use the median or other forms of means that arent as effected by outliers
what is MAD
median absolute deviation - a way to describe variation in data set with outliers - it is the median of the absolute value distances form the median
Before you do PCA - what do you do
scale /transofrm the data - normalize
what is PARSIMONY
Getting to the core explanation of a system with the least amount of info
What test do you run for sig relationship between categorical variables
chi squared
Whats difference between logistic and poisoson regression
Poisson is counts - dependant variable is counts, logistic - dependant variable is just categorical
4 ingredients in machine learning
A model
a loss function
a way to improv the model (optimization)
and data
what does PLS-DA stand for
Partial Leas square discriminant analysis
what is aglglormerative clustering
each observation as own clulster - and join them together until one cluster
What does height in dendogram indicate
order in which clusters joined ( can indicate distance)
What is K in regards to clustering
K is # of clusters desired
What is overfitting in Machine earning
model only fits your data - not generlaizable
Within a confusion matrix what is sensitivity and specificity
sensitivity is TP /(TP+FN)
specificty TN/(FP + TN)
What are ensemble methods
use multipe learning algorithimgs to obtain better predictive performance
4 quantities of power analysis
power, sample size, alpha, effect size
2 principle for appropriate sampling
randomization and representation
What does R^2 tell you in cal curve-
how close do measure smatch linear model
Matrix effects - what are they
behaviour of cal curve changed due to matrix components
what is weighted regression
Cal curve set to go through points that have the lowest variation
What is the main difference between QA and QC
QA before data collected - QC are actions performed at all stages of sample analysis
What does a Shewhart chart show
Sequential plot of observations obtained from a qc material analyzed in successive runs together with warning and action limits to ID when things went wrong
What is the main reason to use system suitability
ensure instrument is working properly before you start study
Blanks
Field and Trip Blank
Reagent blank