MIDA 2 Flashcards
[Intro] What is the difference between White-Box and Black-Box modeling?
- White-box uses first-principle equations and requires knowledge of the system’s physical parameters.
- Black-box relies on data collected from experiments and doesn’t require knowledge of the underlying physical system
[Intro] What is Software Sensing?
It’s an algorithmic approach to estimate un-measurable variables using indirect measurements.
[Intro] What are some advantages and disadvantages to Software Sensing?
+ Measures variables otherwise unmeasurable
+ Reduces the need for physical sensors, saving costs.
+ Eliminates maintenance and fault risks of physical sensors.
- Development costs for designing and calibrating the algorithm
- Potential for higher variance in sensing errors due to indirect measurement.
[Ch1] What are the three main mathematical representations of discrete-time linear systems?
1.- State Space (Internal representation)
2.- Transfer Function (external representation)
3.- Impulse Response (I.R.)
[Ch1] What equations define the state-space representation?
- State Equation: x(t+1) = Fx(t) + Gu(t)
- Output Equation: y(t) = Hx(t) + Du(t)
[Ch1] What do the matrices F, G, H, and D represent in state-space representation?
- F: State Matrix
- G: Input Matrix
- H: Output Matrix
- D: Direct Transmission Matrix (0 for strictly proper systems.
[Ch1] What’s the formula to transform from State-Space to Transfer Function representation?
W(z) = H(z I - F)^-1 G
[Ch1] What is the Z-transform? What does it mean?
The Z-transform converts discrete-time signals into the transform domain, simplifying the analysis of linear systems. It is crucial for moving between IR and TF representations..
[Ch1] How is the impulse response representation defined?
It represents the output y(t) as the convolution of the input u(t) with the system’s impulse response:
y(t) = SUMk=0∞ w(k)u(t-k)
with w(k) being the impulse response
[Ch1] Why is the impulse response representation rarely used in practice?
It requires all values of the impulse response to be noise-free and fully known, often impractical.
[Ch1] How can the eigenvalues of the state matrix F determine system stabillity in state-space representation?
- If all eigenvalues of F lie within the unit circle in the complex plane, the system is asymptotically stable.
- Eigenvalues on the unit circle indicate simple stability.
- Eigenvalues outside the unit circle indicate instability.
[Ch1] What’s the difference between asymptotical stability and simple stability?
Asy. stability guarantees that the system returns to it’s equilibrium after disturbances. Simple stability does not guarantee convergence, posing risk in dynamic systems.
[Ch1] How does the concept of “Strictly Proper” systems relate to the impulse response?
Strictly proper systems ensure that the output doesn’t respond instantaneously to an input jump. This reflects physical system behavior without abrupt changes.
[Ch1] What’s the formula to transfrom from a state-space representation to an Impulse Response?
y(t)= H F^(t-1) G for t>0
[Ch1] Why is transformation from impulse response to state-space representation challenging (not recommended)?
It requires reconstructing state matrices from measured impulse responses, which are sensitive to noise and involves complex calculations.
[Ch1] What is Observablity in the Context of a System?
A system is observable if the current state can be determined from the output history over time. Mathematically
O = [H ; HF ; HF^2; … ; HF^(n-1)]
MUST have a full rank. n is the order of the system.
[Ch1] What is Controllability, How is it Tested?
Controllability means the system’s states can be fully controlled using the inputs. It is tested using the controllability matrix:
R= [ G FG F^2G … F^(n-1)G]
This must be Full rank.
[Ch1] What are the implications of a system not being fully observable or controllable?
If a system is not fully observable, some states cannot be inferred from outputs. If not fully controllable, certain states cannot be influenced by inputs. They both limit differently the ability to design effective controls.
[Ch1] What does 4SID mean?
Subspace-based State Space System Identification.
[Ch1] What is the role of the Hankel matrix in the 4SID algorithm?
The Hankel matrix organizes impulse response data into a structured form. Its rank indicates the system’s order, and it can be factorized to estimate observability and controllability matrices.
[Ch1] How is the system order determined using the Hankel matrix?
By increasing the matrix size and checking its rank. When the rank stops increasing with added rows or columns, the rank indicates the system order. (This happens when we find a Non-full-rank matrix)
[Ch1] Why is 4SID considered non-parametric?
It does not assume specific model structure or use optimization. It directly estimates state-spacce matrices using impulse response data.
[Ch1] What is SVD and why is it used in 4SID?
Singular Value Decomposition decomposes a matrix into three components (U,S,V):
H = USV^T
Separating the system dynamics from noise.
It is used when building a Hankel Matrix with all the dataset.
[Ch1] What does the “knee” in the singular value curve represent in noisy data?
It represents the transition between system-related singular values and noise-related singular values, guiding the choice of system order.
[Ch1] How does noise affect the Hankel Matrix in the 4SID algorithm?
Noise adds distortions, leading to overestimated ranks and inaccurate matrix factorization. SVD helps filter out noise by retaining only dominant singular values.
[Ch1] What is the trade-off in choosing the size of the Hankel matrix?
A larger matrix provides better precision but increases computational complexity. The ideal size balances accuracy and feasibility.
[Ch1] How can one verify observability and controllability using block schemes?
By visually analyzing the block scheme:
- Observabillity: Check if all state contributions can be traced to the output.
- Controllability: Ensure input paths can influence all states.
[Ch1] What are some limitations of the 4SID algorithm?
- Highly sensitive to noise in the impulse response.
- Inefficient use of data
[Ch1] How is the 4SID algorithm generalized for non-impulse inputs?
It adapts to handle general input signals by modifying the data organization and ID Steps, but increases complexity.
[Ch1] Why is SVD critical for the 4SID algorithm?
It is used to decompose a noisy Hankel matrix into system-related singular values and noise-related singular values.
[Ch1] How do singular values determine the system order in 4SID?
The number of singular values corresponds to the system order. In an ideal case, there is a sharp drop after the system-related singular values, making the order evident (Knee).
[Ch1] What are the three matrices obtained from SVD, and what do they represent?
H = USV^T
where:
- U: Contains the left singular vectors related to observability.
- S: A diagonal matrix with singular values, indicating the system’s dynamics and noise levels.
- V: Contains the right singular vectors related to controllability.
[Ch1] What are the key steps in applying 4SID algorithm using the full dataset?
1.- Build Hankel Matrix: using the measured dataset (W(1), W(2) … W(n)).
2.- Perform SVD: Identify system/noise singular values.
3.- Determine system order: Analyze the singular value curve for a “jump” or “knee”
4.- Reconstruct Clean Hankel Matrix: Hsystem = UnSnVn^T
5.- Factorize Hsystem decompose it into the extended observability and controllability matrices.
6.- Estimate State-Space matrices
[Ch1] How does the choice of q and d for the Hankel Matrix dimensions affect the use of SVD in 4SID?
- Larger q and d improve precision but is harder to compute.
- A balanced choice is recommended. q>d/2
[Ch2] What is a Time series, and how is it characterized?
A time series is a sequence of data points measured at successive equally spaced points in time. It represents data like pollutant concentrations, stock values, or sport statistics.
[Ch2] Why are time-series considered “output-only” systems?
Because they focus on the output y(t) without measuring or modeling all the inputs (These can be too many, unmeasurable, or have minima influence on the output)
[Ch2] How can a time series be modeled without measurable inputs?
Time series are modeled using a fictitious input e(t), which is considered white noise. This input is not physically available but serves as a mathematical construct.
[Ch2] What is the significance of modeling a time series as a stationary stochastic process?
Modeling as a stationary stochastic process integrates practical data into a theoretical framework, enabling predictions and analysis of y(t) based on consistent statistical properties.
[Ch2] What are the key statistical properties of a stationary stochastic process?
- Mean value (my) is constant over time.
- Covariance Function (gammay (Tao)) Depends only on the time difference Tao.
- Spectrum (Sy(omega)) is the frequency domain representation of the covariance via a Fourier Transform.
[Ch2] WHat are the main features of white noise in time and frequency domains?
- Time: White noise is completely uncorrelated, making it unpredictable.
- Frequency: It has a flat spectrum, meaning energy is evenly distributed across all frequencies.
[Ch2] What is an ARMA model?
It’s an Auto Regressive Moving Average model. Containing:
- AR : a linear combination of past outputs.
- MA : a linear combination of current and past inputs (white noise).
y(t) = Sumi=1N ai y(t-1) + Sumj=0M cj e(t-j)
[Ch2] What are the special cases of ARMA models?
- ARMA (0, M): Moving Average model.
- ARMA (N, 0): Auto regressive model
[Ch2] How do ARMA models ensure stationarity?
Stationarity is ensured if all poles of the transfer funtion are inside the unit circle.
[Ch2] What are the requirements for canonical representation of an ARMA model?
1.- C(z) and A(z) have the same degree.
2.- Both are monic (highest degree term = 1)
3.- They are co-prime (no common factors).
4.- All roots of C(z) and A(z) lie strictly inside the unit circle.
EX:
y(t) = (z + 1/2) / (z - 1/3) e(t)
[Ch2] Why are canonical representations considered unique?
It minimizes the model order, ensuring no redundancy while maintaining equivalence with the original representation.
[Ch2] What is the goal of predicting ARMA processes?
To estimate future values y(N + k | N) using data up to the present time N, and k being the prediction horizon.
[Ch2] How is the “Optimal Prediction” defined?
A prediction is optimal if the prediction error is uncorrelated with the predictor, meaning no additional information can improve the prediction.
[Ch2] What are the two types of predictors for ARMA processes?
- Predictor from noise: uses white noise e(t) but is impractical since e(t) is un-measurable.
- Predictor from data: Uses past outputs of y(t) and is practical for real-world applications.
[Ch2] What are the steps to compute an optimal predictor for ARMA models?
- Rewrite y(t) in terms of white noise e(t) using canonical representation.
- Perform Polynomial division of C(z) / A(z) to separate predictable and unpredictable components.
- Use predictable past data for the prediction.
[Ch2] How does the prediction horizon affect error variance?
- Variance increases with the horizon (k) since predicting farther into the future introduces more uncertainty.
- for k ~ ∞, predictions converge to the process mean.
[Ch2] What is an all-pass filter, and what is its main property?
An all-pass filter has the form:
T(z) = 1/a (z+a)/(z+1/a)
where |a| < 1. It preserves the spectral characteristics of the signal but introduces a phase shift
[Ch2] Why are all-pass filters useful in ARMA models?
They simplify representations by removing unstable zeros while maintaining spectral equivalence.
[Ch2] What is the Error-to-Signal Ratio (ESR), why is it important?
ESR is a normalized metric that compares prediction error variance to the signal variance:
ESR(k) = (var[y(t) - y(t+k|t)]) / var[y(t)]
It evaluates the prediction quality. 0 = perfect, 1 = trivial (predicting the mean)
[Ch2] Why is predicting processes close to white noise challenging?
White noise is highly unpredictable due to its flat spectrum and lack of correlation, making optimal prediction inherently less accurate
[Ch2] In the Toy Example, why is it necessary to assume a model for prediction?
In the absence of a theoretical model of the system, a given model is needed to make predictions, as it provides mathematical basis to approximate the system’s behavior using the limited dataset
[Ch2] In the Toy Example, how do the predictors differ between AR and MA processes?
- AR process: prediction depends only on past outputs y(t-1), y(t-2), … making computations easiera
- MA process: Predictions depend on both past outputs and unmeasured past white noise e(t), requiring assumptions about initial conditions
[Ch2] What are the conditions for a transfer function to have a canonical representation?
All poles and zeros must lie strictly inside the unit circle, ensuring stability and minimal representation.
[Ch2] Why is canonical representation not always possible?
It is impossible when the poles or zeros lie on or outside the unit circle, as reciprocal transformations fail to satisfy stability and minimality requirements.
[Ch2] What distinguishes ARIMA models from ARMA models?
ARIMA models include Integration to handle non-stationary processes, represented by poles at z=1
[Ch2] What is the significance of the “random walk” process in an ARIMA?
It models cumulative effects over time, with the time-domain behavior
y(t) = y(t-1) + e(t),
representing unpredictable, non-stationary patterns.
[Ch2] How is the performance index for AR model identification computed?
Using the Prediction Error Method:
JN(theta) = 1/N SUMt=1N [ y(t) - phiT(t) theta^2]
where:
phi(t) is the vector of past outputs
theta contains the model parameters
[Ch2] Why is the optimization problem for AR models simpler compared to ARMA?
The predictor is linear with respect to parameters, making the performance index quadratic and solvable using explicit least-squares solutions.
[Ch2] What makes ARMA model optimization more challenging than AR models?
The presence of the moving average part introduces non-linearity, making the performance index non-quadratic and therefore requiring iterative methods to find optimal parameters
[Ch2] How is the order of ARMA models selected?
By testing multiple models with different orders and choosing the one with the minimum prediction error while balancing simplicity and generalization using cross-validation?
[Ch2] What is an ARMAX model? How does it differ from ARMA?
ARMAX adds an exogenous input component (u(t)) to the ARMA framework enabling modeling of input-output systems, which is very useful for control applications.
[Ch2] When should ARMAX models be preferred over ARMA?
- When measurable inputs significantly influence the output.
- For control applications requiring input-output relationships
[Ch2] What are the advantages of ARMA over ARMAX models?
- Simplicity and fewer variables to measure.
- Suitable when only past output values are available and no dominant input exists.
[Ch2] How does cross-validation help in selecting the optimal model order?
By splitting the dataset into training and validation sets, models are evaluated on unseen data to ensure generalization, avoiding overfitting or underfitting.
[Ch2] What is the criterion for selecting the best model in cross-validation?
The model with the lowest prediction error variance on the validation set is considered optimal
[Ch2] Why is a whiteness test conducted on residuals after model identification?
To confirm that the residual error contains no predictable patterns, indicating the model has captured all significant dynamics of the system.
[Ch2] What does a failed whiteness test imply?
- The assumptions were incorrect (linearity)
- Some important dynamics are missing (Nonlinearities or higher-order terms)
[Ch2] What is the expression for an ARMAX model using transfer functions?
y(t) = B(z) / A(z) z^(-k) u(t) + C(z) / A(z) e(t)
[Ch2] What are the general steps to solve a problem involving ARMAX models?
1.- Collect and preprocess the dataset
2.- Choose the ARMAX model structure (n, m, p, k) based on prior knowledge or starting with the simplest and increasing complexity.
3.- Derive predictor using available data using
y(t|t-k) = B(z)/A(z) z^(-k) u(t) + R(z)/A(z) y(t-k)
R being the remainder after C/A
4.-Use PEM to define performance index.
5.- Minimize PEM JN to estimate model parameters.
6.- Check residuals and/or use cross-validation to validate the model.
7.- Perform the forecast
[Ch4] What is the Kalman Filter (KF), and how is it categorized?
Kalman Filter is an algorithm based on state-space representation, used for state estimation and software sensing in control and modeling applications. It is model-based and originates from classical modeling and control theory.
[Ch4] What is the primary assumption behinf the Kalman Filter design?
It assumes a state-space model of the system with linear, time-invariant dynamics and incorporates two types of noise, state noise v1(t) and output noise v2(t)
[Ch4] What are the three main application of the Kalman Filter?
- k-steps ahead prediction of Output y(t+k|t), predicting future outputs
- k- steps ahead prediction of States x(t+k|t), predicting future state variables
- Filtering of the states x(t|t), estimating current state variables based on present data
[Ch4] Why is software sensing important in MIMO systems?
It enables state estimation when there are fewer sensors than states. Critical for:
- Control design
- Monitoring
[Ch4] What is the formula for the state prediction of a Kalman Filter? What about for the Output?
State equation:
x(t +1|t) = Fx(t|t-1) + K(t) e(t)
Output Equation:
y(t|t-1) = Hx(t|t-1)
[Ch4] What are the 3 general “blocks” needed to build the Kalman Gain and DRE equations?
State Block: FP(t)FT + V1
Output Block: HP(t)HT + V2
Mix Block: FP(t)HT + V12
[Ch4] Using the 3 Building Blocks, what is the equation for the Kalman Gain? What about the DRE?
K(t) = Mix_Block * (Output_Block)-1
DRE: P(t+1) = State_Block - (Mix_Block) * (Output_Block)-1 * (Mix_Block)T
[Ch4] How are the state noise v1 and output noise v2 modeled?
- State noise (v1(t)):
- Mean: E[v1(t)] = 0
- Covariance: E[v1(t) v1(t)T] = V1 (positive semi-definnite). - The same is true for the the Output noise, but the Covariance is Positive Definite
[Ch4] What are the main extensions of the Kalman Filter?
1.- Multi-step Predictor.
2.- Filter Form.
3.- Inclusion of Exogenous Inputs.
4.- Time-Varying Systems.
5.- Nonlinear Systems -> Extended Kalman Filter (EKF)
[Ch4] What’s the difference between the DRE and the ARE?
The ARE provides a steady-state solution for “infinite-horizon” problems, and is NOT time varying. (uses P_bar)
DRE is a time-varying solution for finite-horizon problems. (Uses P(t))
[Ch4] How does the multi-step predictor work in KF?
Starting from the 1-step predictor, propagate future states by repeatedly multiplying by F:
x(t+k|t) = F^(k-1) x(t+1|t)
y(t+k | t)=H x(t+k |t)
[Ch4] What is the asymptotic Kalman Filter and why is it used?
The asymptotic KF uses a steady-state gain K when P(t) converges to a constant P. It asimplifies computations and ensures stability in LTI systems?
[Ch4] What are the 2 theorems to establish if there exists an Asymptotic Kalman Filter Solution?
1:
- V12=0
- All Eig (F) inside Unit Circle
2:
- V12=0
- (F, H) is Fully Observable.
- (F, Γ) is Fully Controllable
Where Γ * ΓT = V1
[Ch4] How does the Extended Kalman Filter handle non-linear systems?
1.- Linearizes the system around the current state at each time step using jacobians:
F(t) = ∂f(x(t), u(t))/∂x , H(t) = ∂h(x(t))/∂x
2.- Applies KF equations with these locally linearized matrices.
3.- Recomputes F(t) and H(t) at each time step.
[Ch4] What are the limitations of the EKF?
- Lack of guaranteed stability.
- Computationally demanding due to repeated linearization.
- Sensitive to model inaccuracies.
[Ch4] Whi is software sensing important in KF applications?
It estimates unmeasurable states, reduces costs by minimizing the need for physical sensors, and provides redundancy in safety-critical systems.
[Ch4] What’s an example where the KF is used in practice?
Estimating the vertical speed of a sea in an off-road vehicle (tractor).
Uses a white-box model to describe the seat dynamics, incorporate accelerometer data, and apply KF for software sensing to estimate unmeasurable speed.
[Ch4] What are common challenges in implementing the KF?
- Accurately modeling noise covariances V1 and V2
- Ensuring numerical stability of the Riccati Equations.
- Dealing with computational demands, especially in high-dimensional spaces.
[Ch4] How can noise dynamics be incorporated into KF?
By embedding noise dynamics into the state-space model using state extension, transforming non-white noise into a compatible form.
[Ch5] What is the primary difference between the white-box and black-box approaches to software sensing?
- White-Box relies on a predefined model of the system and does not require a training set.
- Black-Box uses system identification techniques to estimate models and requires a training dataset where states must be physically measured.
[Ch5] Why is the training data-set necessary in black-box software sensing?
The training dataset is required to estimate the system states during the training phase. These measurements are replaced by the estimation algorithm during the production phase.
[Ch5] How is software sensing implemented for LTI systems?
For LTI systems, software sensing involves estimating transfer functions Sux(z, θ) and Syx(z, θ) using parametric system identification techniques.
Where Sux and Syx are the Transfer Functions used to transform the input, and output information into the States information.
[Ch5] What are the main steps to implement software sensing for LTI systems?
1.- Data collection: Gather input, output and state measurements in the training phase.
2.- System Identification: Estimate the parametric T.F. Sux and Syx
3.- Deployment: Use the identifies model to estimate states in real-time, replacing physical sensors.
[Ch5] What challenges arise when applying software sensing to nonlinear systems?
These would require more complex architectures such as:
- Recurrent Neural Networks
- FIR-Based Nonlinear Architectures.
- Recursive Architectures.
[Ch5] What are the differences between the 3 different architectures for software sensing in nonlinear systems?
- Recurrent Neural Networks: Handles nonlinear dynamics but has issues with stability and training complexity.
- FIR-Based: Ensures stability by design but may become computationally intensive in high-dimensional systems.
- Recursive (IIR): Reduces input dimensionality but risks instability during production.
[Ch5] Why are FIR-Based architectures preferred for most practical cases of Non-linear systems software sensing?
It ensures stability due to its finite impulse response scheme and simplifies training by focusing only on the non-linear static part.
[Ch5] What are the advantages and disadvantages of black-box software sensing compared to Kalman Filters?
Advantages:
- Do not require a predefined model of the system.
- Can achieve higher accuracy when high-quality dataset is available.
Disadvantages:
- Cannot estimate completely unmeasurable variables.
- Requires a training dataset and supervised learning.
- Lacks interpretability of results
[Ch6] What is a gray-box system, why is it used?
Gray-Box combines white and black-box modeling approaches. It uses physical equations (White-Box) with some unknown parameters estimated from data (Black-Box), making it suitable for scenarios with partial physical insights.
[Ch6] What is the role of the Kalman Filter in gray-box identification?
The K.F. estimates both system states and unknown parameters by treating the parameters as extended states in the system.
[Ch6] How does the state extendion trick work in gray-box identification?
Unknown parameters are added as state variables with fictitious dynamics:
θ(t+1)=θ(t)+vθ(t)
where:
vθ(t) is a fictitious noise, making the KF able to adjust parameters.
[Ch6] Why is the fictitious noise necessary in the state extension trick?
Fictitious noise ensures that the K.F. does not over-rely on initial conditions and adjusts parameters dynamically to fit the data.
[Ch6] How does one use the Gray-box approach to estimate the friction coefficient in a mechanical system?
1.- Start with physical model: mx_dd + cx_d +kx = F(t)
2.- Discretize using a method like Euler Forward.
3.- Extend the state to include c as a fictitious variable.
4.- Apply the K.F. to estimate both x(t) and c.
[Ch6] Why does gray-box identification often lead to non-linear systems?
Non-linearities arise from interactions between state variables and unknown parameters, requiring methods like the EKF for estimation.
[Ch6] What is the Simulation Error Method, how does it differ from K.F?
SEM minimizes the simulation error between measured outputs and simulated outputs by optimizing parameters in a system model. It does not predict states but evaluates a performance index offline (Not like K.F)
[Ch6] When is SEM preferred over K.F.?
SEM is preferred for systems with a small number of constant parameters, especially in offline identification scenarios.
[Ch6] What are the trade-offs when choosing vθ or Lamda^2 in gray-box sys id?
- Small vθ: Slow convergance but low variance in parameter estimates.
- Big vθ: Fast convergance but high variance in estimates.
The choice depends on the application’s precision and speed requirements.
[Ch6] What is the practical limitation on the number of estimated parameters in gray-box systems?
The ratio of measured variables (Sensors) to estimated parameters must be high enough to ensure reliable estimation.
Ex:
3 sensors, 2 parameters, and 5 states is a GOOD system.
3 sensors, 15 parameters and 10 states is a BAD system
[General] What does the value for the Determinant of a Matrix tell us about its Rank?
If det([A]) != 0, it means that the matrix is full rank, so the rank is the same value as the size of the matrix (if the matrix is 3x3, the rank = 3)
If det(A) = 0, it means that the matrix is Not full rank, so the rank is < size of the matrix.
[General] How does one get the inverse of a 2x2 matrix?
Given matrix A = [a b; c d]
the inverse is
A-1 = det(A) * [d -b; -c a]
[General] How is the Power Spectral Density of a process calculated?
PSD=GAMMA = |H(ejw)|^2 * var(WN)
[General] What’s the formula to find the variance of the y(t) of an AR(1) system?
var[y(t)] = 1/(1-a^2) * lambda^2
where a is the value in a y(t) = a*y(t-1)+e(t)
and lambda^2 is of the input white noise.
[General] What’s the formula to find the variance of an MA system?
For y(t) = [C0 + C1 z^(-1) + C2 z^(-2) …. Cm z^(-m)] e(t)
var[y(t)] = SUMj=0m Cj^2 * Lambda^2
[General] What is ejw+e-jw?
ejw+e-jw = 2 cos(w)
[General] What’s the formula to find the Power Spectral Density of a signal (output)?
GAMMA = W(z=ejw) * W(z=e-jw) * Lambda_n
Where n is the input white-noise of the Transfer Function W
[General] What’s the formula to find the Cost Function of a model from Data?
JN = 1/N SUMt=0N-1 E(t+1|t)^2
Where N is the number of samples
and E is the Prediction Error
Also:
J = var[ E(t+1
[Ch3] What are the 2 general steps for Frequency Domain System Identification?
1.- Given u(t) and y(t) compute an estimate of the system Frequency Response.
2.- Find the model parameters using fitting of What(ω)
minθ SUMω |What(ω) - W(ω, θ)|
[Ch3] In what step of the Frequency Domain Sys. Id. is there the most Challenge?
In the first step. The non-parametric description of the Frequency Response needs to be estimated with different tools.
[Ch3] Which kind of Input signals can be used for Frequency Domain Sys. Id.?
In general we need to use “Longer” signals, to ensure transients due to unknown init. cond. have disappeared.
- Single-Tone Sinewave u(t) = A cos(k wo t) with wo being the frequency resolution = 2pi/T and k=1,…, pi/wo
- Multi-Tone Sinewave: SUMk A cos (k wo t + Φ k) with Φ is a Random Phase between 0 and 2pi.
- Pseudo-Random Binary Signal (PRBS): Uses XOR operator and signal delays. Change of State is Random
-Gaussian Noise: u(t) ~ WN(m, l^2) . Only Non-periodic one.
[Ch3] What are the advantages of using a Multi-tone versus a single-tone Sinewave as the input signal?
- It is much less time consuming (It allows testing many frequencies in a single input)
- Within 1 period the signal is asymptotically Gaussian (Non-predictable), after the 1st period it can be predicted.
[Ch3] What-s the Frequency Response Theorem?
y(ω) = W(ω) U(ω)
[Ch3] In the Frequency Response Theorem, how are y(ω) and U(ω) calculated?
Using the Discrete Fourier Transform (DFT) (NOT the Discrete Time Fourier Transform DTFT).
[Ch3] What is the difference between the DTFT and the DFT?
DTFT uses a sumation from - infinity to + infinity and takes into account a continuous variable, which is not Useful for practical applications.
DFT takes into account a SUM up to the number of available samples.
[Ch3] What is the formula for the Discrete Fourier Transform?
VD(ω) = SUMN-10 v(t) e-JωDt
Where:
ωD = (N-1)/2 * ωo , … , (N-1)/2 * ωo
and ωo = 2pi/T
[Ch3] Using the DFT what are the results for the VD(ω) when T=N?
VD(ω) = A/2 * N for ω = +- ωo
and
VD(ω) = 0 for ω != +- ωo
When T=N The DFT is very simillar to the DTFT results.
[Ch3] In the DTFT what’s the meaning of Dirac’s Delta?
It’s a function representing a very short and intense input to a system as the mathematical idealization of a perfect impulse.
[Ch3] What is the result of the DFT when T != N?
The result is not defined for ω = ωo
There is a lot of Leakage
[Ch3] What is the meaning of Leakage in the context of DFT and DTFT?
The spectral content of a frequency ωo appears also on other frequencies. The information is distributed, so the result is not correct.
[Ch3] How is the Approximated Transfer Function for the Fourier Theorem?
What (ω) = YD(ω) / UD(ω)
Also called Empirical TF Estimation
where the sub D represents solutions using DFT.
[Ch3] What does Windowing do?
Windowing is multiplying a signal by a smooth Window Function w. What this does is reduce the side lobes in the frequency domain, reducing leakage, and emphasizes the main lobe. It smoothens transitions at the edges reducing high-frequency components, it makes it so that frequencies close to each other are less likely to “spill” into one another, and maintains the main frequency components.
[Ch3] What are the solutions to Leakage?
1.- Properly select the experiment duration (N = T). Not always possible.
2.- Properly selecting the Window shape. (Windowing)
[Ch3] What is normally the most used Window type?
It’s called the Hanning Window, it has a large peak in the middle and it goes to zero on either side (looks like a gaussian bell). It’s not perfect, but it helps a lot to reduce leakage.
[Ch3] Why can’t Leakage be avoided in Frequency Domain sys id.?
Because the sampling of the signal is being done on frequencies which are not the correct ones, we don’t know the correct frequencies, therefore it is essentially impossible to choose the correct value for the N
[Ch3] How is the effect of Noise minimized in Freq. Domain Sys. ID?
In general, we need to perform multiple experiments (or a single very long one) with the input being the same in each one, which allow us to average the results of Y and U, and therefore estimate the means.
Ybar = Expected[Y(ω)]
Ubar = Expected[U(ω)]
Wbar_hat = Ybar / Ubar
[Ch3] Why does the noise have more influence on the high frequencies of the signal than at the lower frequencies?
Because the noise to signal ratio at high frequency is bigger (the response of the signal for high frequencies is smaller than for low frequencies)
[Ch3] How do we solve the noise problem for a Random Noise Input?
Using the Power Spectral Density of the Input, and the Cross-PSD of the output.
What(ω) = Γyu (ω) / Γuu (ω)
Γyu = Y(ω) U*(ω) = W(ω) U(ω) U*(ω) + E(ω) U*(ω)
= W(ω) Γuu(ω) + 0
[Ch4] What’s the formula for the 1-step ahead Kalman predictor? (xhat(t+1|t))
xhat(t+1|t) = (F - KH) xhat(t|t-1) + K y(t)
Where K is found using the usual Formula
K = Mix_Block * Out_Block -1
[Ch4] How is the optimal k-steps ahead predictor found? (proof)
1.- y(t) = B(z)/A(z) u(t-k) + C(z)/A(z) e(t) with e(t) ~ WN(0,lambda^2)
2.- Compute C(z)/A(z) = R(z)/A(z) + E(z) and substitute in #1: y(t) = B(z)/A(z) u(t-k) + R(z)/A(z) e(t) + E(z) e(t) Remember E(z)e(t) can be neglected
3.- From #1 solve for e(t) and substitute in #2 to get: yhat(t|t-k)
4.- From the computation of C/A find substitute for C(z) - R(z)
5.- Simplify to get: yhat(t|t-k) = B(z)E(z)/C(z) u(t-k) + R(z)/C(z) y(t)
[General] How is an All-Pass filter built?
H(z) = 1/a (z+a)/(z+1/a)
[Ch1] What are the steps to factorize the Hankel Matrix into the Observability and Controllability matrices?
Prior: State that H = O . R
1.- From the Hankel Matrix, build the extended O (N+1 x N) and R (N x N+1) matrices.
2.- From the Extended Matrices build O1 and O2 (as well as R1 and R2), each of size (N x N) with a shift between them.
3.- Having O1 and O2, build F_hat = O1^-1 . O2