8. Implementation Aspects Flashcards
What is a real-time system?
When the sampling rate at the input and at the output of the scheme is the same. Such a processing unit is called a real-time system if it fulfills the timeliness property. This means that the system is sufficiently fast to ensure that always fs^(out) valid output samples y[k] are produced per second, whatever the signal applied to the input or the operation mode the processing scheme is in.
(p. 229)
What is a batch processing system?
Real-time systems contrast with batch processing systems, i.e. with systems that process a finite set of (prerecorded) data samples (typically stored in the memory of a microprocessor or on hard disk) in an off-line way. Batch processing systems are usually not dictated by time constraints : the user simply has to wait till all the samples have been processed.
(p. 230)
Comment on the correctness of the code.
When porting a signal processing solution onto dedicated hardware, it does not suffice to simply debug the code until all the warnings and (allocation) errors have disappeared: a thorough validation of the correctness of the code is mandatory. This is typically done by processing representative, random test signals both with the original (reference) code in the higher programming language and with the code written in the lower programming language. Only if the difference is small, i.e. around machine precision, correctness can generally be assumed.
(p. 230)
Make sure you can calculate the number of arithmetic operations per sample and the number of arithmetic operations per second that are needed to implement an algorithmic scheme (for example the FIR filtering operation of equation 8.1).
Observe that the number of (arithmetic) operations required to implement an (N + 1)-taps FIR filter amounts to N + 1 multiplications (assuming that all bn 6= 0 and bn 6= ±1) and N additions (assuming that all terms bnx[k − n] 6= 0) per output sample y[k]. As typically, both a multiplication and an addition of two real-valued numbers can be performed in one machine cycle, the number of (arithmetic) operations equals 2N + 1 op. per output sample. If fs is the rate at which filter input x[k], and hence, output y[k] is sampled, the required number of (arithmetic) operations per second is given by (2N + 1)fs ops.
(p. 231)
What are MIPS and MOPS?
Processing power is typically expressed in MIPS (Millions of Instructions per Second) or MOPS (Millions of Operations per Second), and can be found in the data sheet of the processor.
Keep in mind, though, that as each chip manufacturer uses its own definition of MIPS and MOPS, these numbers need to be interpreted with care.
(p. 231)
What is the disadvantage of cost estimates?
Recall that a cost estimate merely counting the number of multiplications and additions is a rough approximation and underestimate for the total execution time only. Nevertheless, it is a useful instrument to highlight the dependence of the total cost on each of the algorithm parameters and to compare and benchmark different algorithmic schemes.
(p. 232)
Make sure you can calculate the required amount of memory that is needed to implement an algorithmic scheme in real time (for example the FIR filtering operation of equation 8.1).
For example, it is easily verified that for a real-time implementation of the FIR filter of equation 8.1 in total 2N + 1 data words need to be stored : N + 1 memory locations are required for the filter coefficients bn (assuming that all bn 6= 0 and bn 6= ±1) and N locations for delayed input samples, i.e. for x[k−1], . . . , x[k−N]. Depending on the specific implementation, a few more storage registers (memory cells) are needed to store additional intermediate variables and the current input x[k] and current output y[k]. The amount of memory required for the program code depends on the software encoding and the programming style.
(p. 232)
Which parts does the input/output delay of a real-time implementation scheme consist of?
- a delay due to the A/D and the D/A converter
- a latency caused by the operating system
- a delay inserted by the block processing
- possibly, an algorithmic delay introduced (on purpose) inside the signal processing scheme
(p. 232 - 234)
What is block processing?
What is sample-based processing?
Block processing is a popular data handling technique used inside many signal processing schemes. The idea is to transfer data and process this data in frames of P samples rather than on a sampleby-sample basis. Sample-based processing is in fact a special case of block processing with P = 1.
(p. 232 - 233)
How long is the delay introduced by a block processing system with frame length P?
Make sure you understand figure 8.1.
Unfortunately, block processing inevitably introduces a delay, which is at least as large as block length, also called frame length P.
(p. 234)
What is in trade-off with the processing delay?
Typically, there is a trade-off between processing delay on the one hand, and computational complexity and/or algorithm performance on the other hand.
(p. 234)
Make sure that you understand figures 8.2 and 8.3, and that you can give the corresponding difference equation, transfer function, filter order, impulse response, frequency response and the poles and zeros.
You may furthermore be asked to discuss the stability and to calculate the number of arithmetic operations, the expected block processing delay and the amount of memory required for a real-time implementation.
(p. 235 - 237)
What is a multiply-accumulate operation? What is it used for?
Many DSP processors provide hardware-encoded so-called multiply-accumulate (MAC) instructions to efficiently perform such a combined multiplication and addition.
Hence, if MAC instructions can be used, the number of operations per second needed
for the FIR filtering operation of equation 7.1 reduces to (N + 1)fs ops.
(p. 235 - 236)
What is a circular buffer?
What is it used for?
If a linear buffer is used, the first cell contains the current input x[k] and in the last cell the least recent input sample x[k − N] is stored. Disadvantage of this approach is that each time a new
input sample arrives the entire delay line has to be updated (mass shift of data). This explains why often a circular buffer is preferred, where at each sample instance only the oldest sample x[k − N], which is not needed any further, is replaced with the newest input sample x[k+ 1] and all other data in memory are left unchanged.
(p. 236)
When are fast convolution techniques needed?
Be aware that FIR filtering operations often require a large number of calculations, and that hence, cost efficient implementation schemes based on fast convolution techniques are called for.
(p. 237)