Cue_Cards Flashcards
What is Reconfigurable Computing (RC)?
Reconfigurable Computing (RC) uses programmable logic like FPGAs to accelerate computations. It combines software flexibility with hardware speed, exploits parallelism at various levels (bit, instruction, and process), and can be dynamically reconfigured for different tasks.
What are the main benefits of Spatial Computing?
Spatial computing improves performance by optimizing for specific problems, efficiently utilizing area, lowering power consumption due to customization, and allowing adaptation for changing requirements.
What techniques enhance RC performance?
Techniques to enhance performance include pipelining, which breaks tasks into stages, and parallelism, where multiple tasks execute simultaneously. For example, FPGAs utilize hardware-level parallelism for acceleration.
What are the coupling approaches in RC systems?
The coupling approaches in RC systems include functional units (FUs) integrated into the CPU’s data path, co-processors operating independently alongside the CPU, attached processing units similar to DMA but less tightly coupled, and standalone units with their own memory.
What is the trade-off between tight and loose coupling?
The trade-off between tight and loose coupling is in communication and independence. Tightly coupled systems have lower communication overhead but limited capability for complex tasks, while loosely coupled systems offer greater independence and parallelism but incur higher communication overhead.
How can we manage large FPGA designs?
Large FPGA designs can be managed by using the largest available FPGAs, optimizing architecture through semi-parallel or serial implementations, replacing floating-point arithmetic with fixed-point, or employing runtime reconfiguration to dynamically adapt the FPGA for different tasks.
What are the benefits of fixed-point arithmetic in FPGA designs?
Fixed-point arithmetic is advantageous because it is less costly, simpler in hardware, consumes less power, and is faster compared to floating-point arithmetic. However, it has limited precision and is not suitable for applications requiring high accuracy.
What are the differences between parallel, semi-parallel, and serial designs?
Parallel designs provide high speed but consume more area, while semi-parallel designs strike a balance between speed and area. Serial designs are the most area-efficient but have the lowest speed. The choice depends on constraints like cost and performance.
What are common metaheuristic methods for partitioning optimization?
Metaheuristic methods for optimization include simulated annealing, which allows exploration by permitting initial uphill moves to avoid local minima, and genetic algorithms, which use evolutionary strategies like selection, crossover, and mutation to refine solutions. These methods balance exploration and exploitation.
When should you use ASAP scheduling, and when is ALAP scheduling more appropriate?
Use ASAP scheduling when minimizing latency is critical. It ensures operations are executed as early as possible for the shortest latency. Use ALAP scheduling when optimizing for resource usage or energy efficiency. It schedules operations as late as possible, maximizing slack and allowing for better resource sharing.
What are limitations of using bit data type to represent a physical signal in VHDL?
Bit has a limited range of values between ‘0’ and ‘1’
What’s the difference between a VARIABLE and a SIGNAL in synthesis and simulation?
VARIABLES are local computation and mapped to combinational logic, SIGNALS are in communication/storage and mapped to flip flops or interconnects
Name three constructs in VHDL that can’t be translated into hardware?
Wait, Assertion, Loops With No Bounds
What does a BLE consist of?
LUTs, Flip Flops, MUX
What does a CLB consist of?
BLEs and Slices
How can a BLE be used to realize a boolean function?
LUTs can be used for combinational circuits while the Flip Flops can be used for sequential circuits
Name three flaws of a fine grain FPGA
1) Delay through the LUT will be constant 2) LUTs are grouped in CLBs, making connections outside of LUT slow 3) Lots of bits have to be downloaded as a bitstream, leading to a high power consumption
What are main advantages of using a medium grain FPGA over fine grain?
Medium grain FPGA has built-in blocks of RAM along with DSP units, which increases area efficiency and making it faster
Give FPGA Cad Flow
Design Entry -> Synthesis -> Logic Optimization -> Map to k-LUTS -> Packing -> Placement -> Routing -> Configure to FPGA
What are the four objectives ofp placement in FPGA?
1) Minimize wire length 2) Minimize congestion 3) Minimize signal delay 4) Ensure routability
What are advantages and disadvantages of separating placement and routing into two stages?
ADVANTAGE is that it helps simplify the problem, DISADVANTAGE is that both interact and we may end up losing critical values and another info in the process
State Flynn’s taxonomy of Multiprocesing
SISD (Single Instruction Single Data), MISD, (Multi Instruction Single Data), MIMD, (Multi Instruction Multi Data), MISD (Multi Instruction Single Data)
What is Amdahl Law?
Amdahl Law finds max speedup of an overall system given a portion of the system being sped up and # of processors
What are the advantages/disadvantages of SRAM-based technology in comparison to Anti-Fuse and EEPROM
ADVANTAGE: Speed, Programmability, Power Consumption.
DISADVANTAGE: Volatility, External Storage to download Bitstream, Radiation
Explain hiearchy between LCs, Slices, and CLBs inside a FPGA
Fastest interconnects between LCs, Faster interconnects between Slices in CLBs, highly related modules are packed together in order to optimize area and performance
What are main drivers behind RCS?
FPGA architechtures (fine, medium, coarse grain). PLDs and CPLDs, GPPs are too slow for computer algorithm, ASICs are too expensive
Temporal vs Spatial Implementations for RCS?
Spatial refers to creation of multiple multipliers/adders/ALUs to solve problem in min # of clock cycles (High Space Low Cycle). Temporal refers to sharing a shared execution unit to perform several optations (Low Space High Cycle).
Give formal definition of “Hardware/Software Co-Design”
The cooperative design between hardware and software components
What are the main concerns that you have to pay attention to (besides partitioning the application) between the general purpose processor and hardware accelerator?
Communication overhead, Load Balancing, Interacing, and achieving balance between performance and flexibility
What’s a simple approach for taking a software and mapping it onto a hardware/software co-design platform?
1) Profile application 2) Find bottlenecks 3) Partition application to move bottlenecks to hardware
What is it meant by both placement and routing being NP-complete problems?
We can’t solve the problem linearly in optimal time
Give two applications that aren’t suitable for RCS
extensive recursion, floating point
If a design exceeds size of FPGA, name three examples you can pursue. Name advantages/disadvantages
1) Get larger FPGA. Simple solution but expensive
2) Use multiple FPGAs, requires partitioning application
3) Local RTR, flexible but needs tools+flows, also harder than Global RTR