V4FPGA Optimization
Hardware Architecture
|
Objectives
|
- VHDL or Verilog concepts
- C Language knowledge (see for example our L2 training course)
- Familiarity with FPGA concepts
- Theoretical course
- PDF course material (in English)
- The trainer to answer trainees’ questions during the training and provide technical and pedagogical assistance
- Practical activities
- Practical activities represent from 40% to 50% of course duration
- Example code, labs and solutions
- Vivado or Libero for design, synthesis, and timing analysis; ModelSim or Vivado for simulation
- Any embedded systems engineer or technician with the above prerequisites.
- The prerequisites indicated above are assessed before the training by the technical supervision of the traineein his company, or by the trainee himself in the exceptional case of an individual trainee.
- Trainee progress is assessed in two different ways, depending on the course:
- For courses lending themselves to practical exercises, the results of the exercises are checked by the trainer while, if necessary, helping trainees to carry them out by providing additional details.
- Quizzes are offered at the end of sections that do not include practical exercises to verifythat the trainees have assimilated the points presented
- At the end of the training, each trainee receives a certificate attesting that they have successfully completed the course.
- In the event of a problem, discovered during the course, due to a lack of prerequisites by the trainee a different or additional training is offered to them, generally to reinforce their prerequisites,in agreement with their company manager if applicable.
Course Outline
- High Throughput
- Low Latency
- Timing
- Add Register Layers
- Parallel Structures
- Flatten Logic Structures
- Register Balancing
- Reorder Paths
| Exercise: | Example of Optimizing a Multiply-Accumulate Block | |
- Rolling Up the Pipeline
- Control-Based Logic Reuse
- Resource Sharing
- Impact of Reset on Area
- Resources Without Reset
- Resources Without Set
- Resources Without Asynchronous Reset
- Resetting RAM
- Utilizing Set/Reset Flip-Flop Pins
| Exercise: | Example of analyzing, comparing and optimizing multiple designs | |
- Clock Control
- Clock Skew
- Managing Skew
- Input Control
- Reducing the Voltage Supply
- Dual-Edge Triggered Flip-Flops
- Modifying Terminations
- AES Architectures
- One Stage for Sub-bytes
- Zero Stages for Shift Rows
- Two Pipeline Stages for Mix-Column
- One Stage for Add Round Key
- Compact Architecture
- Partially Pipelined Architecture
- Fully Pipelined Architecture
- Performance Versus Area
- Other Optimizations
- Abstract Design Techniques
- Graphical State Machines
- DSP Design
- Software/Hardware Codesign Thread Fundamentals
- Crossing Clock Domains
- Metastability
- Solution 1: Phase Control
- Solution 2: Double Flopping
- Solution 3: FIFO Structure
- Partitioning Synchronizer Blocks
- Gated Clocks in ASIC Prototypes
- Clocks Module
- Gating Removal Runtime Statistics
| Exercise: | Show the effects of metastability when crossing asynchronous signal | |
| Exercise: | Measure the probability of metastability by simulating with random input changes | |
- Hardware Division
- Multiply and Shift
- Iterative Division
- The Goldschmidt Method
- Taylor and Maclaurin Series Expansion
- The CORDIC Algorithm
| Exercise: | Example Design: I2S Versus SPDIF | |
| Exercise: | Example Design: Floating-Point Unit | |
- Asynchronous Versus Synchronous
- Problems with Fully Asynchronous Resets
- Fully Synchronized Resets
- Asynchronous Assertion, Synchronous Deassertion
- Mixing Reset Types
- Nonresetable Flip-Flops
- Internally Generated Resets
- Multiple Clock Domains
| Exercise: | Observe the differences between async and sync resets on flip-flops | |
- Testbench Architecture
- Testbench Components
- Testbench Flow
- Main Thread
- Clocks and Resets
- Test Cases
- System Stimulus
- MATLAB
- Bus-Functional Models
- Code Coverage
- Gate-Level Simulations
- Toggle Coverage
- Run-Time Traps
- Timescale
- Glitch Rejection
- Combinatorial Delay Modeling
| Exercise: | Understanding event bit group by synchronizing several threads | |
- Design Partitioning
- Critical-Path Floorplanning
- Floorplanning Dangers
- Optimal Floorplanning
- Data Path
- High Fan-Out
- Device Structure
- Reusability
- Reducing Power Dissipation
- Standard Analysis
- Latches
- Asynchronous Circuits
- Combinatorial Feedback
- Power Supply
- Supply Requirements
- Regulation
- Decoupling Capacitors
- Concept
- Calculating Values
- Capacitor Placement
- SRC Architecture
- Synthesis Optimizations
- Speed Versus Area
- Pipelining
- Physical Synthesis
- Floorplan Optimizations
- Partitioned Floorplan
- Critical-Path Floorplan
- FPGA Memory Types
- Flip-Flops (FF) vs LUT RAM vs Block RAM (BRAM) vs UltraRAM
- When NOT to use Flip-Flops
- Resource explosion and routing impact
- Efficient Memory Mapping
- Using BRAM for buffers and FIFOs
- Inferring RAM in HDL
- Distributed RAM usage strategies
- DSP Blocks in FPGA
- Multipliers and MAC units
- FFT Architectures
- Radix-2 / Radix-4 basics
- Pipelined vs iterative FFT
- Fixed-point vs floating-point trade-offs
- Throughput vs resource trade-offs
| Exercise: | Implement a dynamic FFT IP from the PS part | |
More
To book a training session or for more information, please contact us on info@ac6-training.com.
Registrations are accepted till one week before the start date for scheduled classes. For late registrations, please consult us.
You can also fill and send us the registration form
This course can be provided either remotely, in our Paris training center or worldwide on your premises.
Scheduled classes are confirmed as soon as there is two confirmed bookings. Bookings are accepted until 1 week before the course start.
Last update of course schedule: 23 February 2026
Booking one of our trainings is subject to our General Terms of Sales
Related Courses
ALT1
CYCLONE-V CORTEX-A9 HARD PROCESSOR SYSTEM
ALT2
FPGA Nios (Nios II / Nios V) implementation
H1
Lattice Mico32 FPGA embedded processor
H2
Lattice Diamond
HX4
AMD (Xilinx) - Microblaze implementation
HX5
AMD Zynq All Programmable SoC: Hardware and Software Design
MSP
Microchip SmartFusion2 Programming
RV1
RISC-V Architecture
U1
SystemVerilog
V0
Programmable components fundamentals
V1
VHDL Language Basics
V2
Advanced VHDL for FPGA
V3
Design with SystemC