ac6-training, un département d'Ac6 SAS
Site displayed in English (USA)
View the site in FrenchSite affiché en English (USA)Voir le site en English (GB)
+ +
- -
Online Training
Calendar  Details
Operating Systems
Calendar  Details
Calendar  Details
Calendar  Details
Calendar  Details
+ +
> >
- -

ac6 >> ac6-training >> Processors >> ARM Cores >> Cortex-M7 implementation Download as PDF Write us

RM4 Cortex-M7 implementation

This course covers the Cortex-M7 V7E-M compliant CPU

  • This course is split into 3 important parts:
    • Cortex-M7 architecture
    • Cortex-M7 software implementation and debug
    • Cortex-M7 hardware implementation.
  • Through a tutorial, the Cortex-M7 low level programming is explained, particularly the ARM linker parameterizing and some tricky assembly instructions.
  • Note that attendees can replay these labs after the training.
  • The course also indicates how to use caches and TCMs which are new units with regard to Cortex-M4.
  • The course also details the hardware implementation and provides some guidelines to design a SoC based on Cortex-M7, taking benefit of concurrent AXI/AHB transactions.
  • An overview of the Coresight specification is provided prior to describing the debug related units.

A more detailed course description is available on request at
  • Our course reference RI0 - AXI3 / AXI4 INTERCONNECT course details the operation of AXI4 bus.
  • Contents can adapted when attendees have already the knowledge of Cortex-M4.
  • Theoretical course
    • PDF course material (in English) supplemented by a printed version for face-to-face courses.
    • Online courses are dispensed using the Teams video-conferencing system.
    • The trainer answers trainees' questions during the training and provide technical and pedagogical assistance.
  • At the start of each session the trainer will interact with the trainees to ensure the course fits their expectations and correct if needed
  • Any embedded systems engineer or technician with the above prerequisites.
  • The prerequisites indicated above are assessed before the training by the technical supervision of the traineein his company, or by the trainee himself in the exceptional case of an individual trainee.
  • Trainee progress is assessed by quizzes offered at the end of various sections to verify that the trainees have assimilated the points presented
  • At the end of the training, each trainee receives a certificate attesting that they have successfully completed the course.
    • In the event of a problem, discovered during the course, due to a lack of prerequisites by the trainee a different or additional training is offered to them, generally to reinforce their prerequisites,in agreement with their company manager if applicable.

Course Outline

  • Memory interfaces: TCMs, caches and AHB peripheral port
  • Fixed memory map
  • Configuration options
  • Lock-step implementation
  • Highlighting the new features of the V7E-M architecture
  • In-order superscalar pipeline
  • Dynamic branch prediction
  • Bit-banding
  • System timer
  • System control block
  • Detail of Data Processing Unit
  • Load Store Unit, store buffering
  • Caches
    • Cortex-M7 cache implementation
    • Cache maintenance operations
    • Dynamic read allocate mode, recovering from errors
    • Store buffer merging
    • L1 Caches error correcting code
  • TCMs
    • Benefit of implementing two separate D-TCM ports
    • Implementing external ECC for TCMs
  • Internal bus matrix
  • Accessing ROM
  • Accessing SRAM
  • Connecting peripherals
  • Sharing resources between Cortex-M7 and other CPUs
  • STM32F7 architecture
  • Atomicity in single processor multiple thread systems
  • Operation of the Local monitor
  • Wait For Interrupt
  • Events
  • Exception behavior
  • Exception-continuable instructions
  • Non-maskable exceptions
  • Fault handling MPU faults, external faults
  • Priority boosting
  • Reset sequence, initialization requirements
  • Interrupt entry / exit, timing diagrams
  • Tail chaining
  • NVIC registers
  • Interrupt prioritization
  • Interrupt handlers
  • Wake-up Interrupt Controller
  • Device and normal memory ordering
  • Memory type access restrictions
  • Memory ordering restrictions
  • Memory protection overview, ARM v7 PMSA
  • Fault status and address registers
  • Region overview, memory type and access control, sub-regions
  • Region overlapping
  • Setting up the MPU
  • Introduction to IEEE754,
  • Floating point arithmetic
  • Cortex-M7 single and double precision FPU
  • Hardware support for denormals and all IEEE rounding modes
  • Improving the performance by selection flush-to-zero mode and default NaN mode
  • Extension of AAPCS to include FP registers
  • Lazy floating-point context save
  • Highlighting the new features of FPv5
  • Overview of AXI bus specification, explaining ordering rules
  • AXI attributes and transactions
  • Restrictions on AXI transfers
  • Overview of AHB-Lite protocol
  • AHBP
  • AHBS
  • Pinout
  • Clocking and reset, power management
  • Using an external Wake-up Interrupt Controller (WIC)
  • Cache configurability
  • AXI interface characteristics
  • Understanding through sequences how cacheable information is copied from memory to level 1 and level 2 caches
  • Transient operations, utilization of line buffers LFBs, LRBs, EBs and STBs
  • Power management
  • Cache event monitoring
  • Cache lockdown
  • Interrupt management
  • Coresight debug infrastructure, DAP
  • Cortex-M7 debug features
  • Halt mode
  • Vector catching
  • Monitor mode
  • Debug event sources
  • Flash patch and breakpoint features
  • Data watchpoint and trace
  • AHB-lite Debug interface
  • ROM table
  • Basic ETM operation
  • Instruction trace principles
  • Instrumentation Trace Macrocell
  • DWT trace packets
  • Time-stamping packets
  • Instruction tracing, branch packets, exception tracing packets
  • TPIU components
  • Embedded Trace Buffer
  • Purpose of this debug unit
  • Trigger signals to CTI and Trigger signals from CTI
  • Embedded development process
  • Application startup
  • Placing code, data, stack and heap in the memory map, scatterloading
  • Reset and initialisation
  • Placing a minimal vector table
  • Further memory map considerations, 8-byte stack alignment in handlers
  • Building and debugging your image
  • Long branch veneers
  • Coding guidelines when a cache is used
  • ARM compiler optimisations, tail-call optimization, inlining of functions
  • Mixing C/C++ and assembly
  • Coding with ARM compiler
  • Measuring stack usage
  • Unaligned accesses
  • Local and global data issues, alignment of structures
  • Further optimisations, linker feedback
  • Data processing instructions
  • Branch and control flow instructions
  • Exception generating instructions
  • If…then conditional blocks
  • Stack in operation
  • Memory barriers and synchronization
  • Multiply instructions
  • Packing / unpacking instructions
  • V6 ARM SIMD packed add / sub instructions
  • SIMD combined add/sub instructions
  • Multiply and multiply accumulate instructions
  • SIMD sum absolute difference instructions
  • SIMD select instruction
  • Saturation instructions