ac6-training, un département d'Ac6 SAS
Site displayed in English (USA)
View the site in FrenchSite affiché en English (USA)Voir le site en English (GB)
+ +
- -
Online Training
Calendar  Details
Operating Systems
Calendar  Details
Calendar  Details
Calendar  Details
Calendar  Details
+ +
> >
- -

ac6 >> ac6-training >> Processors >> NXP Power CPUs >> P101X QorIQ implementation Download as PDF Write us

FCQ1 P101X QorIQ implementation

This course covers NXP QorIQs P1010 & P1014

  • The course clarifies the architecture of the P1010 and P1014, particularly the operation of the coherency module that interconnects the e500 to memory and high-speed interfaces.
  • Cache coherency protocol is introduced in increasing depth.
  • The e500 core is viewed in detail, especially the SPE unit that enable vector processing.
  • The boot sequence and the clocking are explained.
  • The course focuses on the hardware implementation of the P101X.
  • A long introduction to DDR SDRAM operation is done before studying the DDR3/3L SDRAM controller.
  • An in-depth description of the PCI-Express port is done.
  • The course explains how to implement QoS on GigaEthernet controllers.

  • ACSYS has developed an optimized SPE based FFT coded in assembler language.
  • Performance for 1024 complex floating point single precision samples is:
    • - 91_386 core clock cycles without reverse ordering, 94_124 with reverse ordering
  • Performance for 4096 complex floating point single precision samples is:
    • - 470_778 core clock cycles without reverse ordering, 511_227 with reverse ordering
  • For any information contact
A more detailed course description is available on request at
  • Theoretical course
    • PDF course material (in English) supplemented by a printed version for face-to-face courses.
    • Online courses are dispensed using the Teams video-conferencing system.
    • The trainer answers trainees' questions during the training and provide technical and pedagogical assistance.
  • At the start of each session the trainer will interact with the trainees to ensure the course fits their expectations and correct if needed
  • Any embedded systems engineer or technician with the above prerequisites.
  • The prerequisites indicated above are assessed before the training by the technical supervision of the traineein his company, or by the trainee himself in the exceptional case of an individual trainee.
  • Trainee progress is assessed by quizzes offered at the end of various sections to verify that the trainees have assimilated the points presented
  • At the end of the training, each trainee receives a certificate attesting that they have successfully completed the course.
    • In the event of a problem, discovered during the course, due to a lack of prerequisites by the trainee a different or additional training is offered to them, generally to reinforce their prerequisites,in agreement with their company manager if applicable.

Course Outline

  • Address map, ATMU, OCEAN configuration
  • Local vs external address spaces, inbound and outbound address decoding
  • Access control unit
  • Dual-issue superscalar control, out-of-order execution
  • Execution units
  • Dynamic branch prediction
  • Execution timing
  • Load store unit, data buffering between LSU and CCB
  • Store miss merging and store gathering
  • Memory access ordering
  • Lock acquisition and import barriers
  • Thread vs process
  • The first level MMU and the second level MMU, consistency between L1 and L2 TLBs
  • TLB software reload, page attributes WIMGE
  • Process protection, variable number of PID registers and sharing
  • 36-bit real addressing
  • The L1 caches
  • Level 2 cache, partition into L2 cache plus SRAM
  • Snooping mechanism
  • Stashing mechanism
  • L2 cache locking
  • ECC protection
  • Differences between the new Book E architecture and the classic PowerPC architecture
  • Floating Point units, Double-Precision FP
  • Signal Processing APU (SPU): implementation of the SIMD capability without using a separate unit
  • PowerPC EABI: sections, C-to-assembly interface
  • Critical versus non critical
  • Handler table
  • Syndrome registers
  • Core timers
  • Performance monitoring
  • JTAG emulation
  • Watchpoint logic
  • Voltage configuration selection
  • Power-on reset sequence, using the I2C interface to access serial ROM
  • Power-on reset configuration
  • Power management
  • Secure boot and trust architecture
  • I/O arbiter
  • CCB arbiter
  • Global data multiplexor
  • On-Die termination
  • Calibration mechanism
  • Mode registers initialization, bank selection and precharge
  • Command truth table
  • Bank activation, read, write and precharge timing diagrams, page mode
  • Introduction to the DDR-SDRAM controller
  • Initial configuration following Power-on-Reset
  • Timing parameters programming
  • Functional muxing of pins between NAND, NOR, and GPCM
  • Data Buffer Control
  • Normal GPCM FSM
  • NOR flash FSM
  • Generic ASIC FSM
  • NAND flash FSM
  • 1-lane PCI Express interface
  • Modes of operation, Root Complex / Endpoint
  • Transaction ordering rules
  • Programming inbound and outbound ATMUs
  • Configuration, initialization
  • Electrical specification
  • Native command queuing, command descriptor
  • Interrupt coalescing
  • Port multiplier operation
  • Initialization steps
  • Interrupt sources
  • Integrated timers
  • Per-CPU register usage
  • Nesting implementation
  • Priority between the 4 channels
  • Support for cascading descriptor chains
  • Scatter / gathering
  • Selectable hardware enforced coherency
  • Event counting
  • Threshold events
  • Chaining, triggering
  • Watchpoint facility
  • Trace buffer
  • Address recognition, pattern matching
  • Buffer descriptors management
  • Physical interfaces: RGMII, SGMII
  • Buffer descriptor management
  • Layer 2 acceleration
  • 256-entry hash table
  • Direct queuing of four flows
  • Management of VLAN
  • Quality of service
  • Filer programming
  • IEEE1588 compliant time-stamping
  • Hardware interface
  • Program options for frame sync and clock generation
  • Network mode of operation with up to 128 time-slots
  • DMA configuration
  • TDM power-down feature
  • Configuring the TDM for I2S Operation
  • Storing and executing commands targeting the external card
  • Multi-block transfers
  • Moving data by using the dedicated DMA controller
  • Dividing large data transfers
  • Card insertion and removal detection
  • Dual-role (DR) operation
  • EHCI implementation
  • ULPI interfaces to the transceiver
  • Dedicated DMA channels
  • Endpoints configuration
  • Introduction to DES and 3DES algorithms
  • Data packet descriptors
  • Crypto channels
  • Link tables
  • XOR acceleration
  • Message buffers, mask registers
  • Time Stamp based on 16-bit free-running timer
  • Short latency time due to an arbitration scheme for high-priority messages
  • Description of the NS16552 compliant DUART
  • I2C controllers
  • Enhanced SPI