Book Image

Architecting and Building High-Speed SoCs

By : Mounir Maaref
5 (1)
Book Image

Architecting and Building High-Speed SoCs

5 (1)
By: Mounir Maaref

Overview of this book

Modern and complex SoCs can adapt to many demanding system requirements by combining the processing power of ARM processors and the feature-rich Xilinx FPGAs. You’ll need to understand many protocols, use a variety of internal and external interfaces, pinpoint the bottlenecks, and define the architecture of an SoC in an FPGA to produce a superior solution in a timely and cost-efficient manner. This book adopts a practical approach to helping you master both the hardware and software design flows, understand key interconnects and interfaces, analyze the system performance and enhance it using the acceleration techniques, and finally build an RTOS-based software application for an advanced SoC design. You’ll start with an introduction to the FPGA SoCs technology fundamentals and their associated development design tools. Gradually, the book will guide you through building the SoC hardware and software, starting from the architecture definition to testing on a demo board or a virtual platform. The level of complexity evolves as the book progresses and covers advanced applications such as communications, security, and coherent hardware acceleration. By the end of this book, you'll have learned the concepts underlying FPGA SoCs’ advanced features and you’ll have constructed a high-speed SoC targeting a high-end FPGA from the ground up.
Table of Contents (20 chapters)
1
Part 1: Fundamentals and the Main Features of High-Speed SoC and FPGA Designs
7
Part 2: Implementing High-Speed SoC Designs in an FPGA
12
Part 3: Implementation and Integration of Advanced High-Speed FPGA SoCs

Xilinx Zynq-7000 SoC family hardware features

As mentioned previously, the Zynq FPGA SoC integrates a popular ARM CPU based on the ARMv7, and the classical FPGA part based on the Xilinx 7th generation logic with rich hardware features.

For a detailed description of the Zynq-7000 SoC FPGA and its features, please refer to the SoC Technical Reference Manual (TRM) available at https://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf.

This section specifies the main Zynq-7000 SoC features and defines them to help you quickly visualize the device’s capabilities.

The SoC is mainly composed of an application processor unit (APU), a connectivity matrix, an OCM memory interface, external memory interfaces, and the I/O peripherals (IOP) block.

The following diagram provides a detailed architectural view of the Zynq-7000 SoC:

Figure 1.6 – Zynq-7000 SoC architecture – dual-core cluster example

Figure 1.6 – Zynq-7000 SoC architecture – dual-core cluster example

Zynq-7000 SoC APU

The CPU cluster topology is built around an ARM Cortex-A9 CPU, which comes in a dual-core or a single-core MPCore. Each CPU core has an L1 instruction cache and an L1 data cache. It also has its own MMU, a floating-point unit (FPU), and a NEON SIMD engine. The CPU cluster has an L2 common cache and a snoop control unit (SCU). This SCU provides an accelerator coherency port (ACP) that extends cache coherency beyond the cluster with external masters when implemented in the FPGA logic.

Each core provides a performance figure of 2.5 DMIPS/MHz with an operating frequency ranging from 667 MHz to 1 GHz, depending on the Zynq FPGA speed grade. The FPU supports both single and double precision operands with a performance figure of 2.0 MFLOPS/MHz. The CPU core is TrustZone-enabled for secure operation. It supports code compression via the Thumb-2 instructions set. The Level 1 instructions and data caches are both 32 KB in size and are 4-way set-associative.

The CPU cluster supports both SMP and AMP operation modes. The Level 2 cache is 512 KB in size and is common to both CPU cores and for both instructions and data. The L2 cache is an eight-way set associative. The cluster also has a 256 KB OCM RAM that can be accessed by the APU and the programmable logic (PL).

The PS has 8-channel DMA engines that support transactions between memories, peripherals, and scatter-gather operations. Their interfaces are based on the AXI protocol. The FPGA PL can use up to four DMA channels.

The SoC has a general interrupt controller (GIC) version 1.0 (GIC v1). The GIC distributes interrupts to the CPU cluster cores according to the user’s configuration and provides support for priority and preemption.

The PS supports debugging and tracing and is based on ARM CoreSight interface technology.

Zynq-7000 SoC memory controllers

The Zynq device supports both SDRAM DDR memory and static memories. DDR3/3L/2 and LPDDR2 speeds are supported. The static memory controllers interface to QSPI flash, NAND, and parallel NOR flash.

The SDRAM DDR interface

The SDRAM DDR interface has a dedicated 1 GB of system address space. It can be configured to interface to a full-width 32-bit wide memory or a half-width 16-bit wide memory. It provides support for many DDR protocols. The PS also includes the DDR PHY and can operate at many speeds – up to a maximum of 1,333 Mb/s. This is a multi-port controller that can share the SDRAM DDR memory bandwidth with many SoC clients within the PS or PL regions over four ports. The CPU cluster is connected to a port; two ports serve the PL, while the fourth port is exposed to the SoC central switches, making access possible to all the connected masters.

The following diagram is a memory-centric representation of the SDRAM DDR interface of the Zynq-7000 SoC:

Figure 1.7 – Zynq-7000 SoC DDR SDRAM memory controller

Figure 1.7 – Zynq-7000 SoC DDR SDRAM memory controller

Static memory interfaces

The static memory controller (SMC) is based on ARM’s PL353 IP. It can interface to NAND flash, SRAM, or NOR flash memories. It can be configured through an APB interface via its operational registers. The SMC supports the following external static memories:

  • 64 MB of SRAM in 8-bit width
  • 64 MB of parallel NOR flash in 8-bit width
  • NAND flash

The following diagram provides a micro-architectural view of the Zynq-7000 SoC SMC:

Figure 1.8 – Zynq-7000 SoC static memory controller architecture

Figure 1.8 – Zynq-7000 SoC static memory controller architecture

QSPI flash controller

The IOP block of the Zynq-7000 SoC includes a QSPI flash interface. It supports serial flash memory devices, as well as three modes of operation: linear addressing mode, I/O mode, and legacy SPI mode.

The software implements the flash device protocol in I/O mode. It provides the commands and data to the controller using the interface registers and reads the received data from the flash memory via the flash registers.

In linear addressing mode, the controller maps the flash address space onto the AXI address space and acts as a translation block between them. Requests that are received on the AXI port of the QSPI controller are converted into the necessary commands and data phases, while read data is put on the AXI bus when it’s received from the flash memory device.

In legacy mode, the QSPI interface behaves just like an ordinary SPI controller.

To write the software drivers for a given flash device to control via the Zynq-7000 SoC QSPI controller, you should refer to both the flash device data sheet from the flash vendor and the QSPI controller operational mode settings detailed in the Zynq-7000 TRM. The URL for this was mentioned at the beginning of this section.

The QPSI controller supports multiple flash device arrangements, such as 8-bit access using two parallel devices (to double the device throughput) or a 4-bit dual rank (to increase the memory capacity).

Zynq-7000 I/O peripherals block

The IOP block contains the external communication interfaces and includes two tri-mode (10/100/1 GB) Ethernet MACs, two USB 2.0 OTG peripherals, two full CAN bus interfaces, two SDIO controllers, two full-duplex SPI ports, two high-speed UARTs, and two master and slave I2C interfaces. It also includes four 32-bit banks GPIO. The IOP can interface externally through 54 flexible multiplexed I/Os (MIOs).

Zynq-7000 SoC interconnect

The interconnect is ARM AMBA AXI-based with QoS support. It groups masters and slaves from the PS and extends the connectivity to PL-implemented masters and slaves. Multiple outstanding transactions are supported. Through the Cortex-A9 ACP ports, I/O coherency is possible so that external masters and the CPU cores can coherently share data, minimizing the CPU core cache management operations. The interconnect topology is formed by many switches based on ARM NIC-301 interconnect and AMBA-3 ports. The following diagram provides an overview of the Zynq-7000 SoC interconnect:

Figure 1.9 – Zynq-7000 SoC interconnect topology

Figure 1.9 – Zynq-7000 SoC interconnect topology