An introduction to heterogeneous multicore processing architecture
Introduction to heterogeneous multicore processing architecture
Nowadays people look to achieve high-performance processing and low power requirements for their devices. They also look for a high degree of functional integration and want to perform complex operations with them. All these products are targeted towards a growing market of connected and portable devices. To take advantage of these devices, it is necessary to have some type of Operating System (OS) to help running applications that benefit the end user to interact with the device. A processor containing a heterogeneous multicore processing architecture (several CPU cores and a special purpose processor) will help to have more flexibility while working with advanced embedded systems.
The i.MX7 processor, for example, includes an ARM Cortex-A7 (one or two cores) plus one ARM Cortex-M4 core, providing the chance to run an OS like Linux on the Cortex-A7 core and a real-time OS like FreeRTOS on the Cortex-M4. Running Linux helps with the communication of all the different features required for our device and running FreeRTOS will take care of the real-time capabilities required.
This article gives an overview on how to take advantage of a heterogeneous architecture (i.MX7), how the communication between the different modules can be managed and all the components involved.
Take advantage of Heterogeneous Multi-core Unit
In i.MX7, the Cortex-A7 and Cortex-M4 have access to the same interconnect and this feature provides them the same access to all the peripherals. Sharing resources could affect the functionality or performance of the system.
Malfunction or degradation in performance will start if the software does not take into account the mechanisms already provided by the SoC to have a careful collaboration between the different domains.
The following diagram shows the interconnection between A7 and M4, plus extra components.
To guarantee the best experience with the heterogeneous architecture, the software running on it must consider multicore support to ensure safe access and allow access restrictions for peripherals and memory. The multicore support includes the Resource Domain Controller (RDC), Messaging Unit (MU) and hardware semaphores. These three components are in place to guarantee a successful communication between the different cores.
The RDC provides robust support for the isolation of destination memory mapped locations such as peripherals and memory to a single core, a bus master or set of cores and bus master. It also grants robust and secure operation on the chip. The way it provides such mechanism is by assigning cores, bus masters, peripherals and memory regions to domain identifiers. This will allow to monitor based on the domain identifiers and restricted access.
The MU enables two processors within the SoC to communicate and coordinate by passing messages through the MU interface. It also provides the ability for one processor to signal the other processor using interrupts.
The semaphores module will implement hardware-enforced semaphores. It implements 16 hardware enforced gates with the following features:
- The hardware gates appear as a 16-entry byte-size array with read and write accesses
- Optional interrupt notification after a failed lock write provides a mechanism to indicate when the gate is locked
- Secure reset mechanisms are supported to clear the contents of individual semaphore gates or notification logic, as well as clear_all capability
- Programming model allocates memory space to support up to 8 processors and up to 64 gates.
All these components will require some support in the software solution running in different cores. Linux and FreeRTOS already provide such support and this document gives a brief list of details on how to work with it to have a successful collaboration between the cores. The following section will mention specifics on what is already available to use of the different components inside the multi-core unit.
How do cores interact in heterogeneous architecture
As mentioned previously the i.MX7 processor contains one ARM Cortex-A7 and one ARM Cortex-M4. This permits to run one specific OS on the A7 and one RTOS on the M4. This section describes how the communication between these two cores is set and how to use it to access resources dedicated only to the M4 core.
The interaction for the different cores is based on the following details:
- Cortex-A7 will run Linux and Cortex-M4 will run FreeRTOS
- Cortex-A7 will boot first, start the clocks, write the firmware address information in the Cortex-M4 bootROM and will put Cortex-M4 out of reset. Cortex-A7 is the master and Cortex-M4 is the slave
- Cortex-A7 and Cortex-M4 can access the same peripherals and RDC will be used to ensure safe access to the shared resources
- To allow communication between cores, Remote Processor Messaging (RPSMG) is available
- OpenAMP framework is used by FreeRTOS for this purpose and there is an API to create, delete, read, and write an RPMSG channel
- Linux kernel also includes support for this feature and an API is also available to create, delete, read and write an RPMSG channel
- Enable HW semaphore to ensure exclusive access to peripherals
- The support for this is already in place for FreeRTOS, there is an API provided to lock/unlock the gate for the specific peripheral
- It is necessary to do some modifications on the Linux side to integrate the access control
- Enable MU to control power states between cores
- FreeRTOS and Linux make use of RPMSG and MU to check for the different power states
- Cortex-M4 shares its status (RUN, WAIT, STOP) with Cortex-A7
- Based on those values Cortex-A7 can go into Deep Sleep mode
The purpose of this document is to show that there are already available software solutions that can work with a heterogeneous multicore processing architecture and most of them can work out of the box. It could be the case that is necessary to do some minor changes or adaptations, but hopefully everything already works. Those software solutions (on Linux and FreeRTOS) already include a lot of examples that could help to develop a custom configuration based on the requirements of each project, either it is for connected, standalone or IoT device. Knowing what is already available also helps to take advantage of the current SoC architectures and get the most out of them, obtaining high-performance, high degree of functional integration and perform complex operations with the device making use of that particular chip.