An introduction to heterogeneous multicore processing architecture

Why would you need a heterogeneous multicore processing architecture?

Nowadays people look to achieve high-performance processing and low power requirements for their IoT solutions devices. They also look for a high degree of functional integration and want to perform complex operations with them. All these products are targeted towards a growing market of connected and portable devices that run user-friendly applications. To take advantage of these devices, it is necessary to have some type of Operating System (OS) to help run applications that benefit the end user to interact with the device. A processor containing a heterogeneous multicore processing architecture (several CPU cores and a special purpose processor) will help to have more flexibility while working with advanced embedded systems.

The NXP i.MX 7 processor, for example, includes an ARM Cortex-A7 (one or two cores) plus one ARM Cortex-M4 core, providing the chance to run an OS like Linux on the Cortex-A7 core and a real-time OS like FreeRTOS on the Cortex-M4. Running Linux helps communicate all the different features required for our device and running FreeRTOS will take care of the real-time capabilities required.

This article gives an overview of how to take advantage of a heterogeneous multicore processing architecture (i.MX7), how the communication between the different modules can be managed and all the components involved.

Take Advantage of the Heterogeneous Multicore Unit for Your Embedded Software Development

In i.MX7, the Cortex-A7, and Cortex-M4 have access to the same interconnect and this feature provides them the same access to all the peripherals. Sharing resources could affect the functionality or performance of the system.

Malfunctions or degradation in performance will start if the software does not consider the mechanisms already provided by the SoC to have a careful collaboration between the different domains.

The following diagram shows the interconnection between A7 and M4, plus extra components.

To guarantee the best experience with the heterogeneous multicore processing architecture, the software running on it must consider multicore support to ensure safe access and allow access restrictions for peripherals and memory. The multicore support includes the Resource Domain Controller (RDC), Messaging Unit (MU), and hardware semaphores. These three components are in place to guarantee successful communication between the different cores.

The RDC provides robust support for the isolation of destination memory-mapped locations such as peripherals and memory to a single core, a bus master, or a set of cores and bus master. It also grants robust and secure operation on the chip. The way it provides such mechanism is by assigning cores, bus masters, peripherals, and memory regions to domain identifiers. This will allow monitoring based on the domain identifiers and restricted access.

The MU enables two processors within the SoC to communicate and coordinate by passing messages through the MU interface. It also provides the ability for one processor to signal the other processor using interrupts.

The semaphores module will implement hardware-enforced semaphores. It implements 16 hardware-enforced gates with the following features:

The hardware gates appear as a 16-entry byte-size array with read and write accesses
Optional interrupt notification after a failed lock write provides a mechanism to indicate when the gate is locked
Secure reset mechanisms are supported to clear the contents of individual semaphore gates or notification logic, as well as clear_all capability
The programming model allocates memory space to support up to 8 processors and up to 64 gates.

All these components will require some support in the software solution running in different cores. Linux and FreeRTOS already provide such support and this document gives a brief list of details on how to work with it to have a successful collaboration between the cores. The following section will mention specifics on what is already available to use for the different components inside the multi-core unit.

How do cores interact in heterogeneous multicore processing architecture?

As mentioned previously the NXP i.MX 7 processor contains one ARM Cortex-A7 and one ARM Cortex-M4. This permits to run one specific OS on the A7 and one RTOS on the M4. This section describes how the communication between these two cores is set and how to use it to access resources dedicated only to the M4 core.

The interaction for the different cores is based on the following details:

Cortex-A7 will run Linux and Cortex-M4 will run FreeRTOS
Cortex-A7 will boot first, start the clocks, write the firmware address information in the Cortex-M4 bootROM, and will put Cortex-M4 out of reset. Cortex-A7 is the master and Cortex-M4 is the slave
Cortex-A7 and Cortex-M4 can access the same peripherals and RDC will be used to ensure safe access to the shared resources
To allow communication between cores, Remote Processor Messaging (RPSMG) is available
- OpenAMP framework is used by FreeRTOS for this purpose and there is an API to create, delete, read, and write an RPMSG channel
- Linux kernel also includes support for this feature and an API is also available to create, delete, read, and write an RPMSG channel
Enable HW semaphore to ensure exclusive access to peripherals
- The support for this is already in place for FreeRTOS, there is an API provided to lock/unlock the gate for the specific peripheral
- It is necessary to do some modifications on the Linux side to integrate the access control
Enable MU to control power states between cores
- FreeRTOS and Linux make use of RPMSG and MU to check for the different power states
- Cortex-M4 shares its status (RUN, WAIT, STOP) with Cortex-A7
- Based on those values Cortex-A7 can go into Deep Sleep mode

OpenAMP framework is used by FreeRTOS for this purpose and there is an API to create, delete, read, and write an RPMSG channel
Linux kernel also includes support for this feature and an API is also available to create, delete, read, and write an RPMSG channel

The support for this is already in place for FreeRTOS, there is an API provided to lock/unlock the gate for the specific peripheral
It is necessary to do some modifications on the Linux side to integrate the access control

FreeRTOS and Linux make use of RPMSG and MU to check for the different power states
Cortex-M4 shares its status (RUN, WAIT, STOP) with Cortex-A7
Based on those values Cortex-A7 can go into Deep Sleep mode

The purpose of this article is to show that there are already available software solutions that can work with a heterogeneous multicore processing architecture and most of them can work out of the box. It could be the case that is necessary to do some minor changes or adaptations, but hopefully, everything already works. Those software solutions (on Linux and FreeRTOS) already include a lot of examples that could help to develop a custom configuration based on the requirements of each project, whether it is for connected, standalone, or IoT devices. Knowing what is already available also helps to take advantage of the current SoC architectures and get the most out of them, obtaining high performance, a high degree of functional integration, and performing complex operations with the device making use of that particular chip.

If you need more help with your custom embedded software development, get in touch today to speak with one of our experts about your project.