Motivation:

The Internet of Things (IoT) has become a part of our everyday lives. The rapid growth in the quantity and diversity of these devices, as well as their applications, is unavoidably posing various new challenges to circuit and system designers. One of the primary challenges for IoT end devices is reducing its energy consumption. The operating power usage of general-purpose commercial microcontrollers used in IoT devices is inefficient, limiting their lifetime. For this reason, a lot of research has been done towards reducing the energy costs of IoT end device computing. Approximation computing is one of the most promising approaches for applications that have an inherent tolerance for inaccuracy.

Problem Statement:

Millions of Internet of Things (IoT) end devices are expected to be connected to the Internet infrastructure shortly. Cloud computing may not be possible for such vast numbers of devices to connect in real time since it is impractical to transmit all of the data from the large number of IoT devices over today's already overburdened Internet backbone. Edge AI, or bringing intelligence to the edge nodes, has been proposed as a solution to these problems [1]. Such Edge nodes process IoT data locally using Machine Learning (ML) algothims. Because IoT end devices and Edge nodes are often powered by battery energy, ensuring energy economy in order to increase network lifetime is one of the top priorities.

Many specialized optimizations of ML algorithms at the software level have been made to reduce the power consumption for IoT Edge devices [1]-[2]. However, on a hardware level, employing commercial off-the-shelf (COTS) microcontrollers for IoT optimized ML consumes inefficient operating power. For example, the usage of a general-purpose processor's Floating Point Unit for optimized Edge ML applications, such as KNN [3] and KM [4], adds extra computational complexity. Thus, the algorithmic optimizations of ML should be supported by specialized hardware accelerators.

Proposed Solution:

There has been a lot of study towards reducing energy costs of IoT edge nodes computing at the levels of circuits, memory and architecture. Approximation computing is one of the most promising approaches, as it minimizes power usage while compromising accuracy [5]-[6]. Many applications show inherent tolerance to inaccuracy. These applications include machine learning, deep neural networks, image and video processing, variable-accuracy architectures and other applications. Owing its genesis to the presence of inherent error resilience in modern application domains, approximate computing allows performance improvement without violating the power density constraints.

In this work, we present an approximate processor for IoT devices to reduce power costs, especially for machine learning applications. Fig. 1 shows a general flow diagram of the proposed processor design. This work focuses on both circuit-level and architecture-level optimizations for the IoT processor. At the architecture-level, we introduce an approximate data path to include approximate operations. Multiplication and addition are the most resource-intensive and power-hungry operations in most machine learning algorithms. Therefore, we focus on functional units such as approximate multipliers and adders. The 32-bit RISC–V Instruction Set Architecture (ISA) is extended to activate the approximate operations. It is worth mentioning that our processor can only handle basic logic and integer operations. This makes the processor simple and suitable for the IOT end devices.

Fig.1. Edge-computing depiction: the machine learning (ML) system is split among layers, with the lowest levels being implemented in the edge and end device. The dashed rectangle at Edge-node also highlights General Flow diagram of the proposed processor design.

At the circuit level, a novel carries aware approximate radix-4 Booth multiplier is used along with an approximate adder. As an example, the architecture of the proposed 6x6 approximate radix multiplier is shown in Fig.2. Unlike the existing approximate radix multiplier, our novel multiplier calculates partial sums sequentially, resulting in a reduction in area and power usage. The significant error rate of the existing approximate multiplier is another issue. In these multipliers, approximations are performed at lower bits. However, incorrect carry propagation to higher bits due to approximations is not taken into account. As a result, the error rate of these multipliers is high and sometimes can get out of control. Our proposed multiplier, on the other hand, reduces the error rate due to its carry-aware nature. In summary, the proposed modifications not only provide improvements in the area and power utilization but also a significant improvement can be observed in the error metrics.

Fig.2. Proposed Novel Approximate Design of Radix-4 Booth Multiplier (6x6)

Team Lead: Engr. Ali Sabir

Team Members:

References:

[1] Merenda M, Porcaro C, Iero D. " Edge Machine Learning for AI-Enabled IoT Devices: A Review" . Sensors. 2020; 20(9):2533

[2] Verhelst M, Murmann B. Machine learning at the edge. NANO-CHIPS, chapter No. 18, 2020: pp. 293-322.

[3] Viegas E, Santin AO, Franca A, Jasinski R, Pedroni VA, Oliveira LS. Towards an energy-efficient anomaly-based intrusion detection engine for embedded systems. IEEE Transactions on Computers. 2016 Apr 29;66(1):163-77.

[4] Suárez JN, Salcedo A. ID3 and k-means based methodology for Internet of Things device classification. In2017 International Conference on Mechatronics, Electronics and Automotive Engineering (ICMEAE) 2017 , pp. 129-133 . IEEE

[5] Ometov A, Nurmi J. Towards Approximate Computing for Achieving Energy vs. Accuracy Trade-offs. Target.;10:18.

[6] Le KH, Le-Minh KH, Thai HT. BrainyEdge: An AI-enabled framework for IoT edge computing. ICT Express. 2021 Dec 31.

Owner

Ali Sabir

Organization URL

http://isb.nu.edu.pk/rfcs2/

Description

A low power approximate processor for IOT intelligent edge nodes applications. The proposed design not only provides improvements in terms of on chip silicon area and power consumption but a significant improvement can also be observed in the error metrics

Git URL

https://github.com/Ali-Sabir2/Approximate_Processor-.git

Version

Final

Labels

chipIgnite

SSCS-22