As VLSI technology advances, number of transistors and circuit complexity increase which results in increased power consumption. With the development in wearable electronics, portable devices that uses batteries, there is a need for designing digital integrated circuits with maximum processing speed while consuming low power. Approximate computing is emerging as a new design methodology to achieve low power consumption that involves trade off between hardware performance and computational accuracy. There are many applications including image processing and machine learning that need intensive computations and can also tolerate error. Arithmetic circuits of these applications play a major role in deciding the computational speed and energy consumption. Among the four basic arithmetic operations, division is least used due to its high latency. Approximate divider proves to be the solution for low power as well as low latency by simplifying the hardware structure.
The proposed approximate divider is aimed to reduce the critical path delay of exact restoring divider i.e., 𝑂( 𝑛2). By introducing small error in the truth table of subtractor cell, the complexity in the divider cell has been reduced which minimizes critical path delay, area, and power of the approximate restoring divider when compared to its exact counterpart. Accuracy of the restoring division can be varied by introducing approximation factor 𝑝. Approximation factor is the number of exact divider cells that has been replaced by the approximate divider cell. In the proposed approximate restoring divider, 𝑝(𝑝+1)/2 exact restoring divider cells are replaced by approximate restoring divider cells in triangle replacement fashion as shown in Fig. 1
Fig 1: An example of 8/4 proposed approximate restoring divider
Fig. 2: Exact divider cell
Fig 3: (a) AD1 cell, (b) AD2 cell, (c) AD3 cell, (d) AD4 cell.
The error characteristics of approximate circuits have been evaluated by measuring NMED, MRED that are widely used. Error distance (ED) is difference between accurate output and approximate output and Normalized Mean Error Distance (NMED) is defined as mean of ED normalized by maximum possible output of accurate design. Mean Relative Error Distance (MRED) is defined as the mean value of ratio between error distance and the accurate output. A python script is used to generate all possible input combinations such that no overflow is generated for the accurate Q and R output of 16-by-8 divider. Q-NMED, R-NMED are listed in Fig. 2. Accuracy evaluation of AD2 shows that quotient NMED is low for smaller values of 𝑝. For 𝑝 = 2, 3, 4 AD4 is having Q-NMED, R-NMED order of magnitude smaller than other designs. Although R-NMED for AD2, AD4 are higher, they are still competitive for approximation factors 𝑝 = 2, 3. AD4 with approximation factor 𝑝 = 14 is notably having Q-NMED smaller than any other designs. Overall AD4 with lowest Q-NMED than any other design can be used in error tolerant applications with improved hardware performance.
Table 1: Error analysis summary table
Design | P(approximation factor) | Q-NMED | R-NMED |
AD2 | 2 4 6 8 10 12 14 |
0.003257 0.004718 0.006276 0.007837 0.009251 0.010339 0.011036 |
0.010018 0.032696 0.084553 0.172986 0.244578 0.275009 0.373132 |
AD4 | 2 4 6 8 10 12 14 |
0.003431 0.004411 0.005070 0.060784 0.005908 0.006217 0.006493 |
0.006828 0.017357 0.038892 -0.668627 0.151779 0.246451 0.333986 |
The proposed designs are described at the structural level using Verilog HDL. The simulation and functionality verification has been carried out in iVerilog and GTKwave. All the divider designs have been synthesized using Cadence Genus ® with 180nm CMOS technology. The critical path, area and power consumption have been measured. Table 2 reports the area, power and delay for exact divider and proposed divider. The simulation result shows that the proposed dividers have significant performance improvement over the exact divider. The proposed divider AD1, AD2, AD3, AD4 with approximation factor of 14, results in power reduction of 32%, 56%, 45%, 62% respectively when compared to exact design.
Table 2: Area, Power, Timing summary
Design | P(approximation factor) | Area (μm2) | Power (μW) | Delay (pS) |
Exact design | - | 7344.691 | 973479.874 | 14776 |
AD1 | 2 4 6 8 10 12 14 |
7244.899 7138.454 6995.419 6815.794 6599.578 6346.771 6057.374 |
968870.432 933106.425 909329.592 857933.356 803143.234 736370.972 653830.036 |
14606 14326 14067 13928 13797 13692 13559 |
AD2 | 2 4 6 8 10 12 14 |
7101.864 6845.731 6469.848 6030.763 5478.581 4846.565 4084.819 |
951304.636 907802.245 851974.922 773073.749 673572.388 559122.286 424981.817 |
14613 14288 13699 13067 12190 11074 9634 |
AD3 | 2 4 6 8 10 12 14 |
7224.941 7105.190 6945.523 6745.939 6506.438 6227.021 5907.686 |
949250.404 916942.588 879686.099 813348.273 746872.264 671881.874 532689.572 |
14539 14144 13645 13068 12414 11847 11123 |
AD4 | 2 4 6 8 10 12 14 |
7005.398 6666.106 6213.715 5648.227 4969.642 4177.958 3273.178 |
931072.581 871542.376 804422.092 714310.215 611192.220 495919.815 365209.936 |
14455 13977 13282 12369 11236 9984 8092 |
Image division is performed for detecting changes between two images or frames in a video. Output from change detection shows the difference between the two images. change detection is widely used in application such as CCTV surveillance, intrusion detection, remote sensing etc. The first image is multiplied by 64 to fit in 16-bit dividend and the second image is directly fed as a 8-bit divisor. AD4 with approximation factor 𝑝 = 14, produces an PSNR of 38.14db as shown in Fig. 4. The comparison between PSNR and SSIM of proposed design is shown in table 3
Fig. 4: Change detection between (a)Image-1 (b)Image-2 (c)Exact divider (d)AD4: P = 14, PSNR = 38.1402db, SSIM = 0.9716
Image division can be extended to remove unwanted or dimmed background from the image. First the input image is scaled to fit in 16-bit input range of dividend. The background of the image is estimated and fed into the divisor. The PSNR, SSIM values for all implemented design for various approximation factor is summarized in Table 3. The images produced by AD4 for 𝑝 = 14 is shown in Fig. 5. The output images of highest quality is produced by AD4 closely followed by AD2. As 𝑝 increases, PSNR/SSIM decreases (SSIM not less than 0.9) because of increase in number of approximate cells. Notably AD4 with 𝑝 = 14 produces PSNR of 42db closely followed by AD-M1 with 38db for 𝑝 = 14.
Fig. 5: (a)Test image (b)Estimated background (c)Exact divider (d)Proposed design 4: P = 14, PSNR = 42.836db, SSIM = 0.98
Table 3: PSNR/SSIM for change detection and background removal application
Design | P(approximation factor) | Change detection (PSNR(dB)/SSIM) |
Background removal (PSNR(dB)/SSIM) |
AD2 | 2 4 6 8 10 12 14 |
72.2714/ 0.9999 63.9770/ 0.9993 58.0331/ 0.9980 52.1977/ 0.9942 45.7021/ 0.9868 38.7986/ 0.9680 30.3745/ 0.9112 |
61.2598/ 0.9940 59.1972/ 0.9937 56.3781/ 0.9929 52.0704/ 0.9912 47.4060/ 0.9880 40.3287/ 0.9736 32.4449/ 0.9407 |
AD4 | 2 4 6 8 10 12 14 |
73.8317/ 0.9999 65.4778/ 0.9993 61.5869/ 0.9998 56.3593/ 0.9983 50.0007/ 0.9977 43.4651/ 0.9931 38.1402/ 0.9716 |
61.5735/ 0.9940 60.2104/ 0.9939 58.3884/ 0.9936 56.5074/ 0.9929 54.1545/ 0.9916 49.3753/ 0.9898 42.8361/ 0.9850 |
Four designs of approximate divider cells AD1, AD2, AD3 and AD4 are proposed and analyzed for error metrices and hardware performance. AD1 and AD2 introduces approximation in full subtractor cell whereas AD3 and AD4 approximates entire divider cell which results in better hardware performance. AD4 with 𝑝 = 14 shows an power savings of 62% with an reduction in delay of 6ns when compared to exact design. Reduction in number of exact divider cells reduces area and power consumption. The feasibility of employing proposed dividers for real time applications have been demonstrated with change detection and background removal applications.
[1]. restoring division L. Chen, J. Han, W. Liu and F. Lombardi, "On the Design of Approximate Restoring Dividers for Error-Tolerant Applications," in IEEE Transactions on Computers, vol. 65, no. 8, pp. 2522-2533, 1 Aug. 2016, doi: 10.1109/TC.2015.2494005.
[2]. S. Venkatachalam, E. Adams and S. Ko, "Design of Approximate Restoring Dividers," 2019 IEEE International Symposium on Circuits and Systems (ISCAS), 2019, pp. 1-5, doi: 10.1109/ISCAS.2019.8702363.
[3]. J. Liang, J. Han and F. Lombardi, "New Metrics for the Reliability of Approximate and Probabilistic Adders," in IEEE Transactions on Computers, vol. 62, no. 9, pp. 1760-1771, Sept. 2013, doi: 10.1109/TC.2012.146.
A low power and high-speed approximate divider using restoring array architecture has been designed. Approximation is realized by replacing the subtractor/divider cells with the approximate subtractor/divider cells through the use of reduced gate level complexity. Four approximate divider architectures, namely, AD1, AD2, AD3 and AD4 have been designed. The amount of approximation can be scaled by introducing the approximation factor and the proposed dividers have been analyzed for different values of approximation factor. The simulation results show that the proposed dividers AD1, AD2, AD3 and AD4 with approximation factor of 10 have achieved power reduction of 17%, 30%, 23% and 37% respectively when compared to the exact restoring divider. All the simulations are carried out using GPDK180nm CMOS technology. The image processing applications such as change detection and background removal have been implemented using the designed divider to show the feasibility of employing the approximate divider for real time applications.
1.0
acc