## **Open** Access



Iraqi Journal of Industrial Research (IJOIR)

Journal homepage: http://ijoir.gov.iq



# A New Technology for Reducing Dynamic Power Consumption in 8-Bit ALU Design

<sup>1</sup>Ahmed Lateef Hameed, <sup>2</sup>Maan Hameed\*, <sup>3</sup>Raed Abdulkareem Hasan

<sup>1</sup>Diyala Education Directorate, Ministry of Education – Iraq

<sup>2</sup>State Commission for Reservoirs and Dams, Ministry of Water Resources – Iraq

<sup>3</sup>Unit of Renewable Energy Research, Northern Technical University – Iraq

#### **Article information**

Article history: Received: October, 09, 2022 Accepted: November, 19, 2022 Available online: December, 14, 2022

*Keywords*: Tri-state state buffer, VLSI, Glitches, Hazards, Sparten

\*Corresponding Author: Maan Hameed maanhameid34@gmail.com

DOI: https://doi.org/10.53523/ijoirVol9I3ID279

#### Abstract

Clock gating is an effective way to decrease dissipated power in synchronous design. The most effective way to do this is by masking the clock that turns toward the unused part of design. In this paper, a comparative evaluation of power consumption in existing clock gating techniques in Arithmetic Logical Unit (ALU) design was achieved. an innovative signal clock gating method offers extra immunity in the direction of the present issue in an accessible mechanism. A Gated Clock Generation designs using a tri-state connection and logic gate, generated by the set of bubbled input with NAND gate, is used for the latest suggested clock gating. This design saves power even when the clock is at applying to the target module. Complete power analysis reveals that the proposed technique has an effect on the dynamic power that decreases total power consumption up to 24.90% relative to traditional power. All experiments are done in arithmetic logic unit design. 130 nm standard logic libraries have been used for implementation in order to achieve ALU frameworks. The ALU design architecture was developed using the Verilog HDL, and the simulations are performed utilizing ModelSim-Altera 10.0c (Quartus II 11.1) Starter Version.

#### 1. Introduction

Arithmetic Logic Unit with low power design is the goal for all designers. Moreover, it is part of the well-known low-power multiple systems, which are highly successful in minimizing power consumption in digital layout. Under a certain condition determined by clock gating circuits, the object of clock gating is inactive or suppresses change to parts of the clock path such as flip flop and clock network [1]. In other words, the clock is disabled when it is not necessary for clock gating to reduce power dissipation. The clock gating easily turns off the clock in which power is unreasonably consumed. The power consumption is reduced by up to half by following the stated procedure without affecting the design performance [2]. The chip requires sophisticated and expensive packaging and processing arrangements to regulate temperature levels, which will result in an escalating cost of the system. The increasing need for portable communication equipment and computer systems has increased the requirement to optimize the chip's power dissipation. Overall low-power construction is a crucial technology required today in

the semiconductor field [3]. The introduction of integrated circuits (ICs), as well indicated as simply chips or microchips, was supplemented by the mandatory testing to these models. Small-Scale Integration (SSI) circuit, together with some of logic transistors in the initial 1960s, and Medium Scale Integration (MSI) design, including a large number of logic transistors of the late 1960s, were comparatively simple to experiments. Furthermore, during the 1970s, (LSI) Large-Scale Integration design, by a large number of coupled with many thousands of logic transistors, many of problems were caused while testing these prototypes. (VLSI) Very-Large-Scale Integration Architecture with many of thousands of logic transistors was defined in the early 1980s [4]. With several millions of logic transistors, developments in VLSI technology have been developed in architecture. [5] [6]. The main goal of this research is to decrease the complexity of design by decreasing number of registers, because each register use clock signal, and the last one consumes much power. Therefore, decrease power consumption by decreasing number of clock signal. In this research, the input signal supplied to the NAND and tri-state-buffer. When the clock switches to 1, En is 0 in this state output, and output 1 will be generated by design logic, and this output value goes to the first generation of clock that generates signal used for design control, by follow this procedure save much power.

## 2. Theoretical Part

Clock gating is a common technique for decreasing dynamic power dissipation used in many real - time systems. Clock gating consequently eliminates power by applying additional logic to a circuit in order to prune the clock tree [7]. Clock power consumption reducing disables parts of the circuit such that the flip-flops in them do not have to change states. In digital architecture, it is an effective method of declining dynamic power utilization. Only a part of the design operates at any time in the synchronous model design, like the basic reference microprocessor. Therefore, power dissipation can be avoided and saved by turning off the inactive portion of the design. One way to achieve this is by masking the clock that heads to the inactive portion of the model [8]. Moreover, CG is an important way to decrease power dissipation. Clock gating technique ultimately disables the clock design by adding a clock with a gate control signal when the design is not required to avoid power consumption caused by unimportant charging and discharging of the inactive design. In particular, the clock gating technique targets the clock power dissipated in the dynamic CMOS architecture used over static logic for gain of speed and area. However, an efficient clock gating technique involves a methodology that specifies which design module, when and for how long, is gated.

Selected synchronous components of the design are out of action (disabled) in the clock gating technique by eliminating the clock signal via the inactive or sleep mode of operation [9]. For clock gating methods, the simplest approach is to use a single AND gate with two input signals. The first one is the clock signal and the second is the signal that is activated. Nonetheless, this technique is not without drawbacks as will be discussed later. This technique will surely lead to setup and hold time violations in the circuit generated through improper alignment of the clock edges [10]. Another technique use a flip-flop to synchronize the enabled signal with the clock and reduce clock misalignment [11]

#### 3. Results and Discussion

Tables In CMOS circuits, power consumption is of two kinds: dynamic power and static power. Internal and switching power are used in dynamic power equation (1). The last one is tri-strategized by capacitance for charging loads. Internal power is generated by internal capacitance and charged short circuits being [12].

$$p_{total} = p_{dynamic} + p_{static} \tag{1}$$

Where: Ptotat: total power consumption, Pdynamic: dynamic power consumption, Pstatic: static power consumption

#### **Dynamic Power**

Dynamic power is of two kinds: internal power and switching activity. Internal is consumed by the cells when one of the inputs changes while the output does not. Inner power is created from the short circuit current that passes through the transition during the PMOS-NMOS (P-channel metal-oxide-semiconductor- N-channel metal-oxide semiconductor) stack [13]. Internal power is generated by internal capacitance and short circuits being charged [12].

#### **Switching Power**

Considering the current passes solely through logic transitions on the gate, dynamic power dissipation is based on the frequency of the clock signal (possible changes per second) and the switching action (occurrence or nonappearance of changes happening on the gate in successive clock cycles) equation (2) as shown in Figure (1). Internal and switching power are used in dynamic power. The last one is tri-strategized by capacitance for charging loads.

$$P_{sw} = \propto C F V^2 \tag{2}$$

Where:  $\propto$  : switching activity, *C* : capacitance, *F* : frequency and *V*: supplied voltage



Figure (1). Switching power.

#### Static (leakage) Power

Static is a leakage power in transistors at all times. This consumption remains constant at all times. Reverse-bias p-n junction diode leakage, sub-threshold leakage and gate leakage are the critical reasons for generating static control as shown in Figure (3).



Figure (2). Static (leakage) Power.

### **Synopsys Design Compiler**

Design Compiler (DC) which is known as a Synopsys synthesis tool. In simplistic terms, this tool needs Register Transfer Logic (RTL) description designed in Verilog language and standard cell library as data input and the producing outcome would be a technology dependent gate level-netlist [14]. Design Compiler (DC) needs technology libraries, design ware libraries, and symbol libraries to carry out a synthesis process. Through the synthesis procedure, the design compiler converts the RTL description to elements obtained from the technology library and design ware library. The synthesis tool internally involves many steps are listed in Figure (3). This figure shows the synthesis process in the Synopsys tool.



Figure (3). Synopsys design compiler flow procedure.

#### **Synthesis Process**

Optimizing the generic netlist gate-level process generated by the logic synthesis process towards generating a netlist [15]. Essential operations are executed through synthesis operation [16]. The steps involved in the synthesis process first one named mapping, this method utilizes logic gates (sequential and combinational) from the libraries named technology library to create a gate-level design that aims to match the area and timing goals. The second one named delay optimization, the goal of this process is to fix delay violations presented in the mapping stage. Delay optimization does not resolve circuit standard violations or match area restrictions. The third one named design rule fixing, this is used to suitable design rule violations by resizing the current cells or adding buffers.

#### A New Technology of Clock Gating

In this work, a new circuit was implemented that would save additional power. The new Gated Clock signal produced is shown in Figure (4) using the tri-state buffer connection and the connected bubbled input NAND gate in order to achieve this goal. In such an operation, this method holds power even when the clock of the target device is off, and even when the clock of the target device is off also the

clock of the controlling device is off [17]. The goal design will save additional power in these procedures by preventing unused clock signal switching operation [18].



Figure (4). Clock gating based upon Tri-State buffer.

# **Proposed Clock Gating**

The input signal called Clk is supplied to the NAND and tri-state buffer connection. Basically, whenever the clock switches to 1, En is 0 in this state output, and output 1 will be generated by NAND logic with neg-edge clock, and this output value goes to the first generation of clock that generates signal used for design control. The logic of the tri-state was the first logic, having Global Clock as an input at the other ground input. As x switches to 1, this relation will generate a clock signal used to monitor the latch. In the next cycle, when the clock flips to 0, the logic of the second clock generation is a NAND gate with En and Global Clk at its input and generates a clock pulse that goes to the target unit when Gen goes '1'. Since GEN is '1' the NAND generates '1' so OR generates constant HIGH at CClk (Composite Clock) until En turns to '0'. GClk (Global Clock) will be running this way and CClk will be at Constant '1' mode, which ensures that without any switching, latch will keep its state. To understand the working of circuit for all process steps, clear to see the signal output from figure (4) b.

A Tri-state Buffer can be thought of as an input-controlled switch with an output that can be electronically turned "ON" or "OFF" by means of an external "Control" or "Enable" (EN) signal input. This control signal can be either a logic "0" or a logic "1" type signal resulting in the Tri-state Buffer being in one state allowing its output to operate normally producing the required output or in another state where its output is blocked or disconnected.

Then a tri-state buffer requires two inputs. One being the data input and the other being the enable or control input as shown in Figure (5).

When activated into its third state it disables or turns "OFF" its output producing an open circuit condition that is neither at a logic "HIGH" or "LOW", but instead gives an output state of very high impedance, High-Z, or more commonly Hi-Z. Then this type of device has two logic state inputs, "0" or a "1" but can produce three different output states, "0", "1" or" Hi-Z" which is why it is called a "Tri" or "3-state" device. Figure (6) displays the 8-bit ALU architecture of the RTL Viewer as drawn in the language of the Verilog HDL.



Figure (5). Tri-state Buffer Switch Equivalent.



Figure (6). RTL viewer of 8-bit ALU design.

### **Implementation Detail**

ALU architecture implemented in various frequencies of clock signals level, and proposed model to decrease dissipated power. Both frequencies are carried out at the same temperature and voltage technology [19]. Changes and their dynamic, static and total power have been calculated [20]. We apply clock gating strategies to an 8-bit logical arithmetic unit in this (ALU). The results tables and simulations of waveforms can be seen following. All experiments are carried out using the language of Verilog HDL on the architecture of arithmetic logic design. The 11.1 Online Edition of Quartus II (32-Bit). Simulated using ModelSim-Altera 10.0c (Quartus II 11.1) Starter Version, in addition. Calculation using the power compiler synopses. Figure 7, shows the waveform simulation of the tri-state ALU design.



Figure (7). Waveform simulation of 8-bit ALU design with tri-state.

# Power analysis of ALU based Tri-state:

Power dissipation of 8-bit ALU without tri-state.

| No | Frequency<br>(MHz) | Dynamic<br>power<br>(mW) | Static<br>power<br>(mW) | Total<br>power<br>(mW) |
|----|--------------------|--------------------------|-------------------------|------------------------|
| 1  | 100                | 0.203658                 | 0.00007                 | 0.2044                 |
| 2  | 200                | 0.524645                 | 0.00006                 | 0.5259                 |
| 3  | 300                | 0.897623                 | 0.00001                 | 0.8991                 |
| 4  | 500                | 1.5766                   | 0.00002                 | 1.5781                 |

#### Table (1). Power dissipation of 8-bit ALU without tri-state.



Figure (8). Power analyses without tri-state.

Table (1) and Figure (8), explain about the (8-bit) ALU power needs when the method of clock gating is applied only to the circuit. power of 8-bit ALU become 0.2044, 0.5259, 0.8991 and 1.5781 whereas design is operating on 100 MHz, 200 MHz, 300 MHz and 500 MHz frequencies respectively.

#### Power dissipation of 8-bit ALU with tri-state

| No | Frequency<br>(MHz) | Dynamic<br>power<br>(mW) | Static<br>power<br>(mW) | Total<br>power<br>(mW) | Ratio  |
|----|--------------------|--------------------------|-------------------------|------------------------|--------|
| 1  | 100                | 0.164988                 | 0.00013                 | 0.1661                 | 18.73% |
| 2  | 200                | 0.431887                 | 0.00014                 | 0.4333                 | 17.60% |
| 3  | 300                | 0.673769                 | 0.00015                 | 0.6753                 | 24.90% |
| 4  | 500                | 1.3922                   | 0.00012                 | 1.3934                 | 11.70% |

Table (2). Power Consumption of 8-bit ALU with tri-state.



Figure (9). Power analyses with tri-state.

Table (2) and Figure (9), talk about the power needs for 8-bit ALU, when clock gating with tri-state is applied to the circuit. Power of 8-bit ALU become 0.1661, 0.4333, 0.6753 and 1.3934 whereas design is operating on the same previous frequencies 100 MHz, 200 MHz, 300 MHz and 500 MHz frequencies respectively. Figure 10, power analyses with tri-state via without tri-state.



Figure (10). Power Analyses with Tri-state via without Tri-state.

# 4. Conclusions

Finally, a new technology of ALU design that saves more power. Clear to see from Figure 9, after using tri-state technique, reduction in power consumption comparing to the traditional state. The main contribution of this research is to develop a new tri-state buffer connection with clock gating and enhanced design efficiency. Increasing dynamic power renders, the design unstable. Therefore, reducing switching operation prevents these constraints from being prevented by the design. With low power dissipation, a new design of tri state connection-based clock gating was successful synthesized and analyzed using Synopsys power compiler. A New NAND gate with tri state-based clock gating technique is suggested with low Power consumption. Comparative analysis of dissipated power shows that the proposed design impacts on the dynamic power reducing up to 24.90% in compare to traditional one. The proposed design will reduce the hardware complexity of the system.

## References

- [1] M, Hameed, A, K, Khmag. F, Z, Rokhani, A. R. Ramli. "VLSI Implementation of Huffman Design Using FPGA with A Comprehensive Analysis of Power Restri-statections". *International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)*, vol. 5, no. 6, pp. 49-54, 2015.
- [2] Sahni, K., Rawat, K., Pandey, S., & Ahmad, Z. Low power approach for implementation of 8B/10B encoder and 10B/8B decoder used for high-speed communication. Paper presented at the 2014 2nd International Conference on Emerging Technology Trends in Electronics Communication and Networking (ET2ECN): 1 – 5.
- [3] J., Kawa, "Low Power and Power Management for CMOS—An EDA Perspective," *Electron Devices, IEEE Transactions on*, vol. 55, no. 1, pp. 186-196, 2008.
- [4] M, Hameed, F, Z, Rokhani, A. R. Ramli. "Low Power Approach for Implementation of Huffman Coding for High Data Compression". International Journal of Advances in Electronics and Computer Science (IJAECS), vol. 2, no. 12, 98-101, 2015.
- [5] Alidash, H.K.; Sayedi, S.M., "Activity aware clock gated storage element design," Electric-Engineering (ICEE), 2011 19th Iranian Conference on Electrical Engineering, ISSN: 2164-7054. 18 July 2011.
- [6] Kathuria, J.; Ayoubkhan, M., Noor, A. "A Review of Clock Gating Techniques", *MIT International Journal of Electronics and Communication Engineering*. vol. 1, no. 2, pp 106-114, 2011.
- [7] T. S. Kiong, N., Soin, "Low power clock gates optimization for clock tree distribution," Quality Electronic Design (ISQED), 2010 11th International Symposium on, vol., no., pp.488, 492, 22-24, 2010.
- [8] M, Hameed, A, K, Khmag. F, Z, Rokhani, A. R. Ramli. "CMOS Technology Using Clock Gating Techniques with Tri-state-State Buffer". *Walailak Journal of Science and Technology (WJST)*. Vol. 14, no. 4, 2017.
- [9] Wimer, S., & Koren, I. "Design flow for flip-flop grouping in data-driven clock gating Very Large-Scale Integration (VLSI) Systems". *IEEE Transactions on*, vol. 22, no. 4, pp. 771-778, 2014.
- [10] Xin, M., Horiyama, T., & Kimura, S. Automatic multi-stage clock gating optimization using ILP formulation. IEICE Transactions on Fundamentals of Electronics, *Communications and Computer Sciences*, vol. 95, no. 8, pp. 1347-1358, 2012.
- [11] Aanandam, S. K. Deterministic clock gating for low power VLSI design. Thesis submitted on National Institute of Technology, Rourkela. (2007).
- [12] Vošalík, J. Design of a digital I2C slave IP block. Thesis submitted on Faculty of Information Technology Czech Technical University in Prague Department of Digital Design. (2012).
- [13] Chen, D., Cong, J., & Fan, Y. Low-power high-level synthesis for FPGA architectures. Paper presented at the Proceedings of the 2003 international symposium on Low power electronics and design: 134 – 139. (2003).
- [14] Dev, M. P., Baghel, D., Pandey, B., Pattanaik, M., & Shukla, A. Clock gated low power sequential circuit design. Paper presented at the 2013 IEEE Conference on Information & Communication Technologies (ICT): 440 – 444. (2013).
- [15] Podila, R. Asynchronous interface, implementation of complete ASIC design flow. *Thesis submitted* on California State University Northridge. (2013).
- [16] M. Hameed, "Low Power Approach for Implementation of Huffman Coding", ISBN-13: 978-620-2-31711-5 & EAN: 9786202317115 & Book language: English & Publishing house: Scholars' Press & Website: <u>http://www.scholars-press.com</u> & Number of pages: 56 & published on: 2018-09-20.
- [17] M. Hameed, H. Sh. Mogheer, A. Mansour. Power reduction using high speed with saving mode clock gating technique. Paper presented at the 2nd International Scientific Conference of Engineering Sciences (ISCES 2020). IOP Conf. Series: Materials Science and Engineering 1076 (2021) 012055. doi:10.1088/1757-899X/1076/1/012055. (2021). University of Dyala – Iraq.

- [18] M. Hameed, A. L. Hameed and K. N. Jasm, "Dynamic Power Reduction in Huffman Design using 130 nm Technology Library," 2022 International Conference on Computer Science and Software Engineering (CSASE), 2022, pp. 19-23, Doi: 10.1109/CSASE51777.2022.9759786.
- [19] S. Ali Kadhim, O. A. Ibrahim, "Improving the Thermal Efficiency of Flat Plate Solar Collector Using Nano-Fluids as a Working Fluids: A Review", *Iraqi Journal of Industrial Research*, vol. 8, no. 3, 2021.
- [20] A. Abdulameer Khalaf, "Electrical Power Generation from Industrial Waste Heat Sources According to the Iraqi Environment", *Iraqi Journal of Industrial Research*, vol. 9, no. 2, 2022.