A Novel Two Level Edge Activated Carry Save Adder for High Speed Processors

In today’s increasing demand of higher integration levels of VLSI and ULSI processors memory capacity and ALU efficiency plays a critical role in designing. The chip-size of memory depends on number of Flip-Flop’s (FF) which are the micro cells to store binary values. An efficient adder is always a parameter to estimate the cost effectiveness of multipliers used by ALU. In this paper the authors focuses on frequency clock utilization and also on low power consumption. It presents a novel Carry Save Adder (CSA) combined with the concept of two level clock triggering for high speed integrated circuits. The authors proposes a new Two Level Edge Triggered (TLET) FF’s built with 14Transistors (14T) and 12Transistors (12T), efficient in terms of switching power dissipation and delay in this paper. The innovative idea deals with CSA 14T and 12T which is compared in terms of Switching Power Dissipation (SPD) from 0.8V to 2.0V. The difference in SPD from 0.8V to 2.0V supply voltage analysis is 132.0nWatts for CSA using 16T FFs, 85.6nWatts for CSA using 14T and only 70.3nWatts for CSA using 12T FFs. In this paper, there is full utilization of clock signal. Keywords—Carry Save Adder; digital integrated circuits; flipflop; switching power dissipation; two level edge triggering


I. INTRODUCTION
In present scenario, technology is evaluated by its computational procedures. In industry, digital implementation is prioritized by structural design for high yield adder in electronic architectural process. Field Programmable Gates Array (FPGA) consists of numerous configurable logic gates [1]. The FPGA targets the design of the architecture according to the market requirement. It aims the modular architecture arithmetic theory to increase the system performance with the utilization of clock signal. Dual Data Rate in Very Large Scale Integration circuits has come into existence and many circuits were proposed in Two Level Edge Triggered FFs [2][3][4][5]. In order to increase Clock performance two edges of the Clock are used to trigger the FFs, this increases the clock frequency. A pulse activated memory cell (FF) is inbuilt with pulse generator and a bi-stable latch for putting away parallel qualities. The circuit intricacy and number of stages inside these pulse activated FF are decreased for D to Q delay reduction. Clock edge activated FF's are extensively two kinds Implicit Pulse activated FF and Explicit Pulse activated FF [6]. In implicit pulse activated FF the clock is created inside the FF, for occurrence, information near yield, Hybrid Latch FF (HLFF) and Semi switching FF (SFF) [6].
In Explicit pulse initiated FF (E p-FF), the pulse is delivered remotely with the aim, that all the neighboring FFs can share the clock. Frameworks like Explicit Pulse Triggered Data Close to Output (EP DCO), static Conditional Discharge FF (S-CDFF) [7] is the crucial arrangement techniques of Ep-FF. Pulse-based FF is mainly for its soft-clock edge property, which allows time borrowing and reduces clock skew [8]. It also deploys superior latency incorporating complex logic. Some of the recently proposed Two Level Edge Triggered FF's are EXDCO [8] It uses NAND-logic gates for clock generation; the power consumption of Clock Generator is less because transistor ON time for MN2 is less. But Data input(D) to Data out (Q) path of the FF shows more delay as D has to traverse through transistors MN1 and MN2 and clock has to switch MN3 transistor. Node X in the circuit shows more discharge time which results in positive Set-up time. In order to avoid discharge at node X in EP-DCO DETFF a new idea called Explicit Pulse Triggered Conditional Discharge FF (EXCDFF) is proposed in [8].
16T Improved Two Level Edge Triggered FF (ITETFF) is obtained just by substituting transmission gates of Clock input with n-MOS switches [9]. It has two data paths from Data in (D) to Data out (Q) shows master slave FF features. As n-MOS transistor along with CMOS inverter is used to latch it stores both strong-‗0' and strong-‗1'.The 16T ITFETFF is free from edge voltage loss issues of pass transistor. By utilizing NMOS transistor in transmission gates .By supplanting the ptype pass transistor by n-type transistor we can diminish the area due to NMOS is smaller than PMOS transistor. It is remunerated that portability requirement of NMOS and PMOS. In this way recently altered two level edge activated FF is progressively proficient in region, power and speed when contrasted with past FF.
The concept of two levels Edge triggering is combined with CSA to enhance the performance of adders thereby improving the speed of multipliers.
The selection of suitable adder with obligatory properties is characterized by different features [10][11][12][13]. In this paper the case study the more important criteria which is highlighted is low power and high frequency which is considered these days for internet of things applications. The Adders plays a major role in Signal processing, Image processing and VLSI Applications [14][15][16][17][18][19]. Demand for elevated speed and low power adders have led to focus on the design strategies of (IJACSA) International Journal of Advanced Computer Science and Applications Vol. 11, No. 4, 2020 488 | P a g e www.ijacsa.thesai.org resourceful adders. They not only work as arithmetic logic unit in computers and some processors but they are used to determine addresses, table directories, and parallel operations Binary adders with enormous variety of algorithms and executions [20].
In CSA parallel architectural execution expands the benefit to provide significant progress [19][20][21][22]. The computational performance executes with quite an assortment of operands. The primary phase in CSA computes the bit generate and bit propagate as follows: Where G i is the bits generate and P i is the bits propagate. These are then utilized to compute the final sum and output carry bits, with the help of equations (2): In contemporary semiconductor industry total power dissipation in CMOS IC's due to leakage current and the leakage power. The static and dynamic power dissipates exponentially which effects efficiency and value of the system [23]. The most highlighted basis for consuming switching power dynamically is the gate capacitance, load capacitance and wiring capacitance. The mathematical relation is expressed in equation (3). Subthreshold leakage current (ISUB) and gate tunneling leakage current(IG) are identified as key sources of leakage currents in all transistor. Subthreshold leakage current transpires only in turned-off transistors [24], [25]. For an individual device, leakage current can be calculated by equation (6).
In this paper leakage current variation of CSA using 16T, 14T and 12T is done by varying temperature from -40C to120C. Fast Fourier Transform FFTs are comprehensively applied in Image Processing, data compressions, Signal spectral analysis, filtering signals, etc. In Biomedical field, Medical Imaging plays an important role for various health conditions. Electrocardiography (ECG) and Electroencephalography (EEG) are an assortment of techniques to learn the pattern of signals generated by heart and brain. Addition and multiplication operations are two fundamental analysis blocks to carry out FFT. As higher order adders contribute to larger amounts of delay and energy consumption slowing down the performance of FFT [26]. This implies the proposed adder can be efficiently used for high speed processors.
Where W refers to twiddle factor defines as … (9) The progress of high-speed algorithms, known as FFTs, has made execution of DFT levelheaded in real-time applications. The Fast Fourier Transform (FFT) and the Energy Scattered spectrum (ESS) are dominant tools for scrutinizing and evaluating signals. The energy scattered analysis in a ESS of FFT at fundamental frequency is given by the equations (10) to (11).
Where K is spring constant, Q is the Q-factor of ESS and θ is phase of fundamental frequency.
and P is Perturb ratio, P1 and P0 are Perturbed amplitudes.
FFT investigation of CSA using 16T, 14T and 12T TLETFF is done with the help of ESS. CSA using 12T displays less energy consumption at fundamental frequency Fig. 1.

A. 14T Improved Two Level Edge Triggered FF (TLETFF-1)
A different TLETFF1 is proposed by modifying the 16T FF, the two n-mos transistors connecting back to back inverters are removed. From the circuit in Fig. 2, we observe that, in the upper Din to Q path n-mos transistors switching for clkb used for the same read operation during positive edge, and bottom path clk signal switches n-mos connecting inverters for read operation. In a single data path two transistors switch for the same edges of the clock for same operation n-MOS transistors connecting back to back inverters are removed. This reduces the transistor count to 14 and the working of new 14T TLETFF is shown in Fig. 2.

B. Improved Two Level Edge Triggered FF-2 (TLETFF-2)
In Improved Two Level Edge Triggered FF the upper data path-1 is activated on '0' to '1' rising edge and lower datapath-2 is activated on '1' to '0' falling edge. In this memory cell an inverter and a PMOS transistor shapes a bi-stable component to hold the bit value. This reduces the transistor count to 14. The working of new 12T TLETFF is shown in Fig. 3.

III. CARRY SAVE ADDER (CSA) USING TWO LEVEL EDGE TRIGGERED FFS
As Dual Data Rate VLSI circuits have appeared numerous circuits were proposed in Two Level Edge Triggered FFs. So as to expand Clock execution two edges of the Clock are utilized to trigger the FFs, this builds the clock recurrence. Beat based FF is for the most part for its delicate clock edge property, which permits time getting and diminishes clock slant. It additionally gives predominant inertness and is equipped for joining complex rationale [13][14][15]. CSA's are the fastest and accurate adders used in high speed processor applications. The frequency of generating the sum bits can be doubled by applying the concept of TETFF's. The intermediate carry bits are stored in flip flops until individual sum bits are generated and summed up to get the final sum and carry out. In Fig. 5, CSA using 14T TLETFF A and B are 6-bit wide inputs where A=‖010101‖ B=‖000100‖ and Carry Input Cin='0' the logic circuit output sum is from S5 to S0 which is equal to -110010‖ and Cout= ‗0'. AND gate output is the intermediate carryout stored in 14T TLETFF and given as the input to Ex-or gate. The second input to EX-OR gate present at output is the next corresponding bits sum. By introducing TLETFF to save the carry bits the frequency of operation is doubled and clock efficiency increases to 100%. In CSA using 12T TLETFF A and B are 6-bit wide inputs where A=B=‖100011‖ and Carry Input Cin='1' the logic circuit output sum is from S0 to S5 490 | P a g e www.ijacsa.thesai.org which is equal to -000111‖ and Cout= ‗1' .For the CSA implemented using 12T and 14T the clock frequency is equal to the frequency at which sum bits are generated. By The results of CSA using 14T and 12T are compared in terms of power, delay and leakage currents for 6-bit and extended up to 32-bit CSA adder.

A. Power and Delay Analysis of Two Level Edge Triggered
FFs: All the circuits are functionally verified and calculations using Mentor Graphics is done at 45nm technology. Area (number of transistors), Delay and Power comparisons of Carry Save Adder using 16T, 14T and 12T Two level Edge Triggered FFs are evaluated. Carry Save Adder using 12T shows efficient results when compared to previous methods. The outputs sum and carryout of adder can be seen at all edges of the clock signal. Fig. 4 shows the switching power dissipation of 16T, 14T and 12T FFs. The delta change between maximum (at 100fF) and minimum (0fF) power dissipation of 16T FF is 51µW, for 14T it is 44 µW and for 12T it is 37µW. switching power of 12TFF is reduced by 27.4% when compared to 16T FF. 14T FF's switching power is reduced by 13.5% when compared with 16T FF. 12T FF shows more percentage reduction in power when compared with 14T.

B. Delay Analysis
There are 4 timing parameters Rise transition time, Fall Transition time, Propagation delay high-low and propagation delay low-high Rise Transition time (t r ) is the delay, during progress, when yield changes from 10% to 90% of the most extreme worth. Fall transition time (t f ) is the delay, during progress, when yield changes from 90% to 10% of the greatest worth. Numerous structures could likewise lean toward 30% to 70% for rise time and 70% to 30% for fall time. It could shift up to various structures.
The proliferation defer high to low (tpHL) is the postponement when yield changes from high-to-low, after information changes from low-to-high. The postponement is normally determined at half purpose of information yield exchanging. Table I displays the values of rise transition time  and fall transition time delays calculated at variable capacitive loads from 10fF to 100fF. The slope of the delay is less in 14T FF when compared to 12T FF as the delta change of rise delay and fall delay is less. For 12T FF the rise and fall in the signal starts at 0ns where as in 14T it starts with positive value which increases the propagation delay of the FF.
Leakage current for 16T, 14T and 12T are calculated at different temperatures ranging from -400C to 1200C. 14T FF shows less leakage currents when compared to 12T FF, this is because in 12T FF a p-MOS transistor is connected in feedback loop that results in a short circuit path from Vdd of p-MOS transistor to the ground through n-MOS transistor of CMOS inverter.
From the above graphs of Fig. 4 and Table I, it is clear that 12T TETFF is efficient in terms of switching power dissipation and delay. The Leakage current increases with temperature for 12T because of p-MOS in feedback path as shown in Fig. 5.    The following paper discusses on implementation and evaluation of Carry Save Adder using three types of flip flops. Fig. 7, 8 and 9 shows the output waveforms of 6-bit CSA using 16T FF, 14T FF and 12TFF. The carry output is generated at two edges of clock, this is because of the two level edge triggered flip-flop incorporated into adder.
The above waveform in Fig. 6 shows sum and carry output of CSA using 16T flip flop. Sum bits from S0 to S5 are highly distorted due to glitches. The delay of carry output with respect to carry in is 904.7 ps at a clock frequency of 1GHz, which is high when compare to CSA using 14T and 12T FF.   7 and 8 shows the sum and carry output waveforms of proposed two techniques. The sum outputs from S0 to S5 are free from glitches and are smooth. From the two waveforms it can be observed clearly that carry output is generated at all the edges of the clock signal. Fig. 9 shows the distribution of transition delay of carry output with respect to carry input at different output loads. Capacitances differing from 0fF to 100fF is connected to the carry output of the CSA circuit designed using 16T, 14T and 12T TLETFF. The delay in CSA using 12T TLETFF is reduced by 27% when compared with 16T TLETFF. The maximum transition delay of CSA implemented using 12T TLETFF is only 1.109nsec. The percentage reduction in delay for 14T TLETFF with respect to CSA using 16T TLETFF is 16.2%. Table II shows switching power dissipation of CSA using existing and proposed two level edge triggered flip flops. The variation in power dissipation from 0.8v to 1.2 volts is 133.5nWatts for CSA using 16T TLETFF, 86.1nWatts for CSA using 14T TLETFF and 70.5nwatts for CSA using 12T TLETFF. It can be inferred that CSA using 12T FF shows more desirable result when compared to previous techniques. This shows 12T flip flop can be used for a wide range of fluctuating voltages.

(IJACSA) International Journal of Advanced Computer Science and Applications
Vol . 11, No. 4, 2020 492 | P a g e www.ijacsa.thesai.org  Using low power, area efficient as well as high speed multipliers and adders in Fast Fourier Transform (FFT) will guarantee upgraded execution and effectiveness. The graphs shown below are the FFT analysis of CSA constructed using 12T, 14T and 16T TLET FFs. The Energy dissipation at fundamental frequency for 12T is less when compared to 14T and 16T TLETFF's. Fig. 10, 11 and 12 shows FFT analysis done from a frequency range of 250 MHz to 8GHz. It is clear that Energy dissipated at fundamental and resonant frequencies is less for 12T. Throughout the above discussion we conclude that adder using 12T can be used for high speed DSP processors.

V. CONCLUSION
The ESS graphs clearly shows energy dissipated for CSA using 12T TLETFF is only168.9mV where as for 14T it is 290.33mV and for 16T it is 314.20mV at fundamental frequency. 12T TLETFF shows better performance in terms of power and delay with reduced area. Due to p-MOS transistor connected in bi-stable element of 12T FF leakage current is more when compared to both14T and 16T FFs. Carry Select adder implemented using Two Level Edge Triggering generates the carry output both at rising edge and the falling edge of the clock which improves clock efficiency to 100%. Variation in transition delay from no-load to 100fF of carry output of CSA using 12T is less when compared with CSA using 14T and 16T TLETFF .The percentage reduction of switching power dissipation of CSA using 12T at 1.2 volts is reduced by 46.42% with that of existing technique and for CSA using 14T it is reduced by 28.78%.