Energy-efficient, Noise-tolerant Cmos Domino Vlsi Circuits in Vdsm Technology

— Compared to static CMOS logic, dynamic logic offers good performance. Wide fan-in dynamic logic such as domino is often used in performance critical paths, to achieve high speeds where static CMOS fails to meet performance objectives. However, domino gates typically consume higher dynamic switching and leakage power and display weaker noise immunity as compared to static CMOS gates. Keeping in view of the above stated problems in previous existing designs, novel energy-efficient domino circuit techniques are proposed. The proposed circuit techniques reduced the dynamic switching power consumption; short-circuit current overhead, idle mode leakage power consumption and enhanced evaluation speed and noise immunity in domino logic circuits. Also regarding performance, these techniques minimize the power-delay product (PDP) as compared to the standard full-swing circuits in deep sub micron CMOS technology. Also the noise immunity of the CMOS Domino circuits with various techniques and keepers are analyzed. Various noise sources are considered and noise immune domino logic is proposed.


INTRODUCTION
Dynamic domino logic circuits are widely used in modern digital VLSI circuits.These dynamic circuits are often favoured in high performance designs because of the speed advantage offered over static CMOS logic circuits.The main drawbacks of dynamic logic are a lack of design automation, a decreased tolerance to noise and increased power dissipation.However, domino gates typically consume higher dynamic switching and leakage power and display weaker noise immunity as compared to static CMOS logic circuits.In this paper novel energy-efficient domino circuit techniques are proposed.
This paper is organized as follows.In section II, Dual-rail domino circuit with self-timed precharge scheme is proposed.
The pseudo-footless dynamic circuit technique is presented in section III.Section IV describes performance evaluation results of energy-efficient dual-Vt domino logic.Section V describes the Noise immune domino logic.Then conclusions are presented in section VI.

II. DUAL-RAIL DOMINO FOOTLESS CIRCUIT WITH SELF-TIMED PRECHARGE SCHEME (DRDFSTP):
Conventional domino circuits: In this section, several conventional domino circuits with their own clocking schemes are briefly reviewed.

A. Dynamic DCVSL Footed Circuit (DDCVSLF):
Fig. 1 shows AND/NAND dynamic DCVSL Footed circuit.One of the disadvantages of this kind of domino circuit is that the existence foot transistor slows the gates somewhat, as it presents an extra series resistance.Moreover, simultaneous precharge may cause an unacceptable IR-drop noise.Fig. 2 shows AND/NAND dynamic DCVSL Footless circuit.Two benefits come from the usage of footless domino gates: improved pull-down speed and reduced precharge signal load.Main disadvantage is simultaneous precharge will cause short-circuit current.Fig. 3 illustrates the delayed-reset domino AND/NAND circuit [3].However, the use of delay elements, together with the need of both footed and footless cell libraries tends to increase design complexity.D 4 L circuit uses input signals instead of precharge signal for correct precharge and evaluation sequencing [5].Correspondingly, clock-buffering and clock-distribution problems can be eliminated.Furthermore, the foot transistor can be eliminated without causing a short-circuit problem.A D 4 L two-input AND/NAND gate is shown in Fig. 4.

Dual-Rail Domino Footless Circuit with Self-Timed Precharge Scheme (DRDFSTP):
The presence of the foot transistor in the conventional dynamic DCVSL circuit shows the gate somewhat, as it presents an extra series resistance.To safely remove the transistor, two constraints must be met: (1) gate changes to evaluation phase before valid input come; (2) gate changes to precharge phase only after inputs change to zero.We propose a footless duail-rail domino circuit with self-timed precharge scheme to realize a high performance footless domino circuit while meeting the constraints mentioned above.It is expected that the peak of precharge current could be reduced due to the self-timed precharge scheme.Fig. 9 shows the AND/NAND gate of the proposed footless dual-rail domino circuit with selftimed precharge scheme.The self-timed precharge control logic consists of static CMOS inverter whose source of NMOS transistors are tied to input signals, which generate subprecharge signals (PC1-PC4) from precharge signal P in cases of the corresponding input signals are zero.The PMOS precharge tree above the pull down network (PDN) is used for precharging the corresponding gate.

Footed domino circuit with a global clock:(FD)
Fig. 6 shows the most conventional domino circuit, which comprises of footed domino gates driven by a common clock

Footed domino circuit with delayed clocks:
In order to improve the logic construction flexibility, the Clock-Delayed domino (CD-domino) circuit, shown in Fig. 8, is proposed to allow the usage of both positive and negative logic gates within a block.To achieve this flexibility, the clock rising edge of a gate should be delayed until all the incoming data settle.However, the delayed evaluation and the footed gates degrade the performance of the whole circuit seriously.In this work, we start from adopting an improved delayed-evaluation clocking style to preserve the logic construction flexibility, but add new circuit techniques to remove the other origin of speed limitation, i.e. the usage of footed gates.

Pseudo footless domino circuit :( PF-domino):
The pseudo footless domino circuit (PF-domino) is shown in Fig. 9. Basically, the circuit structure of the PF-domino is exactly the same with that of the CD-domino circuit.The differences lie in two aspects.First, all the logic gates used are pseudo-footless (PF) dynamic gates (as the inserted gate shows), rather than footed gates.Second, an enhanced selftimed delayed-evaluation clocking scheme is used to replace the simple clock-delayed scheme used in the CD-domino circuit.These two techniques are introduced in the following step by step.The pseudo-footless dynamic gates: The pseudo-footless dynamic circuit technique was first proposed.The PF gate inserted in Fig. 9 is the primitive version used , which is quite similar to a typical footed domino gate except that MN is pulled up beneath MP.The preferred PDN function is NOR.Such an arrangement is beneficial for both speed and power.First, for the dynamic part, only a small output node is precharged, and then the discharged charge, if necessary, is much smaller than that of a conventional footed gate.Second, we require that all the data inputs be ready before the clock rises up.Then, before the evaluation phase, most charges in the PDN have been discharged, which results in a very high-speed discharge in the evaluation phase.This mechanism is also the name "pseudo-footless" comes from.When used in a general domino environment, the PDN may realize a complicated large-fan-in function.The increased capacitance at node n2 will slow down the discharge.The circuit shown in Fig. 10(a) is proposed for speeding in such a condition.The transistor MD is added in parallel with the PDN and is activated in the precharge phase to deplete the charge at n2 in advance.During evaluation, MD is initially disabled because n1 is high.If n1 is being pulled down, MD will be turned on to help discharge.This gate is called a fast PF gate.When the capacitance of n2 is much larger than that of n1, we need to consider the problem of charge sharing.In this case, we can use the gate shown in Fig. 10(b), a robust PF gate, where a second keeper MK2 is added to replenish the charge to n1 when it is subject to a voltage fluctuation due to a charge sharing condition.The output loading and the fin-in number are the dominant factors that determine the performance of PF gates.http://ijacsa.thesai.org/Hence, we need to find out which type of the PF gate is the best choice for each loading and fan-in combination.First, different PF gates with different fan-in numbers are designed and characterized for various loading conditions.And second, the fastest circuit without the charge sharing effect is considered to be the best choice.

The enhanced self-timed delayed-evaluation:
The delay element is the key component for the speed, as explained in the following.If a gate receives all non-inverted inputs, the arrival time of the clock rising edge will not cause malfunction.In this case, the clock signal is usually designed to arrive ahead of the data inputs so that a higher speed can be obtained.For a gate with at least one pull-down path controlled by inverted inputs, the clock signal should be delayed until all the data inputs settle to avoid an unrecoverable error.An enough margin of this delay must be kept to face the PVT variations.In the CD-domino circuit, a simple buffer-type delay element is mentioned, which asks for a quite large margin of the delay and causes remarkable performance degradation.We propose to use a more robust self-tracking.

Simulation results:
Using the above techniques OR2 gate, AND2 gate, XOR2 gate are implemented.These design styles are compared by performing detailed transistor-level simulations on benchmark circuits using DSCH3 and Microwind3 CAD tool for 65 nm technology.

A. Standard single threshold ( low-V t ) voltage
In this, all standard low-threshold voltage transistors ( V t = 0.4 volts ) are used in implementing the bench mark circuits and are simulated using DSCH and Microwind 3.1.

B. Standard single threshold ( high-V t ) voltage
In this, all standard high-threshold voltage transistors ( V t = 0.7 volts ) are used in implementing the bench mark circuits .

C. Standard dual threshold voltage
This Dual Threshold CMOS (DTCMOS) design technique uses fast low threshold voltage (LTV) and slow high threshold voltage (HTV) devices.Thus, the aim of DTCMOS is to maximize the gain in leakage at the HTV devices without worsening the performance of the circuit.In this, the PMOS and NMOS transistors in the output inverter are used with high V t and remaining are used with low V t devices.

D. Modified dual-V t technology
This technology is the proposed technology, which is a modification of standard dual-threshold technology.In standard dual-V t technology, the transistors of the output inverter circuit in CMOS domino logic are introduced with high-V t transistors.In this modified dual-V t technology, only the pull-down transistor is introduced with the standard high-V t transistor and

Simulation results:
In this work, we implemented benchmark circuits using the above four technologies.The figure of merit used to compare these technologies is Power-Delay Product (PDP).The benchmark circuits implemented in this work are and2, or2, or8, or16, xor2, 16-bit adder, 16-bit comparator, D-Latch, 4-bit LFSR which are given below from Table1-9.The OR2 gate is illustrated for the proposed technologies which are given below in Figures 12,13

V. NOISE IMMUNE DOMINO LOGIC CIRCUITS
In DOMINO gates, noise immunity is sacrificed for high performance.The DC noise margin of DOMINO gates is equal to the threshold voltage of pull-down transistors.Unlike static CMOS gates, the charge lost from dynamic node due to noise cannot be restored in DOMINO gates.This makes DOMINO gates more vulnerable to noise than static CMOS gates.A keeper is used to restore any loss of charge from the dynamic node.An analytical noise model for DOMINO gates where the effect of keeper is taken into account is considered.

Noise Margin:
The maximum voltage amplitude of extraneous signal that can be algebraically added to the noise-free worst-case input level without causing the output voltage to deviate from the allowable logic voltage level.
A typical n-type domino CMOS logic gate as shown in Fig. 9, consists of clock controlled transistor M1 and M2, a pulldown n-type transistor network, and an output driver.The operation of a domino CMOS logic gate can be divided into two phases.In the pre charge phase when the clock CLK is low, the dynamic node is charged to logic high through Fig. 16 domino logic Fig. 17. two input and gate M1 and the output of the not gate is low.The evaluation phase starts when the clock goes high.In this phase, M1 is OFF and M2 is ON.The dynamic node discharges or retains its charge depending on the inputs to the pull-down network.A two input AND gate is illustrated in Fig. 17.
Noise sources in dynamic logic circuits can be broadly classified into two basic types: 1) Gate internal noises, including charge sharing noise, leakage noise etc., 2) External noises, including input noise, power and ground noise, and substrate noise.

Domino Noise Model:
Fig. 18 describes the noise model for DOMINO gates.Note that the keeper effect does not contribute to any extra computational cost since T is obtained from the already available input noise pulse and I k-max can also be precharacterized.network along with precharging the dynamic node.An example of dynamic 3-input AND gate using this technique is illustrated in Fig. 19.Finally, it is noted that techniques based on precharging internal nodes alone are not very effective against external noises.The pull-up technique, shown in Fig. 20, employs a PMOS transistor at node N2 forming a resistive voltage divider with the bottom clock controlled transistor.One major drawback of this technique is the DC power consumption in the resistive voltage divider.Furthermore, since the voltage level at the dynamic node S can never get lower than the voltage at node N2, the voltage swing at node S is not rail-to-rail.When the size of the PMOS pull-up transistor is large in an effort to aggressively raise gate noise immunity, the gate output may also not have a rail-to-rail swing.An improved method, shown in Fig. 21, employs a pull-up transistor with feedback control.Here an NMOS transistor M1 is used to pull up the voltage of an internal node.This design allows the pull-up transistor to be shut off when the voltage of the dynamic node goes low, therefore, the dynamic node S undergoes rail-to-rail voltage swing.Also, the DC power consumption problem is partially solved.The mirror technique employs a feedback controlled NMOS transistor similar to the NMOS pull-up technique.In addition, it duplicates the pull-down network in an effort to further reduce DC power consumption and to further improve gate noise tolerance.A 2-input dynamic AND gate designed using the mirror technique is shown in Fig. 22.However, this technique significantly lengthens the discharge path in the pulldown network, which potentially leads to slower circuit or considerably increased circuit active area when the transistors are aggressively sized.The NMOS two transistor technique adopts NMOS pullup transistors at all internal nodes to further improve dynamic gate noise immunity.In addition, the drain nodes of the pull-up NMOS transistors are connected to the inputs instead of to the power-supply network, as illustrated in Fig. 23.As an example, in Fig. 24, we show a 3-input OR-AND gate implementing the logic function of (A+ B).C. Assume input A is high while inputs B and C are low.The dynamic node S stays high because C is low and there is no discharging path to the ground.Under such scenario, there is a DC conducting path between the two inputs A and B, as illustrated in Fig. 25.

Fig.26.Complementary weak p-network technique
The basic principle of this class of techniques is to construct a weak complementary p-network to prevent the dynamic node from being floating in the evaluation phase.One such technique is illustrated in Fig. 26.In additional to the silicon area overhead associated with the pull-up network, a major drawback of this technique in practice is its ineffectiveness in dealing with very wide logic gates, for example, wide OR gates, where dynamic logic styles really outshine static CMOS logic gate in performance.http://ijacsa.thesai.org/Inverter Technique: (CMITQ)

Noise immune logic using different keepers: Domino Always on Keeper (DAOK):
Always On Keeper uses "weak"-PMOS device between the output node and V DD as shown in Figure 30.As the gate is connected to GND, this PMOS device will always be turned ON.So, even in the evaluation phase, the output node will be connected in some capacity to V DD .The PMOS "keeper," has the effect of maintaining the output node charge even at slower clock speeds.Although this configuration has advantages, it does introduce another PMOS device into each stage and also causes excess power dissipation due to possibility of the connection from V DD to GND through the NMOS devices and the PMOS keeper.

Domino Feedback Keeper (DFBK):
The use of a keeper PMOS in dynamic logic could be further improved by connecting the gate of the keeper not to GND, but to the output node of the inverter stage as shown in Figure 31.The keeper would now function as a latch cutting off whenever the output of the inverter is high.In this way, power dissipation is significantly reduced whenever a pull-down path to GND has been formed in the NMOS logic block since this would make the input to the inverter low and thus the output of the inverter high.When the output of the inverter is low however, as would be the case if no pull-down path to ground was formed in the NMOS logic block, the keeper PMOS would turn on and maintain the output high charge on the precharge node even at reduced clock speeds or an idle.The Conditional Feedback Keeper is the keeper consists of two not gates and a NAND gate and a PMOS transistor.The conditional feedback keeper provides two delays by using two not gates in order to retain the voltage at the dynamic node when the pull down network is off during the evaluation phase.The Modified feedback keeper high performance is termed as high speed feedback keeper, the keeper consists of two not gates and CMOS inverter and a PMOS transistor.The Modified feedback keeper high performance provides two delays by using two not gates in order to retain the voltage at the dynamic node when the pull down network is off during the evaluation phase.

Simulation and Implementation Results:
The simulation results are given in below Tables13-21.
OR8 (65nm Technology): This work consists of four parts.In section II the circuits Dynamic DCVSL footed circuit, Dynamic DCVSL footless circuit; Dual-Rail Data-Driven Dynamic Logic and Dual-rail Footless domino gate with self-timed precharge scheme are successfully implemented using CMOS domino logic.The proposed circuits have offered an improved performance in power dissipation, speed and noise tolerance when compared with standard domino circuit.In section III, Pseudo footless domino circuit is proposed.The proposed circuit offers better performance.In section IV, energy-efficient domino logic is presented.Among the four techniques, the standard dual V t and modified dual V t offer better performance.In section V, an attempt has been made to simulate the noise immunity of the benchmark domino circuits with different techniques and keeper transistors which are the basic building blocks for high performance.The proposed circuits have offered an improved performance in power dissipation and noise tolerance when compared with standard domino circuit.As it is observed from the results, the DMDFBK and DMDFBKHP have lower PDP, high noise immunity.Hence, it is concluded that the proposed designs will provide a platform for designing high performance and low power digital circuits and high noise immune digital circuits such as, processors and multipliers.

Fig: 3 .
Fig:3.The delayed-reset domino AND/NAND circuit D. Dual-Rail Data-Driven Dynamic Logic (D 4 L): D 4 L circuit uses input signals instead of precharge signal for correct precharge and evaluation sequencing [5].Correspondingly, clock-buffering and clock-distribution problems can be eliminated.Furthermore, the foot transistor can be eliminated without causing a short-circuit problem.A D 4 L two-input AND/NAND gate is shown in Fig.4.

Fig: 5 .
Fig:5.Dual-rail footless domino AND/NAND gate with self-timed precharge scheme.Simulation results: In this work, we have implemented a Dynamic DCVSL circuit, Dual-Rail Data-Driven Dynamic Logic and a proposed circuit Dual-Rail Domino Footless Circuit with Self-Timed Precharge Scheme.The results of simulation are shown in the below TABLES1-3.Table1.AND/NAND GATE

Fig. 6 .
Fig.6.The footed domino gate Footless domino circuit with delayed clocks:(DR-domino) Fig.7 illustrates the delayed-reset domino circuit (DRdomino).The DR-domino circuit does not improve the logic construction flexibility because it still accepts true logic gates only.

Fig. 18 .
Fig.18.Crosstalk noise model for domino gates Domino Noise Margin: In order to obtain an analytical solution for noise margin for DOMINO gates, consider the current model for the PDN NMOS transistor.We define the DOMINO noise margin as _ 1 . . . 2 inv d k Max DOMINO m NM C T I DNM g  

Fig. 19 .
Fig.19.internal nodes PrecharchigA simple effective way to prevent the charge sharing problem is to precharge the internal nodes in the pull-down

Fig. 27
Fig.27(c) Direct conducting path PMOS transistors can also be employed at a per transistor level, as shown in Fig. 27.This technique is known as inverter technique.Inverter Gated Technique:(GCMITQ)

Fig. 34
Fig.34 Modified Feedback Keeper High Performance One of the disadvantages of this kind of domino circuit is that it should be constructed with only true-logic gates.Moreover, simultaneous precharge may cause an unacceptable IR-drop noise.