

# Reduced Area Bidirectional Shift Register with Pulsed Latch Used in DA FIR Filters

# Sachinma M<sup>1</sup>, Rahul M Nair<sup>2</sup>, S Rajkumar<sup>3</sup>

<sup>1</sup>M.Tech Student, Department of ECE, NCERC, Pampady, Trissur, Kerala, India <sup>2</sup>Assistant Professor, Department of ECE, NCERC, Pampady, Trissur, Kerala, India <sup>3</sup>HOD, Department of ECE, NCERC, Pampady, Trissur, Kerala, India \*\*\*

Abstract- This paper proposes a reduced area bidirectional shift register with pulsed latches used in distributed arithmetic based FIR filter. The bidirectional shift registers are the storage devices which are capable of shifting data to the right or left depending on the mode. *Here a conventional bidirectional shift-register with* master slave flip-flop and 2-to-1 multiplexer is replaced with the proposed bidirectional shift register with bidirectional pulsed latches. The proposed shift register uses delayed clock pulse generator with 2-to-4 decoder. It reduces the area and power consumption of the overall system. The proposed bidirectional shift register is implemented in a distributive arithmetic FIR filter. The distributed arithmetic technique replaces the MAC operations by a series of look-up tables and addition operations. The performance of the bidirectional shift register with pulsed latch using delayed clock pulse is compared with the bidirectional shift register using delayed clock pulse with 2-to-4 decoder. The proposed shift register used in DA FIR filter reduces the complexity of the filter designing and is less time consuming.

*Key Words:* Area-efficient, bidirectional shift-register, flip-flop, pulsed clock, pulsed-latch, 2:4 decoder, FIR filter, distributed arithmetic.

# **1. INTRODUCTION**

Bidirectional shift-registers are the device which are widely used in many applications, such as digital DC-DC buck converters, digital low-dropout (LDO) regulators, decompressor, and digital delay locked loops (DLL). An N-bit conventional bidirectional shift-register consist of N master-slave flip-flops and N 2-to-1 multiplexers. The bidirectional shift-register shifts the data right when the direction signal is '1', and on the contrary, it shifts the data left when the direction signal is '0'. The conventional bidirectional shift-register uses masterslave flip-flops consisting of two latches. The area and power consumption can be reduced by replacing the master-slave flip-flops with pulsed-latches consisting of a latch and a pulsed clock signal. Since all pulsed-latches are enabled during single clock pulse width, the conventional bidirectional shift-register cannot share a pulsed clock signal and this causes a race condition. So that the bidirectional shift register cannot shift the data to the left or right. This problem can be solved by using the proposed bidirectional shift register in which sub

shift-registers and additional temporary storage latches are used. It still cannot shift the data to left due to the reverse order pulsed clock signals, even if the 2-to-1 multiplexers are added as in the conventional bidirectional shift-register and it also required a long hold time to maintain the input signal. Thus to solve the problem an area efficient bidirectional shift-register using bidirectional pulsed-latches is proposed. It can shift the data to right or left by using the proposed bidirectional pulsed-latches. Thus the area and power consumption can be reduced by replacing the master slave flop-flops and 2-to-1 multiplexers with the proposed bidirectional pulsed-latches and nonoverlapping delayed pulsed clock signals. This also reduces the hold time to clock.

An influential block in various signal processing applications are the Finite impulse response (FIR) filter. The number of multiply and accumulate (MAC) operations dominates the complexities in VLSI implementation of FIR filters. An alternative technique where the MAC operations can be replaced by a series of look-up tables and addition operations is Distributed Arithmetic (DA). FIR filter based on DA are computationally efficient because of high degree of mechanization involved in the implementation of MAC operations using DA. Many reconfigurable and nonreconfigurable FIR filter architectures can be developed using DA. FIR filter, requires N- MAC blocks, which consume more area and power, whereas the FIR filter implementation with DA avoids using these MAC blocks. The modified DA (MDA) based FIR filter architecture contains shift register, LUT, and the adder/shifter blocks. The 2's complement input from the shift register will address the ROM. The size of LUT increases gradually with the order of filter, which eventually reduces area and degrades performance. The shift register plays a great role in the FIR filter. Thus by replacing the shift register with the proposed bidirectional shift register can help to reduce the area and the power consumption of the system.

# 2. EXISTING SYSTEM

# 2.1 Bidirectional Shift Register

Bidirectional shift registers are the storage devices which are capable of shifting the data either right or left

depending on the mode selected. An n-bit shift register can be formed by connecting n flip-flops where each flip flop stores a single bit of data. An existing conventional N-bit bidirectional shift-register consisting of N masterslave flip-flops and N 2-to-1 multiplexers is shown in figure1. A bidirectional shift register is one in which the data can be shifted either left or right. It can be implemented by using gate logic that enables the transfer of a data bit from one stage to the next stage to the right or to the left, depending on the level of a control line. A HIGH on the RIGHT / LEFT control input allows data bits inside the register to be shifted to the right, and a LOW enables data bits inside the register to be shifted to the left.



Fig -1: Conventional N-Bit Bidirectional Shift-Register

Initially all the flip-flops in the register are reset by driving their clear pins high. The input serial data is connected at two ends of the circuit. The circuit of a bidirectional shift register using D flip flops is shown. Low power and high speed Bidirectional Shift Register (BSR) architecture can be used for the design and the implementation of hard arithmetic operations. Flip-flops (FFs) are the basic storage elements used extensively in all kinds of digital designs. In particular, digital designs nowadays often adopt intensive pipelining techniques and employ many FF-rich modules such as register file, shift register, and first in first out. It is also estimated that the power consumption of the clock system, which consists of clock distribution networks and storage elements, is as high of the total system power. FFs thus contribute a significant portion of the chip area and power consumption to the overall system design. The conventional bidirectional shift-register uses masterslave flip-flops consisting of two latches as shown in figure 2.



Fig -2: Master-Slave Flip-Flop

Its area and power consumption can be reduced by replacing the master-slave flip-flops with pulsed-latches consisting of a latch and a pulsed clock signal. But, the bidirectional shift-register using pulsed-latches cannot share a pulsed clock signal, because all pulsed-latches are enabled during clock pulse width and this causes a race condition.

# 2.2 Binary FIR Filter

The filters designed by using finite number of samples of impulse response are called FIR filters. In digital signal processing, an FIR is a filter whose impulse response is of finite period, as a result of it settles to zero in finite time. The impulse response of an Nth order discrete time FIR filter takes precisely N+1 samples before it then settles to zero. FIR filters are most popular kind of filters executed in software and these filters can be continuous time, analog or digital and discrete time. Figure 3 shows the normal FIR filter where x(n) is the input given and y(n) is the output taken.



Fig -3: Normal FIR Filter

A normal FIR filter consist of delay element, multiplier and adder blocks. The impulse response factor bi has been multiplied to the delayed input. The analog signal is converted to digital by sampling process and given as x(n) to the filter.

# **3. PROPOSED SYSTEM**

# 3.1 Proposed Bidirectional Shift Register

The proposed bidirectional shift-register reduces the area and power consumption by replacing master-slave flip-flops and 2-to-1 multiplexers with the proposed bidirectional pulsed-latches and non-overlap delayed pulsed clock signals, and by using sub shift-registers and extra temporary storage latches. The bidirectional shift-register using pulsed-latches cannot shift the data right or left. However, the shift-register using pulsed-latches solved this problem by using sub shift-registers and additional temporary storage latches. Its area and power consumption can be reduced by replacing the master-slave flip-flops with pulsed-latches consisting of a latch and a pulsed clock signal, shown in figure 4.



Fig - 4: Pulsed-Latch

It reduces the area and power consumption by replacing master-slave flip-flops and 2-to-1 multiplexers with proposed bidirectional pulsed-latches and non-overlap delayed pulsed clock signals. It also reduces the hold time to a clock pulse width. The N-bit bidirectional shift-register can be realized by connecting the N BD-PLs in series.

#### 3.1.1 Bidirectional Shift-Register Using BD -PLs

Figure 5 shows the proposed 256-bit bidirectional shift-register using BD-PLs. It applies sub shift-registers and

additional temporary latches to reduce the number of the pulsed clock signals. It simplifies the block of the BD-PL omitting the complementary signals to explain the operation easily. It consists of a bidirectional delayed pulsed clock generator, 64 4-bit sub bidirectional shiftregisters, and an extra temporary latch. The extra temporary BD-PL is added in front of the bidirectional shift-register in order to store the input signal (IN) for right-shifting.







Fig - 6: Waveforms of the 256-Bit Bidirectional Shift-Register when (a) Right-Shifting and (b) Left-Shifting

The 4-bit sub bidirectional shift-register requires five BD-PLs to shift data right or left by using five pulsed clock signals for right-shifting (CLK\_pulse\_R<1:4> and CLK\_pulse\_R<T>) or five pulsed clock signals for left-shifting (CLK\_pulse\_L<1:4> and CLK\_pulse\_L<T>), respectively. In the 4-bit sub bidirectional shift-register #1, four BD-PLs store 4-bit data (Q<1>-Q<4>) and shift the 4-bit data right or left. The temporary BD-PL stores Q<4> or the first BD-PL data (Q<5>) of the next sub bidirectional shift-register. Exceptionally, in the sub bidirectional shift-register #64, the temporary BD-PL stores the input signal (IN) for left-shifting.

When right-shifting, CLK\_pulse\_R<T> first updates the temporary data T0 to the input signal (IN). Simultaneously, in the sub bidirectional shift-register #1, the temporary data T<1> is updated to Q<4>. And then, the latch data Q<4>-Q<2> are sequentially updated to their left latch data Q<3>-Q<1>. Finally, the first latch data Q<1> is updated to T<0>, which holds the input signal (IN). Other sub bidirectional shift-registers operate in the same way as the sub bidirectional shift-register #1. On the other hand, when left-shifting, the latches except the temporary latches operate in the

reverse order of the right-shifting. Therefore, the proposed bidirectional shift-register can shift data to the right or to the left. The proposed bidirectional shift-register stores IN to the temporary latch data T<1> or T<64> at the first pulsed clock signal. It can minimize the hold time (THOLD) of the input signal (IN).

#### 3.1.2 Bidirectional Delay Pulsed Clock Generator

The BD-PL stores the left or right latch data according to CLK\_pulse\_R or CLK\_pulse\_L, respectively. The 4-bit sub bidirectional shift-register requires five BD-PLs to shift data right or left by using five pulsed clock signals for right-shifting or five pulsed clock signals for left-shifting.



# Fig -7: Proposed Bidirectional Delayed Pulsed Clock Generator

The shift register plays a great role in the FIR filter. Thus by replacing the shift register with the proposed bidirectional shift register can help to reduce the area and the power consumption of the system. The proposed system uses a decoding based clock pulse generator which can reduce the clock generation circuit. Thus by using only 2 clock cycles we can give 4 output. This also helps in reducing the hold time to a clock pulse width.

# 3.1.3 Pulsed Clock Generator with 2:4 Decoder

Similar to the multiplexer circuit, the decoder is not restricted to a particular address line, and thus can have more than two outputs. In the two to four decoders, the two input lines are decoded into four output lines. Therefore, this decoder consist of two input lines and four output lines. From the four output lines with only one output line will be active and the other three output lines are maintained at logic zero. The fig 8 shows the two input lines to four output line decoder. In the proposed system, instead of using each clock pulse circuit for each clock signal we can reduce it to half number of clock cycles by using a 2:4 decoder logic circuit.



Fig - 8: 2:4 Decoder Logic

The figure 9 shows the proposed bidirectional clock pulse generator with 2:4 decoder. The clock signal has been divided into two different path and delay is given to one signal. The clock signals are then given to two pulsed clock generator and then the output of the generator is given to the 2:4 decoder circuit. Thus using the output of just two clock pulse generator, the input for four latches has been generated. The four signals are then given parallel to the all 4-bit shift registers.



Fig -9: Clock Pulse Generator with 2:4 Decoder Given to Bidirectional Shift Register

# 3.2 Modified Binary DA FIR Filter with Bidirectional Shift-Register

Finite Impulse Response (FIR) digital filter is one of the principal blocks in Digital Signal processing (DSP) systems. Multiply Accumulate (MAC) operations involved in the implementation of FIR filters occupy the abundant area and utilize more power. The number of MAC operations required per filter increases linearly with the order of filter and hence the real-time implementation of higher order filters will become a challenging task. DA is one of the omnipotent architecture in FIR filters which gained more popularity in recent years for its high throughput processing capability. The direct form, as well as transposed form architectures of FIR filter, requires N- MAC blocks, which

consume more area and power, whereas the FIR filter implementation with DA avoids using these MAC blocks. The modified DA (MDA) based FIR filter architecture is as shown in Figure 10.



Fig -10: Modified Binary DA FIR Filter

It contains shift register, LUT, and the adder/shifter blocks. The 2's complement input from the shift register will address the ROM. A four input Look-up-table is used to which the output of the shift register is given as the input. The LSB bit of the shift-register output is taken to the LUT. Each bit of x(n) is given serially as input to the shift register. The output corresponding to the address in the LUT is taken and given to the accumulator and thus the sign calculation correspondingly is done. The accumulator output is taken as the filter output y(n). The MAC blocks in the FIR filter are replaced by multiplier less architectures to reduce the area and power consumption. In FIR filters the filter coefficients are fixed and input samples are tapped by delay line which changes by each clock cycle. Here the data coming serially can be given directly and not required to convert to parallel. Since multipliers are not used the proposed filter is less time consuming.

# 4. PERFORMANCE AND ANALYSIS

Table I shows the performance comparisons of the bidirectional shift register with pulsed clock generator and the bidirectional shift register using clock pulsed generator with 2:4 decoder.

Table -1: Compared Output

| Shift<br>Register | With Clk<br>Pulsed<br>Generator | With Clk Pused<br>Genetaror Using<br>2:4 Decoder |
|-------------------|---------------------------------|--------------------------------------------------|
| Area              | 206                             | 160                                              |

Delay 2.451ns 2.451ns Power 169mW 150mW

The area and power consumption has been reduced in the shift register implemented using the clock pulse generator using 2:4 decoder when comparing to the implementation using the bidirectional clock pulse generator. The no of transistors used in the first is 206 but it has been reduced to 160 in the second. Similarly the power consumption has been reduced to 150mW from 169mW.

Table II shows the performance results of the DA FIR filter with proposed bidirectional shift-register with pulsed clock using 2:4 decoder.

Table 2: Output of DA FIR Filter

| DA FIR Filter With Bidirectional Shift-<br>Register |         |  |
|-----------------------------------------------------|---------|--|
| Area                                                | 223     |  |
| Delay                                               | 4.492ns |  |
| Power                                               | 168mW   |  |

The no of transistors used in the designing of the filter is 223 and the power consumption is 168mW. The results shows that the implementation of the FIR filter with DA method and using bidirectional shift register with delayed clock pulse with 2:4 decoder can reduce the area and power consumption in the DSP applications.

The output waveform obtained when the program runned in the ModelSim software has been shown below. The VHDL code is implemented in the ModelSim software for obtaining the accurate wave pattern.



Fig -11: Output waveform of bidirectional shift register with pulsed clk generator



Fig -12: Output waveform of bidirectional shift register using clk pulse generator with 2:4 decoder



**Fig -13:** Output waveform of DA FIR Filter with bidirectional shift register with delayed clk pulse using 2:4 decoder

# **5. CONCLUSION**

An area-efficient bidirectional shift-register is designed using bidirectional pulsed-latches is proposed. It reduces area and power consumption by replacing master-slave flip-flops and 2-to-1 multiplexers with the proposed bidirectional pulsed-latches and delayed pulsed clock signals, and by using sub shift-registers and extra temporary storage latches. The proposed bidirectional shift-register can minimize the hold time of the input signal. A small number of the pulsed clock signals is used by grouping the latches to several sub shifter registers and using additional temporary storage latches. The pulsed clock signals are generated by the proposed bidirectional delayed pulsed clock generator It removes the input and output buffers (inverters) in 2-to1 multiplexer to minimize the number of the transistors, because the output inverter of the master-slave flip-flop can work as the input buffer. The clock signals are driven by global clock buffers instead of internal clock buffer to reduce the area.

# REFERENCES

- [1] X. Zhang, et al., "A 0.6 V Input CCM/DCM Operating Digital Buck Converter in 40 nm CMOS," IEEE Journal of Solid-State Circuits.
- [2] M. Huang, et al., "A Fully Integrated Digital LDO with Coarse–Fine-Tuning and Burst-Mode Operation," IEEE Transactions on Circuits and Systems II.
- [3] Tingting. Yu, et al., " A New Decompressor with Ordered Parallel Scan Design for Reduction of Test Data and Test Time," 2015 IEEE International Symposium on Circuits and Systems (ISCAS)
- [4] Keerthi. K. M, et al, " Design of FinFET based All-Digital DLL for multiphase clock generation," 2015 Annual IEEE India Conference (INDICON)
- [5] B.-D. Yang, "Low-Power and Area-Efficient Shift Register Using Pulsed Latches," IEEE Transactions on Circuits and Systems.
- [6] Xiaowen Wang, and William H. Robinson, "A Low-Power Double Edge Triggered Flip-Flop with TransmissionGates and Clock Gating" IEEE.
- [7] Fabien Pregaldinyet.al., "Design Oriented Compact Models for CNTFETs", IEEE Trans.Elec. dev.
- [8] ManojkumarNimbalkar, Veeresh Pujari"Design of low power shift register using implicit and explicit type flip flop".
- [9] S. Heo, R. Krashinsky, and K. Asanovic, "Activitysensitive flip-flopand latch selection for reduced energy," IEEE Trans. Very Large ScaleIntegr. (VLSI) Syst.
- [10] H. Partovi et al., "Flow-through latch and edgetriggered flip-flop hybrid elements," IEEE Int. Solid-State Circuits Conf. (ISSCC).
- [11] V. Stojanovic and V. Oklobdzija,"Comparative analysis of masterslave latches and flip-flops for high-performance and low-power systems,"IEEE J. Solid-State Circuits.
- [12] Y. W. Kim, J. S. Kim, J. W. Kim, and B.-S. Kong, "CMOS differential logic family with conditional operation for low-power application," IEEE Transactions on Circuits and Systems II.
- [13] G. Singh and V. Sulochana, "Low Power Dual Edge-

Triggered Static D Flip-Flop," arXiv preprint arXiv.

- [14] Zhao, Peiyi, Tarek K. Darwish, and Magdy Bayoumi. "High-performance and low-power conditional discharge flip-flop." IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
- [15] C. K. Teh, T. Fujita, H. Hara, and M. Hamada, "A 77% energy-saving 22-transistor single-phase-clocking Dflip-flop with adaptive-coupling configuration in 40 nm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC).

[16] H.-S. Kim, J.-H. Yang, S.-H. Park, S.-T. Ryu, and G.-H.

Cho, "A 10-bit column-driver IC with parasiticinsensitive iterative charge-sharing based capacitorstring interpolation for mobile active-matrix LCDs," IEEE J. Solid-State Circuits.

- [17] S.-H. W. Chiang and S. Kleinfelder, "Scaling and design of a 16-megapixel CMOS image sensor for electron microscopy," in Proc. IEEE
- [18] S. Heo, R. Krashinsky, and K. Asanovic, "Activitysensitive flip-flop and latch selection for reduced energy," IEEE Trans.