Reconfigurable Efficient Design of Viterbi Decoder for Wireless Communication Systems

—Viterbi Decoders are employed in digital wireless communication systems to decode the convolution codes which are the forward correction codes. These decoders are quite complex and dissipate large amount of power. With the proliferation of battery powered devices such as cellular phones and laptop computers, power dissipation, along with speed and area, is a major concern in VLSI design. In this paper, a low power and high speed viterbi decoder has been designed. The proposed design has been designed using Matlab, synthesized using Xilinx Synthesis Tool and implemented on Xilinx Virtex-II Pro based XC2vpx30 FPGA device. The results show that the proposed design can operate at an estimated frequency of 62.6 MHz by consuming fewer resources on target device.


I. INTRODUCTION
Viterbi Decoder has been recognized as an attractive solution to a variety of digital estimation problems, as the Kalman filter has been adapted to analog estimation problems.Viterbi algorithm is widely used in many wireless and mobile communication systems for optimal decoding of convolutional codes.Convolutional codes which are forward error correction codes offer a good alternative to block codes for transmission over a noisy channel.The purpose of forward error correction (FEC) is to improve the capacity of a channel by adding some carefully designed redundant information to the data being transmitted through the channel [1].The Viterbi algorithm essentially performs maximum likelihood decoding to correct the errors in received data which are caused by the channel noise.However it reduces the computational load by taking advantage of special structure in the code trellis.Moreover viterbi decoding has a fixed decoding time which is well suited for hardware decoder implementation [2].The requirements for the Viterbi decoder, which is a processor that implements the Viterbi algorithm, depend on the application in which it is used.This results in a very wide range of data throughput.The decoder structure is very simple for short constraint length, making the decoding feasible at rates of up to 100 Mbit/s.Viterbi decoder is effective in achieving noise tolerance, but the cost is an exponential growth in memory, computational resources and power consumption.

II. VITERBI DECODER AND ALGORITHM
The Viterbi algorithm is commonly used in a wide range of communication and data storage applications.It is also used for decoding convolutional codes, in base band detection for wireless systems, and for detection of recorded data in magnetic disk drives.The Viterbi detectors used in cellular telephones have low data rates (typically less than 1Mb/s) and should have very low energy consumption.On the opposite end of the scale, very high speed Viterbi detectors are used in magnetic disk drive read channels, with throughputs over 600Mb/s but power consumption is not as critical Viterbi Maximum Likelihood Algorithm is one of the best techniques for communications, especially wireless where energy efficiency is the most important factor.It works on the principle of selecting a code word closest to the received word.The Viterbi decoder examines an entire sequence of received signal of a given length.The decoder computes a metric for each path and makes a decision based on this metric.The metric is hamming distance between the received branch word and expected branch word [4].This is just the dot product between the received codeword and the allowable codeword.All paths are followed until two paths converge on one node.Then the path with the lower metric is kept and the one with higher metric is discarded.The paths selected are called the survivors.For an N bit sequence, total numbers of possible received sequences are 2 N .The Viterbi algorithm applies the maximum-likelihood principles to limit the comparison to 2 to the power of kL surviving paths instead of checking all the paths.The selection of survivors lies at the heart of the Viterbi algorithm and ensures that the algorithm terminates with the maximum likelihood path.The algorithm terminates when all of the nodes in the trellis have been labeled and their entering survivors are determined.We then go to the last node in the trellis and trace-back through the trellis.At any given node, we can only continue backward on a path that survived upon entry into that node.Since each node has only one entering survivor, our trace-back operation always yields a unique path.This path is the maximum likelihood estimate that predicts the most likely transmitted sequence.The maximum likelihood is given by: where Z is the received sequence, and U (m) is one of the possible transmitted sequences, and chooses the maximum (closest possible received sequence).

III. ARCHITECTURE OF THE VITERBI DECODER
The input to the communication systems is a stream of analog, modulated signals.The primary task of the receiver is the recovery of the carrier signal and also synchronization of bit timing so that the individual received data bits can be removed from the carrier and also separated from one another in an efficient manner.Both tasks are generally performed through the use of phase locked loops [5].The analog base band signal is applied to the analog-to-digital converter with bbit quantizer to get a received bit stream.The bit stream is then applied as the input to the Viterbi decoder.In order to compute the branch metrics at any given point in time, the Viterbi decoder must be able to segment the received bit stream into nbit blocks, each block corresponding to a stage in the trellis.A trellis diagram is a time-indexed version of a state and the simplest 2-state trellis is shown in Fig. 1.Each state in the trellis corresponds to a possible pattern of recently received data bits and each branch corresponds to a receipt of the next (noisy) input.The goal is to find the path through the trellis of maximum likelihood because that path corresponds to the most likely pattern that the transmitter actually sent [8].In this paper, we assume that the input to our proposed design is an identified code symbols and frames.
The basic building blocks of viterbi decoder are:-

A. Branch Metric Unit
The branch metric unit (BMU) takes the fuzzy bit and calculates the cost for each branch of the trellis.A simple branch metric unit may use hamming or Euclidean distance as the metric for calculating the cost of the branch [7].It is based on a look-up table containing the various bit metrics.The computer looks up the n-bit metrics associated with each branch and sums them to obtain the branch metric.

B. Add-Compare-Select Unit
The add-compare-select unit (ACSU) is the heart of the Viterbi algorithm and calculates the state metrics.It recursively accumulates the branch metrics as the path metrics (PM), compares the incoming path metrics, and makes a decision to select the most likely state transitions for each state of the trellis and generates the corresponding decision bits.The path metrics are added to state metrics from the previous time instant and the smaller sum is selected as the new state metric: bm k is the hamming distance between received and expected sequence.
For a given code with rate 1/n and total memory M, the number of ACS required to decode a received sequence of length L is L×2 M .

C. Survivor Memory Unit
The survivor memory unit (SMU) is responsible for keeping track of the information bits associated with the surviving paths designated by the path metric updating and storage unit.There are two basic design approaches for SMU: Register Exchange and Trace Back.In both techniques, a shift register is associated with every trellis node throughout the decoding operation.This register has a length equal to the frame length.The register exchange method works well for small constraint lengths.The traceback method works well for longer constraint length codes.The traceback method stores the decisions from the ACS into a RAM and also the path information in the form of an array of recursive pointers [9].The best path is determined by reading backwards through the RAM.The general approach to traceback is to accumulate path metrics for up to five times the constraint length (5 * (K -1)), find the node with the largest accumulated cost, and begin traceback from this node [15].The trace-back unit can then output the sequence of branches used to get to that state.In practice, the survivor paths merge after some number of iterations.The trellis depth at which all the survivor paths merge with high probability is referred to as the survivor path length.

IV. PROPOSED LOW POWER DESIGN
The ACSU and SMU consume most of the power of the decoder.In this paper we will be focusing on Survivor Memory Unit of viterbi decoder to develop a low power model.Among the two memory organization technique in the SMU, i.e. register exchange and trace back, the trace back approach is being used for low power applications.In the traceback approach, each register storing the survivor path information updates its content only once during the entire period of a code word.In contrast, all the registers in the register-exchange approach update their contents for each code symbol.Hence, the switching activity of the registers in a traceback approach is much lower than that for the registers in a register-exchange approach.So low power design techniques can be applied readily to the traceback module.In our work we will be utilizing the benefit of clock gating to develop a low power design [2].The key issue is that the content of each register does not change as soon as it is updated.This is very useful in our low power design, as we don't have to activate the registers after each updation which reduces the switching activity leading to a reduction in power dissipation.Some blocks of a circuit are used only during a certain period of time.The clock of these blocks can be disabled to eliminate unnecessary switching not in use.Fig. 4 shows clock gating to disable a unit.The survivor path storage block holds the information on survivor paths.When the i th code symbol is received, the survivor path information is obtained and stored in the i th register.At this moment all other registers hold their contents, and hence their clocks can be gated to save power as shown in Fig. 5.The clock is gated by the information coming from a ring counter that tells what the current state is so far.V. HARDWARE IMPLEMENTATION A Matlab code is initially written for the convolutional encoder with constraint length, K= 7 and rate ½ and the two generator polynomials G1G2 as {171,133} and our proposed viterbi decoder design using trace back with clock gating to evaluate the performance of the proposed design.Fig. 6 shows the BER curve vs. Eb /No using the AWGN channel for both the uncoded data and the coded data using convolutional coding with viterbi decoding.The data is decoded every clock cycle but delayed 20 clocks.Since the traceback module is not activated until the end of the frame and only for one clock cycle, this feature helps in saving the power.The results in Table 2 shows how the speed is being enhanced and the power is being reduced for the proposed design using traceback with clock gating as compared to other conventional designs.
VI. CONCLUSION Features like flexibility, re-configurability and shorter time to market provides for a wide range of applications for FPGA.In this paper, a high speed and low power viterbi decoder has been designed which benefits from the concept of clock gating, switching off the blocks when not in use and hence helping in power saving.The design has been described using VHDL and implemented on VirtexII Pro based xc2vpx70 FPGA using ISE10.1.The power analysis has been done using Xilinx Xpower analyzer tool.The overall design shows that the effective speed of operation increases by 24.8% and a reduction in power dissipation to about 45% as compared to the design which was not benefiting clock gating and was using the conventional design using shift registers.

Fig. 6
Fig.6 Matlab Simulation for Viterbi Decoder Next step was the development of VHDL code.The test bench is written for both the convolutional encoder and viterbi decoder using ModelSim SE 6.4b simulator to test the functionality of the implemented decoder.Fig shows the ber www.ijacsa.thesai.orgsimulated results after applying an error pattern to show the efficiency of the decoder in correcting those errors.

Fig7.
Fig7.ModelSim Simulation for DecoderXilinx ISE 10.1 tool has been used to map the design to FPGA Xilinx Virtex-II Pro xc2vpx70 with speed grade -5.The proposed design using the clock gating is then applied to Xilinx Xpower analyzer tool.Table1shows the device utilization summary

Table 1
shows the device utilization summary

TABLE 1 :
DEVICE UTILIZATION SUMMARY

TABLE 2 :
COMPARISON OF POWER DISSIPATION AND SPEED