Performance Realization of CORDIC based GMSK System with FPGA Prototyping

The Gaussian Minimum Shift Keying (GMSK) modulation is a digital modulation scheme using frequency shift keying with no phase discontinuities, and it provides higher spectral efficiency in radio communication systems. In this article, the cost-effective hardware architecture of the GMSK system is designed using pipelined CORDIC and optimized CORDIC models. The GMSK systems mainly consist of the NRZ encoder, Integrator, Gaussian filter followed by FM Modulator using CORDIC models and Digital Frequency Synthesizer (DFS) for IQ Modulation in transmitter section along with channel, the receiver section has FM demodulator, followed by Differentiator and NRZ decoder. TheCORDIC algorithms play a crucial role in GMSK systems for IQ generation and improve the system performance on a single chip. Both the pipelined CORDIC and optimized CORDIC models are designed for 6-stages. The optimized CORDIC model is designed using quadrature mapping method along with pipeline structure. The GMSK systems are implemented on Artix-7 FPGA with FPGA prototyping. The Performance analysis is represented in terms of hardware constraints like area, time and power. These results show that the optimized CORDIC based GMSK system is a better option than the pipelined CORDIC based GMSK systems for real-time scenarios. Keywords—GMSK; CORDIC algorithm; FPGA; DFS; Gaussian Filter; pipelined; integrator; differentiator; channel


I. INTRODUCTION
In the current era of high-speed communication, the prime objective of the system is to achieve modulation that has a power spectrum with a constant amplitude and adequate bandwidth. Out of all, some of the efficient techniques are Minimum Shift Keying (MSK) and Gaussian Minimum Shift Keying (GMSK). They are derived from Constant Phase-FSK family of modulation operating at a constant envelope. As a modulated signal has the characteristic of constant amplitude, the power consumption can be minimized by using a Class C RF amplifier. Employing an amplifier is necessary for battery operated units. Before modulation, the MSK signal along the I-Q component allows via half-sine shaper to pass the modulation signal. GMSK incorporates the same technique as that of MSK. The only variation is, instead of the shape of the half-sine pulse the inputs bits form the shape of the Gaussian bell curve. The realization of such shapers is carried using different digital or analog circuits of PCM LUTs [1][2][3][4][5][6][7].The present wireless digital communication systems use different types of modulation techniques to achieve high performance with efficient usage of available spectrum efficiency. The MSK provides the constant envelope signals for power amplification which reduces the problems caused by non-linear distortion, and MSK encodes each bit is like a half sinusoidal. The spectrum is not close enough to aware of the data rate approaches in RF channel Bandwidth. The GMSK overcomes the problems of MSK. GMSK is used in many wireless applications, and it limits the spectral bandwidth by using Gaussian filtering [1][2].
The GMSK is widely used in mobile communications and associated modulation scheme for GSM standards, and most of the GSM Mobiles have long battery life with greater efficiency and because of RF-Power amplifiers. The software-defined radio is a simplified approach which is adopted in most of the cellular standards. The American /European countries use GMSK modulation schemes in cellular phones as per GSM standards. The GMSK supports non-linear modulation, which is used for easy implementation on hardware [3][4][5]. If the Radio signal frequency increases in mobile communication and it is difficult to use by conventional technology like DSP, Microcontroller and Microprocessor in the real-time conditions due to its computational complexities. The GMSK system based transmitter and receiver is designed with timing synchronization using the Mueller and Muller method. The timing synchronization is necessary for real-time usage for GMSK systems, GMSK receiver consumes larger chip area and affects the overall system performance [6]. The GMSK modulation for EDGE mobile communication is established with a new digital technique of linear approximation based on Inter-symbol-interference (ISI) and partial response signaling. The GSM/Edge systems support narrow bandwidth, reasonable power spectrum usage, lesser the impulse response overshoot, higher immunity to noise interface and constant filter output pulse with a modulation index of 0.5 [7]. The GMSK system is mainly used in Automatic identification systems (AIS) [8], OFDM communication systems [9] and digital image processing [10] and other applications.
Section II discusses the existing approaches of the GMSK system and its application usage and also the review of the new CORDIC algorithms and problem identification with research gaps. Section III describes the CORDIC algorithm principles along with pipelined and optimized CORDIC hardware architecture. The GMSK system is elaborated in section IV with GMSK principles, GMSK modulation, and Demodulation architectures. The results and performance analysis of Pipelined CORDIC and optimized CORDIC based GMSK systems are represented in section V. section VI concludes the overall work of GMSK systems with constraints improvements. www.ijacsa.thesai.org

II. EXISTING WORKS
In this section, the existing approaches towards GMSK systems and its application usage and also a few existing CORDIC based approaches are elaborated. Munir et al. [11] present Cube Sat AIS receiver module using GMSK Modulation, the received AIS signals from the transmitter are at low power, and it is challenging to analyze operating frequency. So by deploying the GMSK modulation, the signals are analyzed inappropriate conditions. Poletaev et al. [12] present phase distortion variation estimation of GMSK modulated signals under Very low frequency (VHF). The phase distortion detector is designed using VCO's for IQ modulation and arc tan module for IQ generations for the generation of phase drift of VLF signals. Rakesh et al. [13] present the software-based approaches for BER analysis of GMSK system under AWGN channel condition. The Simulink tool is used for GMSK systems which include an encoder, GMSK Modulator followed by AWGN channel and GMSK Demodulator along with Viterbi decoder. The GMSK signals are generated for mobile radio telephony using Simulink modeling by George et al. [14]. The performance analysis of the GMSK Model for different fading channel conditions along with hard decision and soft decision decoding. The soft decision decoding techniques give better BER than hard decision decoding. Ghnimi et al. [15] present GMSK modulation under radio mobile propagation conditions based on the Matlab environment and its performance realization including AWGN along with one path and four path channel fading are analyzed.
The FPGA Based GMSK Modulator is designed by Nitin et al. [16] for GSM system which includes differential encoderdecoder, ROM based sine and cosine wave (IQ) generation using phase trajectory, phase concatenation followed by phase accumulation module. Gupta et al. [17] present GMSK transceiver on the FPGA platform for VHF waveform generation. The transmitter design includes Gaussian filter followed by FM Modulator with Direct digital synthesizer (DDS) up converter using XILINX system generator tool. Similarly, for GMSK receiver, DDS followed by Cascadedintegrator-comb (CIC) decimator filter and non-coherent Viterbi decoder. The GMSK carrier phase recovery loop for non-linear analysis are evaluated by jhaidri et al. [18], and the system includes precoder, GMSK Modulator followed by AWGN channel, along with coherent demodulation and carrier phase recovery on Matlab simulation environment. The FPGA based GMSK Demodulator using CORDIC engine is designed by Kumar et al. [19], and demodulator includes IQ generation using 1024 x16 ROM Module followed by 6-stage CORDIC module along with differentiator and decision synchronizer module. The design consumes a huge amount of chip area and power.
The floating-point Arthematic processor based on CORDIC algorithm is designed by bingyi et al. [20] which includes mantissa bits generation by pre-processing unit, followed by reconfigurable CORDIC rotation unit and exponent bits for normalization using post-processing model. The CORDIC rotation unit is a unified structure for multiplication, division, and square roots. The Hoang et al. [21] presents 32-bit Floating point FFT twiddle factor calculation unit using adaptive CORDIC module. The CORDIC module generates the sine and cosine values are used for FFT twiddle factor calculation for each iteration. Madi et al. [22] present a hardware implementation of sine and cosine generation using CORDICmodule. The phase inputs are generated from the ROM table followed by CORDIC5.0 is adopted form Xilinx IP Coreis modeled using Xilinx system generator. The Mousetrap-radix-2 CORDIC module is designed by chagela et al. [23], which includes asynchronous pipelined architecture to improve the throughput and power-delay product.
Research Gap:It has been noticed from the review of existing works carried on GMSK systems are based on software and few on hardware-based approaches. The recent existing approaches towards GMSK systems are not optimized yet and facing computational complexity, performance degradation, more resource utilization on hardware. The existing GMSK Modulation uses Voltage controlled oscillator (VCO), or local oscillators for IQ modulation or ROM based LUT, or current waveform generation approaches for IQ generation, which affects the overall chip area and system performance on GMSK systems. Few approaches are used CORDIC based designs for IQ generation but are facing problems on angle rotation convergence, more iteration stages to achieve the sine and cosine values, not pipelined and LUT based memory for tangent values updating. These research gaps are overcome by using the Proposed CORDIC based GMSK system in next sections.

III. CORDIC DESIGNS
The CORDIC algorithm is an algorithm to calculate the trigonometric and hyperbolic operations, and it is also known as volder's algorithm. CORDIC typically intersect with one digit per iteration. In this section CORDIC algorithm principle and hardware architecture of pipelined and optimized CORDIC is explained in detail.

A. CORDIC Algorithm Principles
The system receives a vector and rotates by an angle θ for each iteration is To calculate the sine and cosine angles through rotation mode. Simplify the equation (1) to One rotation is performed using the equation (3), and it requires four multiplications along with addition and subtractions. To avoid the multiplications, use small arbitrary The angle of the final iteration values accumulated by equation (4) and compared with an initial value 0 z . The sign i S belongs to +1 for counterclockwise direction and -1 for clockwise direction.

B. Pipelined CORDIC Model
The CORDIC model is an iterative structure. So it will take a longer time to process an algorithm for the number of iterations. So use the pipelined registers between the iteration stages to achieve the fast computations. The sine and cosine angles have been computed in a row using a pipelined structure. The hardware architecture of the pipelined CORDIC model is represented in figure 1. For the rotation mode, set the initial values to x0 = 1 and y0=0 and z0= θ. The pipelined structure has a 6-stage (i=6) iterative process. The 8-bit CORDIC model has pipelined registers, shifters, adder/subtract or along with constant tangent values with counter mode. The accuracy of the computation will be increased based on the number of iterative stages. These initial values are shifted by ibits, where i is an integer and shifted up to 0 to 5. The division of Xi and Yi by 2 i is taken place by the shifter for each stage. The new vector values are generated at intermediate stages for the given vector, and it is iteratively rotated to get the desired angle Zi. The Zi will decide the sign for addition and subtraction. The selection of the sign Siset by If the initial stage is set to x0 = 1 and y0=0, then results are in the form of discrete sine and cosine values. It is difficult to realize on FPGA. So varies the Xi and Yi values to realize on hardware and to avoid the fractional values. In order to improve the accuracy, multiply the constant K= 0.611 with 2 6 , so the initial values will set toXi = 38 + x0 and Yi=0+ y0 for six iterations. The addition/ subtraction outputs are feedback to pipelined registers to get each stage CORDIC outputs of sine and cosine values.
The tangent values are generated based on the following equation.
The tangent values ( i ZT ) are assigned for successive iteration using the counter to generate the Zivalues. The counter will count till 5 and reset to 0.

C. Optimized CORDIC Model
The improvised version of the pipelined CORDIC model is designed in this section and calls it as optimized CORDIC model. The significant difference between the above pipelined and this optimized CORDIC is quadrant mapping for proper angle rotation using preprocessing and post-processing, and separate pipelined structure for sine and cosine calculation and also delay module is used to synchronize the quadrant mapping with pipeline structure. The optimized CORDIC model is designed and represented in figure 2. The pipelined structure has a part of optimized CORDIC model is represented in figure  3.
The optimized CORDIC is overcome the rotation angle range problem, computational complexity, and avoiding the ROM table usage for tangent calculation, which saves the chip area of the CORDIC model. This optimized CORDIC is one of the best solutions for sine and cosine signal (IQ) generation to improve the speed and accuracy for GMSK systems and its applications.
The optimized CORDIC model has three stages namely, preprocessing, pipelined structure, and post-processing. The preprocessing stage receives the 8-bit phase input and for quadrant mapping, chooses the MSB 2-bit phase input [7:6] and transformed to the first quadrant. The same 2-bit is passed to the delay unit. The MSB of phase 2-bits [7:6] is "00", then assign the 8-bit phase input (PI) into phase register as a first quadrant. Similarly, if the bits are "01", assign PI-90 0 , for "10", assign PI-180 0 and for "11", assign PI-270 0 into phase register as a first quadrant. This 8-bit phase register is input to the pipelined CORDIC structure.  The pipelined structure is proportional to the accuracy of the angles and provides the high-speed calculation. In pipelined structure, Initialize the X0= 38 (Decimal value), Y0 = 0 and Z0 = 8-bit phase register value as angle input before the iteration starts. The pipelined structure has six stages, and set n=6 is shown in figure 3. The phase register input Zi [7] decides the sign(S) for the corresponding iterations stages. The constant tangent values (a0 to a5) are assigned directly while performing the iterations. The adder/subtract or of each level completes the corresponding iteration stages. The pipelined registers are placed (not shown in figure 3) after each addition/subtraction operation except the final iteration stage. In six stage, the sine and cosine values are generated.
The delay register receives the MSB 2-bit phase input and generates the delayed 2-bit quadrant bits results parallelly with pipelined structure outputs to maintain the synchronization problems. The post-processing unit converts the first quadrant of the phase input to next quadrants along with pipelined outputs. The post-processing quadrant 2 bit generates the final cosine and a sine value from 6th stage pipelined outputs is as shown in table 1.

IV. PROPOSED GMSK SYSTEM
In this section, the proposed GMSK system methodology is addressed using the pipelined CORDIC and optimized CORDIC model and its design flow is represented in figure 4. The proposed work main aims to design cost-effective hardware architecture of the GMSK system using pipelined CORDIC and optimized CORDIC models on a single chip. The CORDIC models are incorporated for IQ generation, which speeds up the GMSK systems in real time scenarios. The GMSK Systems mainly consists of Non-return to-Zero (NRZ) encoder-decoder, Integrator-Differentiator, Gaussian filter, FM Modulator, and FM Demodulator along with Channel. The GMSK design has 1-bit input data feed serially one after other, and process the GMSK operation, generates the 1-bit output, which is similar to GMSK input data.

A. Algorithm Principle
The mathematical representation of the GMSK modulated signal is explained in this section.
2) The integrator is single-pole infinite impulse response with unity feedback coefficient, and in general, it is represented as Where n =0, 1, 2, etc., many delay elements used in the integrator. NRZ   3) The Gaussian filtering is one of the filtering technique which minimizes the group delay, and Gaussian function is used as an impulse response of the Gaussian filter. The Gaussian filter is designed using the FIR method and is expressed as (9) Where i (t) is filtered pulse, M=8 for tap filter and Gaussian filter is specified by BT product, and impulse response h (t) is Where the phase has been normalized to '0' at 0  t . The instantaneous frequency of the modulating signal will be ) ( 2 Where c f is carrier frequency and m f is peak frequency deviation.

5)
The peak frequency deviation is measured by the bit rate, and the GMSK need a one-bit interval of the period b The modulated carrier is generated using the DFS method, and it can produce

7)
For maintaining the adequate sampling rates with digital methods requires at higher operating frequencies, So use Quadrature implementation approach to generate the modulated signal. The GMSK modulated signal Where

B. Hardware Implementation
The GMSK Modulator mainly consists of NRZ Encoder, Integrator Gaussian filter followed by FM Modulation using CORDIC model's for IQ generation and DFS for IQ modulation and the hardware architecture of the GMSK Modulator is represented in figure 5. The GMSK modulator process 1-bit GMSK input serially and generates 12-bit GMSK modulated output. The NRZ Encoder receives the 1-bit GMSK input and generates the 1-bit output according to the NRZ logic. The NRZ encoder mainly contains the XOR module and data-Flipflop. The XOR Module receives the GMSK input along with D-FF feedback output to process the transactions. The NRZ encoder is mainly used for slow speed transmission which interfaces for the synchronous and asynchronous process. The NRZ is to Logic '1', means the bit is set as a high and Logic '0', means the bit is set as a low value. The NRZ encoded output will be inverted and is input to Integrator, and It is mainly used for high data rate conversion with flexible multiplier less filter. The Integrator is used in the design is single-pole infinite impulse response (IIR) with unity feedback value. The integrator uses four data registers, and the NRZ encoded data is input to the first register. The first and fourth register outputs are added to generate the 1-bit integrated output.
The Gaussian filter is used to minimize the group delay in the GMSK process, and Gaussian function is the impulse response of the Gaussian filter. To design a Gaussian filter, the Low power FIR filter is used. The FIR design is an 8-tap filter, which is having 8-Gaussian filter coefficients, is multiplier with integrated output and generates the 8 temporary output values. These output values are stored in the eight registers parallelly. Using adder, add the corresponding register outputs individually, and 7 th adder output is Gaussian filter output.
The FM Modulation mainly consists of pipelined CORDIC or optimized CORDIC model for IQ generation, two -Digital frequency synthesizer (DFS) modules for IQ Modulation, Control unit to generates I, Q separately, and an adder unit.
The pipelined CORDIC or optimized CORDIC models are explained in section III. The pipelined CORDIC or optimized CORDIC models receive the phase (angle) input and generate the In-phase (cos1) and Quadrature (sine1) phase waveforms. The DFS is used to convert the phase value to the sine wave. The DFS Module mainly contains phase accumulator which generates the slope value, followed by complementor which generates the triangle wave values, The Multiplexor-tree generates the off-wave (one-side) values, and finally, format converter is used to generate the sine waveform. The DFS www.ijacsa.thesai.org outputs (cos2 and sine2) are input to the control unit. Based on the integrator output values, the IQ output values are generated. The adder unit adds both the IQ values to generate the final GMSK modulated output.
The Channel is used to generate the noise using random sequences. The random sequences are generated using the LFSR module. For rapid implementation in Hardware, Galois field LFSR is used. The 5 th order generator polynomial is used to design the Galois LFSR. The polynomial is G(x) = x 5 + x 2 +1. The Galois LFSR hardware design uses the 5 adders, 6 multipliers, and 6-data flip-flops. Overall two LFSR modules of the same polynomials are considered and add the two LFSR outputs to frame the random sequence. These random sequences are the same as standard AWGN generation in real time consideration. The GMSK Modulated output is XOR with channel output to produce the corrupted output values which are input to GMSK Demodulator.
The GMSK Demodulator is used to recover the similar GMSK inputs, and it mainly consists of FM demodulator using pipelined or optimized CORDIC models for IQ generation and DFS for IQ modulation, along with Differentiator and NRZ decoder. The hardware architecture of the GMSK Demodulator is represented in figure 6.
The FM demodulator receives the corrupted data (demod_in) as an input to the control unit. The pipelined or optimized CORDIC modules generate the cosine (cos1) waveform based on the phase input values. The CORDIC output is input to the DFS process, which generates the arbitrary waveform as cos2. The Delay unit is used to synchronize the DFS Modulated output with demodulated input (demod_in). The control unit is used compare the delay DFS output and demodulated input which generates the 1-bit FM demodulated output. The differentiator process the FM demodulated output using four registers subtract or. The fourth register output is subtracted from the first register output to generate the differentiator output, and it is input to the NRZ decoder. The NRZ decoder decodes the differentiator output using the delay unit and the XOR Module. The differentiator output and delayed differentiator output are inputs to the XOR module. The XOR output is inverted and generates the 1-bit GMSK demodulated output which is almost similar to the original GMSK input data sequence.

V. RESULTS AND ANALYSIS
The pipelined CORDIC and Optimized CORDIC based GMSK system results are analyzed in the below section. The two different designs are modeled over Xilinx 14.7 platform using Verilog-HDL and simulated on Modelsim6.5f and finally prototyped on Artix-7 FPGA.
The resource utilization comparison of both the pipelined CORDIC and optimized CORDIC based GMSK system are tabulated in table 2, and the graphical representation is in figure 7. The comparative results are analyzed in terms of area, time and power. The optimized CORDIC based GMSK system utilizes only 145 slice registers, working at 259.477 MHz, and utilizes 0.09W total power consumption on Artix-7 FPGA.
The optimized CORDIC based GMSK system improves the area overhead around 41.53% in slice Registers, 44.75% in slice LUTs, and 38.13% in LUT-FF pairs than pipelined CORDIC based GMSK system. The optimized CORDIC based GMSK system improves timing overhead around 38.8% in operating frequency and reduction of 2.17% in total power utilization than pipelined CORDIC based GMSK system.
The summary of the hardware synthesis results shows that the optimized CORDIC based GMSK system is better than the pipelined CORDIC based GMSK systems. The optimized CORDIC model uses Quadrature mapping along with pipeline structure, and constant tangent values are assigned directly while processing in the iteration stages. But in the Pipelined CORDIC model, the constant tangent values are assigned based on the counter method.
The hardware resource comparison of both the pipelined CORDIC and optimized CORDIC models is tabulated in table 3, and the graphical representation is in figure 8. The pipelined CORDIC Model utilizes 146 slice registers, 331 Slice LUTs and 124 LUT-FF pairs on FPGA which is quite higher than the optimized CORDIC Model. The Pipelined CORDIC operated at 157.588 MHz maximum frequency with the minimum period of 6.346ns on Artix-7 FPGA.  The optimized CORDIC model improves the area overhead around 32.87% in slice Registers, 65.55% in slice LUTs, and 34.67% in LUT-FF pairs than pipelined CORDIC based GMSK system. The optimized CORDIC based GMSK system improves timing overhead around 69 % in operating frequency than the pipelined CORDIC based GMSK system.
The speed comparison optimized CORDIC based GMSK system in terms of latency and throughput is improved around 12.5% and 72.89% respectively than pipelined CORDIC based GMSK system.

VI. CONCLUSION AND FUTURE WORK
The proposed GMSK system is designed using pipelined CORDIC and optimized CORDIC individually. The CORDIC Model is used to generate the IQ waveforms with low latency, which improves the speed of the GMSK system in real time scenarios. The pipelined CORDIC models is designed using shift and add method with pipeline registers along with tangent values used by the counter method. The optimized CORDIC model is designed using quadrant mapping and pipeline structure. Both the CORDIC models are designed for six stages. The hardware architecture of the GMSK system is implemented on Artix-7 FPGA with prototyping. The performance analysis of two CORDIC models based GMSK system is synthesized and summarize the hardware constraints like area, time and power. The optimized CORDIC based GMSK system improves the area (slices) around 41.53%, 38.8% in operating frequency and 2.17% in power utilization than pipelined CORDIC based GMSK system. Similarly, the optimized CORDIC model utilizes fewer resources in terms of the area (slices) around 32.87%, operating frequency around 69%, 12.5% in latency and 72.89% in Throughput than pipelined CORDIC model. In the future, adopt the proposed GMSK system in GSM standard for real-time usage and also for further spectral efficiency enhancements.