L-Bit to MBit Code Mapping To Avoid Long Consecutive Zeros in NRZ with Synchronization

we investigate codes that map bits to m bits to achieve a set of codewords which contain no consecutive n “0”s. Such codes are desirable in the design of line codes which, in the absence of clock information in data, provide reasonable clock recovery due to sufficient state changes. Two problems are tackled(i) we derive for a fixed and and (ii) determine for a fixed and . Results benefit telecommunication applications where clock synchronization of received data needs to be done with minimum overhead. Keywords—overhead; mapping; synchronization; consecutive “0”


I. INTRODUCTION AND BACKGROUND
In serial communications, data is transferred on a medium that carries a signal varying with time.For digital signal, each bit is represented as a high or low voltage for a fixed amount of time.We call this time period a clock cycle.The clock of the communication line is very important as it tells the transmitter when to transmit a new bit, and it tells the receiver when to read.For short distance transfer, such as communication within a digital system, we can have a clock signal between the transmitter and receiver to synchronize the clock; for example, the Serial Peripheral Interface (SPI) uses a clock signal for clock synchronization.However when it comes to long distance communication, adding signaling only for clock synchronization consumes part of the bandwidth.It is impossible to exactly match the clock speed for the transmitter and receiver.On the other hand, employing codes that contain explicit clock information (ex.Manchester coding) will waste half of the available bandwidth [1].In practice, the clock information is embedded within the data so that, at the receiver end, the clock can be extracted and used to clock in the received data (using devices such as Phased Locked Loop or PLL).Nonetheless, having a long period of flat signal (which may correspond to consecutive -0‖s) may cause the synchronization to be lost.For that purpose, the signal that carries the data must also have sufficient transitions or state changes to allow a PLL to lock onto the incoming data.In the event that a long sequence of -0‖s is encountered, there will be a risk of losing synchronization.
As one of the scrambling techniques for data encoding, transmitter should provide sufficient amount of signal transitions for the receiver to maintain clock synchronization [2].Line coding is applied on data before transmission especially in High Speed Serial Links to ensure a maximum Run Length (RL) to guarantee frequent transitions for Clock and Data Recovery (CDR) in asynchronous links [3], for example, B8ZS and HDB3, which substitute a long sequence with a code violation of the encoding rule.These types of techniques either require increase of signal rate for the same data rate, or require more than 2 signal levels to represent binary data.For Manchester and Differential Manchester, the signal rate is twice the data rate (50% overhead).For B8ZS and HDB3, having 3 signal levels to represent a single binary bit creates a 33% overhead.Though line codes can generate adequate timing information for clock recovery and error detection [5] [6], it usually comes at the cost of additional bits.In this paper, we will discuss how to minimize the overhead with the same clock recovery performance.
Another technique is to eliminate long sequence of zeros by encoding the data so that the transmitted data does not contain long sequences of "0"'s.The 8b/10b encoding [4] which is widely used, adds 2 bits for every 8 bits resulting in 2/8 = 25% overhead while ensuring a maximum RL of 5.One other example would be mapping 4-bit data to 5-bit codes such that a sequence of 3 "0"s is avoided ( 3).There are total possible codes in 4-bit data.
In 5-bit code space, we have 24 (32 -8) codes without "000" sequence available.So mapping 4-bit all data with 5-bit This paper proposes an empirical method of calculating the minimum overhead to avoid a given number of consecutive -0‖s.The rest of this paper is organized as follows: Section 2 discusses the basic theory of avoid long sequence of -0‖s.Section 3 introduces the methodology to achieve two empirical formulas for our concerns.The results and conclusion are given in Section 4 and 5 separately.

II. THEORY
The research question here is-Given a specific size of a code, what is the smallest overhead to avoid a given number of consecutive "0"s.?
Example 1.We are given a 9-bit code and we want to avoid the sequence "000".First we will check if the 10-bit code has enough space to hold the 9-bit code and also avoid the sequence "000".If 10-bit is not possible, we will consider the 11-bit code and continue checking until we find the smallest size of code that can hold the 9-bit code and avoid the sequence "000".
To check if 10-bit code is enough, first we will enumerate the codes in the 10-bit code that has "000" sequence in.The calculations are depicted in Table II.Note that in the patterns given in the table, X can be 0 or 1 and each line must exclude the cases that had been counted in the previous lines.
Note that for X's of length 3 or more on the right side, we have to exclude any codes that have the -000‖ sequence, because they were already covered in previous lines.
Subtracting the total number of "000" patterns-520 from the total code space 0 , we get only 504 codes which is not enough for mapping all 9-bit codes to 10-bit codes.

Generalization
To answer the general question of if it is possible to map all L-bit codes to m-bit codes that avoid sequence of n consecutive zeros, we first have to find the number of codes without n-zero sequence by subtracting the number of codes with n-zero sequence from the total m-bit code space .
To answer another question of finding the smallest number of consecutive "0"s, we can start with 2 zeros and work upward.Say if we cannot avoid 2 zeros, test if we can avoid 3, 4, 5, etc. Repeat until we can find that smallest number of consecutives "0"s we can avoid through -bit to -bit mapping.
To find the number of codes with sequence(s) of n "0"s in m-bit space, use a similar step from the 9B-10B mapping example to obtain the solution (Table III).
We can see that calculating ( ) requires recursive calculation of the number of codes with n-zero sequence in the code lengths less than m.We must define the basis for the recursive function, otherwise we will go to endless loop of calculations.We know that there cannot exist n-zero sequence in the code if the code length m is shorter than n bits.So ( ) 0, for .
For , continue our generalization.Adding the terms and simplifying gives Now that we obtain the piecewise recursive function ( ): The calculation of the function ( ) seems very complicated with the summation.We can make the calculation easier and more efficient by observing the following 2 special cases of m and n.

Case (i) if
, we know that there is only one code that has the n-zero sequence; that is the n-zero sequence itself.

Case (ii) if
, the last addition to the sum is ( ). Let's look at the last largest , we substitute m with 2n and obtain ( ) , the www.ijacsa.thesai.orgrecursive call to the function returns 0 because .
Similarly, all m value between n and 2n make recursive call to function f with first argument less than second argument n, which will ultimately give 0 as the result.So for , the summation evaluates to 0.
By separating the domain of the function, we produce a new formula with 4 pieces but is easier to calculate or more efficient to compute digitally.
We can use the code in Appendix to calculate the function f.Let 0 3 , the function returns 520 which match our previous calculation for 9-bit to 10-bit mapping example.We can also check our result by counting the number of codes with -000‖ pattern by using a brute force checking program, created by Edgar Solorio (See Appendix) Our result matches the number counted by this checking program ( 0).Similarly when 3 , the function returns 1121, that means there are 927 codes available for mapping.While this is not enough to map 10-bit, it is sufficient for 9-bit codes, giving 18.2% overhead.
Generally speaking, for given and , * ( ) } (1) Floor is the greatest integer function, mapping a real number to the largest previous integer.
We define the minimum overhead bits (2) If we try 19-bit to 22-bit mapping, which is possible, there is only 13.6% (3 bits) overhead.Similarly 64-bit codes has about .codes without "000", while we cannot map 61-bit codes to 64-bit codes, there is enough to map 56-bit codes.Also 8-bit to 9-bit mapping is possible, 9-bit space has 238 codes with "000" leaving 274 codes available to map 8-bit (256 possible) codes.The overhead of for 8B9B is 11.1%.Since 9B-10B is impossible, code mapping with 1-bit overhead stops at with 8-bit to 9-bit mapping.

III. METHODOLOGY
The result from the 9B-10B example in the Work section shows that mapping from 9-bit codes to 10-bit codes cannot avoid all codes with 3 consecutive zeros.If we want to avoid 3 consecutive zeros, the minimum overhead to map 9-bit code is 2 bits.The following 2 questions are our main concerns about avoiding consecutive zero level signal transmitted in regard to maintain synchronization.
A. For a given pattern length and mapping from bits to bits, what is the minimum number of consecutive "0"s we can avoid?(Fixed and , find ) The code in Appendix shows how to solve this question.
Table IV shows the minimum avoidable zeros with 1 to 9 bits overhead for mapping data of lengths from to 24 bits.
The jumps in are highlighted and bold faced.We plot the data both in horizontal (Fig. 1) and vertical direction (Fig. 2), Obviously, we can observe that L is approximating linear with h and is relatively exponential with n.This means for a fixed n, the overhead bits h should be proportional with L. We can assume that a, b, c in this formula is the coefficient to be determined.
B. For a given number of consecutive "0"s to avoid, how to minimize the overhead?Solve the problem for a specific case of avoiding two "0"s first.Under what conditions can we map " " to " "?If not, what about " " to " " or " " to " 3"? (Fixed , find ) The code in Appendix shows how to solve question 2. Table VI shows the minimum overheads to avoid n zeros with to 0 for mapping data of lengths from to bits.The jumps in overheads are highlighted and bold faced.
Fig. 3. Plotting using data from Table VI If we rearrange and extend the data of Fig. 4. Minimum overheads versus message length L and number of consecutive -0‖s to be avoided

IV. RESULDS
We can check the accuracy of ( 4) by different check point: From this table, we can see, when n is between 3 and 9, (4) is accurate enough to determine the minimum consecutive "0"s can be avoided for fixed and .
We can also check the availability of (5) by different check points in TABLE IX: From this table, we can see, when n is between 3 and 6, this formula is accurate enough to determine the minimum required overhead bits h to avoid n consecutive -0‖s for fixed message length .V. CONCLUSION We have considered the problem of L to m mapping to avoid a set of n consecutive -0‖s.We derived two formulas to calculate (i) the minimum number of consecutive -0‖s that can be avoided for fixed L and m (4) and (ii) the minimum overhead required to avoid a given number of consecutive -0‖s with fixed L (5).We found the exact values for small values of L, m and n (Table IV and Table VII).For very long messages, we used the empirical results and combination of several tables to arrive at a formula that will give the desired answer with close approximation.
One may think of splitting a long code into smaller codes and using the results for small values to obtain the parameters for the long code.For example, the splitting of 56-bit code into 8*7-bit codes can simplify the calculation but will not work since a potential problem can occur: Even if all 8 7-bit codes have no -000‖, when the frame size is more than 7 bits (e.g.64 bits), there can exist consecutive -000‖ in the end of a 7-bit code and the start of another consecutive 7-bit code.
The results obtained can find applications in coding and communication where the synchronization of the transmitter and receiver is of primary concern.

TABLE II .
ENUMERATION OF 10-BIT CODES THAT CONTAIN -000‖PatternThe number of occurrences XXXXX

TABLE VI .
MINIMUM OVERHEADS TO AVOID N CONSECUTIVE ZEROS TABLE VI, we can get the following table:

TABLE VII .
MINIMUM L FOR SPECIFIC N AND H ijacsa.thesai.orgThis table is slightly different from Table V.We can use similar procedure and get . . .

TABLE VIII .
COMPARISON OF THEORETICAL AND CALCULATION N BY FIXED L AND H

TABLE IX .
COMPARISON OF THEORETICAL AND CALCULATION N BY FIXED L AND H