Using of Redundant Signed-Digit Numeral System for Accelerating and Improving the Accuracy of Computer Floating-Point Calculations

The article proposes a method for software implementation of floating-point computations on a graphics processing unit (GPU) with an increased accuracy, which eliminates sharp increase in rounding errors when performing arithmetic operations of addition, subtraction or multiplication with numbers that are significantly different from each other in magnitude. The method is based on the representation of floating-point numbers in the form of decimal fractions that have uniform distribution within a range and the use of redundant signed-digit numeral system to speed up calculations. The results of computational experiments for evaluating the effectiveness of the proposed approach are presented. The effect of accelerating computations is obtained for the problems of calculating the sum of an array of numbers and determining the dot product of vectors. The proposed approach is also applicable to the discrete Fourier transform. Keywords—High-precision computation; redundant signeddigit numeral system; signed-digit floating-point format; redundant signed-digit arithmetic; decimal fractions


I. INTRODUCTION
Most computer calculations are carried out in floatingpoint format and double precision computer calculations are sufficient for solving many computational problems.
However, there are a number of problems, for example, in computational geometry and other areas where double precision floating-point arithmetic is not sufficient [1]. To solve such problems, the well-known libraries of highprecision computations are used, such as ZREAL (Russia), MPARITH (Germany), GMP (USA), which implement floating-point calculations at the software level with a mantissa length set by the user [2,3,6,7,8,9,10].
But these libraries have the property of sharply increasing the calculation time with the increasing in the length of the mantissa and the number of arithmetic operations. In addition, they have the inherent disadvantages of the floating-point format itself, which does not always guarantee an accurate result of computer calculations.
One of such disadvantages is the uneven distribution of floating-point numbers. Fig. 1 below shows the uneven distribution of normalized floating-point numbers with the mantissa length of 3 binary digits and the order from 0 to 4 [4].
As an example of the loss of accuracy in computer calculations consider the problem of determining the dot product of two vectors with following coordinates: The true result is 8779. The dot product was calculated in the single precision format, the relative error was calculated with the constant value of 1   and  ranging from 1 to 21.
The relative error of the dot product was calculated using the formula: ( , ) 8779 100 8779 The dependence of the relative error of the dot product of vectors on the parameter  in single precision floatingpoint format is presented in the graph shown in Fig. 2. Fig. 2 shows that for significantly different values of  and  , starting from the value 18 for  , there is a sharp loss in the accuracy of the dot product results, which is due to the fact that calculations are performed with numbers that differ greatly from each other in magnitude.
Using double precision floating-point format, the increase in the relative error occurs at larger values of β compared to the single precision format.
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 9, 2020 358 | P a g e www.ijacsa.thesai.org be given with coordinates defined as follows: The discrete Fourier transform of a vector  х into a vector  y is performed using the following formula: The inverse Fourier transform is performed using the following formula: Obviously, if we carry out the direct discrete Fourier transform, then the inverse Fourier transform of the vector. ) ,..., , ,  The first goal of this work is to eliminate the sharp loss of accuracy in calculations with numbers that differ greatly from each other in magnitude. The second goal is to speed up computations by parallelizing them.
The first goal is achieved by moving from floating-point representation to decimal representation that is evenly distributed within the range, as shown in Fig. 4, for example for decimal fractions of the first degree.
The second goal is achieved through the use of a redundant signed-digit numeral system, in which redundant negative digits are introduced into the system of bases in such a way that the propagation of the carry when adding is not allowed further than one digit [4,5]. Due to this, the arithmetic operations of addition, subtraction and multiplication are parallelized, which leads to their acceleration, especially when the number of digits increases. The time required for addition or subtraction of numbers does not depend on the digit capacity of the numbers.  In [11,12,14,15,16] methods of representation and algorithms for performing arithmetic operations in a modular numeral system and a method for their acceleration due to parallelization in several modules are presented.
This article proposes a different approach based on the transition to a redundant signed-digit numeral system.
Such numeral system has an advantage over modular systems in that it is simpler to convert directly and inversely to the traditional numeral system and there is no need for overflow control.
A method for summing a group of numbers and calculating the dot product, oriented towards parallel implementation on a GPU, is considered. The results of experimental studies of the effectiveness of this method of high-precision calculations are obtained. The next section considers a possible way to represent numbers in a redundant signed-digit numeral system.

II. REPRESENTING NUMBERS IN A REDUNDANT SIGNED-DIGIT NUMERAL SYSTEM
Consider the representation of floating-point numbers of the following form [4]: where is a floating-point number, is the mantissa of the number , an integer such that satisfies the inequality is the base (radix) of the numeral system, is the order, an integer such that satisfies the inequality f k t  is a natural number characterizing the length of the mantissa of the floating-point number, is a natural number characterizing the maximum order of representable numbers. Table I includes positive and negative minimum and maximum numbers representable in the form (6) [11,13,17,18].
The range of representable numbers (6) is as follows: Consider the sum of numbers of the form: If 10 q  , the maximum number of digits required to describe the sum will be equal to: From the last expression it can be seen that to implement the addition of groups of numbers in floating-point format calculations with large numbers are required.
Consider the format for representing floating-point numbers (6) in the signed-digit numeral system as follows: (11) where are the digits of the signed-digit representation. This format will be referred to as the floating-point signeddigit format.
Consider the rule for adding two numbers and a group of numbers in this format. ...  In the next section, the first and second methods of summing a group of numbers are considered.

IV. METHODS OF SUMMING GROUPS OF NUMBERS
The first method of summing groups of numbers is carried out according to the formula (12) using the addition rule presented in Section III in redundant signed-digit arithmetic in parallel over the digits and sequentially for each number of the group. This method, when implemented on GPU, requires a large number of synchronizations between cores. In Section V the results of experimental study of the effectiveness of this method are presented.
Consider the second method of summation with fewer synchronizations.
Let the number of digits of the summed numbers equal d, and the maximum possible number of digits required to describe the sum equal .
Let a set of numbers for summation be given: , , i i i , thread number finds digits: 1 2 3 , , j j j . Then these numbers are summed sequentially one after another bit-parallel in the signed-digit numeral system according to the first method.

Each thread forms numbers
Next section considers the results of experimental studies of the efficiency of summation of groups of numbers.

V. EXPERIMENTAL STUDY OF THE EFFICIENCY OF HIGH-PRECISION SUMMATION OF GROUPS OF NUMBERS IN THE SIGNED-DIGIT NUMERAL SYSTEM
Numerical experiments were carried out on the addition of groups of integers of different magnitudes, with the number of integers k = 10000, 100000, 1000000. The addition was carried out according to the rules of traditional arithmetic on the CPU bitwise each number of the group sequentially and using the first method on the Nvidia GPU (1.78 GHz, 1280 cores) bit-parallel and sequentially for each number of the group.
GPU calculations were performed as follows: 1) Initial data were generated randomly, integers of fixed length were generated and stored in arrays.
2) Arrays were transferred to the GPU.
3) A number of threads were created matching the number of digits. Each thread carried out sequential summation of the array numbers in its corresponding digit in parallel and independently.
4) The result of the summation was transferred from the GPU to the CPU.
The time required for summation of numbers on the CPU and the GPU was calculated, considering the transfer of data to the GPU and in the opposite direction to the CPU. On the basis of these calculations, the absolute acceleration coefficients were determined for different numbers of digits and values of k by the formulas:  Experiments showed that data transfer from CPU to GPU and from GPU to CPU was very fast and did not lead to delays in the computation process. One of the main reasons for the low efficiency of the first method of summation on the GPU in the signed-digit numeral system is associated with the need for synchronization after addition of each pair of numbers in the group, which slows down the computation. If the array contains k numbers, then this summation method requires k-1 synchronizations in the process of summing this array.  Vol. 11, No. 9, 2020 362 | P a g e www.ijacsa.thesai.org Experiments on summing the same groups of numbers using the second method have shown that it is more efficient than the first method. For the second method the computation times on the CPU and the GPU were calculated, on the basis of which the absolute acceleration coefficient was determined by the formula (15) for different values of k. Fig. 6 shows the dependence of the absolute acceleration coefficient on the number of digits for this summation method. Fig. 6 shows that using the second summation method on the GPU with numbers comprising of 800 to 900 digits speeds up the computation 3 to 4 times in comparison with summation on the CPU. With further increase in value of k and the number of digits, the acceleration is supposed to be even greater. Next section considers a method for multiplying numbers based on redundant signed-digit arithmetic.

VI. METHOD FOR MULTIPLYING NUMBERS
Let the number of digits required to represent initial data equal d, and the maximum possible number of digits required to represent the result of multiplication equal .
Let two numbers be given: The results are formed in parallel and independently by each thread for each product of two numbers.
Then they are summed up using the second method described in Section IV.
Next section considers the results of experimental studies of the efficiency of calculating the dot product of vectors.

VII. EXPERIMENTAL STUDY OF THE EFFICIENCY OF HIGH-PRECISION CALCULATION OF THE DOT PRODUCT OF VECTORS
IN SIGNED-DIGIT NUMERAL SYSTEM The dot product of vectors (x,y), where x = (x1,x2,…,xk), y = (y1,y2,…,yk), is calculated as follows: 1) The values of arrays x,y are transferred to the GPU. 2) For each pair xi and yi partial products (24-26) are calculated in parallel and independently on the GPU.
3) Next, the summation of the obtained partial products is carried out using the second method.
4) The result of the dot product is transferred to the CPU. Numerical experiments were carried out to calculate the dot product of vectors with different numbers of coordinates k = 10000, 100000, 1000000. The coordinates were integers with the length of 50 decimal digits. The dot product was calculated according to the rules of traditional integer arithmetic on the CPU and in redundant signed-digit arithmetic bit-parallel on the GPU.
The results of the experiments are presented on the graph in Fig. 7.
The graph shows that with the increase of the number of digits required to represent initial data d and the maximum possible number of digits required to represent the result of the dot product the proposed method provides greater acceleration of the computation process.

VIII. CONCLUSION
This article proposes an approach to software implementation of computations on a GPU, which prevents sharp loss of precision in calculations with numbers that differ greatly from each other in magnitude. The approach is based on representation of floating-point numbers in the form of decimal fractions and the use of a redundant signed-digit numeral system to speed up computations with them on the GPU.
The effect of accelerating computations was obtained and proven experimentally for the operations of summation of an array of numbers on the GPU and calculating the dot product of vectors.
The proposed approach is also applicable for the discrete Fourier transform, for the case presented in the article as well as in other cases.