Performance Evaluation of Network Gateway Design for NoC based System on FPGA Platform

Guruprasad S.P\textsuperscript{1}

Research Scholar, Dept. of ECE
Jain University, Bangalore, India

Dr.Chandrasekar B.S\textsuperscript{2}

Director, CDEVL
Jain University, Bangalore, India

Abstract—Network on Chip (NoC) is an emerging interconnect solution with reliable and scalable features over the System on Chip (SoC) and helps to overcome the drawbacks of bus-based interconnection in SoC. The multiple cores or other networks have a boundary which is limited to communicate with devices, which are directly connected to it. To communicate with these multiple cores outside the boundary, the NoC requires the gateway functionality. In this manuscript, a cost-effective Network Gateway (NG) model is designed, and also the interconnection of a network gateway with multiple cores are connected to the NoC based system is prototyped on Artix-7 FPGA. The NG mainly consists of Serializer and deserializer for transmitting and receiving the data packets with proper synchronization, temporary register to hold the network data, electronic crossbar switch is connected with multiple cores which are controlled by switch controller. The NG with the Router and different sizes of NoC based system is designed using congestion-free adaptive-XY routing. The implementation results and performance evaluation are analyzed for NG based NoC in terms of average Latency and maximum Throughput for different Packet Injection Ratio (PIR). The proposed Network gateway achieves low latency and high throughput in NoC based systems for different PIR.

Keywords—Network gateway; network on chip; FPGA; routing; network interface; crossbar switch

I. INTRODUCTION

The NoC will play an emerging role in future high-performance Chip Multi-Processor (CMP) to address the problems of interconnections. In recent years, most of the research focused on a packet-switched NoC design, which improves the system performance by using optimization techniques in the network for better Latency and bandwidth and also supports on-chip and off-chip communications. The NoC based photonic communications support a mechanism for large data transmission with higher bandwidth and less power consumption. The photonic based NoC supports Multiple cores interface using gateway switch [1] [3]. Multiple cores residing in a single chip (MPSoC) exist towards mixed-criticality system includes dependability, security, and different block access with shared resources. The outside network real-time messages are communicating to MPSoC using a gateway [2]. In general, the network gateway is a node which connects two different networks with different transmission protocols and simplifies the internet connectivity into one electronic device. The gateway node acts as a firewall and proxy server for business use. Gateways are a protocol which provides the compatibility between two different protocols and will be operating in any of the Open system interconnection (OSI) layers. The multifunctional intercommunication supported by the Gateway on a single-core chip. The different protocol standards like Bluetooth, Modbus, serial bus, Process Filed bus, and Controller area Network (CAN) provides intercommunication using gateway [4]. The intelligent Gateway has interoperability and achieves better communication among different bus networks with reconfigurability and also supports fast conversion speed, flexibility, intellectual control ability, reliability, and higher-level interface. The protocol converting Gateway works on Most of the OSI layers [5-6]. The high-performance computation needs high-speed interconnection like Ethernet and Infiniband. The data transmission between two heterogeneous networks needs an efficient network gateway to improve system performance in terms of bandwidth and Latency [7] [12].

The gateway terminology is used commonly for most of the applications for protocol conversion and data packets transfers. The network gateways are used in most of the real-time embedded and Internet of Things (IoT) applications. The home gateway requires a standard ARM chip with SoC chip which integrates the Customer Electronics Bus (CEbus) with home appliances like TV, microwave oven, refrigerator, and washing machine. The user sends a command to the internet; the network control module receives the command, issues request signal to Chip to control the home appliances [8]. The heterogeneous Gateway provides different interfaces to internet, GSM, CDMS, PSTN, and so on, to support different application scenarios [9]. The embedded Gateway is a backbone for smart grid home networks [10], wireless applications [11] [14], indoor high precision positioning systems [13], and IoT applications [15] for communicating with other networks.

In this manuscript, a cost-effective Network Gateway model is designed along with Gateway based NoC system using Adaptive XY Routing. The Network gateway results are hardware resource-efficient, works at low Latency, and High Throughput for input traffic which are evaluated for NoC based system. Section II explains about related work on Gateway mechanism used for different applications and also explains about research findings. Section III elaborates the Network Gateway architecture using electronic crossbar switch with an explanation. Section IV explains the Network Gateway based NoC based system with router architecture. The results and performance evaluation are analyzed with tables and graphs in Section V. Finally concludes the overall proposed work with Future scope in Section VI.
II. RELATED WORKS

In this section, the general Gateway related work and applications of Gateway are reviewed. Shi et al. [16] presented an embedded dual home architecture with secured Gateway both on hardware and software platform. The Gateway improves the transmitting information risk by the user and network isolation module using FPGA is incorporated to improve the security features using data signature and key management. The secured embedded Virtual private network (VPN) gateway is presented by Han et al. [17] to improve the data transmission security with protection capability in application terminals. This VPN gateway is worked under L3, L4, and L7 layers with firewall protection, VPN Functioning, and network isolation modules. Ajami et al. [18] presented an FPGA Based embedded network firewall which supports highly customized data packet filtering on a network gateway. These firewall customized in real-time by changing the TCP/UDP port id, Source MAC address, and source-destination IP address. Abuteir et al. [19] introduced a gateway design to establish the hierarchical platform for multi-core chips interaction either on on-chip or off-chip networks. The software-based Gateway supports message classification, message – traffic shaping, services, downsampling, service, protocol conversion, egress-queueing, ingress-Queuing, Virtual-Link queuing, and also supports serialization services.

Obermaisser et al. [20] described the mixed-criticality systems for end to end real-time communication, which involves gateways between multiple off-chip networks, Gateway between off-chip and on-chip networks. The gateway node resolving the contention between source controlled and autonomous networks and also supports end-to-end addressing and routing. The cloud storage gateway was presented by Dumitru et al. [21] on FPGA platform. The secured data encryption and transparency are resolved by using FPGA between host and outside interface in cloud infrastructure. Lee et al. [22] presented a high-performance hardware-software based gateway design for In-Vehicle Network (IVN) for CAN/FlexRay controllers. The data conversion between CAN to FlexRay and vice-versa is achieved using Routing table converter block with AXI interface on Zed board. Shreejith et al. [23] described the vehicular Ethernet Gateway connected with multiple network protocols like FlexRay, CAN, and Ethernet with embedded computing Units. The Ethernet gateway is designed using Switch fabric between FlexRay and Ethernet controller. The switch fabric is designed using Crossbar switch.

The embedded Gateway for Fourth Generation (4G) mobile network and process Fieldbus (PB) with decentralized Periphery (DP) is described by Zhou et al. [24] on FPGA platform. The AES algorithm is used for secured data transaction in Gateway. The Gateway is used to connect two different protocol 4G and PB conversion in terms of data. The Korona et al. [25] introduced an Internet Protocol security (IPsec) gateway for multi-gigabit networks which includes security association database to store secure information, Internet key exchange to set secure channels, and responsible for all security operation with packet encapsulation. The programmable-SoC (PSoC) based cyber-physical production system (CPPS) gateway is described by Urbina et al. [26] to meet the industry 4.0 standards. The industrial network architecture includes CPPS Gateways, which are interconnected with multiple peripherals, electronic and electrical devices using different network protocols like Profinet, Profinbus, and High availability Seamless Redundancy (HSR). Kwak et al. [27] present the trust domain gateway system to solve the untrusted internet structural problems.

Gaps in the research: Most of the work carried on traditional software-based gateway designs lacks with latency and throughput issues. Hardware-based Network gateway designs use bus-based interconnections for embedded real-time applications and lack of scalability and reliability problems. The existing research work is done on protocol conversion using gateways, but not on NoC based system. In order to resolve these problems, a cost-effective Network gateway with NoC based system is designed.

III. NETWORK GATEWAY DESIGN

The Gateway provides the network and access information to the four gateway cores, and the hardware architecture of the network gateway is represented in Fig. 1. The network gateway mainly consists of deserializer and Serializer for receiving and transmitting the data information’s with proper synchronization, Temporary register, Electronic crossbar switch, Switch controller, priority encoder, and four gateway cores. The gateway cores are processors, buffers, caches, peripheral devices, etc. The FIFO buffers are considered in the design.

The data information is received from the network either from the interface or from the Router to deserializer, which receives the data signals serially, works based on Serial In Parallel Out (SIPO) manner. The received 8-bit data converts to 32-bit data to parallel using shifting operation along with issuing the synchronization signal to Serializer. The synchronization is achieved between Serializer and deserializer using counter method and proper clocking mechanism. The temporary register receives the deserialized data, holds for access to the electronic crossbar. This temporary register is only used to store the received deserialized data signals and that are scheduled towards for the gateway cores through switch controller. The electronic crossbar switch receives the temporary data along with gateway core (buffer) inputs and works based on switch controller, and its hardware architecture is represented in Fig. 2.

![Fig. 1. Hardware Architecture of Network Gateway.](image-url)
The electronic crossbar switch have mainly five multiplexors, and each multiplexer is five inputs, four from the gateway cores and one from the temporary register (which is coming from the network). The switch controller issues the select line based time slot and priority to the crossbar switch and generates the prioritize output. The switch controller receives the five request inputs from the gateway cores and temporary register. The switch controller works based on the arbitration and time slot. The controlling mechanism is incorporated in the switch controller using Finite State Machine (FSM), which receives the input requests, gives priority to the corresponding input and other requests in waiting for the state.

The electronic crossbar issues the data signals based switch controller to priority encoder as an input. The same select signal issue the prior encoded data signal to Serializer. The Serializer is ready to transmit the data signals to the network form the crossbar switch. The Serializer converts the 32-bit parallel data information to 8-bit serially in PISO manner with synchronization and sends to network. The received and transmitted data of the network is same in the Gateway, which proves that the designed Gateway is working effectively.

IV. NETWORK GATEWAY FOR NoC SYSTEM

The network gateway is interconnected to the NoC based systems which offer on-chip and off-chip data flow control and arbitration between many gateway cores interconnections of the NoC based Multiprocessing SoC (MPSoC). The MPSoC chips are considered as an FPGA or ASIC devices for prototyping the network gateway with NoC. The network Gateway interconnected to NoC, and it is represented in Fig. 3. This is an example of 4x4 Mesh topology-based Network gateways with NoC Connection. It mainly contains 16 routers, 16 network gateways with 64-processing cores and all are interconnected with linked wires. This architecture is flexible to support any of the 64-processing core information’s that can transmit to any of the 16 routers via network gateways using Adaptive routing algorithm.

The network gateway with cores is connected to routers via a network interface (NI). In design, Mesh topology is selected to design 2x2, 3x3, and 4x4 NoC architectures. In Fig. 3, the 4x4 NoC has 16 routers (R1 to R16), and all the routers are interconnected using linked wires. All the network Gateway with cores inputs are received to the corresponding routers via the network interface and perform the data transaction based on the destination address of the corresponding routers.

The router architecture of Network Gateway is represented in Fig. 4. The designed Router is congestion-free Router which finds the shortest route to reach the destination. Each Router has five–port input registers followed by packet formation with priority-based arbitration and adaptive XY routing algorithm. The five-port input register receives gateway data information and stores it in local input port (Li), and For NoC, supported service inputs are East (Ei), West (Wi), South (Si) and North (Ni) are presented to route to corresponding destination locations.

The 8-bit local gateway data are used for packet formation along with user address and request input. The Network Gateway based Router packet formation is represented in Fig. 5. The packet is framed based on a request, destination address provided by the user, and gateway input. So The NoC is having a 13-bit packet which includes 1-bit request, 2-bit destination X address, 2-bit destination Y address, and 8-bit Gateway data.

![Fig. 2. Electronic Crossbar Switch Diagram.](image)

![Fig. 3. Network Gateway based NoC Design Architecture.](image)

![Fig. 4. Hardware Architecture of Network Gateway based Router.](image)
The framed packet, along with four more from input register is input to priority encoder. The priority encoder works based on the arbitration. The Arbiter receives the MSB bits from all the five ports and considered as requests and generates the 5-bit grants based on the priority. These grants are acts as a select line to priority encoder. The encoded data is a prioritize packet data, and it sends as an input to the adaptive routing-XY algorithm. Each Router, R1 to R16, has fixed 4-bit current XY address and which is easy to identify the Router. For example, in design, R4 is set to “0011,” and R14 is “1101”.

To perform the routing computation, first, define the congestion parameters along with Destination-XY address from the encoded packet. The adaptive–XY routing is congestion-free routing and adaptive form of normal XY routing [28]. The X or Y direction with less number of routing path is defined and the routing packet id assisted to the destination with less congestion. Based on congestion parameters, which finds the shortest routing path to reach the destination with less traffic. The Network Gateway based single router, 2X2, 3X3, and 4X4 NoC’s are designed and prototyped on FPGA, which are explained detail in the next section.

V. RESULTS AND PERFORMANCE ANALYSIS

The results and performance evaluation are analyzed in this section for Network Gateway (NG) Module, and NG Based NoC using Mesh topology. The NG and NG-Based NoC are designed using Verilog-HDL on Xilinx platform and implemented on Artix-7 FPGA.

A. Implementation Results

The Network gateway implementation results after a place and route process on Artix-7 FPGA are tabulated in Table I. The resources in terms of Area-Slices, LUT’s, Design operating frequency, and total power utilized are represented. The NG utilizes 450 slice registers, 893 slice LUTs and operating at 319.642 MHz frequency. The NG utilizes 0.104W total power, which includes 0.022W dynamic power using X-power analyzer.

The NG Module is designed for NoC based Multiprocessing SoC applications. The NG Based Router is designed using Adaptive–XY routing algorithm. The different network sizes like 2X2, 3X3, and, 4X4 are designed using mesh topology. The Chip area utilization for NG Based NoC designs are represented in Table II. The graphical visualization of the NG based NoC designs for area utilization is represented in Fig. 6.

The total power (W) analysis of NG based NoC design with respect to Different clock frequencies are represented in Fig. 7.

The Power analysis results are generated using Xilinx X-Power analyzer and the ambient temperature, and the initial source voltage is set to 25°C and 1Volt, respectively. The NG router and NG-4X4 NoC utilizes 1.032W and 1.10W respectively for 5000MHz clock frequency. The network gateway based NoC designs are implemented effectively on FPGA with better chip area, speed, and power tradeoffs have been achieved.

TABLE I. NETWORK GATEWAY RESOURCE IMPLEMENTATION RESULTS

<table>
<thead>
<tr>
<th>Resources</th>
<th>Utilized on Artix-7 FPGA</th>
</tr>
</thead>
<tbody>
<tr>
<td>Slice Registers</td>
<td>450</td>
</tr>
<tr>
<td>Slice LUTs</td>
<td>893</td>
</tr>
<tr>
<td>LUT-Flipflops</td>
<td>252</td>
</tr>
<tr>
<td>Max. Frequency (MHz.)</td>
<td>319.642</td>
</tr>
<tr>
<td>Total power (W)</td>
<td>0.104</td>
</tr>
</tbody>
</table>

TABLE II. RESOURCE UTILIZATION–FOR NG-NOC DESIGN

<table>
<thead>
<tr>
<th>Area Utilization</th>
<th>NG Router</th>
<th>NG-2X2 NoC</th>
<th>NG-3X3 NoC</th>
<th>NG-4X4 NoC</th>
</tr>
</thead>
<tbody>
<tr>
<td>Slice Registers</td>
<td>470</td>
<td>603</td>
<td>823</td>
<td>1193</td>
</tr>
<tr>
<td>Slice LUTs</td>
<td>914</td>
<td>1011</td>
<td>1216</td>
<td>1551</td>
</tr>
<tr>
<td>LUT-FF pairs</td>
<td>273</td>
<td>365</td>
<td>561</td>
<td>891</td>
</tr>
</tbody>
</table>

Fig. 5. Packet formation for Network Gateway based Router.

Fig. 6. Network Gateway-NoC Designs Area utilization on Artix-7 FPGA.

Fig. 7. Total Power v/s different Frequencies for NG based NoC Design.
B. Performance Evaluation

The performance analysis of this work is evaluated using average latency and maximum throughput with respect to input traffic. The wormhole switching method and uniform traffic patterns are considered for analysis purpose. The Packet Injection Rate (PIR) is defined as the total number of data packets that can be sent on a single clock cycle. The average latency for network gateway is calculated using below equation (1).

\[(\text{Avg. Latency})_{\text{NG}} = \text{Min.Avg. Latency} + \text{No. of Flits} \]  

The minimum Network gateway (NG) latency in terms of clock cycles is 18.5. The number of flits used in the design is 8. So average latency for NG is 26.5 clock cycles. For the NG Based NoC design, the Average Latency for NG based NoC design is expressed in the below equation (2).

\[(\text{Avg. Latency})_{\text{NoC}} = (\text{No. of PE's} \times (\text{Avg. Latency})_{\text{NG}}) + (\text{No. of PE's} \times 2) \]  

![Average Latency vs Input Traffic for Network Gateway-NoC Designs using Mesh Topology.](image)

![Max. Throughput vs Input Traffic for Network Gateway-NoC Designs.](image)

The number of Processing elements (PE’s) is defined based on Mesh Topology used in NoC. The 2 clock cycles are considered additionally, which is time taken to forward the packet from source to destination in NoC Network. The input traffic interms of PIR are evaluated in each NG –NoC designs are represented in Fig. 8 for average latency calculation. The average Latency for NG based NoC is represented in terms of Clock cycles (ns).

The maximum Throughput for NG based NoC is defined based on a number of PE’s followed by Gateway data width, PIR, and Maximum operating frequency (MHz). And it is represented by using the below equation (3).

\[(\text{Throughput})_{\text{NoC}} = \text{No.of PE’s} \times \text{Datawidth} \times \text{PIR} \times F_{\text{max}} \]  

The PE’s are connected NoC boundary via a network interface. The throughput calculation depends upon the data width used in the Gateway. For example, the 4X4 NoC with 8-bit data packet are connected 16 PE's and operated on maximum frequency (F_{max}) of gateway design used in artix-7 FPGA. The maximum Throughput with respect to Input traffic is represented in Fig. 9. The maximum Throughput of NG-Router and NG-4X4 NoC operated at 1.5342 Gbps and 24.548 Gbps respectively. The maximum Throughput varies based on data width selection. In design 8-bit data width is selected.

VI. CONCLUSION AND FUTURE WORK

This manuscript presents an efficient and cost-effective Network Gateway (NG) model using Electronic crossbar switch along with Network gateway in NoC based system. The NG design is flexible to support multiple cores and easy to prototype on on-chip devices. The NG with a single Router and different sizes of NoC using mesh topology is designed using Adaptive XY routing. The NG implementation results on Artix-7 FPGA utilizes <1% hardware resources and NG based 4X4 NoC utilizes >2% resources. The NG operates at 319.6 MHz and consumes less total power around 0.104W on FPGA. The Performance analysis of NG based NoC is evaluated using Average Latency and Maximum Throughput with respect to different Input traffic. The average Latency for NG and NG Based 4X4 NoC design utilizes 15.9 and 273.6 at 0.6 PIR, respectively. The maximum Throughput for NG and NG Based 4X4 NoC design works at 1.53 Gbps and 24.54 Gbps at 0.6 PIR respectively for 8-bit data width. This architecture can be incorporated in futuristic researches with the security features to Network Gateway and NoC based systems to strengthen the data packets from attacks.

REFERENCES


