Open Packet Processing Acceleration Nuzzo, Craig,

Open Packet Processing Acceleration Nuzzo, Craig, cnuzz2@uis.edu Summary The amount of data in our world is growing rapidly, this is obvious. However, the behind the scenes impacts of this growth may not seem as apparent. All of that data has to be controlled by something, and the answer has always been the computer networks. Over the years, these computer networks have been refined by means of their inner-workings, as they have moved from bare-metal systems to visualized to software defined. The problem that the industry faces is the bottleneck for keeping up with the speed of the Internet without any new bleeding-edge hardware architectures available yet. With the slowing of Moore s Law, the hardware is now only getting a little faster, we have to look at the software instead. The amount of data is so large, that we need to have a discussion on what else can be done in order to accelerate the packets that move that data around the Internet. This is where Xen and OpenDataPlane come into play for the VirtuOR group. After a brief overview of the solution and a summary of ODP in depth, we will then look at the solution in detail and what results VirtuOR saw in the end. VirtuOR addresses part of this problem by implementing a solution to accelerate packet processing, therefore achieving a smaller bottleneck for data. The choice of Xen is due to the high compatibility within ODP. This choice does not come lightly since ODP can drastically impact the performance of all processes on a device, up to 89% (Braham, 2016, p. 408). They chose to manipulate the existing Xen architecture (Braham, 2016, p. 408) with an implementation of the OpenDataPlane (ODP) project. The new Xen architecture (as shown in Figure 1) will virtualize the CPU cores since they will be integrated into a virtual privileged domain called driver domain. This will achieve accelerated packet processing without the overhead of the physical CPU cores. The ODP is an open-sourced project that allows application programmers a easy to use programming environment for data plane applications (OpenDataPlane, 2014). They achieve this by providing common APIs, utilities and configuration files for the underlying hardware. The goal of ODP is to create a data plane application framework for many different platforms. The accelerated packet processing described in this report utilizes the Pull Model packet processing scheme from ODP. The Pull Model (as shown in Figure 2) basically organizes packets with a Scheduler function. The advantage here is to prioritize desired packets for faster processing in the long run (Braham, 2016, p. 409). All of this is dependent on the number of threads as ODP is dependent on how many CPU cores are allocated to the application. This means that each thread will use all the resources available to accelerate the packets. This speed is controlled by the number of allocated cores, or in this case threads, launched by the application. All of these ideas are placed into a Linux virtual machine within the driver domain of the new Xen architecture (as shown in Figure 1). The responsibility of the driver domain is to add or remove the number of virtual CPU cores used by the ODP, this will achieve the accelerated packet processing. ODP will then launch threads corresponding to the number of virtual cores in the driver domain. Those threads then continue to accelerate the processing of packets without loading up the physical CPU. The beauty of virtualized CPU cores is the fact that adding more of them has no influence on the underlying physical CPU of the system (Braham, 2016, p. 410). Replacing the physical CPU cores with the virtual CPU cores in the driver domain is the crux of VirtuOR s solution. In the end, the use of ODP saw some advantages, which include but are not limited to: 1) compatibility with the majority of NICs and drivers in the market and 2) classification of different packet flows with functions from ODP by means of better prioritization of packets for monitoring. The real life implementation by VirtuOR was within their Metamorphic Networks platform (M-Net). The

platform has the ability to remove, create or move dynamically the VMs within the Xen environment (as shown in Figure 3). The M-Net utilizes the TRILL protocol connected though a wired network of two physical nodes. The TRILL provides simple forwarding and speed since it calculates the shortest path based on a combination of IS-IS protocol and Dijkstra algorithm (Braham, 2016, p. 410). All traffic going to different domains is then managed by the driver domain and ODP. The results of the solution were tested on two M-Net devices equipped with 2.5GHz Intel core 2 duo processor and four Intel 82571EB Gigabit Ethernet cards (Braham, 2016, p. 411) and featured a proprietary Linux distribution developed by VirtuOR that contained the new Xen architecture. The parameters used to evaluate were the following: maximum reached throughput, number of processed packets, band-width use percentage and use percentage of the virtual and physical CPU resources on both architectures (Braham, 2016, p. 411). As show in Figure 4, the packet processing of the new architecture has a gain of 15% when the number of virtual CPU cores is more than 1. The throughput evaluation concluded that 958 Mbits/s is achievable. Bandwidth use percentage comparison showed that this 95% use of bandwidth happens with 2 virtual CPU cores in the new architecture. They observed that the only CPU resources used for packet processing were virtual ones. This came out to be 89% for the new architecture and 9.4% for the old (Braham, 2016, p. 412). For future work, the VirtuOR team hopes to compare their solution to other packet processing accelerators in the industry. Future Work One of the main reasons this paper was chosen is the fact that it exemplifies the open-source community. By bringing together multiple open-source solutions, a new one is born. The team at VirtuOR brings together three main open-sourced projects: the Xen Project, the Linux kernel and the OpenDataPlane project. Together this allows them to come up with a solution for faster packet processing within their own solutions. This is something that is on the rise. We see more and more open code than ever before. Microsoft has even recently joined the Linux Foundation and they have opened up their.net coding platform. Many additional companies are unloading their code to places like GitHub for the public to see. This growth will only help the packet processing and software defined networking in order to speed up the Internet further. The collaboration is becoming a healthy solution for the networking industry. Two related computer topics in open-source include cloud computing and graphics processing. The integration of these platforms may help out the software-defined networking of packet processing. The cloud has become a popular option amongst the modern day IT Department. This gives them the ability to concentrate on improving their code without having the physical overhead of running in house servers. An implementation like that of this research paper would most definitely help improve those services. Not only would it improve the cloud service for the business, but also for the client if they are able to utilize the software-defined accelerated packet processing as an option or by default. This would be a lucrative transaction for either parties. Another consideration may be to take advantage of Graphic Processing Units (GPU). The modern GPU architecture can offer computational throughput that is quite high and the memory is very efficient. The GPUs for this particular application would be benefit from being both software and hardware. The GPU is inexpensive and more readily available than the many CPUs. Even more impressive is the fact that 60-200ns of latency can be removed from the ability to retrieve data from main memory (Kalia, Zhou, & Andersen, n.d.). This paired with code being written in CUDA or OpenCL would do wonders to a project like Metamorphic Networks is conducting. The implementation of accelerated packet processing may not be the only thing in the OSI Model that can be virtualized. Research to virtualize other aspects of computer networking could be

considered. Academia and enterprise already use network virtualization to not only learn about networking concepts, but apply it to real world solutions. We see this in software-defined networking implementations already. This idea could also be used to sandbox certain aspects in networking in order to escape the inevitable demise of cyber attacks. This modular code could help IT Departments avoid unnecessary attacks by being able to remove and replace networking concepts at a software control panel or in the command prompt itself. The continued research and implementations of accelerated packet processing is so important now more than ever. Companies should be looking into this as a serious consideration as their network stacks are overran by massive amounts of data. As 4K video is being pushed out into the wild, video streaming services should look to implement some of the aforementioned ideas. That would do us all some good. Citations Rabia, T., Braham, O., & Pujolle, G. (n.d.). Accelerating packet processing in a Xen environment With OpenDataPlane. 2016 IEEE 30th International Conference on Advanced Information Networking and Applications, 408-413. OpenDataPlane Introduction and Overview [An in depth introduction to the OpenDataPlane.]. (2014, January). Kalia, A., Zhou, D., & Andersen, D. G. (n.d.). Raising the Bar for Using GPUs in Software Packet Processing. Carnegie Mellon University and Intel Labs

Figures and Tables Figure 1 Figure 2

Figure 3 Figure 4