781.897.1727

In a building, each floor depends on the strength of the floors below it. Ensuring that the fifth floor is reinforced provides very little comfort if there is a structural problem on the third floor. In order to ensure that the building keeps standing, you need to reinforce every floor starting from the bottom.

And so it is with protocol stacks. Providing security at the transport layer (e.g., TLS) has questionable value if the packet exchange is compromised at the network or data link layer. Yet, we rarely worry about protecting these layers.

This new technical brief from Northforge describes some of the common attacks that can occur at the data link layer and how MAC-layer Security or MACsec (IEEE std 802.1AE™) can be used to provide hop-by-hop or end-to-end authentication and encryption to protect the lowest floors of your protocol building.

For a technical brief on MACsec, download here.

High performance I/O from a NIC

XEN and DPDK can be complementary. Clearly an EAL can be used within a VM running on top of the XEN hypervisor to provide high performance I/O from a NIC. However XEN and DPDK can also be alternatives for implementing a solution to the same problem. Consider a system that needs to perform the following sequence of functions

Using DPDK in a single VM, this could be decomposed as follows:

 

 

The solution can be implemented natively on top of XEN in multiple VMs like this:

 

 

The world of using virtual machines to implement virtualized network functions is still new. Each application is different, either administratively, functionally, or in performance requirements and therefore each will need an implementation that addresses its specific requirements. DPDK and XEN are two of the tools that can assist with these design problems.

If you need help designing and implementing your VNFs call Northforge Innovations. This is what we do.

By Larry S.

 

 

 

 

 

 

Implementing DPDK and Xen

With DPDK, packet processing is performed at the application layer in the virtual machine. Receive processing is based on polling the receive interface (using the EAL) rather than on interrupts. Interrupts require a fair amount of overhead in the “normal” case, but when interrupts must be propagated from the host operating system to the hypervisor to the guest operating system to the application, they end up being very expensive.

With DPDK, threads communicate through the use of shared memory queues. DPDK provides mechanisms for lockless (i.e. no blocking or synchronization needed) ring buffers that support single and multiple writers as well as single and multiple readers. These are called “rte-rings”.

There’s a better approach for increased performance and efficiency

For Rx (Receive Processing), the NIC (Network Interface) performs a Direct Memory transfer (DMA) to a buffer ring. If the NIC supports Receive Side Scaling (RSS), it can queue packets to different threads on different cores based on packet filters setup on the NIC. This increases the packet processing performance by spreading it over multiple cores.

The Proc function (Processing) can be scaled by decomposing it into a sequence of steps that can be organized as a pipeline. For example, if the processing is organized as three sequential steps, the three threads can be assigned to different cores and once the pipeline is full the system is, in effect, working on three packets simultaneously. There are a couple of different models for this depending on whether RSS is being used (which is shown in the top example in the following diagram.

Implementing DPDK and Xen fig 2

The Tx (Transit Processing) is initiated by putting the packet on a shared queue. The Tx process can then transfer the packet to the NIC for transmission.

DPDK is intended for a solution where all of the threads are running on the same Virtual Machine.

Packet Processing with XEN

​Not all packet processing solutions are designed to run on a single Virtual Machine. There are administrative reasons for splitting the system across multiple virtual machines. For example, if the packet stream represents multiple customers, then it might be desirable to split the processing across multiple VMs to provide separation and protection between customers as well as facilitating billing. There are also functional reasons for splitting the system across multiple virtual machines. For example, if the server is providing both a client focused capability (such as DHCP) and also a network service such as an IP Router, then running these on different VMs makes sense.

DPDK can provide performance benefits within a single VM, but splitting the processing across VMs is a bit more problematic since each VM appears as an independent and self-contained machine, each with its own memory. Providing communications and data movement between these VMs could be done using a networking function (i.e. transmitting and receiving packets through a virtual switch), but there are performance problems with this. Using shared memory between processes is much faster.

This is where XEN can play a part. XEN provides inter-VM shared memory using a page grant mechanism. The XEN hypervisor runs in Domain 0 (what we normally think of as kernel mode). It has access to the hardware page tables and memory management functions. Most of the work is done in the User Domain, however. So, a process in Dom 0 (e.g., a NIC driver) or Dom U in a VM (e.g., an application) can share a page but making a request to XEN. XEN enters the page into a grant table and returns a handle for the page. The handle can then be provided to another process running on another VM to grant access to the page. This mechanism is called “xenstore”.

XEN provides support for non-blocking ring buffers in xenstore similar to DPDK rte-rings. These are called xenstore-rings.

Implementing DPDK and Xen fig 3

In the final section we compare and summarize the two solutions. Comparing XEN and DPDK Solutions – A summary

 

By Larry S.

In general, packet processing applications follow a standard regimen:

  •     Receive a packet (Rx)
  •     Process a packet (Proc)
  •     Transmit a packet (Tx)

The Rx part is, more or less, the same regardless of the type of packet processing. The Proc part is the heart of the application.

Use multiple cores in your general packet processing model to enhance performance fig b

This could be an Ethernet Switching application, an IP Routing application, Deep Packet Inspection (DPI), or a protocol process such as DHCP.

In gateway applications (routing and switching) the Tx is usually a different interface than the Rx interface whereas with a protocol process the Tx is usually a response to the sender and therefore to the same interface as the Rx.

​The performance requirements vary from application to application, but a 1Gbps Ethernet can transport approximately 1.5 million packets per second, so the requirements could be very steep.

One approach to achieving high performance is to employ multiple cores. For Rx, this can be done by using a hardware capability called Receive Side Scaling (RSS). With RSS the Network Interface (NIC) can queue packets to different threads on different cores based on fields in the packet. For Proc, this can be done by breaking the processing into multiple tasks and creating a pipeline. (sidebar on pipeline processing). Tx usually requires very little processing so there is little motivation for partitioning it.

Use multiple cores in your general packet processing model to enhance performance fig b

By effectively using multiple cores, the packet processing performance of the system is greatly enhanced because the system is processing multiple packets simultaneously.

Based on the general packet processing model described here, we will next show how this model can be implemented using DPDK and XEN: Packet Processing with DPDK and XEN

 

By Andrei C.

No host OS + paravirtualization support = performance improvement

XEN is a hypervisor. A hypervisor is a supervisory program (think, operating system) that provides support for virtual machines. Parallels (Parallels), VMWare (Dell), and Virtual Box (Oracle) are all hypervisors. They provide an environment that hosts a number of processes (virtual machines) where each virtual machine believes it is running on the underlying hardware. Each virtual machine contains a guest operating system (e.g., Windows, macOS, Linux) and one or more processes/applications running within the guest operating system. Each of these hypervisors sits on top of a host operating system (e.g., Windows, macOS, Linux).

It is common to run a hypervisor like Virtual Box on a Mac and load one or more Windows virtual machines in order to run applications that only run on Windows.

XEN is a bit different in several ways from the three hypervisors listed above. First, it is a “bare metal” system. It runs as the lowest level, right on top of the hardware — there is no host operating system. As you’d expect, this improves performance and efficiency.

Second, XEN supports paravirtualization. With paravirtualization, XEN provides APIs for many system functions and the guest operating system in the virtual machines can be rebuilt to access system resources and the I/O subsystem via this API.

There are several Linux distributions that have been recompiled to use the XEN interface. This can improve performance and also allow stronger support for virtual machines on CPUs that don’t have good VM support (some x86s, older ARMs, etc.) XEN also provides support for accessing the underlying hardware like the other hypervisors (this is called Hardware Virtual Machine, HVM). The final difference is that XEN is open-source.

DPDK

One of the major problems with implementing network processing applications in virtual machines is implementing high performance I/O. Packet processing applications often need to process tens of thousands or even millions of packets per second. This is difficult at the application layer, in general, but even more difficult when the system I/O calls are made to a guest operating system which has to access the hard through the hypervisor and the host operating system.

The Data Plane Development Kit (DPDK) is a solution to this problem. DPDK is a framework that provides for creating software libraries tailored for specific hardware architectures (e.g., X86) and specific operating systems (e.g., Linux). These libraries (called Environment Abstraction Layer or EAL) provide high performance, generic (i.e., hardware and OS independent) access to hardware and operating resources including the I/O subsystem.

Using the DPDK EAL allows the development of high performance user-mode packet processing applications which can also be tuned to exploit multi-core CPUs.

​Next, we provide an overview of a general packet processing model: General Packet Processing Model

 

By Larry S.

 

First in a five-part series

It is increasingly common to implement packet processing functions in virtual machines. This is what Network Functions Virtualization (NFV) is all about.

The most common implementation model for network functions has been to replicate the functions in the devices that are distributed around the network. This is the easiest way to do it, but taking a step back it becomes clear that this is not the most efficient way to do it both from a management point of view as well as from an overall efficiency point of view.

Consider DHCP for example. A network could have 50 edge routers running DHCP, but there is no reason why DHCP has to run in each edge router. The traffic load associated with DHCP is small so it makes sense to run each of the 50 DHCP instances as a Virtual Machine (VM) or possibly as a thread of a DHCP VM on a centralized server. There is the efficiency benefit in statistically multiplexing the computational load (all 50 instances are never running at the same time) and several management benefits such as only having to update a single system to fix bugs and add features and only having to configuration a single local system. Additionally, modern cloud computing technology provides the ability to migrate VMs for redundancy and to cloud burst for unexpected load peaks.

NFV requires high-performance software-based implementations of these packet processing functions in a virtual machine environment. These implementations must be able to read packets from the network, process them, and send packets out to the network. There are a variety of tools and techniques for implementing packet processing in Virtual Machines. In this blog series we will discuss two common, but very different approaches. One is the XEN Hypervisor and the other is the Data Plane Development Kit (DPDK). Depending on the functionality required, these two solutions can be used independently or concurrently. Over a sequence of blog posts we will describe these two approaches and how they can be used to implement high-performance packet-processing software.

​In the next blog post we provide a brief overview of the XEN hypervisor and DPDK.

-Larry S.

Are you looking to reduce costs and improve latency? The Multi Stage Content Aware Engine of the Broadcom StrataXGS allows for additional packet inspection and manipulation in the switching chip rather than using an external device like an FPGA, NPU or software on a host processor. That helps to reduce cost and also reduces packet processing latency by staying in the chip. It also enables a degree of Packet Inspection and packet handling decisions based on that inspection without needing to go off chip, again giving you cost reduction and latency improvement.

This content aware engine capability of Broadcom enables customers to replace FPGAs, redirect traffic to applications and develop diagnostic tools. The Broadcom XGS Content Aware engine provide the following capabilities:

  • Dropping of frames that are identified as potential network security risks like DDoS packets
  • Forwarding of select control packets to the CPU, such as IGMP, OAM etc.
  • Assignment of new priority, Vlan ID, or VRF for the selected traffic stream
  • Counting or metering an ingress flow across multiple ports
  • Redirection of a select flow to a new egress port
  • Redirecting or mirroring traffic based on the egress port
  • These capabilities are achieved at different stages of the pipeline: Before l2 lookup(VCAP), end of ingress stage(ICAP) and end of the egress stage(ECAP). Broadcom achieves these capabilities with the help of TCAMs in the chipset.

 

The Content Aware Engine can be further classified in the following phases:

  • Selection phase: where the packets are matched and selected as per the configured rules/entries
  • Action phase: as per the configured rules, the packets are subject to the following actions
    • Drop the packet
    • Redirect the packet
    • Copy the packet to the host CPU
    • Modify the packet
    • Note: In some cases, more than one action will be required. Broadcom allows the merging of more than one action like drop the packet and copy the packet to the host CPU etc.
  • Statistics phase: the user can enable the statistic feature, so that engine counts the numbers of packets and bytes processed by the corresponding rule

At Northforge we can help you program the content aware engine as per the application requirement and speed up your development process by fulfilling the required functionality by using the Broadcom chipset. We have hands-on experience with our customers in replacing their FPGAs and applications development like IPTV with the content aware engine of Broadcom to fulfill their complex deployment scenarios in the field.

-Manohar R.

Let’s say you want to develop a service where a SmartHome monitoring system detects smoke and a phone call is generated to alert the homeowner. With the ability to receive external events from any monitoring software, there’s a free, open source communications software for creating voice and messaging products for that.

FreeSWITCH is a softswitch for PBX applications which can create that phone call alert and then connect the homeowner to the 911 operator. In this situation, it allows you to generate calls for automation systems where you play audio files, collect user input, and then decide to make another call and have two parties talk to each other. In this case, with FreeSWITCH you can also receive external events from external programs, such as the home monitoring system, and can generate calls remotely. Its built-in Interactive Voice Response (IVR) system lets you play a pre-recorded audio file, collect user input (usually dtmf digits) and make decisions — either to play something else or to generate another call, hang up or basically anything else within the phone system. Some of this functionality is provided by external systems that are integrated using FreeSWITCH modules.

There’s more to FreeSWITCH than this one use case. You can use FreeSWITCH to develop text-to-speech engines where you can turn any text into an audio file and then play it, or have your audio files professionally pre-recorded and then play it while still receiving user input. Or the reverse, use it for speech recognition such as when you want cell phone voicemail messages to also appear in a text format. It’s very suitable for voicemail and eFax applications. FreeSWITCH works with Linux and Windows, and supports English, French, German and Russian languages.

FreeSWITCH is quite popular with telecommunications software developers who need to create automated telemarketing programs. Using FreeSWITCH as a base, you can generate automated telemarketing calls where FreeSwitch generates the call, plays a pre-recorded audio file and then hangs up.

It can be used to develop services which receive incoming calls to the FreeSWITCH for one leg of the call, play an audio file and based on what the user wants to do, connect to another application, such as voicemail. Or create a service where when the user receives a voicemail, they can also receive an email with an attached audio file telling them that a voicemail is waiting for them.

Our Northforge team has created voice and messaging services based on FreeSWITCH for several customers who want to expand or start new services. Based on our experience with this open source software, we can get our customers telecom programs up and running quickly.

Infrastructure-less Network

In last few years mobile communications have dramatically increased in popularity and usage. This growth has inspired a development of advanced communication protocols offering higher throughput and reliability over wireless links.

Much of wireless technology is based on the principle of direct point-to-point communication, where participating nodes “speak” directly to a centralized access point.

However, there is an alternative, “multi-hop” approach, where the nodes communicate to each other using other nodes as relays for traffic if the endpoint is out of direct communication range.

Mobile Ad hoc NETwork (MANET), described here, uses the multi-hop model.

Wikipedia describes MANET (Mobile Ad Hoc Network) as a continuously self-configuring, infrastructure-less network of mobile devices connected wirelessly.

All the nodes (devices) are wireless, mobile and equal (no access points, base stations, or any other kind of infrastructure).

The best comparison could be a cellular network WITHOUT base stations, where all the phones need to create multi-hop mesh network.

     Infrastructure-based Network (i.e. Cellular):

Infrastructure-based Network

     Infrastructure-less Network (MANET – Mobile Ad hoc Network):

 

Infrastructure-less Network (MANET - Mobile Ad hoc Network)

These networks are self-configuring and can be set up randomly and on-demand. Such networks can have dynamically changing multi-hop topologies, composed of, likely, bandwidth-constrained wireless links.

The concept of the mobile ad-hoc network suggests the incorporation of routing functionality into mobile nodes, in other words all nodes should be able to act as routers for each other.

Need

Since an infrastructure-based network is always a better solution than an infrastructure-less network in the meaning of network performance, MANET is relevant only in cases when laying the infrastructure is impossible or is not practical:

  • Natural disasters: for rescue forces
  • Remote areas / difficult terrain, i.e. pit mines, tunnels, mountains, deserts, so on
  • Military, paramilitary, rescue, anti-terror forces
  • Others: Vehicular ad hoc networks, distributed sensor network, smartphones ad hoc network…

 

MANET – Layer-3 Routing Core

Ad-hoc networks are not restricted to special hardware or a certain link layer. MANET is a routing core (Layer-3 routing protocols) running on top of any possible Layer-2 wireless medium that is able to provide connectivity between the neighboring (1-hop) nodes:

MANET – Layer-3 Routing Core

It is important to note a difference between MANET routing and traditional IP routing. Routing in fixed networks is based on aggregation combined with best matching. When a packet is to be forwarded, the routing table is consulted and the packet is transmitted on the interface registered with a route containing the best match for the destination, i.e. all hosts within the same subnet are available on a single one-hop network segment via routers. However, in MANETs nodes route traffic by transmitting packets on the interface it has arrived from.

Aggregation is not required in MANETs, as all routing is host based and for all destinations within MANET, a sender has a specific route.

There are two principal approaches for route maintenance in MANET – reactive and proactive:

  • Reactive routing protocols set up traffic routes on-demand (examples – Ad hoc On-demand Distance Vector, Dynamic Source Routing)
  • Proactive routing protocols dynamically maintain a full understanding of the topology (examples – Optimized Link State Routing Protocol, Babel)

 

Northforge implemented the Optimized Link State Routing Protocol (OLSR). OLSR is an IP routing protocol optimized for mobile and wireless ad hoc networks. The protocol was integrated in a commercial routing stack suite. OLSR is documented in RFC3626 and uses the link-state scheme in an optimized manner to propagate topology information. The optimization is based on a technique called MultiPoint Relaying.

OLSR operation mainly consists of updating and maintaining information in routing tables. The data in these tables is based on received control traffic and the control traffic is generated based on information retrieved from the tables.

A general MANET network is illustrated below:

 

A General MANET network

Key to Success: Network Simulation

Network simulation is designed for characterizing, creating and validating the communication solutions, computer networks and distributed or parallel systems. It enables the prediction of network behavior and performance. One can create, run and analyze any desired communication scenario.

Generally, a simulation is the only method that allows continuous development, testing and debugging of a network comprised of hundreds and thousands of mobile MANET nodes, since a standard lab won’t do, and field tests are expensive, difficult to operate and non-deterministic.

One of the challenges during the development was testing OLSR with various topologies, e.g. two nodes, three 1-hop neighbors or 2-hop neighbor. In order to validate correct behavior of OLSR, it was important to emulate the dynamic nature of MANET, where nodes can roam around, come up and come down. To address these challenges, we decided to deploy a virtualized test environment, based on Linux containers (LXC), thus enabling execution of multiple OS instances on a single x86 machine.

In addition, Network Simulation has a very important practical usage: it can be supplied as a Network Planning System add-on product for MANET core.

This application supplies the following abilities:

  • Planning of node movements
  • Showing the dynamic status of network topology and connectivity (map or canvas based)
  • Operational planning based on communication conditions
  • Planning of the communications infrastructure
  • Verification & comparison of communication solutions
  • “What if” analysis of real actual situation by changing the scenario
  • Real terrain, realistic radio and propagation models

 

We learned a lot during this MANET project, specifically the ability to interpret and implement RFCs in MANET situations and how network simulation based on LXC is a key to success. We’ll be taking this knowledge to future MANET projects as mobile ad-hoc networks grow in use in this growing mobile network environment.

Authors: Oleg P. and Sasha I.

With the Internet of Things gaining popularity, embedded devices are getting more attention. An operating system that’s also getting its share of attention is OpenWrt, an OS that’s primarily for embedded networking devices. It’s based on the Linux kernel and primarily used on embedded devices to route network traffic – essentially a Linux distribution for your router.

If you are developing an application, OpenWrt gives you the framework to build an application without having to build a complete firmware around the application – giving you the ability to fully customize devices in ways you never imagined. While OpenWrt isn’t the ideal solution for everyone, OpenWrt is flexible and can be installed on various routers. OpenWrt has a web interface, but if you just want more than a web interface, you’re probably better off with another replacement router firmware.

Having a modular Linux distribution available on your router gives you lots of opportunities, including setting up a proper VPN on your OpenWrt router; running server software such as a web server, IRC server, or BitTorrent tracker on a router to use less power than on a computer – perfect for lightweight servers; create a special wireless guest network for security purposes; or capturing and analyzing network traffic.

OpenWrt is being adopted in many consumer grade and small business products, but out of the box, OpenWrt lacks many of the functions required for an enterprise class router to make it ready for larger enterprises or carriers to use.

An enterprise class router needs many functions such as a CLI management interface; multiple native routing protocols; multicast; traffic management (queuing, policy based routing, etc.); security (NAT, ACL, VPN, AAA, etc.); interface to manage L2 switch; flow control; OAM for hardware; statistics such as RMON and others; flash image management; and much more.

Some of the key features OpenWrt lacks are hardware monitoring; storing two or more firmware images (for rollback etc.); CLI; extensive statistics; traffic management; some AAA (TACACS+, Radius); and many multicast features. Extensive development is required to build out OpenWrt for use in an enterprise router. In addition, extensive QA would be required to ensure proper operation and robustness in an enterprise environment.

For one customer we looked at whether OpenWrt (release 14.07) would be a suitable option for their enterprise router. The goal was to adapt OpenWrt to meet their needs, taking consumer grade quality and completeness and converting it to enterprise grade quality and completeness. We identified 51 features in the customer’s requirements which weren’t supported in OpenWrt along with a unified CLI needed to cover all features for L2 switches and L3 routers. We would need to apply our experience and support to new development work in the customer’s infrastructure, management interfaces, native protocols, multicast, encapsulation, traffic management, routing IPv4/IPv6, L2, security, and L3 OAM.

Though OpenWrt is not suitable for all environments, it may be time for you to start thinking about what’s possible with OpenWrt for your applications. If you’re hit with challenges and we’ve been there too, we can help develop a solution that takes advantage of the flexibility OpenWrt has to offer.