Virtual Application Networks for Hybrid Cloud Interconnect

Multi cloud and hybrid cloud computing is a popular topic of late. Good advances have been made in portability of workloads and managing rented computing resources. One important aspect that remains elusive is how to interconnect all of the components distributed across various cloud locations.

This article will introduce Virtual Application Networks (VANs) as a general solution to multi and hybrid cloud interconnect.

A Virtual Application Network must meet the following requirements:

  • Flexible topology for interconnected cloud locations
    • No requirement for a star or full-mesh topology
    • Allow redundant paths for resiliency
  • Location-independence for addresses of software components
    • Cross-network multicast
    • Cross-network anycast for load balancing
  • Support for large numbers of private/edge locations having overlapping IP sub-networks
  • Can be set up by a developer without special privileges

The Internet Protocol is not well suited for Hybrid Cloud Interconnect

First, it will help to understand the real difficulty with hybrid cloud networking. Everything is connected to the Internet, right? So what’s the big problem?

The problem is that the Internet Protocol (IP)-based network is built for a client/server world. In the earlier days of the Internet, as it was exploding in popularity and size, it became apparent that the address space was too small for the demand. IPv6 was introduced with a much larger address space, but the cost of conversion was too high and the new standard was not universally adopted. A number of innovations were made to address the problem: Classless Inter-Domain Routing (CIDR) helped make more efficient use of the available addresses but the real solution came with Network Address Translation (NAT). By introducing the idea of private IP networks connected via NAT routers, major swaths of the address space were relocated behind one or a small handful of public addresses. Now, millions of small and large private networks can share the same IP subnets. This is the state of the Internet today and the network still comfortably fits in the 32-bit IPv4 address space.

The NAT solution is effective because the predominant architecture for distributed software is client/server. The network is oriented north/south with clients in private networks connecting to big, dedicated, and expensive servers in public data centers.

Modern software development is moving away from client/server toward more flexible architectures with services located everywhere. The desired communication patterns are north/south, east/west, and every other direction on the compass. Developers and operators want to deploy only as much computing capacity as is needed at the moment. And they want to be able to flexibly deploy that capacity when and where it is needed to support their users.

The hard fact is that the current global network does not provide the addressing needed to meet the needs of modern distributed software.

What is a Virtual Application Network?

A VAN lets IP do what it is good at (fast and reliable north/south connection of everything) and introduces a new layer of addressing for fine grained application components with no constraints on who can talk to whom.

Whereas TCP/IP addressing uses a host:port pair to refer to an endpoint, VAN addressing uses arbitrary strings to refer to endpoints directly. In the modern world of virtualization, containers, and serverless computing, the notion of a host for a port is unnecessary and overly constraining.

Also, whereas IP addressing is primarily unicast, with each address referring to a single host, VAN addressing is either anycast or multicast. It builds in the idea that multiple software components can attach to the network with the same address for multicast delivery or for load balancing. IP does have a notion of multicast addressing, but it is not used in practice beyond the scope of a local area network. VAN multicast and anycast is not constrained by component location and therefore provides network-wide multicast delivery and anycast load balancing.

Another important aspect of a VAN is that it is deployed at the application layer for the purpose of serving a single distributed application. It does not need to be part of the network infrastructure and a VAN can be created by a developer or operator quickly and easily without access to administrative privileges or special networking infrastructure like VPNs, IPSec, SDN, Firewall mapping rules, etc.

How does a Virtual Application Network work?

Layers of a Virtual Application Network

How does a VAN work then? First, a VAN router is deployed in every site participating in the distributed application. Connectivity between the routers is established using TCP/IP in whatever topology works for full connectivity. It need not be a full mesh. There needs to be at least one path from every site to every other site, even if that path transits through other sites. Redundant paths are encouraged because they provide resilience against failure of a site or a connection. Note that it is OK for multiple private networks to participate even if they have overlapping CIDR subnetworks.  Note also that there is no central controller or single point of failure.

An example VAN topology

The network of routers uses a routing protocol and algorithm to learn the topology and compute least-cost paths between each pair of sites. The routing protocol quickly and automatically reacts to changes in the topology. This serves to provide resilience against failure and also makes it easy to add locations to, and remove locations from the network as expansion or changes are needed.

At the next layer up, the VAN maps endpoint addresses to the sites in which the addressed components reside. Routing traffic to an address is then a two-step process of finding the location (or locations) where the desired destination lives and then using the router’s forwarding tables to route the traffic toward those locations.

The topmost layer maps protocol endpoints (DNS names, Kubernetes service names, messaging destinations, etc.) to VAN addresses. This is done for any application protocol, be it HTTP, gRPC, TCP, UDP, AMQP, MQTT, etc.

Where can one find a VAN?

There is an open-source implementation of a Virtual Application Network available from the Skupper project. The Skupper implementation is initially focused on providing multi-cluster interconnect for Kubernetes. Please visit the site to learn more, to see demonstrations on video, and to access example code and configurations that you can try on your own.

Skupper uses the Apache Qpid Dispatch Router as its VAN router. The router is based on the AMQP protocol and is lightweight and stateless. In addition to the base router, Skupper provides tools to facilitate network setup and to provide mappings from commonly used protocols like HTTP, gRPC, and TCP onto the VAN network.

Apache Qpid Dispatch Router uses a Link-State routing protocol, similar to OSPF or IS-IS, to detect topology and compute least-cost paths across the network.

Virtual Application Networks for Hybrid Cloud Interconnect

AMQP as a Network Protocol

October 1, 2015

The Advanced Message Queueing Protocol (AMQP) is usually discussed in the context of message queueing systems since it was designed as a protocol for messaging and indeed has the word “Queueing” in its name.  AMQP is, however, a general-purpose networking protocol that has real advantages over HTTP for service and API delivery.

Brief Overview of AMQP

The purpose of this overview is not to provide a comprehensive description of the protocol.  Readers who wish to see more depth can refer to the specification itself.  Rather, this overview shall introduce the key concepts of the protocol as background for the rest of the article.

AMQP is a wire-line protocol specification with the following characteristics:

  • It runs over any reliable point-to-point transport.  Common transports are TCP/IP (default port 5672) and TLS/SSL (default port 5671).
  • It is a symmetric peer-to-peer protocol.  Once the transport is in place and security layers are established, both endpoints of the transport connection have identical roles.  There is no notion of client/server or client/broker.
  • It is asynchronous.  There is no strict handshaking that occurs between the endpoints.
  • It has a framing layer whereby large transfers are broken into relatively small frames.  Frame sizes are typically tens or hundreds of thousands of octets.
  • It has a layered structure with multiple sessions within a connection and multiple links within a session.

AMQP Anatomy


When two processes are connected with each other using an AMQP connection, they send data in the form of messages over unidirectional links within that connection.  The layers of communication within AMQP are as follows:


The connection is associated with the transport (i.e. a TCP connection).  The AMQP connection is where the endpoints authenticate themselves to each other.


Sessions may be established by either endpoint.  Sessions provide a sliding-window form of flow control that can be used to limit the total amount of memory that incoming traffic can consume within a process.


Links may be established by either endpoint.  They are unidirectional and are commonly referred to as “senders” and “receivers”.  A link is a sender on one end and a receiver on the other.

Links provide a rich set of capabilities for message delivery, including message-level flow control and different guarantees for message hand-off.

Benefits to Service Delivery and Distributed Systems


Multiplexing is the ability to put multiple independent conversations, or message flows, over the same connection at the same time.  AMQP’s structure is such that it provides transports within a transport for maximum flexibility.  It means that a communicating process does not need to manage multiple sockets in order to separate different kinds of message or network traffic.

It was mentioned above that AMQP has a framing layer that underlies the entire protocol structure.  The framing layer guarantees that no single unit of communication will reserve more than one frame’s worth of network capacity.  Small messages will typically fit into a single frame whereas large messages will require multiple frames.  The benefit of the framing layer is that the connection will not be dominated by the delivery of a single large message.  Large messages are broken into frames and will be interleaved with the frames of deliveries on other links.

Flow Control

Flow control is a critically important aspect of distributed systems, especially large ones.  Without good flow control, components of distributed systems need to be designed with a lot of excess capacity to handle bursts of activity.  This drives up hardware costs and has other undesirable effects as well.  For example, a server with a large backlog of yet-to-be-handled requests might fail, leaving the entire backlog either lost or needing to be recovered by some mechanism.

AMQP provides a two-tiered strategy for flow control.  Session flow control is for limiting the amount of memory that a class of incoming messages can consume and link flow control is for limiting the backlog of messages that a consumer can have outstanding at any time.  Note that link flow control is specific to each link.  A process can independently control the flow of incoming messages for each of its receiving links.

Consider the following cross-section of an AMQP connection:

This connection has two sessions:  One for bulk message transfer and another for the transfer of control messages.  The incoming window of each session can be set to correspond to the amount of memory that the process wishes to allocate to the storage of received messages for that session.  This allows for independence between the two sessions.  If the volume of incoming (or outgoing) bulk transfers is such that the bulk session is congested, the control session is unaffected by this congestion and control messages can flow freely along the AMQP connection in spite of the bulk congestion.

The individual links within the sessions have their own credit-based flow control to meter the flow of messages.  Note that link credit cannot be tied to memory use because messages can be of any size.  One link-credit allows the delivery of one message, be it a 64-byte message or a 64-megabyte message.

Typically, session windows will be large (hundreds of megabytes or more).  A process wants to use the available memory but needs to prevent memory exhaustion.  On the other hand, link credit backlogs should be relatively small (dozens or less).  They should be just big enough to keep deliveries flowing continually.  Too few link credits will cause the sender to stall while waiting for credit to be replenished, reducing throughput on the link.

Delivery Guarantees

AMQP provides a variety of levels of delivery-guarantee that can make writing distributed software systems much simpler by removing the burden of requiring the application to implement its own guarantee mechanisms.  These delivery levels come into effect when there is loss of communication between sender and receiver.  A link that is lost (when the connection fails, for example) can be recovered after reconnect.


Also called “at-most-once”, “fire-and-forget”, and “best-effort”.  In this case, the sender of the message doesn’t want to know if the message was received or not.  There are no circumstances in which the message will be re-sent after the original transfer.


The sender receives an acknowledgement that the receiver got the message.  Before the acknowledgement is received, the sender considers the message unsettled, or in-doubt.  During link-recovery, the sender will re-send its unsettled deliveries to ensure receipt.  It is possible that a message will be delivered to the receiving process more than once.


The sender receives an acknowledgement of receipt and sends a settlement message back to the receiver.  The receiver keeps a record of received-but-unsettled deliveries so it can detect and remove duplicates that are received during link-recovery.  Only one copy of each delivered message is passed up to the receiving process.

Asynchronous, Full-Duplex Communication

Since applications using AMQP can create as many links as they need, there is no constraint on the patterns of communication that can be used.  A single connection can be used simultaneously for multiple kinds of communication:

  • Streaming sends (acknowledged or best-effort)
  • Streaming receives (acknowledged or best-effort)
  • Request/response
  • etc.

Serialization of Structured Data

Any kind of data can be sent in the body of an AMQP message.  A message can carry JSON or XML data, straight binary data, text, or any other format.  AMQP also provides its own type system for serializing structured data like Python dictionaries, Java hashmaps, arrays of numbers, etc.

Language-specific APIs for AMQP take advantage of this feature to remove the burden of encoding and decoding structures.  A programmer can simply provide a language-native data structure as the message body and the API library will encode it in a (hopefully) interoperable way.

AMQP and Routing

AMQP provides an addressing capability for messages and for links that provide the groundwork for sophisticated message routing.  Watch this space for future articles on this topic.

A Word About Earlier AMQP Specifications

During the development of the AMQP protocol, several early versions were released.  AMQP versions 0-8, 0-9, and 0-10 are quite different from the final 1.0 specification.  All of the preliminary versions of the protocol are asymmetric in that they define client and broker roles for the two endpoints of a connection.  Because of this, the 0-* versions are relegated to the status of traditional brokered-messaging protocols.  The AMQP 1.0 specification dispensed with these roles and adopted a symmetric structure.  AMQP 1.0, being unconstrained by traditional messaging, is now an enabling technology for interesting, new ways to interconnect distributed systems, with and without messaging brokers.

AMQP as a Network Protocol