October 1, 2015
The Advanced Message Queueing Protocol (AMQP) is usually discussed in the context of message queueing systems since it was designed as a protocol for messaging and indeed has the word “Queueing” in its name. AMQP is, however, a general-purpose networking protocol that has real advantages over HTTP for service and API delivery.
Brief Overview of AMQP
The purpose of this overview is not to provide a comprehensive description of the protocol. Readers who wish to see more depth can refer to the specification itself. Rather, this overview shall introduce the key concepts of the protocol as background for the rest of the article.
AMQP is a wire-line protocol specification with the following characteristics:
- It runs over any reliable point-to-point transport. Common transports are TCP/IP (default port 5672) and TLS/SSL (default port 5671).
- It is a symmetric peer-to-peer protocol. Once the transport is in place and security layers are established, both endpoints of the transport connection have identical roles. There is no notion of client/server or client/broker.
- It is asynchronous. There is no strict handshaking that occurs between the endpoints.
- It has a framing layer whereby large transfers are broken into relatively small frames. Frame sizes are typically tens or hundreds of thousands of octets.
- It has a layered structure with multiple sessions within a connection and multiple links within a session.
When two processes are connected with each other using an AMQP connection, they send data in the form of messages over unidirectional links within that connection. The layers of communication within AMQP are as follows:
The connection is associated with the transport (i.e. a TCP connection). The AMQP connection is where the endpoints authenticate themselves to each other.
Sessions may be established by either endpoint. Sessions provide a sliding-window form of flow control that can be used to limit the total amount of memory that incoming traffic can consume within a process.
Links may be established by either endpoint. They are unidirectional and are commonly referred to as “senders” and “receivers”. A link is a sender on one end and a receiver on the other.
Links provide a rich set of capabilities for message delivery, including message-level flow control and different guarantees for message hand-off.
Benefits to Service Delivery and Distributed Systems
Multiplexing is the ability to put multiple independent conversations, or message flows, over the same connection at the same time. AMQP’s structure is such that it provides transports within a transport for maximum flexibility. It means that a communicating process does not need to manage multiple sockets in order to separate different kinds of message or network traffic.
It was mentioned above that AMQP has a framing layer that underlies the entire protocol structure. The framing layer guarantees that no single unit of communication will reserve more than one frame’s worth of network capacity. Small messages will typically fit into a single frame whereas large messages will require multiple frames. The benefit of the framing layer is that the connection will not be dominated by the delivery of a single large message. Large messages are broken into frames and will be interleaved with the frames of deliveries on other links.
Flow control is a critically important aspect of distributed systems, especially large ones. Without good flow control, components of distributed systems need to be designed with a lot of excess capacity to handle bursts of activity. This drives up hardware costs and has other undesirable effects as well. For example, a server with a large backlog of yet-to-be-handled requests might fail, leaving the entire backlog either lost or needing to be recovered by some mechanism.
AMQP provides a two-tiered strategy for flow control. Session flow control is for limiting the amount of memory that a class of incoming messages can consume and link flow control is for limiting the backlog of messages that a consumer can have outstanding at any time. Note that link flow control is specific to each link. A process can independently control the flow of incoming messages for each of its receiving links.
Consider the following cross-section of an AMQP connection:
This connection has two sessions: One for bulk message transfer and another for the transfer of control messages. The incoming window of each session can be set to correspond to the amount of memory that the process wishes to allocate to the storage of received messages for that session. This allows for independence between the two sessions. If the volume of incoming (or outgoing) bulk transfers is such that the bulk session is congested, the control session is unaffected by this congestion and control messages can flow freely along the AMQP connection in spite of the bulk congestion.
The individual links within the sessions have their own credit-based flow control to meter the flow of messages. Note that link credit cannot be tied to memory use because messages can be of any size. One link-credit allows the delivery of one message, be it a 64-byte message or a 64-megabyte message.
Typically, session windows will be large (hundreds of megabytes or more). A process wants to use the available memory but needs to prevent memory exhaustion. On the other hand, link credit backlogs should be relatively small (dozens or less). They should be just big enough to keep deliveries flowing continually. Too few link credits will cause the sender to stall while waiting for credit to be replenished, reducing throughput on the link.
AMQP provides a variety of levels of delivery-guarantee that can make writing distributed software systems much simpler by removing the burden of requiring the application to implement its own guarantee mechanisms. These delivery levels come into effect when there is loss of communication between sender and receiver. A link that is lost (when the connection fails, for example) can be recovered after reconnect.
Also called “at-most-once”, “fire-and-forget”, and “best-effort”. In this case, the sender of the message doesn’t want to know if the message was received or not. There are no circumstances in which the message will be re-sent after the original transfer.
The sender receives an acknowledgement that the receiver got the message. Before the acknowledgement is received, the sender considers the message unsettled, or in-doubt. During link-recovery, the sender will re-send its unsettled deliveries to ensure receipt. It is possible that a message will be delivered to the receiving process more than once.
The sender receives an acknowledgement of receipt and sends a settlement message back to the receiver. The receiver keeps a record of received-but-unsettled deliveries so it can detect and remove duplicates that are received during link-recovery. Only one copy of each delivered message is passed up to the receiving process.
Asynchronous, Full-Duplex Communication
Since applications using AMQP can create as many links as they need, there is no constraint on the patterns of communication that can be used. A single connection can be used simultaneously for multiple kinds of communication:
- Streaming sends (acknowledged or best-effort)
- Streaming receives (acknowledged or best-effort)
Serialization of Structured Data
Any kind of data can be sent in the body of an AMQP message. A message can carry JSON or XML data, straight binary data, text, or any other format. AMQP also provides its own type system for serializing structured data like Python dictionaries, Java hashmaps, arrays of numbers, etc.
Language-specific APIs for AMQP take advantage of this feature to remove the burden of encoding and decoding structures. A programmer can simply provide a language-native data structure as the message body and the API library will encode it in a (hopefully) interoperable way.
AMQP and Routing
AMQP provides an addressing capability for messages and for links that provide the groundwork for sophisticated message routing. Watch this space for future articles on this topic.
A Word About Earlier AMQP Specifications
During the development of the AMQP protocol, several early versions were released. AMQP versions 0-8, 0-9, and 0-10 are quite different from the final 1.0 specification. All of the preliminary versions of the protocol are asymmetric in that they define client and broker roles for the two endpoints of a connection. Because of this, the 0-* versions are relegated to the status of traditional brokered-messaging protocols. The AMQP 1.0 specification dispensed with these roles and adopted a symmetric structure. AMQP 1.0, being unconstrained by traditional messaging, is now an enabling technology for interesting, new ways to interconnect distributed systems, with and without messaging brokers.
2 thoughts on “AMQP as a Network Protocol”
[…] nice and interesting article titled AMQP as network protocol by Ted Ross (from RedHat) with a brief overview on the protocol, multiplexing, flow control and […]
A very helpful article, thank you.