There is client-server software, and there is peer-to-peer software. Are these two architectures all we ever need for distributed software?
I’d like to suggest that some of the world’s most successful distributed software architectures are neither client-server nor peer-to-peer when you look at them closely. They follow an architectural pattern that I’d like to call the 4-Point Architecture.
Consider e-mail, a distributed and network-centric application if there ever was one. To accurately describe how e-mail gets sent from my computer to your computer, we need four computers in the picture (the “four points” of this architecture):
In this case, My Computer first hands the e-mail message to my SMTP server through one of several possible protocols (e.g. SMTP, POP, …). The SMTP server then uses SMTP to send the message to your SMTP server, which in turn holds the message until Your Computer downloads it using a protocol such as POP.
In this architecture, we have elements of both peer-to-peer and client-server: either SMTP server can initiate communication to another SMTP server, which makes them peer-to-peer. (However, SMTP is not a symmetrical protocol, which means we run client-server on top of peer-to-peer.) The relationship between an SMTP server and its client computer is clearly a client-server relationship.
If we redraw the picture in a more generic manner, we arrive at this: (and I will get to the labeling in a second)
The 4-Point Architecture is also used by the following technologies:
- Jabber/XMPP presence and instant messaging. Jabber clients log into Jabber servers using an asymmetrical protocol. Jabber servers talk to each other using a symmetrical protocol. Same picture.
- Modern P2P file sharing networks. It turns out that virtually all of them distinguish between regular nodes and super nodes. While the communication between nodes may allow for an entirely peer-to-peer form of communication, in practice only subsets of this protocol are used, making peer-super-peer relationships client-server and super-peer-super-peer relationships peer-to-peer. While dynamic reassignment of roles may take place, they still use the 4-Point Architecture at any point in time.
- Update April 25: Of course, blogs work the same way as well: we edit locally on our PCs but publish on visible servers with a defined address.
- There are lesser-known ones as well, including what we have in our product.
We could call this architecture a mix of client-server and peer-to-peer. However, it comes with some rules that cannot adequately be described without looking at all 4 points of the architecture:
- The computers in the first row must be addressable and routable by any other computer in the first row. As a result, they must belong to the “bright” internet (i.e. those parts of the internet that can be reliably addressed and found)
- No computer in the second row needs to be addressable and routable. To communicate, they must be able to initiate a connection to a bright point, but there is no requirement that a bright node initiate a connection on its own. Therefore, the computers in the second row may belong to the “dark” internet (i.e. those parts of the internet that may have rapidly changing IP addresses, are moved quickly from network to network, such as laptops, and generally do not have a well-defined DNS name).
- Any dark point only interacts with exactly one bright point (at least for a given application or protocol). It is extremely rare that a dark point interacts with more than one bright point (for a given protocol) for reasons other than basic availability of bright points. Thus the relationship between a bright point and its dark points is a fairly stable one.
- No dark point ever interacts directly with another dark point, they always go through their respective bright points first.
Why do I think this 4-Point Architecture is important? It’s important because it is a generally-useful, proven architecture for distributed applications that includes all kinds of devices, not just “bright” servers. Some of its benefits are:
- It clearly assigns responsibilities: bright points must be stable and available, for example, while dark points need not be.
- It enables reliable communication between unreliable dark points that may not even be able to route to each other directly, regardless how much work one was willing to put in.
- It allows organizations to specialize: operating a bright point requires a different set of expertise than operating a dark point.
- It allows local innovation: if somebody invents IMAP, they can install it locally (between their bright and dark points), without impacting the rest of the network.
- It provides clear guidance where to attach logging, approval, quota and so forth functionalities and procedures.
- It reduces the requirements on the dark nodes, which are often relatively underpowered devices (e.g. cell phones can send e-mail without having to support the entire SMTP protocol).
Watch the 4-Point Architecture to become more prominent going forward in many places … it simply makes too much sense not to.