The Simplification of the Internet – HTTP4Everything? – Zukunft-Innovation-Technik (ZuInnoTe)

Traditionally, the Internet is made up of a more or less complex protocol stack. It is less complex than the „theoretical“ Open Systems Interconnection (OSI) model, but it still consists of a couple of protocols at different layers. Over the last years we have seen a radical simplification of this to use for very different applications just one protocol: the Hyper Text Transfer Protocol (HTTP). This is mainly driven due to increased usage of cloud technologies. I will discuss in this blog post the drivers and implications for future networking technologies.

I will explain in the next section the traditional Internet protocol stack and afterwards the current evolving Internet protocol stack. Afterwards, I will present trends to even more simplify it by removing certain layers in the protocol stack and by reducing the number of available protocol choices at a certain layer. These trends indicate that HTTP will be the base of many application protocols and I will investigate why this is as well as possible implications from that.

Current Internet Stack

The following figure presents the „traditional“ OSI reference model for networking. It separates the whole networking stack into different layers where one layer only communicates with the next layer without bypassing layers.

Each layer fulfils one task depending on the services of the lower layer.

It is a theoretical model that is often taught in universities to have a clean understanding of networks and to design complex network architectures.

The OSI reference model has been established around the same time when the Internet started to grow in civilian contexts and has been already available when the protocols used today, such as the Internet Protocol (IP), were deployed.

However, the Internet stack cannot be always mapped to the OSI reference model. It is a simplification where some layers, such as the transport layer include parts of the network and higher layers. This is illustrated in the next figure.

Even the Internet Protocol has some application characteristics (e.g. ping). Especially some of the higher layers, such as session layer, presentation layer and application layer, are combined in a single layer or can be omitted.

The transport layer offers two different protocols: Transmission Control Protocol (TCP), a connection-oriented transport, and User Datagram Protocol (UDP), a connectionless transport. Actually with IPv6 it is four protocols: Each IP version needs a dedicated TCP and UDP protocol stack as subtle differences in the IP header (e.g. checksums do not exist in the IPv6 header) require changes in the transport protocols.

Internet Protocol (IP) Model for Networking

Challenges with the current Internet stack

One challenge with the current Internet protocol is that it creates complexity and has grown historically. Aside the need to support legacy network protocols, such as IPv4, it has two transport protocols (of which each of them has two different incompatible versions 4 and 6).

Additionally, each application, such as SMTP, HTTP, sFTP etc., has a dedicated protocol that is often conceptually and technically not comparable to each other.

Security (e.g. authentication/authorisation) is implemented differently in each application as they are conceptually and technically different.

Encryption is realized differently in each application, but they are often based on transport layer security (TLS). Each application is responsible for the TLS handshake and does it differently.

Caching and other means to improve performance and reduce compute needs are implemented differently in each application – if supported at all.

The current Internet stack has demonstrated to be a very flexible, but simplified, solution compared to the reference OSI stack. Nevertheless, it is still complex.

For example, applications that provided file services, such as sFTP, FTPs, HTTPs, are diverse each using a different protocol with different features underneath. sFTP uses port 22, FTPs uses dynamic port assignments and may require the client to open an ingress port, HTTPs uses port 443. This has impact on security and firewalls of clients as well as servers. It may also require mediators and translations between the protocols.

Each of them uses a different exchange and has different security features. sFTP maybe based on another application protocol (SCP), but this is not necessary. HTTPs has a wide range of other applications as it can expose resources and operations on resources.

The authentication is different for each of them. sFTP supports different ways of authentication, such as password base/public key based. HTTPs has some rudimentary in-built authentication, but nowadays relies on standards such as OIDC where authentication is managed by a third party server that supports various authentication means, including public keys and multi-factor authentication.

There are different protocols used for lower layers usually due to the on-going but accelerating migration to IPv6.

Such a variety is due to the understanding and different needs that have evolved from the beginning of the Internet to present time.

Future Internet Stack

It can be observed that the future Internet Stack simplifies itself more as it is illustrated in the next figure. It is mainly one application protocol, HTTP3, that allows access to resources, i.e. any application. The idea stems originally from Representational State Transfer (REST). For example, you can have resources representing mails. Any changes to resources using HTTP verbs, such as GET, PUT etc, allows to send emails, receive emails, representing them in directories etc. The key is that any application can be represented as a set of resources.

For instance, files/directories can be resources (cf. object store APIs, such as S3). Users/Groups can be resources (cf. System for Cross-domain Identity Management (SCIM)). Richtext, Wikis can be resources. Within cloud computing, one finds virtual machines, disk drives, networks, object stores, machine learning jobs, big data processing jobs, backups etc. all as resources that are managed through HTTP. Video and audio streams are accessible as resources through HTTP (Streaming).

Despite some criticism to the progress of migrating to IPv6, it is unavoidable. We see that current techniques, such as carrier-grade NAT, to make the depleted IPv4 address space usable reach their limits, are costly deploy and just increase complexity without any benefits. Several large technology companies went for IPv6-only long time ago (e.g. Google, Meta and Microsoft). Additionally, IPv6 can be flexible extended with custom headers out of the box without the need that every part of the Internet supports each of them. This provides opportunities to extend existing applications without modifying the application itself. IPv6 can bring significant performance improvements compared to IPv4 (cf. e.g. Apple WWDC 2020 presentation). The number of smartphones excluding other devices is already much higher than the available IPv4 addresses. Hence, one finds in the mobile Internet market often IPv6 first and IPv4 only for legacy websites. IPv4 will become less and less relevant.

Then, we can observe since the emergence of HTTP/3 that the traditional transport layer consisting of TCP and UDP does not provide enough flexibility for a „generic“ HTTP protocol. Such a protocol will require multiplexing in a low-latency fashion for accessing one or more applications through the same server. This was not possible efficiently using TCP. Hence, HTTP/3 implements a connection-oriented multiplex-capable transport on the lightweight UDP. Thus, there is no need any more for TCP.

Applications in the Future Internet Stack

As already mentioned before, applications are modelled differently in the future Internet stack. Contrary to previously, we do not have to invent for every new application a new protocol.

For example, e-mail has featured many different protocols for sending (e.g. SMTP) and for receiving (e.g. POP3, IMAP). One can realize that this is simply about accessing resources (e.g. mailboxes, folders, mails, contacts, calendars) and thus one can use simply HTTP where each resource can be accessed using HTTP verbs. This is what the JSON Meta Application Protocol (JMAP) describes. Indeed, SMTP does not only describe a client to server protocol, but also a server to server protocol. However, this also works analogously with HTTP.

There are other applications based on HTTP where applications simply expose a set of resources

Application	Standard	Resources	Traditional Protocol
Activity	ActivityPub	Actors, Activities, Objects	Proprietary protocols in social networks
Calendar	JMAP	Calendars, Events, Tasks	CalDAV (already HTTP based)
Contacts	JMAP	Address, Address book	CardDAV (already HTTP based)
Cloud Resources	depends on cloud provider	VM, Container, Storage, Backup Big Data Jobs	proprietary protocols, Simple Network Management Protocol (SNMP)
Data	Open Data Protocol (OData)	Entities, relations	proprietary protocols
E-Mail	JMAP	Folders, Mails	SMTP, POP3, SMTP
Object Storage	S3 (originally AWS, but reused in other clouds)	Objects, Prefixes, Policies	WebDAV (partially), sFTP, FTPs, proprietary commercial storage applications
Text versioning	Git over http	services, refs	WebDAV (partially)
Video/Audio Streaming	HTTP Live Streaming	media streams playlists	proprietary protocols

There are other applications based on HTTP, but they have been designed earlier and not in (full) spirit of the HTTP resource approach. For example:

WebDAV (and variants, such as CalDAV, CardDAV): distributed authoring and versioning. Requires changes, such as additional methods, to the original HTTP protocol. JMAP or similar may replace them in the future.

Although most of the applications are accessible through HTTP, there are some applications where this is only partially the case, for example, peer-to-peer applications and video/voice conferencing. This is mainly due to IPv4 as it requires complex additional infrastructure, such as NAT and network discovery protocols, to be able to communicate directly between participants. This will be mainly obsolete with IPv6 and thus we can expect that those also change to HTTP/3 as it also supports enabling low latency communication.

Examples where this development still needs to happen:

WebRTC (Voice/Video communication, screensharing): Only signaling (finding participants) is via HTTP. ICE and STUN/TURN are only needed for outdated IPv4 environments involving NAT. The connection where the media is exchanged is done using custom protocols on custom ports (Secure Real-time Transport Protocol (SRTP)) and a dedicated key-exchange protocol exists (Datagram Transport Layer Security (DLTS) for SRTP) for exchanging keys for end-to-end encryption. However, as we have seen there are HTTP based standards such as HTTP Live Streaming or WebTransport between the two peers of a communication, which could be used instead, but they are not at the moment. Especially HTTP/3 seems to be suitable for this.

Discussion

While there seems to be a clear sign that the Internet moves to a simpler version based on IPv6 and HTTP, one might wonder if this may be an issue or if indeed the benefits of simplicity outweigh the disadvantages.

Firstly, this move makes only sense if it is based on IPv6 and not on IPv4. IPv4 introduces a lot of complexity and bottlenecks with Network Address Translation (NAT), additional (reverse) proxies etc. due to the scarcity of IPv4 addresses. This is not only related to deploying them, but also troubleshooting issues and scaling the infrastructure in case of higher demands. It is also more complex from a security perspective to determine what is going on in the network.

If one designs their network architecture based on IPv6-only right then you can indeed benefit from some simplicity.

Currently, IPv6 is widely deployed, but there is still a significant amount of IPv4. Additionally the IPv6 deployment has not advanced equally in the world. Nevertheless, there are clear trends towards IPv6 and large corporations, especially from the software industry, move to IPv6-only due to the complexity of parallel IPv6/IPv4 operations.

Even if IPv6 would be widely deployed, HTTP/3 is not yet, but is so on a very accelerating path. While it is not a prerequisite for most of the applications, it will be needed to make the protocol more efficient and especially for real-time communication. Given the rapid development in the open source world related to HTTP/3 (e.g. curl, NGINX, Jetty etc.) and that the browsers already support HTTP/3 in production it is only a matter of time that it is everywhere supported.

Another problem is that while there are some generic semantics in HTTP, such as resources and methods (verbs), these semantics will need to be defined in more detail by the applications. For example in case of email you will need to describe the resources (e.g. folders, mails), the underlying data model of a folder and a mail as well as what the methods mean, e.g. what is the difference between POST an email or PUT an email? How does a client synchronize emails via HTTP in an efficient fashion? How do servers exchange emails between themselves to deliver them to the destination?

This is not a blocking problem, as you see from the table above this has been done, but needs to be done properly and it is not simply „use HTTP“.

HTTP has compared to other protocols also some advantages: It is stateless and requests to resources can be cached in an efficient way using various caching mechanisms. Proxies at server side and client side can provide general caching without necessarily knowing the underlying application. Mobiles or other devices with unreliable connections benefit a lot from being stateless.

This would also facilitate centralized monitoring of applications as one can easily see which resources are available, how they are accessed by whom.

Also from a security side we can use standard mechanisms, such as OpenID Connect, that also do not need to know the underlying application as they simply work with anything using HTTP. For example, for IMAP for emails still does not have the same level of support for multi-factor authentication as we see in OpenID Connect. Similarly for machine-to-machine communication as envisioned by web identity federation, there is nothing foreseen in IMAP. The reason is that these things go beyond an individual application and need to be available generally to all applications.

Using OpenID Connect, one can also centrally manage the authorisation – as each application based on HTTP simply exposes resources and methods on them, one can centrally manage authorisation to those resources without knowing the underlying application in detail.

Certainly, some applications will pose challenges for the HTTP model, such as Peer-to-Peer (P2P) and real-time communication. They can be in principle based on HTTP as we have, for example, video streaming already using HTTP. Streams in real-time communications would be simply resources on which one can execute standard HTTP methods. Users would need to expose a HTTP port for direct communication instead of a set of proprietary ports by custom application protocols. If they do not want to expose their HTTP port directly, they can also channel the HTTP requests through a reverse proxy that may provide additional protection mechanisms.

However, these applications will require IPv6 (for direct communication) and HTTP3 (for higher throughput).

Finally, one should ask if this is a positive development or not. We cannot say now for sure. Clearly, compared to the OSI-model it is much less complex, but this is only on the outside: In the future Internet Stack there are less layers, but each layer is itself much more complex and does do more tasks. This is typical in many scenarios, such as Microservices and alike: Complexity does not change by cutting the architecture differently – it is just moved to a different part. This can make sense in certain scenarios, but many organisations do not assess if it makes sense to them and do very expensive wrong designs.

We currently do not know if representing everything as resources is a good choice or not. There are indications of different more complex models, such as in ActivityPub, which can make sense in certain situations.

Like in any architectural choice – there is not the best choice and not everything fits for everyone equally. However, one needs to find a good common denominator to be interoperable in a complex world.

Conclusions

There is a trend towards simplifying the Internet stack towards a more simplified version on the outside, which is ideally based on IPv6 and HTTP3. We see nowadays already that most of the new applications are using HTTP (e.g. HTTP 1.1 or HTTP 2.0) and not proprietary protocols on top of the transport layer. This facilitates security and monitoring as well as simplifies the setup of your application. IPv6 and HTTP3 are currently deployed to a different degree. Nevertheless, still applications that run on older IP and HTTP version benefit from the standardization. They have very mature and high-performant server and client libraries in virtually any software ecosystems.

Most likely, if the trend is followed, you will build your application based on HTTP. Ideally you support IPv6 and HTTP3. Nevertheless, you should study existing applications based on HTTP so you make sure that you can benefit from its advantages also in context of the architecture of the application (e.g. security, caching etc.). Some of the disadvantages of HTTP in older versions have been addressed in newer versions to make the move to HTTP more efficient.

The Simplification of the Internet – HTTP4Everything?