As web platforms become more complicated, complex, and in need of higher throughput, one’s web architecture and design decisions therein become more and more paramount. Different web platforms with different use cases will call for different solutions, but there are a number of common components and patterns.
At the most basic level, a web platform will consist of an application layer on a web server and a persistence layer, typically a database. The application layer contains the majority of a platforms logic. Depending on the platform this may be as simple as an API layer allowing for basic CRUD (Create, Read, Update, Delete) operations, but in most cases it will be where the business logic associated with the platform is executed. Databases can be relational, such as MySQL, PostgreSQL, and Oracle; or stored within a key-value store, such as MongoDB, Redis, and Cassandra.
Technically, one could could simplify even further, placing the web server and database on a single machine, but this is rarely done and is only fit for the most basic of sandbox applications.
Such a basic architecture unfortunately does not scale. While it may work for small platforms or as an initial minimum viable product, this is not something that will last for long. The most glaring issue here are the several potential points of failure. If either the database or web server goes down the site will go down. Furthermore, having a single web server severely limits the number of requests the platform can handle at a time.
Scaling the Web Servers
There are two methods to scale ones server architecture: vertically and horizontally. Vertical scaling involves upgrading one’s server to increase it’s memory and/or CPU processing power. This is inherently constrained by technological limits though. As you can imagine, one can only increase a server’s memory and/or CPU to such a degree and even then it may become prohibitively expensive. Even more so, such an architecture has a high risk of failure. If the one server goes down, there goes your platform. Because of this, in the majority of cases horizontal scaling is the correct choice.
Horizontal scaling involves incorporating additional servers to balance the load between servers. The division of load can be done by a separation of services, where separate logic is placed on separate servers, or by placing all of the same logic on all servers. Then to properly route requests between these web servers, a load balancer must be used. A load balancer is software or hardware that acts as a reverse proxy, distributing requests between the multiple web servers.
Scaling the Database
With multiple web servers behind a load balancer, this architecture becomes much more scalable, but is still limited. The primary limiting factor here is the database. Writing of data, especially in relational databases such as MySQL, PostgreSQL, and Oracle, are expensive and in many use cases will severely limit how far a web platform may scale. Scaling one’s database can become significantly more complicated than scaling one’s web server tier, but is essential in scaling one’s web presence.
To scale a read-heavy web platform, a common solution is to use master-slave database replication. In this setup when writing data, the data is written to the master database and then replicated to all slave databases. Reading of data can then be distributed between the databases, easing the load on any one database.
A master-master configuration can also be used, where data is written and read from both databases. Such a configuration helps with reliability and eliminates the single point of failure from the database layer, but is very difficult to maintain consistency between the databases.
To solve this, another technique to scale the database layer is to split decoupled data between databases or to utilize database sharding. Database sharding involves splitting database table rows between separate databases. An example of this could involve a database table of phone numbers. Phone numbers starting between 000…499 would be written to one database, while phone numbers starting between 500…999 would be written to another database. Doing this distributes write load between multiple databases, though does require a clear and straightforward algorithm for determining which shard to read and write data from.
Eliminating Single Point of Failure
With multiple web servers and multiple databases in place, the final single point of failure in this architecture lies within the load balancer. If the load balancer dies the entire system collapses. To remedy this, one may integrate a second load balancer. This is typically done in either a master-slave or master-master configuration.
A master-slave load balancer configuration when in good times only utilizes the single master load balancer. On set time intervals the master load balancer sends a “heartbeat” to the slave load balancer, notifying it that all is okay. If the master load balancer goes down the slave load balancer is there to pick up the slack. Upon not hearing the master load balancer’s “heartbeat”, the slave load balancer will step up and take on master duties.
A master-master load balancer configuration utilizes both load balancers and depends on DNS to balance requests between the two load balancers.
With an architecture such as this with multiple load balancers, multiple web servers, and multiple databases, a web platform will be able to handle significant load. Though, that being said, caching is not being utilized which severely hinders the efficiency of this architecture. In a follow-up post I will dive into the world of caching, including Varnish, Memcached, Redis, and more.