System Design - Chat Messenger Like WhatsApp

What is Chat messenger?

 
Nowadays, we are all using one or another kind of personal chat messenger like WhatsApp or Signal, etc. We are using this application to send messages to individuals or to groups. We can send text messages or media messages (image, video, document, etc.).
 

Functional requirements

 
We will discuss and design below features of chat messenger,
  1. Send text message (One to one)
  2. Ack of Sent, Delivered and Read receipts
  3. Last seen of an individual
  4. Send media message
  5. Profile management
We also need to make sure that all these services should be reliable and should be able to handle huge amount of traffic. There could be huge traffic on some occasions. E.g. New year
 

High-level design

 
System Design - Chat Messenger Like WhatsApp
 

Network protocol

 
Whenever any message sent by A to B, there has to be an intermediator between A and B because A doesn’t know the address of B. Off course, as intermediator, server would come in to picture. When A wants to send message to B, first A needs to send it to server and then server will forward that message to B. But is it possible for server to initiate the request? HTTP works based on request-response. it means that whenever server receives any request then only it can send back response to client. so in our scenario where client is different other than who has sent the message (A), server cannot directly send message to B.
 
To solve this problem, we can use websocket. It provides full-duplex connection over single TCP connection. Whenever any user connects with the internet, it creates the TCP connection with the server. This is a private tunnel where client and server both can send message to each other in secure way that is why its called as full-duplex connection.
 

Send text message

 
When user A send message to user B, if A is not connected with the internet then mobile client saves that message in to local sql db like sqlite.
 
When user comes online, client sends pending message to gateway like G1 and its establish duplex connection with client. To send message, G1 sends that request to MessageService. MessageService query in to the database whether B is currently connected with any gateway or not. If it doesn’t find B in database then it will keep that message in server only and when B comes online, service will forward that message to B via gateway (G2) and delete that message from the server if it has stored.
 
System Design - Chat Messenger Like WhatsApp
 
To store these connections, we can use Redis (distributed cache). Its key-value pair database and keeps all data in primary memory. By using Redis we can quickly retrieve the details about the user like which user connected with which gateway. We can also flush these data to secondary memory whenever required.
 

Ack of Sent, Delivered, and Read receipts

 
MessageService has delivered the message to B however B has not opened that message. So B’s client will send an Ack back to the MessageService that message has been delivered. MessageService will send “Ack of Sent” to user A. In the same way, when user B reads the message, client will again send the Ack to MessageService and service will send that Ack of read receipts to A.
 

Last seen

 
We can track last seen of an user by couple of ways. When user perform any activity in the client like send text or media message, MessageService invoke LastSeenService to update the timestamp of that user. Sometimes, user connected with gateway but client is closed and continuously receiving the message in background like using notifications. Message delivered Ack send back to the service. in such scenarios, user has not opened the application yet so this would be system initiated messages and not the user-initiated message so LastSeen shouldn’t be updated. In this way, we can keep updating the user’s last seen in the database.
 
System Design - Chat Messenger Like WhatsApp
 
but here there is a catch. Users can keep the application open and do not perform any activity so in this case, the LastSeen of the user should be updated.
 
So the client can send the LastSeen timestamp at regular intervals say every 5 or 10 secs and LastSeenService will update the timestamp in a database for that user. The disadvantage of this way is that it regularly sends the update to the service and uses the network bandwidth.
 
To store LastSeen of users, we can use Redis so we can quickly retrieve the last seen of any user when needed.
 

Send media messages

 
Here media messages can be of type image or video or document. When user C sends a media message, it invokes a service called MediaService.
 
This service will save that media file into some external storage or in CDN. Along with this, it will also generate a unique hash of that message and invokes the MessageService. MessageService will forward that message (contains hash) to user B so B’s client can download that media file based on that hash. Why we need hash here? The answer is to identify the message (Ack can be sent back to the sender) and to get the storage location of the media file until its downloaded to the user’s device.
 
System Design - Chat Messenger Like WhatsApp
 

Load balancers of each microservices

 
A single instance of service can not handle the traffic so we need multiple instances of each service. The load balancer can be placed in front of each service so traffic can be distributed between multiple instances of the same service.
Next Recommended Reading System Design - pastebin.com