The Incredible Scalability of Hotstar

Hotstar, launched in 2015, is India's leading streaming media and video-on-demand service. It allows clients to stream popular shows in various languages and genres over the internet, with approximately 300 million clients and 1 billion minutes worth of content.
However, the service's primary and most sought-after feature is the live streaming of cricket matches.
Recently, the streaming platform received record traffic of 10.3 Million concurrent clients in the latest edition of the IPL T20 Cricket tournament, overtaking its previous record of 8.26 million clients.
So what about the technicalities that make its streaming a smooth slope?
The answer is Hotstar’s Backend Development- The Backend-for-Frontend (BFF) pattern.
Back-end development entails working on server-side software, which is concerned with everything that cannot be seen on a website. Back-end developers ensure that the website functions appropriately by concentrating on databases, back-end logic, application programming interfaces (APIs), architecture, and servers. They use code to assist browsers in communicating with databases, storing, understanding, and deleting data.
The Backend-for-Frontend (BFF) pattern
The BFF is tightly coupled to a specific client experience and is typically maintained by the same team as the client interface, making it easier to define and adapt the API as the UI requires, while also simplifying the process of coordinating the release of both the client and s
erver components.

Reaching the scalability
There are two basic models for ensuring seamless scalability: traffic-based and ladder-based.
When using traffic-based scaling, the technical team simply adds new servers and infrastructure to the pool as the number of requests processed by the system grows.
In cases where the details and nature of the new processes are unclear, ladder-based scaling is used. In such cases, the Hotstar tech team has pre-defined ladders per million concurrent users. As the system processes more requests, new infrastructure in the form of ladders is added.

As of now, Hotstar has a concurrency buffer of 2 million concurrent users, which are, as we know, optimally utilized during the peak events such as World Cup matches or IPL tournaments.
If the number of users exceeds this concurrency level, it takes 90 seconds to add new infrastructure to the pool and 74 seconds to start the container and the application.
To deal with the time lag, the team has set up a pre-provisioned buffer, which is the inverse of auto-scaling and has proven to be a better option.
The team also has an in-built dashboard called Infradashboard, which assists the team in making smart decisions during an important event based on concurrency levels and prediction models of new users.
The Road that led to BFF
All client apps would receive the same response payload from the backend services, regardless of the hardware device or its capabilities. When compared to a Web App or a Smart TV app, the payload for mobile clients could be reduced. This became critical in order to control wire size and optimize for data required for specific screen formats.
The majority of the UI (Client Interface) processing logic is located on the clients. Each client processed the service response using their logic. This would frequently result in a shift in the way content information is derived and rendered on the UI, creating a source of a mismatch for the information rendered and resulting in a broken client experience across devices. Also, because the majority of business logic processing occurs on the client, migrating this processing to a backend service could improve perceived page load time, particularly for low-end devices.
Personalization service is a standalone micro-service. It returns unique identifiers to identify contents that reflect the client’s preference. The clients would then request the content service for the metadata corresponding to these identifiers. This layout made the client app’s chatty and also contributed significantly to the page load time.
The rectification of these problems is creating a middle layer between the clients and the micro-services.

This layer should make all the multi-service calls and compose the entire metadata for the client to consume. This should be done at high throughput with low latency. The technology or framework required to build this layer should allow for a high level of client team participation. The client team should also be able to make changes without much help from the backend services team. This layer's services should be highly scalable.
One of the key decisions for the implementation is the technology to choose that would allow the client team developers to contribute to the development and maintenance of this layer without requiring them to invest a lot in learning a new technology or framework.
So, the technology/framework chosen needed to be fast and also easy for anyone to pick up. NodeJS. The NodeJS ecosystem has a bunch of frameworks — Hapi, Express, Restify.
These all serve as a general purpose web framework, but Fastify is blazingly fast compared to the other frameworks available.
Because of its plug-in model, Fastify has the advantage of allowing software reuse and extensibility. Furthermore, the plug-in system is quite rich and would meet our documentation, CORS enablement, logging, and response compression requirements. Validation and Serialization, which is provided out of the box, is another useful Fastify offering. This allows us to quickly validate any incoming requests for appropriate headers and query parameters and is a significant improvement when using the JSON serialization with the fast-JSON-stringify plugin used internally by Fastify.
DNS caching is used to avoid network delays and improve performance. It is critical that the services be rigorously load tested before going into production. Therefore, the conclusion came out at a configuration for both the socket timeouts and the heap memory of the service instance that would allow each service instance to handle 1k requests per minute as a result of this exhaustive load testing exercise.
What else can be improved?
Currently, the middle-layer BFF communicates with all services via HTTP 1.1.
If HTTP 2 is used, its support for multiplexing different network requests on the same connection will result in significant speed improvements when aggregating responses from multiple different services.
With the goal of quickly reaching the middle layer, using node streams would allow each service instance to be more memory and time efficient.
Finally, all communication with the Backend Microservices is done through the CDN. DNS resolution, rate limiting, and WAF protection all add time to the process. Direct communication with the Backend Microservices ELB may be considered in the future.
This implementation, however, is not an exact implementation of the traditional BFF pattern. Instead, it functions more as an API Gateway. The client app still has the logic for generating the view layer from the metadata. The goal is to evolve the service in a direction it transitions more towards the traditional BFF pattern and generates the view for each of the client apps.
Even still, Hotstar has proved to have a delightful and hassle-free viewing experience. Content Contribution: Siddhi Mohanty BTech CSE AI