Revamped and Automated the infrastructure for NTN Buzztime Executive Summary NTN Buzztime Inc. was looking for scalable infrastructure with a new platform that could support display of real-time restaurant menus on 50,000 devices placed across the United States. HashedIn modernized NTN Buzztime s infrastructure and enabled them to go to the market faster.
Problem Statement NTN Buzztime Inc. is an interactive gaming company known for introducing innovative dining technologies to bars and restaurants across the United States and Canada. Buzztime wanted to modernize their infrastructure to go to market faster. They wanted a new platform that would allow effective monitoring with zero downtime. Business Requirements Objective To Modernize architecture and provide a stable platform that performed consistently across all environments. Key Requirements Summarized requirements put forth by the client were as below: Modernize existing architecture with latest tech stack which will solve the cost and scalability Build a stable platform that can handle a large volume of requests with zero downtime Implement continuous deployment and delivery to enable Release Early and Often Standardize environment and version control Proactive infrastructure maintenance Impact and involvement of stakeholders Smartshift has a big user base. There is also a big network of operations team involved on a daily basis. Restaurant Customers & Employees: Food menu delivered to 50K tablets, so the platform was used by customers and employees in restaurants to place orders Website users: Website data was powered by the same platform to deliver data. Approx 200K requests were served in a day Web Trivia game users: Gamers used the Metadata for live web trivia games. The platform had to scale to serve a large number of live gamers Product Managers: Product managers were focused on delivering features early and often Infrastructure Admins: The admins focused on reducing time spent on release management tasks Developers: The developers were carrying out the task of identifying incidents before users were impacted.
Solution Approach Our Solution Structure Docker- Containerization using extremely lightweight Docker resulted in utilizing the VM resources effectively. This also bought inconsistency across environments. Isolation and segregation enabled by docker resulted in reducing the regression areas. Docker Swarm- Continuous deployment of containers was implemented in docker swarm clusters and swarm failover was enabled to protect from node outages and achieve zero downtime. Docker swarm was also set up to add or subtract container iteration as changing the computing demands resulted in cost deduction. Nginx reverse-proxy and load balancing was implemented with advanced security implementations which helped to avail 100% uptime Smart caching using Redis implementation enabled to serve real-time food menus in 50000 tablets across the US. A Redis sentinel setup with 3 redis servers and 3 hosts instead of one make the system more fault tolerant than the usual. A Gunicorn application server was implemented, tuned for high performance with lower response time. New-Relic was setup for enhanced monitoring, which helped in proactive maintenance of application & infrastructure. Log entries were implemented for centralized log management of docker and other applications, which helped in achieving proactive alert system in case of live incidents and also reduced the debugging time by 30%. Log rotate was setup for periodically collecting logs from each node of applications. This also includes periodic cleanup of old logs to maintain steady disk capabilities. Beanstalk and Jenkins were used in continuous integration and orchestration which resulted in smooth deployment and error-free build with fall back ability in case of failures. Apache workbench & Gatling (Automated testing) enabled automated performance testing, which helped in achieving stable systems during each deployment. HAProxy and Docker swarm contribute towards making the system more efficient to handle a large load. Apache workbench & Gatling (Automated testing) enabled automated performance testing, which helped in achieving stable systems during each deployment.
Solution Dynamics and Interactions The overall solution is composed of the following layers: HAProxy Layer HAProxy is the load balancer which is the first point of interaction from the coming request. It routes the requests forward to one of the swarm load balancers in the docker swarm cluster. Docker Swarm Load Balancer The Docker swarm load balancer s job is to forward the request to one of the multiple NGINX container across the three hosts. NGINX Layer The NGINX layer is connected with multiple upstreams, which essentially are containers for a different application supported by the platform. Each of which is a set of wsgi container running on different port. Each of the requests is routed to the correct container based on the URL. WSGI Layer This is the application server layer, this runs the Gunicorn server which is the process which handles the request. This layer has interaction with multiple external services like Database, Mongo DB, Firebase etc. Redis Layer The requests to Redis are the only ones which reach this layer. The redis follows a sentinel pattern, which consists of one master and two slaves, to avoid possible downtime.
Technology Stack Nginx Docker and Docker Swarm Gunicorn Redis Log entries New Relic Beanstalk Jenkins Apache workbench Business Outcomes The infrastructure delivered was serving 2 million requests with zero downtime. There was 4X improvement in application performance with the new infrastructure. The platform was stable since it went live. The continuous integration with automated performance testing enabled in finding issues earlier and in reducing development cycles. Consistency across environments via docker helped in stabilizing the program. HashedIn has helped many promising firms across the globe by building customized solutions to give the users a completely hassle-free experience. Kindly let us know if you have any specific problem/use case, where we can provide more information or consult you. https://hashedin.com/contact-us/