Sep 29, 2023

Building Node JS Scalable Applications

Learn how to build NodeJs applications that are scalable and reliable. Get the best tips and tricks to ensure your applications are optimized for success!

In an increasingly digital world, utilizing Node JS development services to create scalable, high-performance applications is not only advantageous, but also required. This article looks into the realm of NodeJS, a runtime environment loved by many developers worldwide, giving best practices, critical tools, and strategic patterns to improve the performance of Node JS scalable projects. Whether you’re a beginner just getting started with Node JS development services or a seasoned developer looking to improve your program, this post will walk you through the fundamental stages to take your NodeJS application from functional to amazing. Utilize the power of these insights and tactics to build Node JS scalable apps that not only meet but exceed your performance goals.

What is Node.js?

NodeJs is a JavaScript runtime that is based on Chrome’s V8 JavaScript engine and employs an event-driven, non-blocking I/O approach. That is, NodeJs allows developers to execute Javascript code on the server side, allowing Javascript developers to construct both front end and back end apps. One of NodeJs’ numerous selling features is the use of a single programming language across the stack. Among the others are:

  • Because NodeJs is asynchronous and event-driven, if an action is taking a long time to complete, the application can continue to run other operations while waiting for the first one to complete. This functionality improves the efficiency and speed of NodeJs applications.
  • NodeJs is extremely quick in code execution because it is based on the V8 Javascript engine.
  • NodeJs has a sizable developer community. That means there are several resources to learn from when one gets stuck, as well as numerous libraries to employ to make development easier.
  • NodeJs is platform independent. It is compatible with Windows, Linux, and Mac OS. And, because it is essentially simply Javascript on the server side, it is simple to learn, utilize, and find developers for. It is not difficult to assemble a team capable of developing NodeJs, React Native, and ReactJS applications that span all aspects of the development process.
  • NodeJs is a lightweight programming language. It requires few resources and is simple to scale. Scaling in backend development means that an application can process more requests per second without crashing or slowing down, resulting in a more pleasant user experience. Because scaling is the primary topic of this post, we will go over it in greater depth

Understanding the NodeJS Event Loop

Before we go into scaling, let’s take a quick look at the event loop. The event loop is a fundamental notion in NodeJs programming. It is a single-threaded mechanism that runs indefinitely and oversees the asynchronous execution of operations such as reading from files, accessing databases, or executing network requests in a NodeJs application. Rather than waiting for a task to be done, NodeJs registers callback methods that will be executed after the process is completed. If the proper approaches are utilized, NodeJs’ non-blocking nature makes it exceedingly fast and highly scalable.

What is scaling?

In its most basic form, scaling refers to an application’s ability to handle many requests per second at the same time. Vertical and horizontal scaling are two more concepts in scaling terminology. Vertical scaling, also known as scaling up, is the process of improving an application’s capacity to fulfill requests by upgrading its resources, such as adding more RAM, boosting CPU, and so on. Horizontal scaling, also known as scaling out, is the process of adding extra instances to the server.

NodeJS Scaling with Multiple Instances

First and foremost, consider why we scale at all. Simply put, in this day and age, an application that cannot manage all incoming requests from all of its users cannot expect to stay in the game.

And, as backend developers, we must ensure that our application is quick, responsive, and secure. Scaling allows developers to improve performance by distributing the workload across multiple instances or nodes, handling more traffic, and creating fault tolerance, which is the process of having multiple instances so that if one fails, the other instances can take over and keep the Node JS application running.

While some programming languages, such as Go, can handle concurrent requests by default, NodeJs handles processes differently due to its single-threaded nature. As a result, the approaches employed to scale vary.

NodeJs is quick. Very quick. However, because it is single-threaded, it may struggle with multi-threading because it can only run one thread at a time. Too many queries at once can cause an event loop to become clogged.

Scaling Node JS Applications

There are several ways to grow Node.JS apps. Let us take a quick look at some of them, such as microservices architecture, cluster modules, and database optimization.

The process of designing software that is made up of loosely linked, separate entities is known as Node JS microservices architecture. Each service is a separate Node JS application that has been written and deployed, and they can communicate with one another via HTTP requests or messaging systems such as RabbitMQ or Apache Kafka. Instead of stuffing everything into one monorepo, this way of producing software allows developers to focus on each service independently and apply essential changes without affecting the others. Although it should be emphasized, the benefits of microservices are a contentious topic that should be employed with caution.

Let’s look at a hypothetical e-commerce application to better grasp microservices design. This program could be divided into microservices such as Product, Cart, and Order. Each microservice is created and deployed on its own.

For example, the Product microservice may be in charge of maintaining product data in the system. It would expose an HTTP API and allow CRUD endpoints for other microservices to interact with product information.

The Cart microservice could manage all cart management functions such as adding products, adjusting quantities, computing totals, and so on. It would also offer an API for other microservices to construct and update carts. In addition, the Order microservice might support order creation, payment processing, status tracking, and other functions. It would have APIs for cart checkout and order lookup.

The application is easier to scale and maintain by dividing concerns into standalone, disconnected microservices. Each microservice focuses on a unique domain capability while yet collaborating to give the complete application experience.

The Cart microservice, for example, would manage all shopping cart functions, such as adding products, updating amounts, computing totals, and so on. It would keep track of cart data on its own database.

Endpoints for placing orders, searching up order history, and combining the Cart and Product microservices would be provided by the Order microservice. It acts as a link between the basket and product information/functionality.

This allows each microservice team to concentrate on their specific area of the application. The Cart team is in charge of cart capabilities, the Product team is in charge of product data and APIs, and the Order team is in charge of order processing and integration.

In theory, this domain separation speeds up development by splitting effort and decreasing overlapping functionality among teams. It also encourages independence and flexible coupling of services. Each microservice is less reliant on other components of the system, decreasing the impact of changes and increasing reliability.

Caching

Caching is a technique used to improve the efficiency and scalability of Node.js projects by temporarily storing frequently used data for quick lookup.

Consider the following example: We need to create an app that retrieves and shows museum data – photographs, titles, descriptions, and so on – in a grid format. Users can also view different pages of data using pagination.

Each paginated request could retrieve up to 20 items from the museum’s public API. Because it is a public API, it is likely to include rate limitation to avoid abuse. We will quickly reach those rate constraints if we request data from the API with each page change.

Instead of making duplicate API calls, we can employ caching. We cache the first page of data when it is requested. On subsequent page views, we check to see if the data is still in the cache. If this is the case, we return the cached data in order to avoid violating rate constraints.
Caching allows for quick access to previously retrieved data. Caching can dramatically increase performance and minimize costs/limits on backend services for public APIs or other data that does not change frequently.

One excellent solution to this problem is to cache the data using a caching service such as Redis. It operates as follows: We obtain the data for page 1 from the API and put it in Redis memory.

When the user navigates to page 2, we make the regular request to the museum database.

However, caching is most useful when a user returns to a previously seen page. For example, instead of sending a new API request when the user returns to page 1 after seeing previous pages, we first check to see if the data for page 1 is already in the cache. If it does, we instantly return the cached data, avoiding an extra API request.

We only make the API request, save the result in the cache, and return it to the user if the cache does not contain the data. As a result, we limit the number of duplicate API queries as users return to pages. We optimize performance and keep within API rate constraints by serving from cache wherever possible. The cache serves as a temporary data store, reducing calls to the backend.

Practice: Cluster Module, Multithreading, and Worker Processes

Theory without practice is only half the job. In this section, we will look at some of the approaches for scaling NodeJs applications: Multiple-threading and cluster module. We will initially use NodeJS’ built-in cluster module, and then, once we understand how it works, we will use the process manager, pm2 package, to make things easier. Then, we’ll alter the example slightly and utilize the worker threads module to generate numerous threads.

Cluster Module

Because NodeJs is single-threaded, it will only use a single core of your CPU, regardless of how many cores you have. This is perfectly acceptable for input/output activities, but if the code is CPU intensive, the Node app may experience performance concerns. We can use the cluster module to fix this problem. We can use this module to construct child processes that use the same server port as the parent process.

This manner, we can make use of all of the CPU’s cores. To understand what it means and how it works, let’s build a small NodeJs application as an example.

We’ll begin by making a new folder called nodeJs-scaling, and then inside that folder, we’ll make a file called no-cluster.js. We will write the following code snippet inside that file:

const http = require("http");

const server = http.createServer((req, res) => {
if (req.url === "/") {

    res.writeHead(200, { "content-type": "text/html" });

    res.end("Home Page");

  } else if (req.url === "/slow-page") {

    res.writeHead(200, { "content-type": "text/html" });

    // simulate a slow page

    for (let i = 0; i < 9000000000; i++) {

      res.write("Slow Page");

    }

    res.end(); // Send the response after the loop completes

  }

});


server.listen(5000, () => {

  console.log("Server listening on port : 5000....");

});

Here, we start by importing NodeJs’ built-in HTTP module. We use it to create a server that has two endpoints, a base endpoint, and a slow-page endpoint. What we are aiming for with this structure is that when we go to the base endpoint, it will run and open the page as usual. But, as you can see, because of the for loop that’ll run once we go to the slow-page endpoint, the page will take a long time to load. While this is a simple example, it is a great way to understand how the process works.

Now, if we start the server by running node cluster.js and then sent a request to the base endpoint via CURL, or just open the page on a browser, it will load pretty quickly. An example CURL request is curl -i http://localhost:5000/. Now, if we did the same for curl -i http://localhost:5000/slow-page we will realize that it takes a long time, and even it might end up with an error. This is because the event loop is blocked by the for loop, and it cannot handle any other requests until the loop is completed. Now, there are a couple of ways to solve this problem. We will first start by using the built-in cluster module, and then use a handy library called PM2.

Built-in cluster module

Now let’s create a new file called cluster.js in the same directory and write the following snippet inside of it:

const cluster = require("cluster");

const os = require("os");

const http = require("http");




// Check if the current process is the master process

if (cluster.isMaster) {

  // Get the number of CPUs

  const cpus = os.cpus().length;

  console.log(`${cpus} CPUs`);

} else {

  console.log("Worker process" + process.pid);

}

Here, we start by importing the cluster, operating system, and http modules.

What we are doing next is checking whether the process is the master cluster or not, if so, we’re logging the CPU count.

This machine has 6, it would be different for you depending on your machine. When we run node cluster.js we should get a response like “6 CPUs”. Now, let’s modify the code a bit:

const cluster = require("cluster");

const os = require("os");

const http = require("http");




// Check if the current process is the master process

if (cluster.isMaster) {

  // Get the number of CPUs

  const cpus = os.cpus().length;




  console.log(`Forking for ${cpus} CPUs`);

  console.log(`Master process ${process.pid} is running`);




  // Fork the process for each CPU

  for (let i = 0; i < cpus; i++) {

    cluster.fork();

  }

} else {

  console.log("Worker process" + process.pid);

  const server = http.createServer((req, res) => {

    if (req.url === "/") {

      res.writeHead(200, { "content-type": "text/html" });

      res.end("Home Page");

    } else if (req.url === "/slow-page") {

      res.writeHead(200, { "content-type": "text/html" });




      // simulate a slow page

      for (let i = 0; i < 1000000000; i++) {

        res.write("Slow Page"); // Use res.write instead of res.end inside the loop

      }




      res.end(); // Send the response after the loop completes

    }

  });




  server.listen(5000, () => {

    console.log("Server listening on port : 5000....");

  });

}

In this updated version, we’re forking the process for each CPU. We could have written cluster.fork() maximum amount of 6 times too (as this is the CPU count of the machine we are using, it would be different for you).

There’s a catch here: We should not succumb to the tantalizing idea of creating more forks than the number of CPUs as this will create performance issues instead of solving them. So, what we are doing is, we are forking the process for each CPU via a for loop.

Now, if we run node cluster.js we should get a response like this:

6 CPUs

Master process 39340 is running

Worker process39347

Worker process39348

Worker process39349

Server listening on port : 5000....

Worker process39355

Server listening on port : 5000....

Server listening on port : 5000....

Worker process39367

Worker process39356

Server listening on port : 5000....

Server listening on port : 5000....

Server listening on port : 5000....

As you can see all of those processes have a different id. Now, if we tried to first open the slow-page endpoint and then the base endpoint, we should see that instead of waiting for the long for loop to complete, we are getting a faster response from the base endpoint.

This is because the slow-page endpoint is being handled by a different process.

Multiple Threads

While the cluster module allows us to run multiple instances of NodeJs that can distribute workloads, the worker_threads module enables us to run multiple application threads within a single NodeJs instance.

So the Javascript code will run in parallel. We should note here that code executed in a worker thread runs in a separate child process, preventing it from blocking our main application.

Let us again see this process in action. Let’s create a new file called main-thread.js and add the following code:

const http = require("http");

const { Worker } = require("worker_threads");




const server = http.createServer((req, res) => {

  if (req.url === "/") {

    res.writeHead(200, { "content-type": "text/html" });

    res.end("Home Page");

  } else if (req.url === "/slow-page") {

    // Create a new worker

    const worker = new Worker("./worker-thread.js");

    worker.on("message", (j) => {

      res.writeHead(200, { "content-type": "text/html" });




      res.end("slow page" + j); // Send the response after the loop completes

    });

  }

});




server.listen(5000, () => {

  console.log("Server listening on port : 8000....");

});

Let’s also create a second file named worker-thread.js and add the following code:

const { parentPort } = require("worker_threads");

// simulate a slow page

let j = 0;

for (let i = 0; i < 1000000000; i++) {

  //   res.write("Slow Page"); // Use res.write instead of res.end inside the loop

  j++;

}




parentPort.postMessage(j);

Now what is going on here? In the first file, we’re destructuring the Worker class from the worker_threads module.

With worker.on plus a fallback function, we can listen to the worker-thread.js file that posts its message to its parent, which is main.thread.js file. This method also helps us run parallel code in NodeJs.

Conclusion

We’ve covered various techniques to scaling NodeJs apps in this lesson, including microservices architecture, in-memory caching, leveraging the cluster module, and multi-threading. We’ve also looked at two specific cases to demonstrate how these approaches operate in practice. If you like the article, please contact us if you require any assistance.