Guide to scale a Sails Node.js App with Heroku

11 min readJun 4, 2019

One of the most difficult challenges you may find as a Software Engineer is to manually or automatically scale your applications. This may have been a really hard task to achieve some years ago, but not anymore with Heroku.

Please read on, so you can see how we approached this problem in our tourware Suite at tourware.net, one of the biggest touristic Software provider companies in Germany.

A bit about our business concepts

tourware Suite is a stateless scalable web application intended to be accessed from anywhere in the world by any individual, group, or tour agency to publish their travel packages, vacation rentals, and hotels, accept bookings from other customers and more importantly be able to delegate the configuration of all the platform to the customer himself.

tourware Suite is composed of two main applications, the Mid/Back office handles all the configuration, of every customer packages, payment methods, prices, seasons, and hotels, so our customers can edit their IBEs online with just a richly cutting-edge HTML editor (see Figure 1 below).

Figure 2, tourware Suite Mid/Backoffice (Sails + ExtJS)

Once the customer is done setting up the Mid/Backoffice he can go ahead and point his domain to our IBE which gets automatically a new Heroku domain assigned with a brand new IBE app running on a container (see Figure 2) or if he prefers then he can simply go with his own existing website which can be in any CMS (see Figure 3).

Figure 3, Customer custom WordPress IBE using the tourware Suite API to fetch the data/book/pay online

At this point, I have walked you a bit around our business model. As you can see we require that our app can be easily auto-scaled so we can support in theory millions of users concurrently. This is because we have N Clients that have N customers therefore, our app can’t be down for any reason.

So, enough talk, let’s get to the tech stack

Heroku (build, deliver, monitor, and scale apps)
Node.js
Sails.js (Node.js framework with an embedded REST, Websocket API, Waterline ORM, and much more, very simple, flexible, and extendable).
Socket.io shipped within Sails.js
GitHub
Azure Redis Cache server
Kue is a priority job queue backed by Redis, built for node.js.
MongoLab for MongoDB database
Express HTTP Server
Throng for one-liner for clustered Node.js apps. Runs X workers and respawns them if they go down. Correctly handles signals from the OS.
CircleCI for continuous integration and delivery
Mocha, Sinon, Chai, Nightwatch with Selenium for the tests suite
and other cool technologies/frameworks.

You may say, wow, that’s a lot to take in, but I am sure most of you guys may have tried them out at some point in your career.

The sails app structure

To be able to create your scalable app, you first need to have an already existing Sails.js app or create your new one.

In Sails.js an app uses an MVC architecture, therefore is basically composed by:

Models (a class structure representation of the database table that using a driver can fetch queries of any database engine)
Controllers (containing all the logic for processing users' requests)
Services (Allowing the user to separate the database queries/heavy work out of the controller)
and Views that can be rendered with any configured templating engine like swig, ejs, handlebars, and served/cached as static content via Express.

For you to have an idea of our app structure, see figure 4 with a screenshot of our folder structure.

The biggest advantage that Sails.js brings to the game is an organization, it kinda forces you to respect that frame, although everything is configurable. Therefore in the future, you can always change the complete app by just switching a different config file or just installing a new npm package and your app remains the same.

Heroku

In Heroku, applications run in a collection of lightweight Linux containers called dynos. Heroku allows you to deploy your application in different environments completely for free with logs, database, hosting, and many other cool features, completely for free, until your dyno gets to sleep. However, if you want to get serious, you will have to pay 😉

I am not going to cover all details of how to get your app to Heroku, but it should be pretty simple, as it has a complete integration with Git and allows you to import any repository at any time, as well as detect new changes and auto-deploy every time you merge your changes to the master or the selected branch for your app in Heroku.

In Figure 5 below you can see how Heroku manages different environments of your app using different pipelines. Each pipeline is connected to a branch in Git and detects changes and automatically deploys if desired by the user.

At this point, you got an understanding of how Heroku manages apps and be ready for the next step.

From stateful to stateless

This is the biggest challenge, so please keep reading and you will find out how we dealt with it.

Before scaling with Heroku, you will need to convert your app from a stateful to a stateless app, so that you can run your app in different servers/processes/dynos/workers sharing the same session, database, and other resources.

First, our app needs different processes

You may skip this step if your app does not need to run cron tasks or any heavy processing task that may delay the performance of your web application. But if it does, then you need to consider splitting your app into small sub-apps like Clock, Worker, and Web.

Heroku has a very cool feature for this called “The Process Model”, allowing you to separate your app into small sub-processes so that your web app performance is not affected by a heavy cron task running in the background. Heroku Engineers say that your app should not be executed as monolithic entities. Instead, you should run them as one or more lightweight processes. During development, this can be a single process launched via the command line. In production, you can then run as many processes as you wish (see Figure ).

In our case, our app is configured to run 3 different scalable processes, 2+ workers, 1 clock, 2 web processes, and one additional release process that runs only after every release to do some changes in the database, send deployment emails, etc. Below is the Procfile configuration file in our app that tells Heroku how it should deploy our apps.

release: yarn run after-release
web: yarn run start
worker: yarn run worker
clock: yarn run clock

Therefore, we need 4 different sails apps, to separate the concerns of each process. See Figure 7 for details on how the process communicates with each other.

release: The release phase enables us to run certain tasks before and after a new release of your app is deployed. The release phase can be useful for tasks such as:
- Sending CSS, JS, and other assets from your app’s slug to a CDN or S3 bucket
- Running database schema migrations
web: Runs a sails application with only the web process but no cron tasks or workers. Any heavy work that the app needs to do it will be passed to the worker task.
worker: The worker is also a Sails app but without the main hooks, only the database connection, but no HTTP, no pub/sub mechanism, no cronjobs hook, etc.
The worker takes the jobs that have been sent to it for processing through Redis Cache Server with a Kue Job, does the heavy work, and responds to the requester by changing the Job status to “complete”.
clock: the clock.js app loads a Sails application without any hook, no database connection, no HTTP, no web app, and no views, except the configured cron tasks and Kue with a valid connection to the Redis server.

Figure 7, Process communicating through Redis using Kue (abstract)

// clock.js
process.chdir(__dirname);

const _ = require('lodash');

// Ensure a "sails" can be located:
(function () {
 var sails;
 try {
  sails = require('sails');
 } catch (e) {
   console.error(e);
   return;
 }

 // Try to get `rc` dependency
 var rc;
 try {
  rc = require('rc');
 } catch (e0) {
  try {
   rc = require('sails/node_modules/rc');
  } catch (e1) {
   console.error('Could not find dependency: `rc`.');
   console.error('Your `.sailsrc` file(s) will be ignored.');
   console.error('To resolve this, run:');
   console.error('npm install rc --save');
   rc = function () {
    return {};
   };
  }
 }// we load the default sails configuration for the configured environment
 let defaultConfig = rc('sails');
 let config = _.merge(defaultConfig, {
  bootstrapTimeout: 10000,

  generators: {
   modules: {}
  },// and as we only want to run the clock, we lift sails with only
// the cronjobs hook running, no views, no http, nothing else, 
// just a clock
  hooks: {
   cronjobs: true,
   blueprints: false,
   controllers: false,
   endpoints: false,
   cors: false,
   csrf: false,
   grunt: false,
   http: false,
   i18n: false,
   logger: false,
   policies: false,
   pubsub: false,
   request: false,
   responses: false,
   session: false,
   sockets: false,
   views: false
  }
 });

 // Load sails with the clock environment
 sails.load(config, function (err) {
  sails.emit('loaded');

  if (err) {
   sails.log.error("Error, could not load clock.js");
   sails.log.error(err);
  }

  sails.log.info('Sails running from clock.js to send cron tasks via Redis to the worker.js');
 });
})();

Also, you can see an example of one of our cron tasks, file updateFollowups.js

/**
 * @class cron.updateFollowups
 *//**
 * With this configuration the clock.js publishes a job every 2 minutes to Kue, so the worker.js can process it
 */
exports.schedule = '*/2 * * * *';

exports.report = false;// Method executed by the worker.js once a Job of type updateFollowups arrives
exports.process = async () => {
 let followups = await Followups.find({
  date: {'<': new Date()},
  sent: false
 }).populate('user').populate('createdBy');

 return Promise.all(_.map(followups, exports.sendFollowupEmail));
};

Then the worker.js takes care of tasks coming from Kue:

/**
 * worker.js
 *
 * Use `worker.js` to run your app without `sails load`.
 * To start the server, run: `node worker.js`.
 *
 * This is handy in situations where the sails CLI is not relevant or useful.
 *
 * For example:
 *   => `yarn run worker.js`
 */

// Ensure we're in the project directory, so relative paths work as expected
// no matter where we actually lift from.
process.chdir(__dirname);

const throng = require('throng');
const os = require('os');
const _ = require('lodash');
const WORKERS = process.env.WEB_CONCURRENCY || os.cpus().length;

// Ensure a "sails" can be located:
function boot() {
 var sails;
 try {
  sails = require('sails');
 } catch (e) {
  console.error(e);
  return;
 }

 // Try to get `rc` dependency
 var rc;
 try {
  rc = require('rc');
 } catch (e0) {
  try {
   rc = require('sails/node_modules/rc');
  } catch (e1) {
   console.error('Could not find dependency: `rc`.');
   console.error('Your `.sailsrc` file(s) will be ignored.');
   console.error('To resolve this, run:');
   console.error('npm install rc --save');
   rc = function () {
    return {};
   };
  }
 }

 let defaultConfig = rc('sails');
 /// @todo: as we scale up in number of workers, reduce the number of concurrent jobs
 const CONCURRENT_JOBS = process.env.CONCURRENT_JOBS || 3; // process up to 3 concurrent jobs

 let config = _.merge(defaultConfig, {
  bootstrapTimeout: 20000,
  // indicates to sails that this is the worker, so we can skip some things at runtime
  isWorker: true,
  generators: {
   modules: {}
  },

  hooks: {
   // No cronjobs hook
   cronjobs: false,
   blueprints: false,
   controllers: false,
   endpoints: false,
   cors: false,
   csrf: false,
   grunt: false,
   logger: false,
   policies: false,
   pubsub: false,
   request: false,
   responses: false,
   session: false,
   sockets: false
  }
 });

 // Load sails with the worker environment
 sails.load(config, function (err) {
  if (err) {
   sails.log.error('Error, could not load worker.js');
   sails.log.error(err);
  }

  sails.log.info('Sails running from worker.js to receive and run cron tasks and other events incoming from Kue ');

  QueueService.getQueue().process('cron', CONCURRENT_JOBS, async (job, done) => {
   let data = job.data || {};
    let cronTask = sails.config.cron[data.name] || {};
    let report = true === cronTask.report;

    if (job.state() === 'active' && _.isFunction(cronTask.process)){
     report && sails.log.debug(`Processing cron task '${data.name}' with job id: '${job.id}'`);
     try {
      await cronTask.process(job);
     } catch (err) {
      sails.log.error(`Error found while processing job '${data.name}' with id: '${job.id}'`);
      sails.log.error(err);
     }
     // mark the job complete
     job.complete();
     done();
    } else {
     done();
    }
  });
 });
}// Run multiple workers per app using throng
throng({
 workers: WORKERS,
 lifetime: Infinity
}, boot);

It is important to point out that all these processes load the same Sails application, but require different hooks to improve the performance and still use all the capabilities of the framework like accessing the ORM, services, controllers, etc.

Next, for every process in Heroku, we need a Dyno, the basic dynos initially will be enough, but in the future, if your application grows in the number of users, this may require an auto-scale strategy, you will find this blog post explaining how to auto-scale Heroku Apps.

Figure 8, Heroku Dynos per process distribution

Last but not least, scale the web process

To use Heroku containers to the maximum you can include the Throng library so that it runs your web app

const throng = require('throng');
const os = require('os');
// Specify how many times your app should be lifte
const WORKERS = process.env.WEB_CONCURRENCY || os.cpus().length;throng({
 workers: WORKERS,
 lifetime: Infinity
}, liftSailsApp );
function liftSailsApp(){
   // Lift your app normally
}

Make the rest of the app stateless

Take for example the following configuration file for the staging environment. You can see here that the session is stored in MongoDB, the configuration for Sails sockets either via Long Polling or WebSockets will be sent through Redis MQ, so we can actually notify from the server (spawn in worker threads) any connected client.

/**
 * Staging environment settings
 */

module.exports = {

 port: process.env.PORT || 1337,
 skipFixtures: false,
 liftTimeout: 12000,

 orm: {
  _hookTimeout: 50000
 },// For the web process we don't want to run cronjobs hooks: {
  cronjobs: false
 },

 sockets: {
  adapter: 'socket.io-redis',
  url: process.env.REDIS_URL,
  transports: ['websocket'],
  _hookTimeout: 50000
 },

 models: {
  migrate: 'safe'
 },// Tell to sails to lift session data from a MongoDB Clustered database
 session: {
  adapter: 'connect-mongo',
  url: process.env.MONGODB_URI,
  collection: 'sessions',
  saveUninitialized: false
 },

  connections: {
  // Setup your MongoDB cluster url
    mongodb: {
      adapter: 'sails-mongo',
      url: process.env.MONGODB_URI
    },
   // And initialize Kue with the redisUrl and the prefix for the given environment, note the prefix is added to every Job type   kue: {
    prefix: 'staging',
    redisUrl: process.env.REDIS_URL
   }
  }
};

Give it a try

Create all required Heroku Settings or environment variables on your local PC and run the app.

// set WEB_CONCURRENCY=4yarn run app

Figure 9, Web Process running locally in 4 different threads.

Note, this will run on Heroku as many times (4 threads * dynos count), so, you can multiply your app easily.

Note: With the provided Procfile, Heroku would lift all configured processes per dyno for you on every deployment.

Figure 10, worker.js app running locally to process events incoming from Redis through Kue

Figure 11, clock.js running to send cron tasks to the worker using the configured Redis server through Kue

Conclusions

With this document, I expect to have covered the doubts of many developers that may be struggling to set up in an easy way a stateless scalable application with Heroku and Sails.

Looking forward to hearing your comments/suggestions/questions/feedback from you guys.

Thanks for reading!!!