Mastering Node.js - Second Edition

By : Sandro Pasquali, Kevin Faaborg

Mastering Node.js - Second Edition

By: Sandro Pasquali, Kevin Faaborg

Overview of this book

Node.js, a modern development environment that enables developers to write server- and client-side code with JavaScript, thus becoming a popular choice among developers. This book covers the features of Node that are especially helpful to developers creating highly concurrent real-time applications. It takes you on a tour of Node's innovative event non-blocking design, showing you how to build professional applications. This edition has been updated to cover the latest features of Node 9 and ES6. All code examples and demo applications have been completely rewritten using the latest techniques, introducing Promises, functional programming, async/await, and other cutting-edge patterns for writing JavaScript code. Learn how to use microservices to simplify the design and composition of distributed systems. From building serverless cloud functions to native C++ plugins, from chatbots to massively scalable SMS-driven applications, you'll be prepared for building the next generation of distributed software. By the end of this book, you'll be building better Node applications more quickly, with less code and more power, and know how to run them at scale in production environments.

Preface

What this book covers

What you need for this book

Free Chapter

Understanding the Node Environment

Introduction – JavaScript as a systems language

Standard libraries

Extending JavaScript

V8, JavaScript, and optimizations

The process object

The REPL

Summary

Understanding Asynchronous Event-Driven Programming

Node's unique design

Understanding the event loop

Listening for events

Timers

Concurrency and errors

Building a Twitter feed using file events

Summary

Streaming Data Across Nodes and Clients

Why use streams?

Exploring streams

Creating an HTTP server

HTTPS, TLS (SSL), and securing your server

The request object

Working with headers

Handling POST data

Creating and streaming images with Node

Summary

Using Node to Access the Filesystem

Directories, and iterating over files and folders

Reading from a file

Writing to a file

Serving static files

Handling file uploads

A simple file browser

Summary

Managing Many Simultaneous Client Connections

Understanding concurrency

Routing requests

Using Redis for tracking client state

Handling sessions

Authenticating connections

Summary

V8, JavaScript, and optimizations

V8 is Google's JavaScript engine, written in C++. It compiles and executes JavaScript code inside of a VM (Virtual Machine). When a webpage loaded into Google Chrome demonstrates some sort of dynamic effect, like automatically updating a list or news feed, you are seeing JavaScript, compiled by V8, at work.

V8 manages Node's main process thread. When executing JavaScript, V8 does so in its own process, and its internal behavior is not controlled by Node. In this section, we will investigate the performance benefits that can be had by playing with these options, learning how to write optimizable JavaScript, and the cutting-edge JavaScript features available to users of the latest Node versions (such as 9.x, the version we use in this book).

Flags

There are a number of settings available to you for manipulating the Node runtime. Try this command:

$ node -h

In addition to standards such as --version, you can also flag Node to --abort-on-uncaught-exception.

You can also list the options available for v8:

$ node --v8-options

Some of these settings can save the day. For example, if you are running Node in a restrained environment like a Raspberry Pi, you might want to limit the amount of memory a Node process can consume, to avoid memory spikes. In that case, you might want to set the --max_old_space_size (by default ~1.5GB) to a few hundred MB.

You can use the -e argument to execute a Node program as a string; in this case, logging out of the version of V8 your copy of Node contains:

$ node –e "console.log(process.versions.v8)"

It's worth your time to experiment with Node/V8 settings, both for their utility and the path, to give you a slightly stronger understanding of what is happening (or might happen) under the hood.

Optimizing your code

The simple optimizations of smart code design can really help you. Traditionally, JavaScript developers working in browsers did not need to concern themselves with memory usage optimizations, having quite a lot to use for what were typically uncomplicated programs. On a server, this is no longer the case. Programs are generally more complicated, and running out of memory takes down your server.

The convenience of a dynamic language is in avoiding the strictness that compiled languages impose. For example, you need not explicitly define object property types, and can actually change those property types at will. This dynamism makes traditional compilation impossible, but opens up some interesting new opportunities for exploratory languages such as JavaScript. Nevertheless, dynamism introduces a significant penalty in terms of execution speeds when compared to statically compiled languages. The limited speed of JavaScript has regularly been identified as one of its major weaknesses.

V8 attempts to achieve the sorts of speeds one observes for compiled languages for JavaScript. V8 compiles JavaScript into native machine code, rather than interpreting bytecode, or using other just-in-time techniques. Because the precise runtime topology of a JavaScript program cannot be known ahead of time (the language is dynamic), compilation consists of a two-stage, speculative approach:

Initially, a first-pass compiler (the full compiler) converts your code into a runnable state as quickly as possible. During this step, type analysis and other detailed analysis of the code is deferred, prioritizing fast compilation – your JavaScript can begin executing as close to instantly as possible. Further optimizations are accomplished during the second step.
Once the program is up and running, an optimizing compiler then begins its job of watching how your program runs, and attempting to determine its current and future runtime characteristics, optimizing and re-optimizing as necessary. For example, if a certain function is being called many thousands of times with similar arguments of a consistent type, V8 will re-compile that function with code optimized on the optimistic assumption that future types will be like the past types. While the first compile step was conservative with as-yet unknown and un-typed functional signature, this hot function's predictable texture impels V8 to assume a certain optimal profile and re-compile based on that assumption.

Assumptions help us make decisions more quickly, but can lead to mistakes. What if the hot function V8's compiler just optimized against a certain type signature is now called with arguments violating that optimized profile? V8 has no choice, in that case: it must de-optimize the function. V8 must admit its mistake and roll back the work it has done. It will re-optimize in the future if a new pattern is seen. However, if V8 must again de-optimize at a later time, and if this optimize/de-optimize binary switching continues, V8 will simply give up, and leave your code in a de-optimized state.

Let's look at some ways to approach the design and declaration of arrays, objects, and functions, so that you are helping, rather than hindering the compiler.

Numbers and tracing optimization/de-optimization

The ECMA-262 specification defines the Number value as a "primitive value corresponding to a double-precision 64-bit binary format IEEE 754 value". The point is that there is no Integer type in JavaScript; there is a Number type defined as a double-precision floating-point number.

V8 uses 32-bit numbers for all values internally, for performance reasons that are too technical to discuss here. It can be said that one bit is used to point to another 32-bit number, should greater width be needed. Regardless, it is clear that there are two types of values tagged as numbers by V8, and switching between these types will cost you something. Try to restrict your needs to 31-bit signed Integers where possible.
Because of the type ambiguity of JavaScript, switching the types of numbers assigned to a slot is allowed. For example, the following code does not throw an error:

let a = 7;
a = 7.77;

However, a speculative compiler like V8 will be unable to optimize this variable assignment, given that its guess that a will always be an Integer turned out to be wrong, forcing de-optimization.

We can demonstrate the optimization/de-optimization process by setting some powerful V8 options, executing V8 native commands in your Node program, and tracing how v8 optimizes/de-optimizes your code.

Consider the following Node program:

// program.js
let someFunc = function foo(){}
console.log(%FunctionGetName(someFunc));

If you try to run this normally, you will receive an Unexpected Token error – the modulo (%) symbol cannot be used within an identifier name in JavaScript. What is this strange method with a % prefix? It is a V8 native command, and we can turn on execution of these types of functions by using the --allow-natives-syntax flag:

node --allow-natives-syntax program.js
// 'someFunc', the function name, is printed to the console.

Now, consider the following code, which uses native functions to assert information about the optimization status of the square function, using the %OptimizeFunctionOnNextCall native method:

let operand = 3;
function square() {
    return operand * operand;
}
// Make first pass to gather type information
square();
// Ask that the next call of #square trigger an optimization attempt;
// Call
%OptimizeFunctionOnNextCall(square);
square();

Create a file using the previous code, and execute it using the following command: node --allow-natives-syntax --trace_opt --trace_deopt myfile.js. You will see something like the following returned:

 [deoptimize context: c39daf14679]
 [optimizing: square / c39dafca921 - took 1.900, 0.851, 0.000 ms]

We can see that V8 has no problem optimizing the square function, as operand is declared once and never changed. Now, append the following lines to your file and run it again:

%OptimizeFunctionOnNextCall(square);
operand = 3.01;
square();

On this execution, following the optimization report given earlier, you should now receive something like the following:

**** DEOPT: square at bailout #2, address 0x0, frame size 8
 [deoptimizing: begin 0x2493d0fca8d9 square @2]
 ...
 [deoptimizing: end 0x2493d0fca8d9 square => node=3, pc=0x29edb8164b46, state=NO_REGISTERS, alignment=no padding, took 0.033 ms]
 [removing optimized code for: square]

This very expressive optimization report tells the story very clearly: the once-optimized square function was de-optimized following the change we made in one number's type. You are encouraged to spend some time writing code and testing it using these methods.

Objects and arrays

As we learned when investigating numbers, V8 works best when your code is predictable. The same holds true with arrays and objects. Nearly all of the following bad practices are bad for the simple reason that they create unpredictability.

Remember that in JavaScript, an object and an array are very similar under the hood (resulting in strange rules that provide no end of material for those poking fun at the language!). We won't be discussing those differences, only the important similarities, specifically in terms of how both these data constructs benefit from similar optimization techniques.

Avoid mixing types in arrays. It is always better to have a consistent data type, such as all integers or all strings. As well, avoid changing types in arrays, or in property assignments after initialization if possible. V8 creates blueprints of objects by creating hidden classes to track types, and when those types change the optimization, blueprints will be destroyed and rebuilt—if you're lucky. Visit https://github.com/v8/v8/wiki/Design%20Elements for more information.

Don't create arrays with gaps, such as the following:

let a = [];
a[2] = 'foo';
a[23] = 'bar';

Sparse arrays are bad for this reason: V8 can either use a very efficient linear storage strategy to store (and access) your array data, or it can use a hash table (which is much slower). If your array is sparse, V8 must choose the least efficient of the two. For the same reason, always start your arrays at the zero index. As well, do not ever use delete to remove elements from an array. You are simply inserting an undefined value at that position, which is just another way of creating a sparse array. Similarly, be careful about populating an array with empty values—ensure that the external data you are pushing into an array is not incomplete.

Try not to preallocate large arrays—grow as you go. Similarly, do not preallocate an array and then exceed that size. You always want to avoid spooking V8 into turning your array into a hash table. V8 creates a new hidden class whenever a new property is added to an object constructor. Try to avoid adding properties after an object is instantiated. Initialize all members in constructor functions in the same order. Same properties + same order = same object.

Remember that JavaScript is a dynamic language that allows object (and object prototype) modifications after instantiation. Since the shape and volume of an object can, therefore, be altered after the fact, how does V8 allocate memory for objects? It makes some reasonable assumptions. After a set number of objects are instantiated from a given constructor (I believe 8 is the trigger amount), the largest of these is assumed to be of the maximum size, and all further instances are allocated that amount of memory (and the initial objects are similarly resized). A total of 32 fast property slots, inclusive, are then allocated to each instance based on this assumed maximum size. Any extra properties are slotted into a (slower) overflow property array, which can be resized to accommodate any further new properties.

With objects, as with arrays, try to define as much as possible the shape of your data structures in a futureproof manner, with a set number of properties, of types, and so on.

Functions

Functions are typically called often, and should be one of your prime optimization focuses. Functions containing try-catch constructs are not optimizable, nor are functions containing other unpredictable constructs, like with or eval. If, for some reason, your function is not optimizable, keep its use to a minimum.

A very common optimization error involves the use of polymorphic functions. Functions that accept variable function arguments will be de-optimized. Avoid polymorphic functions.

An excellent explanation of how V8 performs speculative optimization can be found here: https://ponyfoo.com/articles/an-introduction-to-speculative-optimization-in-v8

Optimized JavaScript

The JavaScript language is in constant flux, and some major changes and improvements have begun to find their way into native compilers. The V8 engine used in the latest Node builds supports nearly all of the latest features. Surveying all of these is beyond the scope of this chapter. In this section, we'll mention a few of the most useful updates and how they might be used to simplify your code, helping to make it easier to understand and reason about, to maintain, and perhaps even become more performant.

We will be using the latest JavaScript features throughout this book. You can use Promises, Generators, and async/await constructs as of Node 8.x, and we will be using those throughout the book. These concurrency operators will be discussed at depth in Chapter 2, Understanding Asynchronous Event-Driven Programming, but a good takeaway for now is that the callback pattern is losing its dominance, and the Promise pattern in particular is coming to dominate module interfaces.

In fact, a new method util.promisify was recently added to Node's core, which converts a callback-based function to a Promise-based one:

const {promisify} = require('util');
const fs = require('fs');

// Promisification happens here
let readFileAsync = promisify(fs.readFile);

let [executable, absPath, target, ...message] = process.argv;

console.log(message.length ? message.join(' ') : `Running file ${absPath} using binary ${executable}`);

readFileAsync(target, {encoding: 'utf8'})
.then(console.log)
.catch(err => {
  let message = err.message;
  console.log(`
    An error occurred!
    Read error: ${message}
  `);
});

Being able to easily promisify fs.readFile is very useful.

Did you notice any other new JavaScript constructs possibly unfamiliar to you?

Help with variables

You'll be seeing let and const throughout this book. These are new variable declaration types. Unlike var, let is block scoped; it does not apply outside of its containing block:

let foo = 'bar';

if(foo == 'bar') {
    let foo = 'baz';
    console.log(foo); // 1st
}
console.log(foo); // 2nd

// baz
// bar
// If we had used var instead of let:
// baz
// baz

For variables that will never change, use const, for constant. This is helpful for the compiler as well, as it can optimize more easily if a variable is guaranteed never to change. Note that const only works on assignment, where the following is illegal:

const foo = 1;
foo = 2; // Error: assignment to a constant variable

However, if the value is an object, const doesn't protect members:

const foo = { bar: 1 }
console.log(foo.bar) // 1
foo.bar = 2;
console.log(foo.bar) // 2

Another powerful new feature is destructuring, which allows us to easily assign the values of arrays to new variables:

let [executable, absPath, target, ...message] = process.argv;

Destructuring allows you to rapidly map arrays to variable names. Since process.argv is an array, which always contains the path to the Node executable and the path to the executing file as the first two arguments, we can pass a file target to the previous script by executing node script.js /some/file/path, where the third argument is assigned to the target variable.

Maybe we also want to pass a message with something like this:

node script.js /some/file/path This is a really great file!

The problem here is that This is a really great file! is space-separated, so it will be split into the array on each word, which is not what we want:

[... , /some/file/path, This, is, a, really, great, file!]

The rest pattern comes to the rescue here: the final argument ...message collapses all remaining destructured arguments into a single array, which we can simply join(' ') into a single string. This also works for objects:

let obj = {
    foo: 'foo!',
    bar: 'bar!',
    baz: 'baz!'
};

// assign keys to local variables with same names
let {foo, baz} = obj;

// Note that we "skipped" #bar
console.log(foo, baz); // foo! baz!

This pattern is especially useful for processing function arguments. Prior to rest parameters, you might have been grabbing function arguments in this way:

function (a, b) {
    // Grab any arguments after a & b and convert to proper Array
    let args = Array.prototype.slice.call(arguments, f.length);
}

This was necessary previously, as the arguments object was not a true Array. In addition to being rather clumsy, this method also triggers de-optimization in compilers like V8.

Now, you can do this instead:

function (a, b, ...args) {
    // #args is already an Array!
}

The spread pattern is the rest pattern in reverse—you expand a single variable into many:

const week = ['mon','tue','wed','thur','fri'];
const weekend = ['sat','sun'];

console.log([...week, ...weekend]); // ['mon','tue','wed','thur','fri','sat','sun']

week.push(...weekend);
console.log(week); // ['mon','tue','wed','thur','fri','sat','sun']

Arrow functions

Arrow functions allow you to shorten function declarations, from function() {} to simply () => {}. Indeed, you can replace a line like this:

SomeEmitter.on('message', function(message) { console.log(message) });

To:

SomeEmitter.on('message', message => console.log(message));

Here, we lose both the brackets and curly braces, and the tighter code works as expected.

Another important feature of arrow functions is they are not assigned their own this—arrow functions inherit this from the call site. For example, the following code does not work:

function Counter() {
    this.count = 0;

    setInterval(function() {
        console.log(this.count++);
    }, 1000);
}

new Counter();

The function within setInterval is being called in the context of setInterval, rather than the Counter object, so this does not have any reference to count. That is, at the function call site, this is a Timeout object, which you can check yourself by adding console.log(this) to the prior code.

With arrow functions, this is assigned at the point of definition. Fixing the code is easy:

setInterval(() => { // arrow function to the rescue!
  console.log(this);
  console.log(this.count++);
}, 1000);
// Counter { count: 0 }
// 0
// Counter { count: 1 }
// 1
// ...

String manipulation

Finally, you will see a lot of backticks in the code. This is the new template literal syntax, and along with other things, it (finally!) makes working with strings in JavaScript much less error-prone and tedious. You saw in the example how it is now easy to express multiline strings (avoiding 'First line\n' + 'Next line\n' types of constructs). String interpolation is similarly improved:

let name = 'Sandro';
console.log('My name is ' + name);
console.log(`My name is ${name}`);
// My name is Sandro
// My name is Sandro

This sort of substitution is especially effective when concatenating many variables, and since the contents of each ${expression} can be any JavaScript code:

console.log(`2 + 2 = ${2+2}`)  // 2 + 2 = 4

You can also use repeat to generate strings: 'ha'.repeat(3) // hahaha.

Strings are now iterable. Using the new for...of construct, you can pluck apart a string character by character:

for(let c of 'Mastering Node.js') {
    console.log(c);
    // M
    // a
    // s
    // ...
}

Alternatively, use the spread operator:

console.log([...'Mastering Node.js']);
// ['M', 'a', 's',...]

Searching is also easier. New methods allow common substring seeks without much ceremony:

let targ = 'The rain in Spain lies mostly on the plain';
console.log(targ.startsWith('The', 0)); // true
console.log(targ.startsWith('The', 1)); // false
console.log(targ.endsWith('plain')); // true
console.log(targ.includes('rain', 5)); // false

The second argument to these methods indicates a search offset, defaulting to 0. The is found at position 0, so beginning the search at position 1 fails in the second case.

Great, writing JavaScript programs just got a little easier. The next question is what's going on when that program is executed within a V8 process?

Mastering Node.js - Second Edition

By : Sandro Pasquali, Kevin Faaborg

Mastering Node.js - Second Edition

By: Sandro Pasquali, Kevin Faaborg

Overview of this book

Related Content you might be interested in

Current Title:

Mastering Node.js - Second Edition

Node Cookbook.

Node.js Design Patterns

Building Cross-Platform Desktop Applications with Electron