Skip to content

azat-co/node-advanced

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 

Repository files navigation

footer: © NodeProgram.com, Node.University and Azat Mardan 2018 slidenumbers: true theme: Simple, 1 build-lists: true autoscale:true

[.slidenumbers: false] [.hide-footer]


Node Advanced

Overview

Azat Mardan @azat_co

![left](images/azat node interacitev no pipe.jpeg)

inline right


Node Advanced


Course Overview


Course Overview

  • Table of Contents
  • What to expect
  • What you need

Curriculum


Curriculum

  1. Node Modules
  2. Node Event Loop and Async Programming
  3. Streaming
  4. Networking
  5. Debugging
  6. Scaling

What to Expect

Focus on:

  • Pure Node
  • Core Node modules
  • ES6-8

What not to Expect

Do not expect:

  • Not much JavaScript fundamentals and no old ES5
  • Not much Linux, Unix, Windows or computer fundamentals
  • Not many fancy npm modules or frameworks

Prerequisites


What You Need


Mindset

  • Embrace errors
  • Increase curiosity
  • Experiment by iteration
  • Get comfortable reading source code of Node.js, npm, and npm modules
  • Enjoy the process

Reading Source Code

You learn how to use a module and how to be a better developer


Tips for Deeper (Advanced) Understanding

  • Learn to think like V8 (a JS+Node engine): When in doubt, use console.log or debugger to walk through execution
  • Read call stack error message carefully. Learn and know common errors (address in use, cannot find module, undefined, etc.)
  • Upgrade your tools (No Notepad ++, seriously)

Tips for Deeper (Advanced) Understanding (Cont)

  • Memorize all the array, string and Node core methods - saves tons of time and keeps focus (can work offline too)
  • Read good books, take in-person classes from good instructors and watch good video courses
  • Build side-projects
  • Subscribe to Node Weekly to stay up-to-date
  • Teach

Module 1: Modules


Importing Modules with require()

  1. Resolving
  2. Loading
  3. Wrapping
  4. Evaluating
  5. Caching

Modules Can Have Code

code/modules/module-1.js:

console.log(module) // console.log(global.module)

Module {
  id: '.',
  exports: {},
  parent: null,
  filename: '/Users/azat/Documents/Code/node-advanced/code/module-1.js',
  loaded: false,
  children: [],
  paths:
   [ '/Users/azat/Documents/Code/node-advanced/code/node_modules',
     '/Users/azat/Documents/Code/node-advanced/node_modules',
     '/Users/azat/Documents/Code/node_modules',
     '/Users/azat/Documents/node_modules',
     '/Users/azat/node_modules',
     '/Users/node_modules',
     '/node_modules' ] }

require()

  • local paths takes precedence (0 to N)
  • module can be a file or a folder with index.js (or any file specified in package.json main in that nested folder)
  • loaded is true when this file is imported/required by another
  • id is the path when this file is required by another
  • parent and children will be populated accordingly

require.resolve()

Check if the package exists/installed or not but does not execute


How require() Checks Files

  1. Try name.js
  2. Try name.json
  3. Try name.node (compiled addon example)
  4. Try name folder, i.e., name/index.js

require.extensions

{ '.js': [Function], '.json': [Function], '.node': [Function] }
function (module, filename) { // require.extensions['.js'].toString()
  var content = fs.readFileSync(filename, 'utf8');
  module._compile(internalModule.stripBOM(content), filename);
  }

function (module, filename) { // require.extensions['.json'].toString()
    var content = fs.readFileSync(filename, 'utf8');
    try {
          module.exports = JSON.parse(internalModule.stripBOM(content));
      } catch (err) {
          err.message = filename + ': ' + err.message;
        throw err;
      }
    }

function (module, filename) { // > require.extensions['.node'].toString()
    return process.dlopen(module, path._makeLong(filename));
}

Caching

Running require() twice will not print twice but just once:

cd code/modules && node
> require('./module-1.js')
...
> require('./module-1.js')
{}

(Or run modules/main.js)


A better way to execute code multiple times is to export it and then invoke


Exporting Module


Exporting Code

module.exports = () => {

}

CSV to Node Object Converter Module

code/modules/module-2.js
module.exports.parse = (csvString = '') => {
  const lines = csvString.split('\n')
  let result = []
  ...
  return result
}

CSV to Node Object Converter Main Program

code/modules/main-2.js
const csvConverter = require('./module-2.js').parse

const csvString = `id,first_name,last_name,email,gender,ip_address
...
10,Allin,Bernadot,[email protected],Male,15.162.216.199`

console.log(csvConverter(csvString))

Module Patterns

  • Export Function
  • Export Class
  • Export Function Factory
  • Export Object
  • Export Object with Methods

More on these patterns at Node Patterns


Exporting Tricks and Gotchas

module.exports.parse = () => {} // ok
exports.parse = () => {} // ok
global.module.exports.parse = () => {}  // not ok, use local module

Exporting Tricks and Gotchas (Cont)

exports.parse = ()=>{} // ok
module.exports = {parse: ()=>{} } // ok again 
exports = {parse: ()=>{} } // not ok, creates a new variable

Module Wrapper Function

Keeps local vars local

require('module').wrapper

node
> require('module').wrapper
[ '(function (exports, require, module, __filename, __dirname) { ',
  '\n});' ]

Tricky Local Globals

exports and require are specific to each module, not true global global, same with __filename and __dirname

console.log(global.module === module) // false
console.log(arguments)

What You Export === What You Use

module.exports = { 
  parse: (csv) => {
    //...
  }
}

Importing object, so use:

const parse = require('./name.js').parse
const {parse} = require('./name.js') // or
parse(csv)

What You Export === What You Use (Cont)

const Parser = { 
  parse(csv) {
    // ...
  }
}
module.exports = Parser

Again importing object, so use:

const parse = require('./name.js').parse
const {parse} = require('./name.js') // or
parse(csv)

What You Export === What You Use (Cont)

module.exports = () => { 
  return {
    parse: (csv) => {}
  }
}

Importing function, not object, so use:

const {parse} = require('./name.js')()
const parse = require('./name.js')().parse

(modules/main-3.js and modules/module-3.js)


What You Export === What You Use (Cont)

class Parser extends BaseClass {
  parse(csv) {
    // ...
  }
}
module.exports = Parser
const Parser = require('./name.js')
const parser = new Parser()
const parse = parser.parse // or const {parse} = parser

import vs import() vs require()


Node experimental ESM support

import fs from 'fs'
import('./button.js')

For now, it's better to use Babel or just stick with require


Caching

require.cache has the cache


Clear Cache

main-4.js prints twice (unlike main-1.js):

require('./module-4.js')
delete require.cache[require.resolve('./module-4.js')]
require('./module-4.js')

Global

var limit = 1000 // local, not available outside
const height = 50 // local
let i = 10 // local
console = () => {} // global, overwrites console outside
global.Parser = {} // global, available in other files
max = 999 // global too

npm

  • registry

  • cli: folders, git, private registries (self hosted npm, Nexus, Artifactory)

  • yarn

  • pnpm


npm Git

npm i expressjs/express -E

npm i expressjs/express#4.14.0 -E
npm install https://github.com/indexzero/forever/tarball/v0.5.6
npm install git+ssh://[email protected]:npm/npm#semver:^5.0
npm install git+https://[email protected]/npm/npm.git

When in doubt: npm i --dry-run express


npm ls

npm ls express
npm ls -g --depth=0
npm ll -g --depth=0
npm ls -g --depth=0 --json

npm installs in ~/node_modules (if no local)


Creating package.json For Lazy Programmers

npm init -y

Setting Init Configs

List:

npm config ls

My npm Configs: cli, user, global

; cli configs
scope = ""
user-agent = "npm/4.2.0 node/v7.10.1 darwin x64"

; userconfig /Users/azat/.npmrc
init-author-name = "Azat Mardan"
init-author-url = "http://azat.co/"
init-license = "MIT"
init-version = "1.0.1"
python = "/usr/bin/python"

; node bin location = /Users/azat/.nvm/versions/node/v7.10.1/bin/node
; cwd = /Users/azat/Documents/Code/node-advanced
; HOME = /Users/azat
; "npm config ls -l" to show all defaults.

Configs for npm init

init-author-name = "Azat Mardan"
init-author-url = "http://azat.co/"
init-license = "MIT"
init-version = "1.0.1"

Setting up npm registry Config

npm config set registry "http://registry.npmjs.org/"

or

edit ~/.npmrc, e.g., /Users/azat/.npmrc


Setting up npm proxy

npm config set https-proxy http://proxy.company.com:8080
npm config set proxy http://proxy_host:port

Note: The https-proxy doesn't have https as the protocol, but http.


Dependency Options

  • npm i express -S (default in npm v5)
  • npm i express -D
  • npm i express -O
  • npm i express -E

npm update and npm outdated

  • < and <=
  • =
  • .x
  • ~
  • ^
  • > and >=

npm Tricks

npm home express
npm repo express
npm docs express

npm Linking for Developing CLI Tools

npm link 
npm unlink

Module 2: Node Event Loop and Async Programming


Event loop


Two Categories of Tasks

  • CPU-bound
  • I/O-bound

CPU Bound Tasks

CPU-bound tasks examples:

  • Encryption
  • Password
  • Encoding
  • Compression
  • Calculations

Input and Output Bound Tasks

Input/Output examples:

  • Disk: write, read
  • Networking: request, response
  • Database: write, read

CPU-bound tasks are not the bottleneck in networking apps. The I/O tasks are the bottleneck because they take up more time typically.


Dealing with Slow I/O

  • Synchronous
  • Forking (later module)
  • Threading (more servers, computers, VMs, containers)
  • Event loop (this module)

Call Stack

Uses push, pop functions on the FILO/LIFO/LCFS basis, i.e., functions removed from top (opposite of queue).

^https://techterms.com/definition/filo


Call Stack Illustration

const f3 = () => {
  console.log('executing f3')
  undefinedVariableError //  ERROR!
}
const f2 = () => {
  console.log('executing f2')
  f3()
}
const f1 = () => {
  console.log('executing f1')
  f2()
}

f1()

Call Stack as a Bucket

Starts with Anonymous, then f1, f2, etc.

f3() // last in the bucket but first to go
f2()
f1()
anonymous() // first in the bucket but last to go

Call Stack Error

> f1()
executing f1
executing f2
executing f3
ReferenceError: undefinedVariableError is not defined
    at f3 (repl:3:1)
    at f2 (repl:3:1)
    at f1 (repl:3:1)
    at repl:1:1
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:339:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:536:10)
    at emitOne (events.js:101:20)

Event Queue

FIFO to push to call stack


Async Callback Messes Call Stack

const f3 = () => {
  console.log('executing f3')
  setTimeout(()=>{
    undefinedVariableError // STILL an ERROR but async in this case
  }, 100)
}
const f2 = () => {
  console.log('executing f2')
  f3()
}
const f1 = () => {
  console.log('executing f1')
  f2()
}

f1()

Different Call Stack!

No f1, f2, f3 for the setTimeout callback call stack because event loop moved one, i.e., error comes from a different event queue:

> f1()
executing f1
executing f2
executing f3
undefined
> ReferenceError: undefinedVariableError is not defined
    at Timeout.setTimeout [as _onTimeout] (repl:4:1)
    at ontimeout (timers.js:386:14)
    at tryOnTimeout (timers.js:250:5)
    at Timer.listOnTimeout (timers.js:214:5)
>

Event Loop Order of Operation:

  1. Timers
  2. I/O callbacks
  3. Idle, prepare
  4. Poll (incoming connections, data)
  5. Check
  6. Close callbacks

^https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/

[.autoscale: true]


Phases Overview

  1. Timers: this phase executes callbacks scheduled by setTimeout() and setInterval().
  2. I/O callbacks: executes almost all callbacks with the exception of close callbacks, the ones scheduled by timers, and setImmediate().
  3. Idle, prepare: only used internally.
  4. Poll: retrieve new I/O events; node will block here when appropriate.
  5. Check: setImmediate() callbacks are invoked here.
  6. Close callbacks: e.g. socket.on('close', ...).

[.autoscale: true]


https://youtu.be/PNa9OMajw9w?t=5m48s



setTimeout vs. setImmediate vs. process.nextTick

  • setTimeout(fn, 0) - pushes to the next event loop cycle
  • setImmediate() similar to setTimeout() with 0 but timing is different sometimes, it is recommended when you need to execute on the next cycle
  • process.nextTick - not the next cycle (same cycle!), used to make functions fully async or to postpone code for events

nextTick Usage

All callbacks passed to process.nextTick() will be resolved before the event loop continues

  • To emit event after .on()
  • To make some sync code async

Event Emit nextTick Example in http

In http, to make sure event listeners are attached before emitting error (or anything else) source:

    if (err) {
      process.nextTick(() => this.emit('error', err));
      return;
    }

Async or Sync Error Handling in fs

To postpone callback if it's set (async) or throw error right away (sync) source:

function handleError(val, callback) {
  if (val instanceof Error) {
    if (typeof callback === 'function') {
      process.nextTick(callback, val);
      return true;
    } else throw val;
  }
  return false;
}

Async Code Syntax

  • Just Callbacks: code and data are arguments
  • Promises: code is separate from data
  • Generators and Async/await: look like sync but actually async

Error-First Callback

Define your async function:

const myFn = (cb) => {
  // Define error and data
  // Do something...
  cb(error, data)
}

Error-First Callback Usage

Use your function:

myFn((error, data)=>{
  
})

Arguments Naming

Argument names don't matter but the order does matter, put errors first and callbacks last:

myFn((err, result)=>{
  
})

Error-First

Errors first but the callback last

Popular convention but not enforced by Node)


Arguments Order and Callback-First?

Some functions don't follow error-first and use callback first, e.g., setTimeout(fn, time).


Callback-First

With the ES6 rest operator, it might make sense to start using callback-first style more because rest can only be the last parameter, e.g.

const myFn = (cb, ...options) => {

}

How to Know What is The Function Signature

  1. You created it so you should know
  2. Someone else created it, thus, always know others modules by reading source code, checking documentation, testing and reading examples, tests, tutorials.

Callbacks not Always Async

Sync code which has a function as an argument. :

const arr = [1, 2, 3]
arr.map((item, index, list) => {
  return item*index // called arr.length times
})

Promises

Externalize the callback code and separate it from the data arguments


Promises for Developers

  • Consume a ready promise from a library/module (axios, koa, etc.) - most likely
  • Create your own using ES6 Promise or a library (bluebird or q) - less likely

Usage and Consumption of Ready Promises


Callbacks Syntax

Where to put the callback and does the error argument go first?

asyncFn1((error1, data1) => {
  asyncFn2(data1, (error2, data2) => {
    asyncFn3(data2, (error3, data3) => {
      asyncFn4(data3, (error4, data4) => {
        // Do something with data4
      })
    })
  })
})

Promise Syntax

Clear separation of data and control flow arguments:

promise1(data1)
  .then(promise2)
  .then(promise3)
  .then(promise4)
  .then(data4=>{
    // Do something with data4
  })
  .catch(error=>{
    // handle error1, 2, 3 and 4
  })

Axios GET Example

const axios = require('axios')
axios.get('http://azat.co')
  .then((response) => response.data)
  .then(html => console.log(html))

Axios GET Error Example

const axios = require('axios')
axios.get('https://azat.co') // https will cause an error!
  .then((response)=>response.data)
  .then(html => console.log(html))
  .catch(e=>console.error(e))

Error: Hostname/IP doesn't match certificate's altnames: "Host: azat.co. is not in the cert's altnames: DNS:.github.com, DNS:github.com, DNS:.github.io, DNS:github.io"


Let's implement our own naive promise.

We can learn how easy promises are, and this is advanced course after all so why not?


Naive Promise: Callback Async Function

function myAsyncTimeoutFn(data, callback) {
  setTimeout(() => {
    callback()
  }, 1000)
}

myAsyncTimeoutFn('just a silly string argument', () => {
  console.log('Final callback is here')
})

Naive Promise: Implementation

function myAsyncTimeoutFn(data) {
  let _callback = null
  setTimeout(() => {
    if (_callback) _callback()
  }, 1000)
  return {
    then(cb){
      _callback = cb
    }
  }
}

myAsyncTimeoutFn('just a silly string argument').then(() => {
  console.log('Final callback is here')
})

Naive Promise: Implementation with Errors

const fs = require('fs')
function readFilePromise( filename ) {
  let _callback = () => {}
  let _errorCallback = () => {}

  fs.readFile(filename, (error, buffer) => {
    if (error) _errorCallback(error)
    else _callback(buffer)
  })

  return {
    then( cb, errCb ){
      _callback = cb
      _errorCallback = errCb
    }
  }

}

Naive Promise: Reading File

readFilePromise('package.json').then( buffer => {
  console.log( buffer.toString() )
  process.exit(0)
}, err => {
  console.error( err )
  process.exit(1)
})

Naive Promise: Triggering Error

readFilePromise('package.jsan').then( buffer => {
  console.log( buffer.toString() )
  process.exit(0)
}, err => {
  console.error( err )
  process.exit(1)
})
{ Error: ENOENT: no such file or directory, open 'package.jsan'
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: 'package.jsan' }

Creating Promises Using The Standard ES6/ES2015 Promise


ES6/ES2015 Promise in Node

Node version 8+ (v8 not V8):

Promise === global.Promise

ES6 Promise takes callback with resolve and reject


Simple Promise Implementation with ES6/ES2015

const fs = require('fs')
function readJSON(filename, enc='utf8'){
  return new Promise(function (resolve, reject){
    fs.readFile(filename, enc, function (err, res){
      if (err) reject(err)
      else {
        try {
          resolve(JSON.parse(res))
        } catch (ex) {
          reject(ex)
        }
      }
    })
  })
}

readJSON('./package.json').then(console.log)

Advanced Promise Implementation with ES6/ES2015 for Both Promises and Callbacks

const fs = require('fs')

const readFileIntoArray = function(file, cb = null) {
  return new Promise((resolve, reject) => {
    fs.readFile(file, (error, data) => {
      if (error) {
        if (cb) return cb(error) 
        return reject(error)
      }

      const lines = data.toString().trim().split('\n')
      if (cb) return cb(null, lines)
      else return resolve(lines)
    })
  })
}

Example Calls with then and a Callback

const printLines = (lines) => {
  console.log(`There are ${lines.length} line(s)`)
  console.log(lines)
}
const FILE_NAME = __filename

readFileIntoArray(FILE_NAME)
  .then(printLines)
  .catch(console.error)

readFileIntoArray(FILE_NAME, (error, lines) => {
  if (error) return console.error(error)
  printLines(lines)
})

Event Emitters

  1. Import require('events')
  2. Extend class Name extends ...
  3. Instantiate new Name()
  4. Add listeners .on()
  5. Emit .emit()

Emitting Outside Event Emitter Class

const events = require('events')
class Encrypt extends events {
  constructor(ops) {
    super(ops)
    this.on('start', () => {
      console.log('beginning A')
    })    
    this.on('start', () => {
      console.log('beginning B')
    })
  }
}

const encrypt = new Encrypt()
encrypt.emit('start')

Emitting Outside and Inside

const events = require('events')
class Encrypt extends events {
  constructor(ops) {
    super(ops)
    this.on('start', () => {
      console.log('beginning A')
    })    
    this.on('start', () => {
      console.log('beginning B')
      setTimeout(()=>{
        this.emit('finish', {msg: 'ok'})
      }, 0)
    })
  }
}

const encrypt = new Encrypt()
encrypt.on('finish', (data) => {
  console.log(`Finshed with message: ${data.msg}`)
})
encrypt.emit('start')

Working with Events

Events are about building extensible functionality and making modular code flexible

  • .emit() can be in the module and .on() in the main program which consumes the module
  • .on() can be in the module and .emit() in the main program, and in constructor or in instance
  • pass data with emit()
  • error is a special event (if listen to it then no crashes)
  • on() execution happen in the order in which they are defined (prependListener or removeListener)

[.autoscale: true]


Default Max Event Listeners

Default maximum listeners is 10 (to find memory leaks), use setMaxListenerssource

var defaultMaxListeners = 10;
...
EventEmitter.prototype.setMaxListeners = function setMaxListeners(n) {
  if (typeof n !== 'number' || n < 0 || isNaN(n)) {
    const errors = lazyErrors();
    throw new errors.RangeError('ERR_OUT_OF_RANGE', 'n',
                                'a non-negative number', n);
  }
  this._maxListeners = n;
  return this;
};

Promises vs Events

  • Events are synchronous while Promises are typically asynchronous
  • Events react to same event from multiple places, Promise just from one call
  • Events react to same event multiple times, then just once

nextTick in class

Again, nextTick helps to emit events later such as in a class constructor

class Encrypt extends events {
  constructor() {
    process.nextTick(()=>{  // otherwise, emit will happen before .on('ready')
      this.emit('ready', {})
    })
  }
}
const encrypt = new Encrypt()
encrypt.on('ready', (data) => {})

Async/await


How Developers Use Async/await

  • Consume ready async/await functions from libraries which support it - often
  • Create your own from callback or promises - not often (Node's util.promisify)

You need Node v8+ for both


Consuming Async Fn from axios

const axios = require('axios')
const getAzatsWebsite = async () => {
  const response = await axios.get('http://azat.co')
  return response.data
}
getAzatsWebsite().then(console.log)

util.promisify

const fs = require('fs')
const util = require('util')
const f = async function () {
  try {
    const data = await util.promisify(fs.readFile)('os.js', 'utf8') // <- try changing to non existent file to trigger an error
    console.log(data)
  } catch (e) {
    console.log('ooops')
    console.error(e)
    process.exit(1)
  }
}

f()
console.log('could be doing something else')

(Can be use just for Promises as well)


Consuming Async Fn from mocha and axios

const axios = require('axios')
const {expect} = require('chai')
const app = require('../server.js')
const port = 3004

before(async function() {
  await app.listen(port, () => {
    console.log('server is running')
  })
  console.log('code after the server is running')
})

Consuming Async Fn from mocha and axios (Cont)

describe('express rest api server', async () => {
  let id

  it('posts an object', async () => {
    const {data: body} = await axios
      .post(`http://localhost:${port}/collections/test`, 
      { name: 'John', email: '[email protected]'})
    expect(body.length).to.eql(1)
    expect(body[0]._id.length).to.eql(24)
    id = body[0]._id
  })

  it('retrieves an object', async () => {
    const {data: body} = await axios
      .get(`http://localhost:${port}/collections/test/${id}`)
    expect(typeof body).to.eql('object')
    expect(body._id.length).to.eql(24)
    expect(body._id).to.eql(id)
    expect(body.name).to.eql('John')
  })
  // ...
})

Project: Avatar Service

Koa Server with Mocha and Async/await Fn and Promise.all

Terminal:

cd code
cd koa-rest
npm i
npm start

Open in a Browser: http://localhost:3000/?email=YOURMAIL, e.g., http://localhost:3000/[email protected] to see your avatar (powered by Gravatar)


Module 3: Streaming


Abstractions for continuous chunking of data or simply data which is not available all at once and which does NOT require too much memory.


No need to wait for the entire resource to load


Types of Streams

  • Readable, e.g., fs.createReadStream
  • Writable, e.g., fs.createWriteStream
  • Duplex, e.g., net.Socket
  • Transform, e.g., zlib.createGzip

Streams Inherit from Event Emitter


Streams are Everywhere!

  • HTTP requests and responses
  • Standard input/output (stdin&stdout)
  • File reads and writes

Readable Stream Example

process.stdin

Standard input streams contain data going into applications.

  • Event data: on('data')
  • read() method

Input typically comes from the keyboard used to start the process.

To listen in on data from stdin, use the data and end events:

// stdin.js
process.stdin.resume()
process.stdin.setEncoding('utf8')

process.stdin.on('data', function (chunk) {
  console.log('chunk: ', chunk)
})

process.stdin.on('end', function () {
  console.log('--- END ---')
})

Readable stdin Stream Demo

$ node stdin.js


Interface read()

var readable = getReadableStreamSomehow()
readable.on('readable', () => {
  var chunk
  while (null !== (chunk = readable.read())) { // SYNC!
    console.log('got %d bytes of data', chunk.length)
  }
})

^readable.read is sync but the chunks are small


Writable Stream Example

  • process.stdout: Standard output streams contain data going out of the applications.
  • response (server request handler response)
  • request (client request)

More on networking in the next module


Write to Writable Stream

Use write() method

process.stdout.write('A simple message\n')

Data written to standard output is visible on the command line.


Writable stdout Stream Demo

node stdout.js

Pipe

source.pipe(destination)

source - readable or duplex destination - writable, or transform or duplex


Linux vs Node Piping

Linux shell:

operationA | operationB | operationC | operationD

Node :

streamA.pipe(streamB).pipe(streamC).pipe(streamD)

or

streamA.pipe(streamB)
streamB.pipe(streamC)
streamC.pipe(streamD)

left fit

How pipe really works: readable source will be paused if the queue for the writable/transform/duplex destination stream is full. Otherwise, the readable will be resumed and read. source

[.footer:hide]


                                                   +===================+
                         x-->  Piping functions   +-->   src.pipe(dest)  |
                         x     are set up during     |===================|
                         x     the .pipe method.     |  Event callbacks  |
  +===============+      x                           |-------------------|
  |   Your Data   |      x     They exist outside    | .on('close', cb)  |
  +=======+=======+      x     the data flow, but    | .on('data', cb)   |
          |              x     importantly attach    | .on('drain', cb)  |
          |              x     events, and their     | .on('unpipe', cb) |
+---------v---------+    x     respective callbacks. | .on('error', cb)  |
|  Readable Stream  +----+                           | .on('finish', cb) |
+-^-------^-------^-+    |                           | .on('end', cb)    |
  ^       |       ^      |                           +-------------------+
  |       |       |      |
  |       ^       |      |
  ^       ^       ^      |    +-------------------+         +=================+
  ^       |       ^      +---->  Writable Stream  +--------->  .write(chunk)  |
  |       |       |           +-------------------+         +=======+=========+
  |       |       |                                                 |
  |       ^       |                              +------------------v---------+
  ^       |       +-> if (!chunk)                |    Is this chunk too big?  |
  ^       |       |     emit .end();             |    Is the queue busy?      |
  |       |       +-> else                       +-------+----------------+---+
  |       ^       |     emit .write();                   |                |
  |       ^       ^                                   +--v---+        +---v---+
  |       |       ^-----------------------------------<  No  |        |  Yes  |
  ^       |                                           +------+        +---v---+
  ^       |                                                               |
  |       ^               emit .pause();          +=================+     |
  |       ^---------------^-----------------------+  return false;  <-----+---+
  |                                               +=================+         |
  |                                                                           |
  ^            when queue is empty     +============+                         |
  ^------------^-----------------------<  Buffering |                         |
               |                       |============|                         |
               +> emit .drain();       |  ^Buffer^  |                         |
               +> emit .resume();      +------------+                         |
                                       |  ^Buffer^  |                         |
                                       +------------+   add chunk to queue    |
                                       |            <---^---------------------<
                                       +============+

Pipe and Transform

Encrypts and Zips:

const r = fs.createReadStream('file.txt')
const e = crypto.createCipher('aes256', SECRET) 
const z = zlib.createGzip()
const w = fs.createWriteStream('file.txt.gz')
r.pipe(e).pipe(z).pipe(w)

^Readable.pipe takes writable and returns destination


Readable Streams Events

  • data
  • end
  • error
  • close
  • readable

Readable Streams Methods

  • pipe()
  • unpipe()
  • read()
  • unshift()
  • resume()
  • pause()
  • isPaused()
  • setEncoding()

Writable Streams Events

  • drain
  • finish
  • error
  • close
  • pipe
  • unpipe

Writable Streams Methods

  • write()
  • end()
  • cork()
  • uncork()
  • setDefaultEncoding()

With pipe, we can listen to events too!

const r = fs.createReadStream('file.txt')
const e = crypto.createCipher('aes256', SECRET) 
const z = zlib.createGzip()
const w = fs.createWriteStream('file.txt.gz')
r.pipe(e)
  .pipe(z).on('data', () => process.stdout.write('.') // progress dot "."
  .pipe(w).on('finish', () => console.log('all is done!')) // when all is done

Readable Stream

paused: stream.read() - safe stream.resume()

flowing: EventEmitter - data can be lost if no listeners or they are not ready stream.pause()


What about HTTP?


Core http uses Streams!

const http = require('http')
var server = http.createServer( (req, res) => {
  req.setEncoding('utf8')
  req.on('data', (chunk) => { // readable
    processDataChunk(chunk) // This functions is defined somewhere else
  })
  req.on('end', () => {  
    res.write('ok') // writable
    res.end()
  })
})

server.listen(3000)

Streaming for Servers

streams/large-file-server.js:

const path = require('path')
const fileName = path.join(
  __dirname, process.argv[2] || 'webapp.log') // 67Mb
const fs = require('fs')
const server = require('http').createServer()

server.on('request', (req, res) => {
  if (req.url === '/callback') {
    fs.readFile(fileName, (err, data) => {
      if (err) return console.error(err)
      res.end(data)
    })
  } else if (req.url === '/stream') {
    const src = fs.createReadStream(fileName)
    src.pipe(res)
  }
})

server.listen(3000)

inline


inline


Before we were consuming streams, not let's create our own stream. This is Sparta advanced course after all!


Create a Stream

const stream = require('stream')
const writable = new stream.Writable({...})
const readable = new stream.Readable({...})
const transform = new stream.Transform({...})
const duplex = new stream.Duplex({...})

or

const {Writable} = require('stream')
const writable = new Writable({...})

Create a Writable Stream

const translateWritableStream = new Writable({
  write(chunk, encoding, callback) {
    translate(chunk.toString(), {to: 'en'}).then(res => {
        console.log(res.text)
        //=> I speak English
        console.log(res.from.language.iso)
        //=> nl
        callback()
    }).catch(err => {
        console.error(err)
        callback()
    })
  }
})

streams/writable-translate.js


Creating Readable

const {Readable} = require('stream')
const Web3 = require('web3')
const web3 = new Web3(new Web3.providers.HttpProvider("https://mainnet.infura.io/jrrVdXuXrVpvzsYUkCYq"))

const latestBlock = new Readable({
  read(size) {
    web3.eth.getBlock('latest')
      .then((x) => {
        // console.log(x.timestamp)
        this.push(`${x.hash}\n`)
        // this.push(null)
      })
  }
})

latestBlock.pipe(process.stdout)

Creating Duplex

const {Duplex} = require('stream')

const MyDuplex = new Duplex ({
  write(chunk, encoding, callback) {
    callback()
  }
  read(size) {
    this.push(data) // data defined
    this.push(null)
  }
})

Creating Transform

const {Transform} = require('stream')

const MyTransform = new Transform({
  transform(chunk, encoding, callback) {
    this.push(data)
    callback()
  }
})

Transform Real Life Example: Zlib from Node Core source

Zlib.prototype._transform = function _transform(chunk, encoding, cb) {
  // If it's the last chunk, or a final flush, we use the Z_FINISH flush flag
  // (or whatever flag was provided using opts.finishFlush).
  // If it's explicitly flushing at some other time, then we use
  // Z_FULL_FLUSH. Otherwise, use the original opts.flush flag.
  var flushFlag;
  var ws = this._writableState;
  if ((ws.ending || ws.ended) && ws.length === chunk.byteLength) {
    flushFlag = this._finishFlushFlag;
  } else {
    flushFlag = this._flushFlag;
    // once we've flushed the last of the queue, stop flushing and
    // go back to the normal behavior.
    if (chunk.byteLength >= ws.length)
      this._flushFlag = this._origFlushFlag;
  }
  processChunk(this, chunk, flushFlag, cb);
};

Backpressure

  • Data clogs
  • Reading typically is faster than writing
  • Backpressure is bad for memory exhaustion and GC (triggering GC too often is expensive)
  • Stream and Node solves the back pressure automatically by pausing source (read) stream if needed
  • highWaterMark option, defaults to 16kb

Overwrite Streams

Since Node.js v0.10, the Stream class has offered the ability to modify the behavior of the .read() or .write() by using the underscore version of these respective functions (._read() and ._write()).

Guide


Module 4: Networking


net


Any server, not just http or https!

const server = require('net').createServer()
server.on('connection', socket => {
  socket.write('Enter your command: ') // Sent to client
  socket.on('data', data => {
    // Incoming data from a client
  })

  socket.on('end', () => {
    console.log('Client disconnected')
  })
})

server.listen(3000, () => console.log('Server bound'))

Chat

chat.js:

if (!sockets[socket.id]) {
  socket.name = data.toString().trim()
  socket.write(`Welcome ${socket.name}!\n`)
  sockets[socket.id] = socket
  return
}
Object.entries(sockets).forEach(([key, cs]) => {
  if (socket.id === key) return
  cs.write(`${socket.name} ${timestamp()}: `)
  cs.write(data)
})

Client?

telnet localhost 3000

or

nc localhost 3000

or write your own TCP/IP client using Node, C++, Python, etc.


Bitcoin Price Ticker

node code/bitcoin-price-ticker.js


Ticker Server

const https = require('https')

const server = require('net').createServer()
let counter = 0
let sockets = {}
server.on('connection', socket => {
  socket.id = counter++

  console.log('Welcome to Bitcoin Price Ticker (Data by Coindesk)')
  console.log(`There are ${counter} clients connected`)
  socket.write('Enter currency code (e.g., USD or CNY): ')

  socket.on('data', data => {
    // process data from the client
  })

  socket.on('end', () => {
    delete sockets[socket.id]
    console.log('Client disconnected')
  })
})

server.listen(3000, () => console.log('Server bound'))

Processing Data from the Client

    let currency = data.toString().trim()
    if (!sockets[socket.id]) {
      sockets[socket.id] = {
        currency: currency
      }
      console.log(currency)
    }
    fetchBTCPrice(currency, socket)
    clearInterval(sockets[socket.id].interval)
    sockets[socket.id].interval = setInterval(()=>{
      fetchBTCPrice(currency, socket)
    }, 5000)

Making request to Coindesk API (HTTPS!)

API: https://api.coindesk.com/v1/bpi/currentprice/.json

https://api.coindesk.com/v1/bpi/currentprice/USD.json https://api.coindesk.com/v1/bpi/currentprice/JPY.json https://api.coindesk.com/v1/bpi/currentprice/RUB.json https://api.coindesk.com/v1/bpi/currentprice/NYC.json


Response

{
  "time": {
    "updated": "Jan 9, 2018 19:52:00 UTC",
    "updatedISO": "2018-01-09T19:52:00+00:00",
    "updateduk": "Jan 9, 2018 at 19:52 GMT"
  },
  "disclaimer": "This data was produced from the CoinDesk 
  Bitcoin Price Index (USD). Non-USD currency data 
  converted using hourly conversion rate from openexchangerates.org",
  "bpi": {
    "USD": {
      "code": "USD",
      "rate": "14,753.6850",
      "description": "United States Dollar",
      "rate_float": 14753.685
    }
  }
}

HTTPS GET

const fetchBTCPrice = (currency, socket) => {
  const req = https.request({
    port: 443,
    hostname: 'api.coindesk.com',
    method: 'GET',
    path: `/v1/bpi/currentprice/${currency}.json`
  }, (res) => {
    let data = ''
    res.on('data', (chunk) => {
      data +=chunk
    })
    res.on('end', () => {
      socket.write(`1 BTC is ${JSON.parse(data).bpi[currency].rate} ${currency}\n`)
    })
  })
  req.end()
}

Client

telnet localhost 3000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Enter currency code (e.g., USD or CNY): USD
1 BTC is 14,707.9438 USD
1 BTC is 14,694.5113 USD
1 BTC is 14,694.5113 USD
CNY
1 BTC is 40,202.5000 CNY
RUB
1 BTC is 837,400.5342 RUB
1 BTC is 837,400.5342 RUB
1 BTC is 837,400.5342 RUB

http

Protected SQL archive (file-server/file-server.js):

const url = require('url')
const fs = require('fs')
const SECRET = process.env.SECRET
const server = require('http').createServer((req, res) => {
  console.log(`URL is ${req.url} and the method is ${req.method}`)
  const course = req.url.match(/courses\/([0-9]*)/) // works for /courses/123 to get 123
  const query = url.parse(req.url, true).query // works for /?key=value&key2=value2 
  if (course && course[1] && query.API_KEY === SECRET) {
    fs.readFile('./clients_credit_card_archive.sql', (error, data)=>{
      if (error) {
        res.writeHead(500)
        res.end('Server error')
      } else {
        res.writeHead(200, {'Content-Type': 'text/plain' })
        res.end(data)
      }
    })
  } else {
    res.writeHead(404)
    res.end('Not found')
  }
}).listen(3000, () => {
  console.log('server is listening on 3000')
})

HTTP File Server

Command to run the server:

SECRET=NNN nodemon file-server.js

Browser request: http://localhost:3000/courses/123?API_KEY=NNN


HTTP Routing

You can use switch...

const server = require('http').createServer((req, res) => {
  switch (req.url) {
    case '/api':
      res.writeHead(200, { 'Content-Type': 'application/json' })
      // fetch data from a database
      res.end(JSON.stringify(data))
      break
    case '/home':
      res.writeHead(200, { 'Content-Type': 'text/html' })
      // send html from a file
      res.end(html)
      break
    default:
      res.writeHead(404)
      res.end()
  }
}).listen(3000, () => {
  console.log('server is listening on 3000')
})

HTTP Routing Puzzle

Find a problem with this server (from Advanced Node by Samer Buna):

const fs = require('fs')
const server = require('http').createServer()
const data = {}

server.on('request', (req, res) => {
  switch (req.url) {
  case '/api':
    res.writeHead(200, { 'Content-Type': 'application/json' })
    res.end(JSON.stringify(data))
    break
  case '/home':
  case '/about':
    res.writeHead(200, { 'Content-Type': 'text/html' })
    res.end(fs.readFileSync(`.${req.url}.html`))
    break
  case '/':
    res.writeHead(301, { 'Location': '/home' })
    res.end()
    break
  default:
    res.writeHead(404)
    res.end()
  }
})

server.listen(3000)

Puzzle Answer

Always reading (no caching) and blocking!

  case '/about':
    res.writeHead(200, { 'Content-Type': 'text/html' })
    res.end(fs.readFileSync(`.${req.url}.html`))
    break

Use HTTP Status Codes

http.STATUS_CODES

Core https Module

Server needs the key and certificate files:

openssl req -x509 -newkey rsa:2048 -nodes -sha256 -subj '/C=US/ST=CA/L=SF/O=NO\x08A/OU=NA' \
  -keyout server.key -out server.crt

HTTPS Server with Core https Module

const https = require('https')
const fs = require('fs')

const server = https.createServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
}, (req, res) => {
  res.writeHead(200)
  res.end('hello')
}).listen(443)

https Request with Streaming

const https = require('https') 

const req = https.request({
    hostname: 'webapplog.com',
    port: 443, 
    path: '/',
    method: 'GET'
  }, (res) => {
  console.log('statusCode:', res.statusCode)
  console.log('headers:', res.headers)

  res.on('data', (chunk) => {
    process.stdout.write(chunk)
  })
})

req.on('error', (error) => {
  console.error(error)
})
req.end()

HTTP/2 with http2


Generating Self-Signed SSL

openssl req -x509 -newkey rsa:2048 -nodes -sha256 -subj '/C=US/ST=CA/L=SF/O=NO\x08A/OU=NA' \
  -keyout server.key -out server.crt

Using Core http2 Module

const http2 = require('http2')
const fs = require('fs')

const server = http2.createSecureServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
}, (req, res) => {
  res.writeHead(200, {'Content-Type': 'text/plain' })
  res.end('<h1>Hello World</h1>') // JUST LIKE HTTP!
})
server.on('error', (err) => console.error(err))
server.listen(3000)

Running H2 Hello Server

cd code
cd http2
node h2-hello.js

Browser: https://localhost:3000

Terminal:

curl https://localhost:3000/ -vik

inline


inline


inline


inline


curl https://localhost:3000/ -vik
 Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 3000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection:
...
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=CA; L=SF; O=NOx08A; OU=NA
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)

Using Core http2 Module with Stream

const http2 = require('http2')
const fs = require('fs')

const server = http2.createSecureServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
})

server.on('error', (err) => console.error(err))
server.on('socketError', (err) => console.error(err))

server.on('stream', (stream, headers) => {
  // stream is a Duplex
  stream.respond({
    'content-type': 'text/html',
    ':status': 200
  })
  stream.end('<h1>Hello World</h1>')
})

server.listen(3000)

WTF is http2 Server Push?


Example: index.html refers to four static assets

HTTP/1: server requires five requests from a client:

  1. index.html
  2. style.css
  3. bundle.js
  4. favicon.ico
  5. logo.png

Example: index.html refers to four static assets (Cont)

HTTP/2: server with server push requires just one request from a client:

  1. index.html
  • style.css
  • bundle.js
  • favicon.ico
  • logo.png

HTML and assets are pushed by the server but assets are not used unless referred to by HTML.


Let's implement some server push!


Start with a Normal H2 Server

const http2 = require('http2')
const fs = require('fs')

const server = http2.createSecureServer({
  key: fs.readFileSync('server.key'),
  cert: fs.readFileSync('server.crt')
})

server.on('error', (err) => console.error(err))
server.on('socketError', (err) => console.error(err))

Use Stream and pushStream

server.on('stream', (stream, headers) => {
  stream.respond({
    'content-type': 'text/html',
    ':status': 200
  })
  stream.pushStream({ ':path': '/myfakefile.js' }, (pushStream) => {
    pushStream.respond({ 
      'content-type': 'text/javascript',
      ':status': 200 
    })
    pushStream.end(`alert('you win')`)
  })
  stream.end('<script src="/myfakefile.js"></script><h1>Hello World</h1>')
})

server.listen(3000)

inline


Additional server push articles


Advanced Express REST API Routing in HackHall Demo


Conclusion

Just don't use core http directly. Use Express, Hapi or Koa.


Module 5: Debugging


Debugging Strategies

  • Don't guess and don't think too much
  • Isolate (use binary search)
  • Watch/check values
  • Trial and error
  • Full Stack overflow development (skip question, read answers)
  • Read source code, docs can be outdated or subpar

console.log is one of the best debuggers

  • Not breaking the execution flow
  • Nothing extra needed (unlike Node Inspector/DevTools or VS Code)
  • Robust: clearly shows if a line is executed
  • Clearly shows data

Console Tricks


Streaming Logs to Files

const fs = require('fs')

const out = fs.createWriteStream('./out.log')
const err = fs.createWriteStream('./err.log')

const console2 = new console.Console(out, err)

setInterval(() => {
  console2.log(new Date())
  console2.error(new Error('Whoops'))
}, 500)

Console Parameters

console.log('Step', 2) // Step2
const name = 'Azat'
const city = 'San Francisco'
console.log('Hello %s from %s', name, city)

util.format and util.inspect

const util = require('util')
console.log(util.format('Hello %s from %s', name, city)) 
// Hello Azat from San Francisco
console.log('Hello %s from %s', 'Azat', {city: 'San Francisco'}) 
// Hello Azat from [object Object]
console.log({city: 'San Francisco'}) 
// { city: 'San Francisco' }
console.log(util.inspect({city: 'San Francisco'})) 
// { city: 'San Francisco' }

console.dir()

const str = util.inspect(global, {depth: 0})
console.dir(global, {depth: 0})
info = log
warn = error
trace // prints call stack
assert // require('assert')

Console Timers

console.log('Ethereum transaction started')
console.time('Ethereum transaction')
web3.send(txHash, (error, results)=>{
  console.timeEnd('Ethereum transaction') 
  // Ethereum transaction: 4545.921ms
})

REPL Tricks (which can be used for quick testing and debugging)

  • Core modules are there already
  • You can load any module with require() (must be installed with proper path)
  • You can see all your sessions' histories in ~/.node_repl_history, i.e., cat ~/.node_repl_history or tail ~/.node_repl_history

REPL Commands

  • .break: When in the process of inputting a multi-line expression, entering the .break command (or pressing the -C key combination) will abort further input or processing of that expression.
  • .clear: Resets the REPL context to an empty object and clears any multi-line expression currently being input.
  • .exit: Close the I/O stream, causing the REPL to exit.
  • .help: Show this list of special commands.
  • .save: Save the current REPL session to a file: > .save ./file/to/save.js
  • .load: Load a file into the current REPL session. > .load ./file/to/load.js

Editing in REPL

.editor - Enter editor mode (-D to finish, -C to cancel)

> .editor
// Entering editor mode (^D to finish, ^C to cancel)
function welcome(name) {
  return `Hello ${name}!`;
}

welcome('Node.js User');

// ^D
'Hello Node.js User!'
>

Real Debuggers

  • CLI
  • DevTools
  • VS Code

Node CLI Debugger

$ node inspect debug-me.js
< Debugger listening on ws://127.0.0.1:9229/80e7a814-7cd3-49fb-921a-2e02228cd5ba
< For help see https://nodejs.org/en/docs/inspector
< Debugger attached.
Break on start in myscript.js:1
> 1 (function (exports, require, module, __filename, __dirname) { global.x = 5;
  2 setTimeout(() => {
  3   console.log('world');
debug>

Node CLI Debugger (Cont)

Stepping#
cont, c - Continue execution
next, n - Step next
step, s - Step in
out, o - Step out
pause - Pause running code (like pause button in Developer Tools)

Node V8 Inspector

$ node --inspect index.js
Debugger listening on 127.0.0.1:9229.
To start debugging, open the following URL in Chrome:
    chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=127.0.0.1:9229/dc9010dd-f8b8-4ac5-a510-c1a114ec7d29

Better to break right away:

node --inspect-brk debug-me.js

Old (deprecated):

node --inspect --debug-brk index.js

Node V8 Inspector Demo


VS Code Demo


CPU profiling


Networking Debugging with DevTools


V8 Memory Scheme

Resident Set:

  • Code: Node/JS code
  • Stack: Primitives, local variables, pointers to objects in the heap and control flow
  • Heap: Referenced types such as Objects, strings, closures

process.memoryUsage()
{ rss: 12476416,
  heapTotal: 7708672,
  heapUsed: 5327904,
  external: 8639 }

Heap

  • New Space&Young Generation: New allocations, size 1-8Mb, fast collection (Scavenge), ~20% goes into Old Space
  • Old Space&Old Generation: Allocation is fast but collection is expensive (Mark-Sweep)

Garbage Collection

The mechanism that allocates and frees heap memory is called garbage collection.


Garbage Collection (Cont)

  • Automatic in Node, thanks to V8
  • Stops the world - expensive
  • Objects with refs are not collected (memory leaks)

Memory Leak


fit


Leaky Server

const express = require('express')

const app = express()

let cryptoWallet = {}
const generateAddress = () => {
  const initialCryptoWallet = cryptoWallet
  const tempCryptoWallet = () => {
    if (initialCryptoWallet) console.log('We received initial cryptoWallet')
  }
  cryptoWallet = {
    key: new Array(1e7).join('.'),
    address: () => {
      // ref to tempCryptoWallet ???
      console.log('address returned')
    }
  }
}

app.get('*', (req, res) => {
  generateAddress()
  console.log( process.memoryUsage())
  return res.json({msg: 'ok'})
})
app.listen(3000)

Starting the LEAK

loadtest -c 100 --rps 100 http://localhost:3000
node leaky-server/server.js
{ rss: 1395490816,
  heapTotal: 1469087744,
  heapUsed: 1448368200,
  external: 16416 }
{ rss: 1405501440,
  heapTotal: 1479098368,
  heapUsed: 1458377224,
  external: 16416 }
{ rss: 1335377920,
  heapTotal: 1409097728,
  heapUsed: 1388386720,
  external: 16416 }

GCs

<--- Last few GCs --->

[35417:0x103000c00]    36302 ms: Mark-sweep 1324.1 (1345.3) -> 1324.1 (1345.3) MB, 22.8 / 0.0 ms  allocation failure GC in old space requested
[35417:0x103000c00]    36328 ms: Mark-sweep 1324.1 (1345.3) -> 1324.1 (1330.3) MB, 26.4 / 0.0 ms  last resort GC in old space requested
[35417:0x103000c00]    36349 ms: Mark-sweep 1324.1 (1330.3) -> 1324.1 (1330.3) MB, 20.9 / 0.0 ms  last resort GC in old space requested

Line 12

==== JS stack trace =========================================

Security context: 0x3c69fae25ee1 <JSObject>
    2: generateAddress [/Users/azat/Documents/Code/node-advanced/code/leaky-server/server.js:12] [bytecode=0x3c69df959db9 offset=42](this=0x3c69a7f0c0b9 <JSGlobal Object>)
    4: /* anonymous */ [/Users/azat/Documents/Code/node-advanced/code/leaky-server/server.js:20] [bytecode=0x3c69df959991 offset=7](this=0x3c69a7f0c0b9 <JSGlobal Object>,req=0x3c69389c07c1 <IncomingMessage map = 0x3c693e7300f1...


    FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

Memory Leak Mitigation

  • Reproduce the error/leak
  • Check for variables and fn arguments, pure fns are better
  • Take heap dumps and compare (with debug and DevTools or heapdump modules)
  • Update Node
  • Get rid of extra npm modules
  • Trial and error: remove code you think is leaky
  • Modularize&refactor

Useful Libraries


Heap Dumping

code/leaky-server/server-heapdump.js:

// ...
const heapdump = require('heapdump')
setInterval(function () {
  heapdump.writeSnapshot()
}, 2 * 1000)
// ...

Creates files in the current folder:

heapdump-205347232.998971.heapsnapshot
heapdump-205508465.289834.heapsnapshot
heapdump-205513413.472744.heapsnapshot

fit


fit


fit


Module 6: Scaling


Why You Need to Scale

  • Performance (e.g., under 100ms response time)
  • Availability (e.g., 99.999%)
  • Fault tolerance (e.g., zero downtime)

^Zero downtime ^ Offload the workload: when Node server is a single process, it can be easily blocked ^https://blog.interfaceware.com/disaster-recovery-vs-high-availability-vs-fault-tolerance-what-are-the-differences/


Scaling Strategies

  • Forking (just buy more EC2s) - what we will do
  • Decomposing (e.g., microservices just for bottlenecks) - in another course
  • Sharding (e.g., eu.docusign.com and na2.docusign.net) - not recommended

Offload the Workload

  • spawn() - events, stream, messages, no size limit, no shell
  • fork() - Node processes, exchange messages
  • exec() - callback, buffer, 1Gb size limit, creates shell
  • execFile() - exec file, no shell

Sync Processes (Dumb)

  • spawnSync()
  • execFileSync()
  • execSync()
  • forkSync()

Executing bash and Spawn params

const {spawn} = require('child_process')
spawn('cd $HOME/Downloads && find . -type f | wc -l', 
  {stdio: 'inherit', 
  shell: true, 
  cwd: '/', 
  env: {PASSWORD: 'dolphins'}
})

Good Examples of Offloading the Workload

  • Hashing
  • Encryption
  • Requests
  • Encoding
  • Archiving/Compression
  • Computation

Let's use Node to launch Python script to securely (512) hash a long string and get results back into Node.


Executing Python with exec()

code/exec-hash.js:

const {exec} = require('child_process')
console.time('hashing')
const str = 'React Quickly: Painless web apps with React, JSX, Redux, and GraphQL'.repeat(100)
exec(`STR="${str}" python ${__dirname}/py-hash.py`, (error, stdout, stderr) => {
  if (error) return console.error(error)
  console.timeEnd('hashing')
  console.log(stdout)
})

console.log('could be doing something else')

Python SHA512 Hashing

code/py-hash.py:

import os
str = os.environ['STR'] 
import hashlib
hash_object = hashlib.sha512(str.encode())
hex_dig = hash_object.hexdigest()
print(hex_dig)

Let's launch Ruby script to encrypt a string from Node with AES into a file and not wait for it.


Node Sends a Long String for Encryption to Ruby

const {spawn} = require('child_process')
const str = 'React Quickly: Painless web apps with React, JSX, Redux, and GraphQL'.repeat(100)
console.time('launch encryption')

const rubyEncrypt = spawn('ruby', ['encrypt.rb'], {
  env: {STR: str},
  detached: true,
  stdio: 'ignore'
})
rubyEncrypt.unref() // Do not wait cause the results will be in the file.

console.timeEnd('launch encryption')

Ruby Script is Encrypting with AES 256

require 'openssl'
cipher = OpenSSL::Cipher.new('aes-256-cbc')
cipher.encrypt # We are encrypting
key = cipher.random_key
iv = cipher.random_iv

encrypted_string = cipher.update ENV["STR"]
encrypted_string << cipher.final
File.write('ruby-encrypted.txt', encrypted_string)

Quick Summary About Spawn

  • Use params to pass data around
  • Offload work to other processes even when they are in other languages
  • Compare timing

Scaling by forking will require the core os module.


os Module


Things You Can Do with os

const os = require('os')
console.log(os.freemem())
console.log(os.type())
console.log(os.release())
console.log(os.cpus())
console.log(os.uptime())
console.log(os.networkInterface())

Network Interface Results

{ lo0:
   [ { address: '127.0.0.1',
       netmask: '255.0.0.0',
       family: 'IPv4',
       mac: '00:00:00:00:00:00',
       internal: true },
  ...
  en0:
   [ { address: '10.0.1.4',
       netmask: '255.255.255.0',
       family: 'IPv4',
       mac: '78:4f:43:96:c6:f1',
       internal: false } ],
  ...  

macOS terminal command to get the same IP:

ifconfig | grep "inet " | grep -v 127.0.0.1

CPU Usage in %

code/os-cpu.js:

const os = require('os')
let cpus = os.cpus()

cpus.forEach((cpu, i) => {
  console.log('CPU %s:', i)
  let total = 0
  for (let type in cpu.times) {
    total += cpu.times[type]
  }
  for (let type in cpu.times) {
    console.log(`\t ${type} ${Math.round(100 * cpu.times[type] / total)}%`)
  }
})

The Core cluster Module

  • Master process
  • Worker processes: it's own PID, event loop and memory space
  • Load testing - round robin or the second approach is where the master process creates the listen socket and sends it to interested workers. The workers then accept incoming connections directly.
  • Use the child_process.fork() method and messaging

Load Testing

cluster uses round Robin uses shift and push source

RoundRobinHandle.prototype.distribute = function(err, handle) {
  this.handles.push(handle);
  const worker = this.free.shift();

  if (worker)
    this.handoff(worker);
};

Load/Stress Testing Tools

Node loadtest:

npm i loadtest -g
loadtest -c 10 --rps 100 10.0.1.4:3000

or Apache ab

ab -c 200 -t 10 http://localhost:3000

With Clusters

Avoid In-memory caching (each cluster has its own memory) or sticky sessions. Use external state store.


Cluster Messaging

Master:

cluster.workers
worker.send(data)

Worker:

process.on('message', data=>{})

Optimizing a Slow Password Salting+Hashing Server


A sync function which is a very CPU-Intensive task

offload/server-v1.js:

// ...
const bcrypt = require('bcrypt')

const hashPassword = (password, cb) => {
  const hash = bcrypt.hashSync(password, 16) // bcrypt has async but we are using sync here for the example
  cb(hash)
}
// ...
app.post('/signup', (req, res) => {
  hashPassword(req.body.password.toString(), (hash) => { // callback but sync
    // Store hash in your password DB.
    res.send('your credentials are stored securely')
  })
})
app.listen(3000)

Benchmarking The Password Salting+Hashing Server

Terminal:

node server-v1.js

Another terminal (not the first terminal):

curl localhost:3000/signup -d '{"password":123}' -H "Content-Type: application/json" -X POST

Third terminal window/tab:

curl localhost:3000

Result: 2nd request (3rd terminal) will wait for the 1st request (2nd terminal)


Optimizing The Password Salting+Hashing Server

Server with forked hashing code/offload/worker-v2.js:

const bcrypt = require('bcrypt')

process.on('message', (password) => {
  const hash = bcrypt.hashSync(password, 16)
  process.send(hash)
})

Optimizing Server (Cont)

Optimized server code/offload/server-v2.js:

const hashPassword = (password, cb) => {
  const hashWorker = fork('worker-v2.js')
  hashWorker.send(password)
  hashWorker.on('message', hash => {
    cb(hash)
  })
}
app.use(bodyParser.json())
app.get('/', (req, res) => {
  res.send('welcome to strong password site')
})

app.post('/signup', (req, res) => {
  hashPassword(req.body.password.toString(), (hash) => { // callback but sync
    // Store hash in your password DB.
    res.send('your credentials are stored securely')
  })
})

Testing Server v2 (Forked Process)

Terminal:

node server-v2.js

Another terminal (not the first terminal):

curl localhost:3000/signup -d '{"password":123}' -H "Content-Type: application/json" -X POST

Third terminal window/tab:

curl localhost:3000

Result: 2nd request (3rd terminal) will NOT wait for the 1st request (2nd terminal)


We can fork the v1 server without splitting the hashing+salting function into a worker


Server With a Forked Cluster

code/offload/server-v3.j:

const express = require('express')
const app = express()
const path = require('path')
const bodyParser = require('body-parser')
const bcrypt = require('bcrypt')

const cluster = require('cluster')

if (cluster.isMaster) {
  const os = require('os')
  os.cpus().forEach(() => {
    const worker = cluster.fork()
    console.log(`Started worker ${worker.process.pid}`)
  })
  return true
} 

Server With a Forked Cluster (Cont)

code/offload/server-v3.j:

// cluster.isWorker === true
const hashPassword = (password, cb) => {  
  const hash = bcrypt.hashSync(password, 16) // bcrypt has async but we are using sync here for the example
  cb(hash)
}

app.use(bodyParser.json())
app.get('/', (req, res) => {
  res.send('welcome to strong password site')
})

app.post('/signup', (req, res) => {
  hashPassword(req.body.password.toString(), (hash) => { // callback but sync
    // Store hash in your password DB.
    res.send('your credentials are stored securely')
  })
})

app.listen(3000)

Testing Server v3 (Forked Server)

Terminal:

node server-v3.js

Another terminal (not the first terminal):

curl localhost:3000/signup -d '{"password":123}' -H "Content-Type: application/json" -X POST

Third terminal window/tab:

curl localhost:3000

Result: 2nd request (3rd terminal) will NOT wait for the 1st request (2nd terminal)


Node.js does not automatically manage the number of workers, however. It is the application's responsibility to manage the worker pool based on its own needs.


No Fault Tolerance in Server v3

node server-v3.js
ps aux | grep 'node'
kill 12668

Implementing Fault Tolerance in Server v4

in isMaster in code/offload/server-v4.js:

  cluster.on('exit', (worker, code, signal) => {
    if (signal) {
      console.log(`worker was killed by signal: ${signal}`);
    } else if (code !== 0) { // &&!worker.exitedAfterDisconnect
      console.log(`worker exited with error code: ${code}`);
    } else {
      console.log('worker success!');
    }
    const newWorker = cluster.fork()
    console.log(`Worker ${worker.process.pid} exited. Thus, starting a new worker ${newWorker.process.pid}`)
  })

Fault Tolerance in Server v4

node server-v4.js
ps aux | grep 'node'
kill 12668

cluster is good but pm2 is better


pm2 Basics

npm i -g pm2
pm2 start app.js Start, Daemonize and auto-restart application (Node)
pm2 start app.js --watch
pm2 start app.js --name="bitcoin-exchange-api"
pm2 reset bitcoin-exchange-api
pm2 stop all
pm2 stop bitcoin-exchange-api

pm2 Advanced

pm2 startup
pm2 save
pm2 unstartup 
pm2 start app.js -i 4         # Start 4 instances of application in cluster mode 
                              # it will load balance network queries to each app
pm2 start app.js -i 4         # Start auto-detect instances of application in cluster mode  
pm2 reload all                # Zero Second Downtime Reload
pm2 scale [app-name] 10       # Scale Cluster app to 10 process

pm2 More

pm2-dev
pm2-docker

Outro


Summary

  • Debugging
  • Console, Node REPL and npm tricks
  • Forking and spawning
  • Creating streams, async/await and naive promises
  • How really globals, modules and require() work

The End!