It has been a while since I’ve used Node.js for anything serious. To give you an idea of how long go we’re talking… I originally hacked together the Green Man Gaming stock control system in Node.js 0.1.x and to this day it only runs on 0.2.x because of the way 0.4.x changed things way way back and it works so no one dare upgrade it.

Since then, the Node.js ecosystem has matured and although people make fun of it… Node.js is heavily used on things at massive scale. I’d also like to think I’ve learnt to write better Javascript since working on TweetDeck… but let’s not get ahead of ourselves!

So I wanted to get up to date on Node and figured I’d have a quick go at using the ‘parse the deb Packages.gz’ file as a quick thing to try out piping.

This is going to be very crude in places, I’m sure… but here was my quick hack around:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// There be lots of dragons...
var zlib = require('zlib');

var R = require('request'),
    _ = require('lodash'),
    through2 = require('through2'),
    es = require('event-stream');

var url = 'http://security.debian.org/dists/jessie/updates/main/binary-amd64/Packages.gz';

var all = [];

R.get(url)
    .pipe(zlib.createGunzip())
    .on('error', function() {
        console.log("it's all broke yo");
    })
    // Split by each 'object' in the packages file
    .pipe(es.split(/\n\n/))
    .pipe(through2.obj(function(chunk, enc, callback) {
        // We'll get an empty chunk at the end
        if (chunk === "") {
            callback();
            return;
        }
        // Create kv Pairs from each line
        var kvPairs = _.map(chunk.split('\n'), function(line) {
            return line.split(': ');
        });
        // Lower case the attributes - cos that's better
        var fixedAttribs = _.map(kvPairs, function(obj) {
            return [obj[0].toLowerCase(), obj[1]]
        });

        this.push(_.zipObject(fixedAttribs));
        callback();
    }))
    .on('data', function(data) {
        all.push(data)
    })
    .on('end', function () {
        console.log(JSON.stringify(all));
    });

I know Streams were introduced way back in 0.8.x of Node.js - but… wow that code is far simpler to understand than counterparts in other languages. It also improved debugging - as I was able to debug each part of the pipe.

The only thing that took me a short while was ‘what if the request fails’. It turned out that I had to use the ‘on’ error bit at the point in the pipe where I was adding the gunzip streaming. If I threw up there - then I’d be all good.

Either way - as a post Dota2 TI5 final night bit of experimenting it was certainly worthwhile.