I use boot2docker a lot. By a lot I mean every day. A particular bug in boot2docker on OSX has led to me constantly having to destroy and rebuild my boot2docker VM. So… I’m leaving this here for people to Google/find (including me).

The symptom is:

1
2
3
4
5
6
7
➜  Code  docker version
Client version: 1.7.0
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): 0baf609
OS/Arch (client): darwin/amd64
An error occurred trying to connect: Get https://192.168.59.103:2376/v1.19/version: x509: certificate is valid for 127.0.0.1, 10.0.2.15, not 192.168.59.103

Well that’s annoying. Normally I’d do this:

1
2
3
$ boot2docker halt
$ boot2docker destroy
$ boot2docker up

That has a serious downside. Like losing all my images. So hunting around I found boot2docker(v 1.4.1 and 1.5) bad cert issues on OSX 10.9.3 and a guy called garthk had the answer!

1
2
3
4
5
$ boot2docker ssh
$ sudo curl -o /var/lib/boot2docker/profile https://gist.githubusercontent.com/garthk/d5a17007c277aa5c76de/raw/3d09c77aae38b4f2809d504784965f5a16f2de4c/profile
$ sudo halt
$ # Now load up VirtualBox and manually power off the VM
$ boot2docker up

Job done!

1
2
3
4
5
6
7
8
9
10
11
$ ➜  Code  docker version
Client version: 1.7.0
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): 0baf609
OS/Arch (client): darwin/amd64
Server version: 1.7.0
Server API version: 1.19
Go version (server): go1.4.2
Git commit (server): 0baf609
OS/Arch (server): linux/amd64

This is going to be a bit of a rant - but a rant from something that came up recently where someone was considering MongoDB.

I was just reading MongoDB Set to Become the ‘New Default’ Database

Just… wow. Quite a bold statement there. To save people giving details on the form (another personal bugbear of mine… so I filled it with junk) - here’s the link to the relevant piece.

HIGH PERFORMANCE BENCHMARKING: MongoDB and NoSQL Systems

First things first let’s pick apart the minor error in the press release that eWeek clearly didn’t check up on.

All tests were performed with 400M records distributed across three servers, which represents a data set larger than RAM.

Ok…

Our setup consisted of one database server and one client server to ensure the YCSB client was not competing with the database for resources. Both servers were identical.

And…

Load 20M records using the “load” phase of YCSB

So that’d be mistake one… it wasn’t three servers at all. That is a gross error as the read statistics for Cassandra would be way off as a result. In fact they say as such in the Conclusions.

We focused on single server performance in these tests. Multi-server deployments address high availability and scale out for all three databases. We believe that this introduces a different set of considerations, and that the trade offs may be quite different.

My point is that it looks like the creators of MongoDB have commissioned and paid for this report. If they haven’t then really the press release and news around it is tripe and if they have… where’s the notification of bias.

It’s worth adding that the three databases tested are completely different! Cassandra, MongoDB and CouchBase each have very different use cases. It’s not overly fair to pit them off against each other. If you were to pit MongoDB and CouchDB against each other, that would be fairer. CouchBase is really CouchDB but prettier and with a very very clever caching front end on it.

I have deployed a large Cassandra and very large CouchDB set up. I wouldn’t use either one for the other’s workload.

Rant over…

Docker is a hot topic at the moment in the DevOps world. I use it almost every day and want to look at how automation can be achieved in terms of security and monitoring.

Containers in computing aren’t new. In fact FreeBSD had containers before Google was using them in Linux; although it call them jails.

Docker is great in that it’s brought containers to the masses. Once the reserve of people with the patience to set up LXC on Linux or the painful jails on FreeBSD - side note: it’s very painful I might talk about that another time.

We can talk to Docker via it’s RESTful API and libraries exist for almost every language. The two popular obvious ones are Go and Python - I say obvious, but it’s more that I just prefer these two languages. I’m sure the Ruby one is awesome too.

The downside of Docker that’s coming up more and more is managing security of containers. People often just use official images without a second thought and these end up in production. There’s posts containing loads of FUD on the topic which exist already - but in general how do you ensure you keep your containers’ operating system packages up to date?

Sounds like a task for a script. I broke it down into the following tasks:

  • Connect to Docker (boot2docker in my case)
  • Get a list of installed packages in debian:jessie image
  • Get a list of packages from security.debian.org
  • Compare the two

I need to add I used Python 3.4 for this. This makes the syntax seem a little odd to a Python 2.x view so needed to say!

Let’s get connecting out of the way:

1
2
3
4
5
6
7
8
9
10
from docker.client import Client
from docker.utils import kwargs_from_env

# So we can use boot2docker
kwargs = kwargs_from_env()
kwargs['tls'].assert_hostname = False
kwargs['tls'].verify = False  # Workaround https://github.com/docker/docker-py/issues/465

# Set up the client
client = Client(**kwargs)

Took me a small while to figure out the issue where OpenSSL 1.0.2a causes problems with quite a few libraries and talking to APIs. To get out of it for now I disable the verify part of requests - It’ll complain a lot about it.

Now we’re connected we can make a container and get some stuff out:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Create my Debian Jessie container
container = client.create_container(
    image='debian:jessie',
    stdin_open=True,
    tty=True,
    command="/usr/bin/dpkg-query -Wf '${Package},${Version}\n'",
)

# Launch it with the custom command
client.start(container)

# Grab the dpkg-query output
output = client.logs(container).decode("utf-8")

packages = {}

lines = output.split('\n')
lines.pop(0)
for line in lines:
    # Last line is a blank
    if not line:
        continue

    k, v = line.split(',')
    packages[k] = v

We now have a stack of packages in a dictionary keyed by package name. To do this we make good use of dpkg-query to get a CSV like list of package,version.

What we want next is a similar dict for up to date packages. Now, I know a lot of people who might read this would launch into apt-get update and then query the global list of packages. Would you do that in production? Really? You just want a list of stuff… Let’s just get it from security.debian.org directly.

1
2
3
4
5
6
7
import gzip
import re

import requests

r = requests.get('http://security.debian.org/dists/jessie/updates/main/binary-amd64/Packages.gz', stream=True)
gz = gzip.GzipFile(fileobj=r.raw)

A small point here… We make use of the gzip library directly to ungzip the file downloaded via Requests. To do this we use ‘r.raw’ like a file which GzipFile can use without any issue.

Now the format of this file is a bit weird. It’s a list of key value pairs for each package with a blank line between packages. The two keys we’re interested in for each package are Package (the name) and Version.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
kvregex = re.compile(r'(\w+): (.*)')
security_updates = {}

current_package = None
current_version = None

# Build up a security updates dict
for line in gz.readlines():
    m = kvregex.match(line.decode("utf-8"))
    if not m:
        security_updates[current_package] = current_version
        continue

    g = m.groups()
    if g[0] == 'Package':
        current_package = g[1]
    elif g[0] == 'Version':
        current_version = g[1]

r.close()

Perfect! We now have a dict with all the security updates in Jessie keyed by the package name again.

With these two dicts we can intersect them and only get elements that are in both. If the version doesn’t match spit it out. I had to fake an update to exist to test this properly as when I tested there were no out of date packages.

1
2
3
4
5
6
7
8
9
10
def common_entries(*dcts):
    for i in set(dcts[0]).intersection(*dcts[1:]):
        yield (i,) + tuple(d[i] for d in dcts)

# Fake a security update for Sed... (cos Jessie 8.1 is quite up to date)
security_updates['sed'] = '4.2.2-4+b2'

for entry in common_entries(packages, security_updates):
    if entry[1] != entry[2]:
        print('%s is %s and a security update exists for %s' % entry)

And, there we have it (spot the whining from requests…):

1
2
3
4
5
6
7
8
$ python foobar.py
.../venv/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
.../venv/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
.../venv/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
sed is 4.2.2-4+b1 and a security update exists for 4.2.2-4+b2

Awesome! There we have it - a quick way to grab and compare packages against containers.

It’s about time I updated this site. I go through stints of bothering with it; which is very common I find with a lot of people who still blog.

However, as I’m using Twitter less and less (can’t put my finger on why) and I like to keep my Facebook more private than most… it’s about time I bothered once more.

So, new theme. Went off, got the Octostrap theme. It’s awesome and well worth it.

I did look at Octopress 3 - but I don’t like the way it works. The approach of using rake still works for me and it seems like separating things for the sake of doing so… bit like something Hubot has done over the past year too.

As for the ‘new start’ - I’m going to try and blog more. Adding to that I do have a Tumblr I post random things to as well which may be more up to date.

In fact - ways to find stuff I’m doing are:

And because it does happen…

I probably spend way too much time configuring my VIM setup. It tends to change depending on what I’m working on. So, at the moment the following things matter to me most:

There would be Scala, but I use the excellent IntelliJ IDEA product for that. Nothing can beat it, so there’s no point trying to get VIM to do it.

It matters to me that my editor works cross platform too. Not fussed so much about VIM on Windows (although it’s nice when that works too) but more between OSX and Linux as they are the main two Operating Systems I use.

So I felt I’d do a post about how I manage my VIM config as it may/may not be useful for others.

Setup

Let’s start nice and empty:

1
2
3
4
5
mv ~/.vim ~/.vim.old
mv ~/.vimrc ~/.vimrcold
mkdir ~/.vim
touch ~/.vim/myvimrc
ln -s ~/.vim/myvimrc ~/.vimrc

Why do this? Well, simply put - this way your .vim folder can be easily stored in Git or another VCS you fancy. Job done!

Right, so what next? vundle all the things.

1
2
mkdir ~/.vim/bundle
git clone https://github.com/gmarik/Vundle.vim.git ~/.vim/bundle/vundle

Now you need a small bit at the top of your .vimrc file.

1
vi ~/.vimrc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
set nocompatible              " be iMproved, required
filetype off                  " required

" set the runtime path to include Vundle and initialize
set rtp+=~/.vim/bundle/Vundle.vim
call vundle#begin()"

" let Vundle manage Vundle, required
Plugin 'gmarik/Vundle.vim'

" Your stuff is going go here...

" All of your Plugins must be added before the following line
call vundle#end()            " required
filetype plugin indent on    " required

Now we have a basis of a working VIM we can work on. Let’s set up some cool stuff now…

Some obvious bootstrap things

By default, VIM likes to behave a little bit old fashioned. We want some niceties from the off - so let’s do that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
set expandtab     " Soft tabs all the things
set tabstop=2     " 2 spaces is used almost everywhere now
set shiftwidth=2  " When using >> then use 2 spaces
set autoindent    " Well, obviously
set smartindent   " As opposed to dumb indent

set noautowrite
set number
set autoread      " Read changes from underlying file if it changes
set showmode      " Showing current mode is helpful++
set showcmd
set nocompatible  " Actually make this vim
set ttyfast       " We don't use 33.6 modems these days
set ruler

set incsearch     " Use incremental search like OMG yes
set ignorecase    " Ignore case when searching
set hlsearch      " Highlight searching
set showmatch     " Show me where things match
set diffopt=filler,iwhite "Nice diff options
set showbreak=" Cooler linebreak
set noswapfile    " It's 2014, GO AWAY FFS

set esckeys       " Allow escape key in insert mode
set cursorline    " Highlight the line we're on
set encoding=utf8 " Really, people still use ASCII

You’ll notice that 2 spaces is the default but, obviously, Python is a good example of a language that uses 4.

1
au FileType python setlocal tabstop=8 expandtab shiftwidth=4 softtabstop=4

This way you’ll see, we get to customise each language. It’s nice. ‘au’ is short for auto. As in… Automatically run this when the FileType is python.

Syntastic

This is the Batman utility belt. It’s also easy to set up and serves as a good example of how Vundle works.

1
Bundle 'scrooloose/syntastic'

Job done. Make sure this goes between the Vundle begin and end calls.

Now save that and we’ll online reload/install:

1
vim -c "execute \"BundleInstall\" | q | q"

This will load up vim, install all the things and then exit when done.

Just spent the weekend at FOSDEM 2014. It’s the first time I have been to FOSDEM and checked out more of the Open Source world.

Seven of us went from Green Man Gaming and the only thing I will remember for the future is that I need to turn up to talks I want to go to very much in advance.

Rooms were always very quickly packed out. Managed to meet lots of cool and awesome people though.

Go was the language of the conference

There’s no way you could avoid this. Go is mainstream now. It’s been heading this way for a while - but it’s very clear that this language that people wondered the point of is now relevant to the point of obsession. The room was constantly ram packed and people staying for talk after talk.

I’m not the only one that laughs at MongoDB

Yep, turns out lots of people find the stability amusing.

There tons of PostgreSQL I don’t know

This was always going to be a given. The RDBMS is still relevant and still attracting a lot of attention. There was a definite lack of MySQL and MariaDB however. Maybe that swings it a bit.

There’s loads of stuff coming in 9.4.x for JSON and the like. The main thing I got from the talks though was an understanding of TOAST.

I need to do more here

Next year… I need to make more of an effort to plan and attend more talks.

I will be here next year!

I was trying to make HTTP calls using Finagle today and all I would get was this traceback from my logs:

1
2
3
4
5
6
7
8
9
10
11
12
13
FAT [20121215-15:49:46.803] HttpServer: A server service client threw an exception
FAT [20121215-15:49:46.803] HttpServer: com.twitter.finagle.WriteException$$anon$1: java.net.ConnectException: connection timed out
FAT [20121215-15:49:46.803] HttpServer:     at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processConnectTimeout(NioClientSocketPipelineSink.java:391)
FAT [20121215-15:49:46.803] HttpServer:     at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:289)
FAT [20121215-15:49:46.803] HttpServer:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
FAT [20121215-15:49:46.803] HttpServer:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
FAT [20121215-15:49:46.803] HttpServer:     at java.lang.Thread.run(Thread.java:722)
FAT [20121215-15:49:46.803] HttpServer: Caused by java.net.ConnectException: connection timed out
FAT [20121215-15:49:46.803] HttpServer:     at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processConnectTimeout(NioClientSocketPipelineSink.java:391)
FAT [20121215-15:49:46.803] HttpServer:     at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:289)
FAT [20121215-15:49:46.803] HttpServer:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
FAT [20121215-15:49:46.803] HttpServer:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
FAT [20121215-15:49:46.803] HttpServer:     at java.lang.Thread.run(Thread.java:722)

It turns out I needed to set up my ClientBuilder a little differently:

1
2
3
4
5
6
7
  val client = ClientBuilder()
    .codec(Http())
    .hosts("localhost:80")
    .tcpConnectTimeout(2.seconds)
    .requestTimeout(55.seconds)
    .hostConnectionLimit(5)
    .build()

The important bits, that don’t appear well documented are tcpConnectTimeout and requestTimeout. The normal ‘timeout’ usually used on the ClientBuilder is not what you want.

This was more a note for me - but figured people Googling might find it useful also.

Recently I was catching up on talks from DjangoCon EU 2012. Wish I could have been there. This talk on Flasky Goodness (or, why Django sucks) sort of rang a bell with me.

Why? Well, for me the point of writing everything like it’s going to be open source seems like a great way of doing things. It’s a great philosophy to have. Seriously.

Interesting tidbit posted by The Register last week. It’s a topic fairly close to my heart as I don’t have a degree. I do, weirdly, get asked about this quite a bit - “should I get a degree?”. If you think you should, you should. Don’t let any “IT pro” sway your decision.

This is especially important now that degrees are just so damned expensive. According El’Reg, you’re looking at £27k for your degree. Wow!

Personally, when I look at people’s CVs, the first thing I’ll do is Google them. Then, I’ll take a look at their Github profile. Then I look at their education. It’s important that if they did a degree, they did well - but it doesn’t matter if they didn’t do one at all.