Docker is a hot topic at the moment in the DevOps world. I use it almost every day and want to look at how automation can be achieved in terms of security and monitoring.

Containers in computing aren’t new. In fact FreeBSD had containers before Google was using them in Linux; although it call them jails.

Docker is great in that it’s brought containers to the masses. Once the reserve of people with the patience to set up LXC on Linux or the painful jails on FreeBSD - side note: it’s very painful I might talk about that another time.

We can talk to Docker via it’s RESTful API and libraries exist for almost every language. The two popular obvious ones are Go and Python - I say obvious, but it’s more that I just prefer these two languages. I’m sure the Ruby one is awesome too.

The downside of Docker that’s coming up more and more is managing security of containers. People often just use official images without a second thought and these end up in production. There’s posts containing loads of FUD on the topic which exist already - but in general how do you ensure you keep your containers’ operating system packages up to date?

Sounds like a task for a script. I broke it down into the following tasks:

  • Connect to Docker (boot2docker in my case)
  • Get a list of installed packages in debian:jessie image
  • Get a list of packages from security.debian.org
  • Compare the two

I need to add I used Python 3.4 for this. This makes the syntax seem a little odd to a Python 2.x view so needed to say!

Let’s get connecting out of the way:

1
2
3
4
5
6
7
8
9
10
from docker.client import Client
from docker.utils import kwargs_from_env

# So we can use boot2docker
kwargs = kwargs_from_env()
kwargs['tls'].assert_hostname = False
kwargs['tls'].verify = False  # Workaround https://github.com/docker/docker-py/issues/465

# Set up the client
client = Client(**kwargs)

Took me a small while to figure out the issue where OpenSSL 1.0.2a causes problems with quite a few libraries and talking to APIs. To get out of it for now I disable the verify part of requests - It’ll complain a lot about it.

Now we’re connected we can make a container and get some stuff out:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Create my Debian Jessie container
container = client.create_container(
    image='debian:jessie',
    stdin_open=True,
    tty=True,
    command="/usr/bin/dpkg-query -Wf '${Package},${Version}\n'",
)

# Launch it with the custom command
client.start(container)

# Grab the dpkg-query output
output = client.logs(container).decode("utf-8")

packages = {}

lines = output.split('\n')
lines.pop(0)
for line in lines:
    # Last line is a blank
    if not line:
        continue

    k, v = line.split(',')
    packages[k] = v

We now have a stack of packages in a dictionary keyed by package name. To do this we make good use of dpkg-query to get a CSV like list of package,version.

What we want next is a similar dict for up to date packages. Now, I know a lot of people who might read this would launch into apt-get update and then query the global list of packages. Would you do that in production? Really? You just want a list of stuff… Let’s just get it from security.debian.org directly.

1
2
3
4
5
6
7
import gzip
import re

import requests

r = requests.get('http://security.debian.org/dists/jessie/updates/main/binary-amd64/Packages.gz', stream=True)
gz = gzip.GzipFile(fileobj=r.raw)

A small point here… We make use of the gzip library directly to ungzip the file downloaded via Requests. To do this we use ‘r.raw’ like a file which GzipFile can use without any issue.

Now the format of this file is a bit weird. It’s a list of key value pairs for each package with a blank line between packages. The two keys we’re interested in for each package are Package (the name) and Version.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
kvregex = re.compile(r'(\w+): (.*)')
security_updates = {}

current_package = None
current_version = None

# Build up a security updates dict
for line in gz.readlines():
    m = kvregex.match(line.decode("utf-8"))
    if not m:
        security_updates[current_package] = current_version
        continue

    g = m.groups()
    if g[0] == 'Package':
        current_package = g[1]
    elif g[0] == 'Version':
        current_version = g[1]

r.close()

Perfect! We now have a dict with all the security updates in Jessie keyed by the package name again.

With these two dicts we can intersect them and only get elements that are in both. If the version doesn’t match spit it out. I had to fake an update to exist to test this properly as when I tested there were no out of date packages.

1
2
3
4
5
6
7
8
9
10
def common_entries(*dcts):
    for i in set(dcts[0]).intersection(*dcts[1:]):
        yield (i,) + tuple(d[i] for d in dcts)

# Fake a security update for Sed... (cos Jessie 8.1 is quite up to date)
security_updates['sed'] = '4.2.2-4+b2'

for entry in common_entries(packages, security_updates):
    if entry[1] != entry[2]:
        print('%s is %s and a security update exists for %s' % entry)

And, there we have it (spot the whining from requests…):

1
2
3
4
5
6
7
8
$ python foobar.py
.../venv/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
.../venv/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
.../venv/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
sed is 4.2.2-4+b1 and a security update exists for 4.2.2-4+b2

Awesome! There we have it - a quick way to grab and compare packages against containers.