Categories
Transport Layer Security (TLS)

Monitoring certificate issuance with the power of certificate transparency

Wow, what a long title. And so big! I really need to rework this site layout – or think of shorter titles.

Basics first, what is certificate transparency?

It’s a (still new) technology to monitor the issuance of certificates. (Wow what a great statement, that’s exactly the title of this post). No really, the thing is that the TLS protocol (and many other security protocols) have always had trust issues. Not psychological, but technical.

“On the Internet, nobody knows you’re a dog”

– Peter Steiner

And that’s where trust issues start. The protocols that make up the internet at it’s core do not make any guarantees about who you’re talking with. That’s a problem if you want security – if I want to talk to my banking website, how can I be certain that it’s actually my bank at the other side of the wire? Could be anyone.

In order to solve that problem, digital certificates were introduced. But to be honest, those didn’t solve the trust issue, but just shifted it. Certificates are basically a proof that you’re who you claim you are. In order to make them worth something, they need to get some kind of notarization. Those are done by certificate authorities. They are magical instances which never make mistakes and only issue certificates to good persons with a proven identity. That’s why everyone trusts these certificate authorities and that’s how we solved trust issues on the internet.

Sounds too good to be true? In fact it actually works surprisingly well. But not always. Certificate authorities (CA) make mistakes, get hacked or otherwise compromised and then you got the problem:
Because everyone trusts the CA and the power of a CA is hardly limited (there’s CAA, but we’ll keep that to another post), a fraudulent CA can pretty much issue certificates for every domain it wants.

That is obviously a bad thing and there should be protection against that – and there is! One way of protection is certificate transparency.

You still haven’t told what that is…

We’ll get there. The idea behind Certificate Transparency is a requirement that all certificates must be logged publicy. If you can force every CA to always and without exception log every certificate that it issues, you can at least detect immediatly when a CA goes full rogue – with smart (and automatic) monitoring you can also detect other abnormalities in the certificate ecosystem, but that’s way beyond the scope of this post.

And how does it work exactly?

Sadly, the details are also way beyond the scope of this post. To make it short, Google Chrome (and other browsers, most notably Apple products) requires every freshly issued certificate to appear in a public log that records all certificates. If a certificate isn’t logged there, the browser does not allow a connection to the website. That’s how you force every CA to log every certificate issuance there. There’s a lot of crypto involved to make sure that certificates can prove they’re part of a log and the logs use a Merkle tree to prove they’re also doing fair-play.

I’m not a certificate authority. If it works, why should I care about certificate transparency?

The thing is, certificate transparency is pretty worthless if it isn’t fully utilized. If every certificate is put into a log, but no one watches that log, the log is useless. There are a few people and companies that monitor all logs for unusual behavior and notify relevant authorities if something happens. But that isn’t enough to detect every attack. Small (e.g. single domain) attacks cannot be detected by large-scale monitoring. That only works if domain owners do their own monitoring.

Note that certificate transparency logs are public. That means anyone is free to query them if they want to. That’s something that we can utilize. If you’re a domain owner, you can setup an automatic monitor application that tells you whenever someone issued a certificate for your domain. Imagine someone took over your domain unnoticed – maybe due to a leak, DNS hijack or something else. If an attacker has (limited or full) control over your domain, they can most likely also issue a fake certificate for the domain and thus even intercept encrypted communications via a man in the middle attack.

Certificate Transparency can help in detecting such attacks.

If you have a monitor that tells you who has issued certificates for your domain(s) recently, you get notified immediatly that something’s wrong. That in itself doesn’t fix your security hole, but at least aids in early detection and possibly mitigation of further breaches.

That sounds fine, but where/how can I monitor my domain(s)?

The ideal way would be to query all certificate transparency logs yourself. But that’s a complicated and resource-intensive thing, which we will not cover here.

Luckily, there’s a “cheap alternative”. Theres a service, hosted by Sectigo (formerly Comodo), called crt.sh. They automatically query all approved certificate transparency logs and put every certificate in a public database, that can be nicely accessed – both via web and via PostgreSQL.

If you want automatic monitoring for your domains, you can just access that website (or database) periodically, fetch every certificate for your domain and look for new ones. And that is it! That’s what basic monitoring is. You can get pretty sophisticated monitoring, sure, but a simple email notification should already do the trick for most people. You will probably need a human though to determine if a new certificate issuance was “legal” (remember, Let’s Encrypt certificates are renewed/re-issued every 60 days) or fraudulent (you can in theory automate this, but it’s not that easy).

crt.sh protip: When querying the web service of crt.sh, putting the parameter output=json in a query will get you results in nice JSON format – much easier for scripts. For example, if you want all certificates for %.germancoding.com (% means “match anything”) you could do this:
https://crt.sh/?q=%25.germancoding.com&output=json

If you go to advanced search you will see some more web filter options (like excluding expired certificates). For implementation details, or if you want to access the database directly, it’s probably best to head over to their forums and look for answers there – most things are explained there.

Tutorial time: Simple monitor python script

The following is an example script that does the following:

  • Fetch all currently valid certficates from crt.sh for a few given example domains (germancoding.com and a few more in this example)
  • Bail out if that fails, but have automatic retries too
  • Load a local SQLite database from disk (“knowncerts.db”) that stores all certificates that we already know
  • Compare list of certificates from crt.sh with the local database, look for new certificates
  • If one or more new certificates are found:
    • Try to fetch additional information about them
    • Then send an email to a preconfigured address, telling them about the new certificate(s)
    • Finally, add the new certificate(s) to the local database

Known caveats and other things to know about the script:

  • The script relies on some python-dependencies (subprocess, requests, sqlite, urllib3, and json) being available (as well as python itself of course)
  • The script relies on some OS commands being available:
    • ‘openssl’ command in order to parse a new certificate
    • A working ‘mail’ command in order to send emails
  • The local SQLite database ‘knowncerts.db’ must already exist locally. In order to create the database, you need to do the following:
    • Make sure sqlite/sqlite3 is installed (including the python module)
    • sqlite3 knowncerts.db will create a new database (or open an existing one)
    • Executing CREATE TABLE IF NOT EXISTS certs (id text, serial text); in the freshly opened SQLite-terminal will create the neccessary table structure expected by the script.
    • Type .exit to get out of that terminal.

Now, let’s finally get to it. The formatting below is a lot different than the rest of this page, but this snippet really needs as much space as it can get.

import subprocess
from urllib3.util.retry import Retry
import requests
from requests.adapters import HTTPAdapter
import sqlite3
from subprocess import Popen, PIPE
import json

# Replace email with your own
email = "your-email@example.com"

# Replace domain(s) below with your own
# The OR syntax is understood by crt.sh and searches
# multiple domains
domains = "%.germancoding.com OR example.com OR letsencrypt.org"

# Setup a retry system in case crt.sh has temporary hiccups
session = requests.Session()
retry = Retry(
   total=10,
   status=10,
   status_forcelist=[429],
   backoff_factor=1,
   respect_retry_after_header=True,
)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

payload = {'Identity': domains, 'exclude': 'expired', 'output': 'json'}
r = session.get('https://crt.sh/', params=payload)
r.raise_for_status()
allcerts = r.json()

changes = False
conn = sqlite3.connect('knowncerts.db')
c = conn.cursor()
c.execute("SELECT id FROM certs");
knowncerts = c.fetchall()
knowncerts = [i[0] for i in knowncerts]
knownserials = []
for item in allcerts:
   id = str(item['id'])
   if id not in knowncerts:
      domain = item['name_value'].strip().splitlines()[0]
      text = "ERROR UNDEFINED TEXT"
      serial = "ERROR"
      sendmail = True
      try:
        # Get PEM cert from crt.sh and print it using OpenSSL
        payload = {'d': id}
        r = session.get('https://crt.sh/', params=payload)
        r.raise_for_status()
        pem = r.text
        domain = subprocess.run(["openssl", "x509", "-subject", "-noout"], input=pem, text=True, check=True, capture_output=True, timeout=5).stdout.replace("subject=", "").strip().splitlines()[0].replace("CN = ", "")
        serial = subprocess.run(["openssl", "x509", "-serial", "-noout"], input=pem, text=True, check=True, capture_output=True, timeout=5).stdout
        serial = serial.replace("serial=", "").strip().splitlines()[0]
        c.execute('SELECT id, serial FROM certs WHERE serial=?', [serial])
        if len(c.fetchall()) > 0 or serial in knownserials:
           #print("debug: serial already known")
           sendmail = False
        text = subprocess.run(["openssl", "x509", "-text", "-noout"], input=pem, text=True, check=True, capture_output=True, timeout=5).stdout
      except Exception as e:
        text = "(Failed to fetch additional data - " + repr(e) + ")"
        pass
      c.execute("INSERT INTO certs VALUES(?, ?)", [id, serial]);
      if sendmail:
         mailbody = "Issued cert can be found at https://crt.sh/?id=" + id + "\n\n\n" + text
         subprocess.run(["mail", "-s", "New cert " + domain + " for a monitored domain", email], input=mailbody, text=True, check=True, timeout=60)
      knowncerts.append(id)
      knownserials.append(serial)
      changes = True
if changes:
  conn.commit()
c.close()
conn.close()

You can put the script above in a certnotify.py python file and run it with a cronjob, hourly or so:

# crontab -e

27 * * * * python3 /path/to/certnotify.py >> /var/log/certnotify/certnotify.log 2>> /var/log/certnotify/certnotify.err.log

Categories
Uncategorized

Hello, World!

Well, this is it. The beginning of a new blog. Not the first one to be honest. We will see how long it will last.

Soo, what’s this about? Well, I don’t know yet. The idea is to write about all sorts of software or computer related stuff (admistration, programming, debugging, …) that I somehow encounter. Maybe a bit tutorial-style, but we’ll see that when we get there.

Note that this is not a professional blog in any way. As such I don’t care at all about correct grammar – it either works or it does not (as with all software 🙂 ).