Categories
Linux

I can’t remember tar commands

.zip, .tar, .tar.gz, .tar.bz, .gz, .tar.bz2 and so much fucking more archive file formats! There are like a bazillion archive/compression formats on this earth. And everyone uses something else.

But how to handle all these? On systems with a proper GUI this is mostly pretty simple stuff – just install a proper tool that can handle all of these. For example on Windows just put it into 7-Zip and you’re good to go. Most modern Linux distros also ship something at least partially useful nowadays.

But what if you’re working pure command line with no fancy tools installed AND YOU JUST WANT TO FUCKING UNZIP THE ARCHIVE YOU JUST DOWNLOADED? The professionals will tell you to just tar -xzf the whole archive1 (or something thereabout). But wait, wasn’t it tar -jxf? I just can’t remember…

“Luckily, there’s always a search machine to help you out”, I thought. “The answer is just one google away.” – And well, it is. But after I googled the same tar command for the 9000th time I just told myself: “I don’t want to do that anymore”. It just feels stupid to forget the same command over and over again, but I can’t help it. So I needed a solution that felt at least slighty more smart than just google the command everytime I unpack some archive.

Protip: Bash supports aliases!

And that’s exactly how I solved it. I just added some aliases for common archive files with an easy to remember alias. The alias rule is pretty simple: Just write “un” (short for “unpack”) and then the file ending without any dots. For example, if I wanted to unpack a .tar.gz file I would just write “untargz <filename>“. MUCH easier to remember, right? Well, at least for me it is.

If you want to do the same thing, you will need to create the neccessary aliases in your shell. A common place to put these is your .bashrc file located in your users home directory (e.g /home/max/.bashrc). Some systems also have a dedicated .bash_aliases file in the same directory which is loaded by the bashrc-file. If you have such a file you may also use that. The following is an excerpt from my .bashrc file, which includes aliases for common Unix archive filetypes.

# ~/.bashrc: executed by bash(1) for non-login shells.

# You may uncomment the following lines if you want `ls' to be colorized: (Everyone likes colors, right?)
export LS_OPTIONS='--color=auto'
eval "`dircolors`"
alias ls='ls $LS_OPTIONS'
alias ll='ls $LS_OPTIONS -l'
alias l='ls $LS_OPTIONS -lA'
alias untargz='tar -xzf'
alias untarbz='tar -xzf'
alias ungz='gunzip'
alias untarbz2='tar -jxf'

There are many more archive filetypes, but the aliases above cover the most common ones – if you need more, you can certainly figure out how to add these yourself. To me these aliases have been highly useful. Maybe someone else will find them useful too.

PS: The “ls” aliases shown above where suggested by another individual. I also found them to be useful, especially the ‘l’ shortcut for ls -lA which is why I included them here too.

Categories
Transport Layer Security (TLS)

Monitoring certificate issuance with the power of certificate transparency

Wow, what a long title. And so big! I really need to rework this site layout – or think of shorter titles.

Basics first, what is certificate transparency?

It’s a (still new) technology to monitor the issuance of certificates. (Wow what a great statement, that’s exactly the title of this post). No really, the thing is that the TLS protocol (and many other security protocols) have always had trust issues. Not psychological, but technical.

“On the Internet, nobody knows you’re a dog”

– Peter Steiner

And that’s where trust issues start. The protocols that make up the internet at it’s core do not make any guarantees about who you’re talking with. That’s a problem if you want security – if I want to talk to my banking website, how can I be certain that it’s actually my bank at the other side of the wire? Could be anyone.

In order to solve that problem, digital certificates were introduced. But to be honest, those didn’t solve the trust issue, but just shifted it. Certificates are basically a proof that you’re who you claim you are. In order to make them worth something, they need to get some kind of notarization. Those are done by certificate authorities. They are magical instances which never make mistakes and only issue certificates to good persons with a proven identity. That’s why everyone trusts these certificate authorities and that’s how we solved trust issues on the internet.

Sounds too good to be true? In fact it actually works surprisingly well. But not always. Certificate authorities (CA) make mistakes, get hacked or otherwise compromised and then you got the problem:
Because everyone trusts the CA and the power of a CA is hardly limited (there’s CAA, but we’ll keep that to another post), a fraudulent CA can pretty much issue certificates for every domain it wants.

That is obviously a bad thing and there should be protection against that – and there is! One way of protection is certificate transparency.

You still haven’t told what that is…

We’ll get there. The idea behind Certificate Transparency is a requirement that all certificates must be logged publicy. If you can force every CA to always and without exception log every certificate that it issues, you can at least detect immediatly when a CA goes full rogue – with smart (and automatic) monitoring you can also detect other abnormalities in the certificate ecosystem, but that’s way beyond the scope of this post.

And how does it work exactly?

Sadly, the details are also way beyond the scope of this post. To make it short, Google Chrome (and other browsers, most notably Apple products) requires every freshly issued certificate to appear in a public log that records all certificates. If a certificate isn’t logged there, the browser does not allow a connection to the website. That’s how you force every CA to log every certificate issuance there. There’s a lot of crypto involved to make sure that certificates can prove they’re part of a log and the logs use a Merkle tree to prove they’re also doing fair-play.

I’m not a certificate authority. If it works, why should I care about certificate transparency?

The thing is, certificate transparency is pretty worthless if it isn’t fully utilized. If every certificate is put into a log, but no one watches that log, the log is useless. There are a few people and companies that monitor all logs for unusual behavior and notify relevant authorities if something happens. But that isn’t enough to detect every attack. Small (e.g single domain) attacks cannot be detected by large-scale monitoring. That only works if domain owners do their own monitoring.

Note that certificate transparency log’s are public. That means anyone is free to query them if they want to. That’s something that we can utilize. If you’re a domain owner, you can setup an automatic monitor application that tells you whenever someone issued a certificate for your domain. Imagine someone took over your domain unnoticed – maybe due to a leak, DNS hijack or something else. If an attacker has (limited or full) control over your domain, they can most likely also issue a fake certificate for the domain and thus even intercept encrypted communications via a man in the middle attack.

Certificate Transparency can help in detecting such attacks.

If you have a monitor that tells you who has issued certificates for your domain(s) recently, you get notified immediatly that something’s wrong. That in itself doesn’t fix your security hole, but at least aids in early detection and possibly mitigation of further breaches.

That sounds fine, but where/how can I monitor my domain(s)?

The ideal way would be to query all certificate transparency logs yourself. But that’s a complicated thing, which we will not cover here.

Luckily, there’s a “cheap alternative”. Theres a service, hosted by Sectigo (formerly Comodo), called crt.sh. They automatically query all approved certificate transparency logs and put every certificate in a public database, that can be nicely accessed – both via web and via PostgreSQL.

If you want automatic monitoring for your domains, you can just access that website (or database) periodically, fetch every certificate for your domain and look for new ones. And that is it! That’s what basic monitoring is. You can get pretty sophisticated monitoring, sure, but a simple email notification should already do the trick for most people. You will probably need a human though to determine if a new certificate issuance was “legal” (remember, Let’s Encrypt certificates are renewed/re-issued every 60 days) or fraudulent (you can in theory automate this, but it’s not that easy).

crt.sh protip: When querying the web service of crt.sh, putting the parameter output=json in a query will get you results in nice JSON format – much easier for scripts. For example, if you want all certificates for %.germancoding.com (% means “match anything”) you could do this:
https://crt.sh/?q=%25.germancoding.com&output=json

If you go to advanced search you will see some more web filter options (like excluding expired certificates). For implementation details, or if you want to access the database directly, it’s probably best to head over to their forums and look for answers there – most things are explained there.

Tutorial time: Simple monitor python script

The following is an example script that does the following:

  • Fetch all currently valid certficates from crt.sh for a given example domain (germancoding.com in this example)
  • Bail out if that fails
  • Load a local SQLite database from disk (“knowncerts.db”) that stores all certificates that we already know
  • Compare list of certificates from crt.sh with the local database, look for new certificates
  • If one or more new certificates are found:
    • Try to fetch additional information about them
    • Then send an email to a preconfigured address, telling them about the new certificate(s)
    • Finally, add the new certificate(s) to the local database

Known caveats and other things to know about the script:

  • The script relies on some python-dependencies (subprocess, requests, sqlite) being available
  • The script relies on some OS commands being available:
    • ‘openssl’ command in order to parse a new certificate
    • A working ‘mail’ command in order to send emails
  • The local SQLite database ‘knowncerts.db’ must already exist locally. In order to create the database, you need to do the following:
    • Make sure sqlite/sqlite3 is installed (including the python module)
    • #sqlite3 knowncerts.db will create a new database (or open an existing one)
    • Executing CREATE TABLE IF NOT EXISTS certs (id text, serial text); in the freshly opened SQLite-terminal will create the neccessary table structure expected by the script.
    • Type .exit to get out of that terminal.

Now, let’s finally get to it. The formatting below is a lot different than the rest of this page, but this snippet really needs as much space as it can get.

import subprocess
import requests
import sqlite3
from subprocess import Popen, PIPE

# Replace domain below with your own
payload = {'Identity': '%.germancoding.com', 'exclude': 'expired', 'output': 'json'}
r = requests.get('https://crt.sh/', params=payload)
r.raise_for_status()
allcerts = r.json()
# You can repeat the above and then do the following to make multiple querys in one go:
# allcerts.extend(r.json())

changes = False
conn = sqlite3.connect('knowncerts.db')
c = conn.cursor()
c.execute("SELECT id FROM certs");
knowncerts = c.fetchall()
knowncerts = [i[0] for i in knowncerts] # https://stackoverflow.com/questions/12142133/how-to-get-first-element-in-a-list-of-tuples
knownserials = []
for item in allcerts:
   id = str(item['id']) # Field is called 'min_cert_id' on legacy servers
   if id not in knowncerts:
      domain = item['name_value'].strip().splitlines()[0]
      #print("debug: new cert " + domain + " " + id)
      text = "ERROR UNDEFINED TEXT"
      serial = "ERROR"
      sendmail = True
      try:
        # Get PEM cert from crt.sh and print it using OpenSSL
        payload = {'d': id}
        r = requests.get('https://crt.sh/', params=payload)
        r.raise_for_status()
        pem = r.text
        # <PEM encoded cert> >> openssl x509 -serial -noout (Get cert serial)
        serial = subprocess.run(["openssl", "x509", "-serial", "-noout"], input=pem, text=True, check=True, capture_output=True, timeout=5).stdout
        serial = serial.replace("serial=", "").strip().splitlines()[0]
        c.execute('SELECT id, serial FROM certs WHERE serial=?', [serial])
        if len(c.fetchall()) > 0 or serial in knownserials:
           #print("debug: serial already known")
           sendmail = False
        # <PEM encoded cert> >> openssl x509 -text -noout (Get cert in human readable format)
        text = subprocess.run(["openssl", "x509", "-text", "-noout"], input=pem, text=True, check=True, capture_output=True, timeout=5).stdout
      except Exception as e:
        text = "(Failed to fetch additional data - " + repr(e) + ")"
        pass
      # Put into database
      c.execute("INSERT INTO certs VALUES(?, ?)", [id, serial]);
      if sendmail:
         mailbody = "Issued cert can be found at https://crt.sh/?id=" + id + "\n\n\n" + text
         # Replace below with your own email address - note that the "mail" command needs to work (e.g your system needs a working MTA)
         # mail -s <Subject> <Dest Email> << <Email Text>
         subprocess.run(["mail", "-s", "New cert for monitored domain " + domain, "root@germancoding.com"], input=mailbody, text=True, check=True, timeout=60)
      knowncerts.append(id)
      knownserials.append(serial)
      changes = True
if changes:
  conn.commit()
c.close()
conn.close()

You can put the script above in a certnotify.py python file and run it with a cronjob, hourly or so:

# crontab -e

27 * * * * cd /home/max && python3 ./certnotify.py >> /var/log/certnotify/certnotify.log 2>> /var/log/certnotify/certnotify.err.log

Categories
Uncategorized

Hello, World!

Well, this is it. The beginning of a new blog. Not the first one to be honest. We will see how long it will last.

Soo, what’s this about? Well, I don’t know yet. The idea is to write about all sorts of software or computer related stuff (admistration, programming, debugging, …) that I somehow encounter. Maybe a bit tutorial-style, but we’ll see that when we get there.

Note that this is not a professional blog in any way. As such I don’t care at all about correct grammar – it either works or it does not (as with all software 🙂 ).