Deploying a Self-Hosted Static Website with ‘git push’

Posted: 2024-12-02
Tags: #blogging, #git, #self-hosting, #web

It’s something of a trope that when a programmer starts a blog, an inordinate number of their blog posts will be about the act of blogging, especially with regard to their chosen tech stack.

It’s also something of a trope that said programmer/blogger is doomed perpetually to tinker with said tech stack, rather than spending that energy on more directly productive activities; like, I don’t know, writing.

Alas, I am immune from neither of these; and so we find ourselves here, as I prepare to regale you with the tale of how I migrated my blog from Netlify’s fancy specialized static-site hosting to my own general-purpose Ubuntu VM on Digital Ocean.

Well… there’s really not that much of a tale. Towards the beginning of this year, I decided I wanted to try hosting Git repositories on my own domain; and, ultimately, I did so via Digital Ocean. Originally, I intended to leave this website where it was, figuring it would be better for the sake of robustness if my Git server and my blog remained separate.

Then, in February, I came across a viral news story about someone on Netlify’s free tier — which I was also on — who suddenly and unexpectedly received a bill for over $100,000 USD. The bill was ultimately dropped, but the incident nonetheless gave me cause to reconsider my hosting arrangements.

Netlify’s free tier has a usage limit of 100 GB per month, which seems pretty reasonable — I never came close to breaking one gigabyte, let alone one hundred. If you ever do reach the limit, though, that’s when things get hairy: rather than cutting off the site, as I and many others might assume, they instead bill you $55; and from there, another $55 for every additional 100 GB of bandwidth used until the next month begins.

Meanwhile, my Digital Ocean VM costs $6 per month and allows 1,000 GB of bandwidth before imposing overage charges of $0.01 per gigabyte. I would have to use 5,900 GB of bandwidth in one month for Digital Ocean to charge me — including the $6! — as much as Netlify would for 101 GB. Since I was already paying for the VM, hosting my website there too is effectively free and gives me much less cause for anxiety over hypothetical spikes in usage.

There were three main reasons why I chose Netlify in the first place:

It was free. As I just mentioned, though, this is irrelevant since I would have been paying for the VM regardless.
They support the static-site generator Hugo, which I no longer use.
They automatically re-build and re-deploy your site whenever you push commits to GitHub (or GitLab, or a handful of other repository hosts).

This last point was the only one left that I particularly cared about. I had already set up the capability to git push to my server; how hard could it be to make that trigger a deployment as well?

As it turns out, it’s not all that difficult. Still, that didn’t stop me from making a few missteps along the way.

How it works

The repository for my blog is hosted privately, so I can commit drafts in a branch without being too self-conscious about my raw unpolished prose attracting scrutiny. Whereas my public repositories live in the filesystem under /srv/git, my private ones can be found at /home/daniel/private. I also have a symlink /home/daniel/public that points to /srv/git, which allows me to configure my remote URLs as e.g. git.rdnlsmith.com:public/fitbit-cpu-clock.git or git.rdnlsmith.com:private/blog.git. I find this very satisfying.

Meanwhile, the web-server-accessible files for my cgit installation — and, now, my blog — are under /var/www:

daniel@git.rdnlsmith.com:~$ ls -lh /var/www
total 12K
drwxr-xr-x 6 daniel daniel 4.0K Oct  6 20:46 blog
drwxr-xr-x 2 root   root   4.0K Mar 29  2024 cgit
drwxr-xr-x 2 root   root   4.0K Mar 25  2024 html

(The html directory contains the default “Welcome to nginx!” page.)

The contents of /var/www are typically owned by root, so regular users can’t accidentally mess them up. When I push commits, though, my SSH key authenticates me as the user daniel, and any actions performed resulting from e.g. a Git hook will therefore be done as daniel. Consequently, daniel must at least have write access to blog; and with it being my personal website and all, I decided it would be simplest just to make daniel the owner.

As is usually the case on the server side, /home/daniel/private/blog.git is a bare repository: essentially what would be the contents of the .git subdirectory in a normal repository, with no working tree whatsoever.

At first, I tried to simply add a working tree, corresponding to the main branch, under /var/www/blog:

cd /var/www
sudo mkdir blog
sudo chown daniel:daniel blog
cd /home/daniel/private/blog.git
git worktree add /var/www/blog main

This did work, more or less, along with a post-receive Git hook that would git reset --hard to update the files to match whatever commits were just pushed. However, I found that this process left the index in a weird state; and while I suppose that doesn’t really matter as long as the files come out right, I didn’t want to leave it at that.

I removed the worktree, and instead created a self-contained local clone:

git worktree remove /var/www/blog
cd /var/www/blog
git clone /home/daniel/private/blog.git ./

I had tried to avoid this with the worktree route, because I didn’t want to have a second copy of the .git objects/metadata/etc. on the same machine. It turns out, though, that local clones create hard links to the parent repository for most of those files instead of copying them, so it doesn’t particularly waste any more space this way.

The post-receive hook is a file named post-receive in /home/daniel/private/blog.git/hooks. The final version ended up looking like this:

#!/bin/sh

TARGET=/var/www/blog

while read old new ref
do
    if [ "$ref" = "refs/heads/main" ]; then
        echo "deploying $(git show --no-patch --format=reference $new)..."
        git --git-dir="$TARGET/.git" --work-tree="$TARGET" fetch \
        && git --git-dir="$TARGET/.git" --work-tree="$TARGET" reset --hard origin/main
    fi
done

(Don’t forget to make this executable: chmod +x post-receive.)

When Git invokes this script, it pipes in one line to standard input for each ref (branch, tag, etc.) that is getting updated. Each line lists the commit hash that the ref used to point to; the commit hash that it points to now; and the ref name. The while loop above reads each line into three variables: old, new, and ref.

If one of the refs being updated is the main branch, then the script tells the /var/www/blog copy of the repository to fetch the new commits and reset itself to match the updated main ref. It wouldn’t actually hurt anything to do this unconditionally, because /var/www/blog always reflects the main branch no matter what I push; but it would be extra work to make it go through the fetch and reset for no reason when I push some other branch.

The --git-dir and --work-tree options are needed to override the GIT_DIR and GIT_WORK_TREE environment variables, which in this case would reflect the /home/daniel/private/blog.git copy of the repository from which the hook was triggered instead of the /var/www/blog copy where the work needs to happen.

The echo line sends a message back down to my local terminal to let me know which commit is being deployed:

remote: deploying 3b32c43 (Test Mastodon author attribution, 2024-09-21)...

Finally, I added a fairly simple Nginx configuration file named /etc/nginx/sites-available/blog, symlinked it under /etc/nginx/sites-enabled/, and re-loaded Nginx. The first few lines indicate that this file applies to requests with either of the domains rdnlsmith.com or www.rdnlsmith.com, and that the files to serve are located in /var/www/blog/src (as the repository contains more than just the web pages, and as my site is currently handwritten source files with no build step). The rest of the configuration was put there by Certbot to enable HTTPS; this is pretty similar to what it did for cgit, so you can read that post if you want more details.

server {
    server_name rdnlsmith.com www.rdnlsmith.com;
    root /var/www/blog/src;

    listen [::]:443 ssl; # managed by Certbot
    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/git.rdnlsmith.com/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/git.rdnlsmith.com/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}

server {
    if ($host = www.rdnlsmith.com) {
        return 301 https://$host$request_uri;
    } # managed by Certbot


    if ($host = rdnlsmith.com) {
        return 301 https://$host$request_uri;
    } # managed by Certbot

    listen 80;
    listen [::]:80;

    server_name rdnlsmith.com www.rdnlsmith.com;
    return 404; # managed by Certbot
}

Some words of warning about DNS

Aside from hosting my blog, Netlify was also my DNS provider; this allowed them to manage the TLS certificates to enable HTTPS for rdnlsmith.com and www.rdnlsmith.com. As I set up my Git server and my Fastmail address earlier this year, Netlify became responsible for those DNS records too. Before I could close out my Netlify account, I needed to move those records to a different provider.

My domain registrar, Namecheap, offers free DNS, so I went with that. Unfortunately, I made the mistake of assuming that I could just copy each entry verbatim from Netlify to Namecheap. This proved not to be the case: while Netlify accepted e.g. git.rdnlsmith.com to specify a subdomain, Namecheap expected just git for the same record, and @ for the record that specified the rdnlsmith.com base domain itself.

When I deleted my Netlify DNS records, my website immediately became inaccessible. I had read somewhere that after transferring DNS, it could take up to 24 hours for downstream DNS resolvers to become aware of the new source; so at first, I chided myself for being too zealous in deleting the old records, and settled in to wait. It wasn’t until the next day, with my website still down, that I finally decided to actually verify that I had set up the new DNS records correctly — which, of course, I hadn’t.

The moral of this story is: when you transfer DNS — or any service, really — from one provider to another, read their documentation first.