I Have a Git Server Now
Yes, you read that right: in the year 2024 C.E.—sixteen years after the advent of GitHub, whose reign spans not just my entire professional career but even my (formal) computer science education—I’ve decided to take a step backwards in time and host my software projects independently.
“But why?” you ask, “Why on Earth would anyone bother to do such a thing, in this day and age?”
A few years ago now, I encountered SourceHut, and I was immediately fascinated. Where GitHub aims to replace “old-school” practices and tools with its own ideas—which, admittedly, are friendlier, if not necessarily better—SourceHut embraces them: patch submissions, code review, and general discussions are all based around email. While this may sound archaic, it does have its advantages.
Sporadically, I’ve contemplated migrating or mirroring some of my projects from GitHub to SourceHut; the main reason I haven’t done so is because it would cost money.1 I consider SourceHut’s financial reliance on regular users rather than enterprises or advertising to be a point in their favor, but I am always reticent to commit myself to yet another subscription. Even so, SourceHut’s approach has continued to intrigue me; and, gradually, GitHub has begun to push me away. Compared to its early simplicity, the GitHub of today has begun to feel a bit bloated; but more than that, I find their recently-declared change of focus especially off-putting.
As it seemed more and more likely that, sooner or later, I was going to end up subscribing to SourceHut, a thought occurred: if I’m willing to spend five-ish dollars per month on Git hosting anyways, why not spend that on a VM and further my new goal to have more ownership of my own web presence?
The following narrative endeavors to be complete and accurate, and to always make clear not just what I did but also why. That said, I did take a few slight liberties in order to present a more coherent guide, be it for my future self or some other interested party. Also, this all went down five or six months ago—because I lost momentum in the middle of writing this post and then my brain refused to engage with it again until now—so there may be a few slight mis-remberings here and there on that basis.
Deciding on software
SourceHut’s entire platform is free software—another point in their favor—so at first, I intended to try running their Git module. While this would be possible, though, upon investigation I didn’t think it would be especially easy for a first-timer like myself.
Forgejo is another interesting option, especially with their ongoing work towards federation. However, they explicitly imitate GitHub, and I wanted something simpler.
The canonical Linux and Git repositories are both hosted at <https://git.kernel.org>, which uses cgit. Not exactly cutting-edge, but clearly dependable; and better still, it’s very lightweight and very easy to set up. Plus, it shares the same utilitarian design sensibilities as SourceHut.2
Obtaining a server
I’ve never hosted any kind of public-facing web service before, so I wasn’t really sure what to look for in a provider. Ultimately, I settled on DigitalOcean, partly because the price seemed right—$4/month for the most barebones shared-CPU VM,3 or $6/month for one with a little more breathing room—and partly because I had recently read about how Molly White hosts her newsletter there.
I was fairly sure either the $4 or $6 options would be sufficient; but, again, I've never done this before, so I decided to click through DigitalOcean’s “Getting Started” wizard. I selected “Host a website or static site,” then “Deploy an Ubuntu server,” and it recommended the following configuration:
- 1 GiB RAM
- 1 CPU
- 25 GiB SSD
- 1000 GiB transfer4
…all for $6 per month. I didn’t proceed from there, though, because it selected Ubuntu 23.10 and I didn’t see a way to change this. Although my laptop ran 23.10 at the time—24.04 now—I wanted to stick to LTS releases for my server.
I backed out to the welcome page, selected “Spin up a Droplet,”5 and selected mostly the same options as the wizard:
- Choose Region: New York
- Choose an Image: Ubuntu 22.04 (LTS) x64
-
Choose Size: Shared CPU/Basic
- CPU Options: Regular/SSD
- $6 (same configuration as above)
- Choose Authentication Method: SSH Key
- Finalize Details: set
git.rdnlsmith.com
as the hostname
Note, of course, that setting the hostname here only determines what the VM calls itself; the name remains meaningless to the outside world pending the creation of a DNS record.
After a minute or so, it declared the VM was ready and gave me an IP address.
rdnlsmith@zephyr ~ $ ssh root@138.197.81.0
Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-67-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Fri Mar 22 17:38:37 UTC 2024
System load: 0.7216796875 Users logged in: 0
Usage of /: 6.8% of 24.05GB IPv4 address for eth0: 138.197.81.0
Memory usage: 25% IPv4 address for eth0: [REDACTED]
Swap usage: 0% IPv4 address for eth1: [REDACTED]
Processes: 100
Expanded Security Maintenance for Applications is not enabled.
17 updates can be applied immediately.
13 of these updates are standard security updates.
To see these additional updates run: apt list --upgradable
Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status
The list of available updates is more than a week old.
To check for new updates run: sudo apt update
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
root@git:~#
Securing the server
The very first thing I did was install updates:
root@git:~# apt update && apt upgrade
Next, I set about creating a user for myself, and giving that user
sudo
rights. It’s generally considered a bad idea to
just be root
all the time—or, indeed, to allow
root
to log in at all. Partly, this helps protect you from
accidentally destroying something important; and partly, it takes away
an obvious point of attack for unscrupulous ne’er-do-wells.
I go by rdnlsmith
on my current laptop and most
websites, but I used to just use daniel
for local
accounts. Now that my email
address is
<daniel@rdnlsmith.com>,
I decided to go back to that.
adduser daniel
usermod -aG sudo daniel
Because I uploaded my public SSH key when I created the VM, the
root
account already had the necessary configuration to
allow access in its ~/.ssh
directory. Following a
DigitalOcean
community tutorial, I copied that to my new user:
rsync --archive --chown=daniel:daniel ~/.ssh /home/daniel
The --archive
flag preserves file permissions (and other
attributes), which are important—SSH will refuse to authenticate a
key if anyone besides the target user has write access to the
authorized_keys
file. --chown=daniel:daniel
changes both the owner and group of the copied files from
root
to daniel
.
Now, I can switch users:
root@git:~# exit
logout
Connection to 138.197.81.0 closed.
rdnlsmith@zephyr ~ $ ssh daniel@138.197.81.0
…and disable root
login:
daniel@git:~$ sudo vim /etc/ssh/sshd_config
This file contained a commented-out entry
#PermitRootLogin no
, which I un-commented by
removing the #
. Because I provided an SSH key rather than a
password when I created the VM, this file also contains the entry
PasswordAuthentication no
, which I would have added if it
weren’t there already: you don’t have to worry as much about
leaked or insecure passwords if your server doesn’t accept
passwords in the first place. Instead, I’ll authenticate
exclusively by SSH key. (It’s maybe worth double-checking that
your file doesn’t have an entry that says
PubkeyAuthentication no
, or else you might end up locked
out of your server.)
The guide linked above also recommends configuring UFW (“Uncomplicated Firewall”) to block any services you aren’t using. UFW was installed by default; I just had to configure it to allow SSH, and then enable it (in that order; or else, again, you might end up locked out).
daniel@git:~$ sudo ufw app list
[sudo] password for daniel:
Available applications:
OpenSSH
daniel@git:~$ sudo ufw allow OpenSSH
Rules updated
Rules updated (v6)
daniel@git:~$ sudo ufw enable
Command may disrupt existing ssh connections. Proceed with operation (y|n)? y
Firewall is active and enabled on system startup
daniel@git:~$ sudo ufw status
Status: active
To Action From
-- ------ ----
OpenSSH ALLOW Anywhere
OpenSSH (v6) ALLOW Anywhere (v6)
Finally, after reading this blog post by Bryan Brattlof and another DigitalOcean community tutorial, I also set up Fail2Ban.
sudo apt install fail2ban
Fail2Ban watches the authentication logs for various services. If it notices repeated authentication failures originating from the same IP address within a short span of time—by default, five attempts within ten minutes—it automatically configures your firewall to temporarily ban that address (by default, for ten minutes). This helps mitigate the (likely mild) performance impact of any automated attempts to compromise your server, and (if you haven’t disabled password authentication) substantially hinders attempts at password-guessing.
Fail2Ban’s configuration file is
/etc/fail2ban/jail.conf
, but this file can be overwritten
by package upgrades, so you shouldn’t edit it directly. Instead, I
created a new file jail.local
in the same directory. The
local file only needs to contain the settings that are different from
what’s in jail.conf
.
I figure I don’t adequately understand the ramifications of
fiddling with Fail2Ban’s settings, so I decided to only change
what was absolutely necessary. The default configuration creates its ban
rules via iptables
directly (which UFW sits on top of); but
since I’m using UFW to manage my firewall, I configured Fail2Ban
to do the same. These two lines:
banaction = ufw
banaction_allports = ufw
…tell Fail2Ban to read the file
/etc/fail2ban/action.d/ufw.conf
to understand how it should
create and remove firewall rules, instead of
/etc/fail2ban/action.d/iptables.conf
.
With that done, I enabled the service:
sudo systemctl enable fail2ban
sudo systemctl start fail2ban
Configuring DNS
The DNS records for my domain are managed by Netlify, because when I first set up this blog (on Netlify), allowing them to manage DNS meant they would also manage the TLS certificate. Whatever DNS provider you use, though, you’ll almost certainly need to fill out the same four fields:
- Type: A
- Name: git.rdnlsmith.com
- Value: 138.197.81.0
- TTL: 3600
“A” records represent a mapping between a domain—in this case, a new “git” subdomain under rdnlsmith.com—and an IPv4 address. The time-to-live (TTL) value indicates the maximum length of time, in seconds, that any DNS resolver should cache the record; here, I’m telling them to check for updated values before responding to a query at least once per hour.
To prove that it worked, immediately before configuring the
record,6
I ran the following ping
tests:
rdnlsmith@zephyr ~ $ ping 138.197.81.0
PING 138.197.81.0 (138.197.81.0) 56(84) bytes of data.
64 bytes from 138.197.81.0: icmp_seq=1 ttl=45 time=36.1 ms
64 bytes from 138.197.81.0: icmp_seq=2 ttl=45 time=34.6 ms
64 bytes from 138.197.81.0: icmp_seq=3 ttl=45 time=31.9 ms
^C
--- 138.197.81.0 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 31.911/34.194/36.096/1.729 ms
rdnlsmith@zephyr ~ $ ping git.rdnlsmith.com
ping: git.rdnlsmith.com: Name or service not known
…and then after:
rdnlsmith@zephyr ~ $ ping git.rdnlsmith.com
PING git.rdnlsmith.com (138.197.81.0) 56(84) bytes of data.
64 bytes from git.rdnlsmith.com (138.197.81.0): icmp_seq=1 ttl=45 time=35.5 ms
64 bytes from git.rdnlsmith.com (138.197.81.0): icmp_seq=2 ttl=45 time=36.0 ms
64 bytes from git.rdnlsmith.com (138.197.81.0): icmp_seq=3 ttl=45 time=32.0 ms
^C
--- git.rdnlsmith.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 31.982/34.521/36.041/1.807 ms
Nice.
One more thing: you may have noticed in earlier snippets that, when
I’m connected to the server, the hostname part of my prompt reads
@git
, not @git.rdnlsmith.com
. There are two
factors at play here.
The first is the default prompt configuration, which can be found in
~/.bashrc
:
if [ "$color_prompt" = yes ]; then
PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
else
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
fi
unset color_prompt force_color_prompt
# If this is an xterm set the title to user@host:dir
case "$TERM" in
xterm*|rxvt*)
PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@\h: \w\a\]$PS1"
;;
*)
;;
esac
The placeholder \h
, which appears in each of the three
PS1=
lines above, represents the hostname. More precisely,
from the man page for Bash itself (man bash
):
PROMPTING
When executing interactively, bash displays the primary prompt PS1 when it is ready to read a command… Bash allows these prompt strings to be customized by inserting a number of backslash-escaped special characters that are decoded as follows:
\h
- the hostname up to the first '.'
\H
- the hostname
This makes sense if you have a large number of machines on the same
domain, each with its own subdomain: everything after the first
.
will be the same, and you only need to see the first part
to know which machine you’re connected to. I only have the one VM,
and I wanted it to display the full hostname, so I changed each
\h
to \H
.
After re-loading the configuration with source ~/.bashrc
,
however, I still saw daniel@git:~$
. The
second factor was the file /etc/hostname
:
daniel@git:~$ cat /etc/hostname
git
I changed this to read git.rdnlsmith.com
. Then, I also
checked /etc/hosts
:
daniel@git:~$ cat /etc/hosts
# Your system has configured 'manage_etc_hosts' as True.
# As a result, if you wish for changes to this file to persist
# then you will need to either
# a.) make changes to the master file in /etc/cloud/templates/hosts.debian.tmpl
# b.) change or remove the value of 'manage_etc_hosts' in
# /etc/cloud/cloud.cfg or cloud-config from user-data
#
127.0.1.1 git.rdnlsmith.com git
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
…but no change was needed here, because the first loopback
address (127.0.1.1
) was already correlated with both
git
and git.rdnlsmith.com
. Per the comments at
the top of that file, I also went into /etc/cloud/cloud.cfg
and changed preserve_hostname: false
to
preserve_hostname: true
, to ensure that DigitalOcean
won’t overwrite my change to /etc/hostname
.
After sudo reboot
, my prompt read
daniel@git.rdnlsmith.com:~$
.
Serving webpages
In order to actually see anything upon visiting <http://git.rdnlsmith.com>, I needed to install three more packages:
sudo apt install nginx fcgiwrap cgit
Nginx is a popular, lightweight
web server, designed to handle more traffic under tighter resources than
the older-but-more-featureful
Apache. cgit is, of course, the
application that I actually want to run. So, what’s
fcgiwrap
?
An imprecise history of dynamic webpages, from someone who was not there for most of it
The simplest kind of website is just a collection of HTML files in some directory on a server. When you request a particular URL through your browser, a program running on that server—Apache and Nginx being examples—maps the URL to a file path and sends back the file. This works well if you have a website that people will only read, such as this blog.
If you want a website that people will interact with—submit comments, for instance—then you’re going to need some kind of database to store the information those people submit, and you’re going to need some way to pull content back out of that database and inject it into a webpage. Nowadays, this is usually done client-side with JavaScript: it runs in your browser, fetches content in the background, and rewrites the webpage on the fly to incorporate the content.
But the web has been around longer than JavaScript. In the olden days, you would write another server-side program to serve as a gateway between the web server software and your database. Instead of locating a file, the server software would pass on the request to your gateway; the gateway would then find the appropriate information in the database, generate a webpage containing that information, and pass it back.
Perhaps unsurprisingly, people started writing a whole lot of gateway programs to do specific things with specific databases. Each program might depend on implementation details of a particular web server in order to function: if two people wanted to do something similar, but they used different server software, they might have to write two separate gateways.
Eventually, the web community standardized the interface between web servers and gateways, so that any compliant gateway would be compatible with any compliant web server. This was named the Common Gateway Interface, or CGI. The name “cgit” is a portmanteau of “CGI” and “Git”—it’s a gateway program that uses Git as its database.
FastCGI came along sometime later to address scaling issues with regular-CGI. With regular-CGI, every incoming web request spawns a new instance of the gateway program, which serves that one request and then terminates. Under high traffic, this approach can lead to latency (as each request waits for a process to spawn) and resource exhaustion (from running so many independent processes at once). With FastCGI, a smaller number of longer-running processes each handle multiple requests, which can be much more efficient.
Okay, back to fcgiwrap
As mentioned above, cgit is a CGI program. For my use case, the
performance implications of CGI vs. FastCGI aren’t likely to be an
issue. What is an issue is the fact that Nginx doesn’t
support CGI—but it does support FastCGI. As you may have
guessed by now, fcgiwrap
is a wrapper for CGI programs: it
spawns a persistent process that interacts with a FastCGI-compatible web
server on one side and a regular-CGI program on the other.
The fcgiwrap
service started automatically upon
installation (check sudo systemctl status fcgiwrap
), so I
didn’t need to do anything else with this.
Configuring Nginx
Nginx is capable of serving multiple distinct websites from one
machine. Each site gets its own server { }
configuration
block; typically, you would put each such block in its own file under
/etc/nginx/sites-available
and symlink each file to
/etc/nginx/sites-enabled
. This allows you to easily take
individual websites down and put them back up again by removing and
re-creating the symlink.
sudo touch /etc/nginx/sites-available/cgit
sudo ln -s /etc/nginx/sites-available/cgit /etc/nginx/sites-enabled/
Initially, my cgit
configuration file looked like
this:
server {
listen 80;
listen [::]:80;
These first two lines tell Nginx that any requests intended for this website should be expected on port 80—the standard port for HTTP traffic— for IPv4 and IPv6, respectively. I haven’t actually enabled IPv6 for my VM as of this writing; but if I ever do, I won’t need to change this file.
server_name git.rdnlsmith.com;
This line means that Nginx will only consider this file if the
hostname portion of the request URL matches
git.rdnlsmith.com
. A request with any other hostname, even
one that comes through port 80, will be handled by some other website
configured in some other file.
root /var/www/cgit
# First attempt to serve request as file (logo, css), then fall back to
# calling cgit (all pages).
try_files $uri @cgit;
The first line means that any literal files to be served will be
found somewhere under /var/www/cgit
. By default, the
remainder of the path to each file should match the path portion of the
request URL.
In my case, the only literal files I have are the cgit logo
(cgit.png
), favicon (favicon.ico
),
and CSS (cgit.css
), each of which I copied from
/usr/share/cgit
. I intend to customize them eventually.
The second line (not counting the comments) means Nginx will check
for a literal file matching the request URL first, and any requests that
don’t map to a literal file will be handled by the
location { }
block labeled @cgit
. Locations
are normally identified with a regular expression that matches some part
of the URL; the @
syntax lets you identify a location block
by name instead.
Finally:
location @cgit {
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME /usr/lib/cgit/cgit.cgi;
fastcgi_param PATH_INFO $uri;
fastcgi_param QUERY_STRING $args;
fastcgi_param HTTP_HOST $server_name;
fastcgi_pass unix:/run/fcgiwrap.socket;
}
}
The first line reads in the file
/etc/nginx/fastcgi_params
, which maps several Nginx
variables to the corresponding FastCGI parameters. The next four
override some select parameters, most notably setting the location where
the cgit executable is installed. The last line tells Nginx where to
find the running fcgiwrap
process. I’ll admit that I
didn’t expend much effort trying to understand these; as much as I
usually like to make my own informed decisions rather than blindly
copying from others, there’s a lot that’s new to me
going into this project and this bit seems pretty innocuous.
With the file created and symlinked, I tested that it was valid:
sudo nginx -t
…and re-loaded Nginx so it would take effect:
sudo nginx -s reload
Next, I needed to allow HTTP traffic through the firewall. Installing Nginx added three new entries to UFW’s app list:
daniel@git.rdnlsmith.com:~$ sudo ufw app list
Available applications:
Nginx Full
Nginx HTTP
Nginx HTTPS
OpenSSH
“Nginx HTTP” and “Nginx HTTPS” are pretty self-explanatory; “Nginx Full” combines both. For starters, I just enabled HTTP:
sudo ufw allow 'Nginx HTTP'
At this point, it was possible to visit <http://git.rdnlsmith.com> in a web browser and see cgit’s homepage, albeit with no actual content.
Adding Repositories
In its
chapter
on hosting, the book Pro Git depicts an example Git
server where the repositories are kept in /srv/git
. The
Filesystem
Hierarcy Standard seems to agree with this:
Purpose
/srv
contains site-specific data which is served by this system.Rationale
Th[e] main purpose of specifying this is so that users may find the location of the data files for a particular service … Data that is only of interest to a specific user should go in that user[’s] home directory. If the directory and file structure of the data is not exposed to consumers, it should go in
/var/lib
.The methodology used to name subdirectories of
/srv
is unspecified as there is currently no consensus on how this should be done. One method for structuring data under/srv
is by protocol, eg.ftp
,rsync
,www
, andcvs
.
…so, I decided to put public repositories in
/srv/git
and private ones in /home/daniel
.
I expect to be the only person ever to have write access to any
repositories on this server, even the public ones, so I could
have given ownership of /srv/git
to daniel
.
Nonetheless, I wanted to do this “right.” I created a
git
group, and made myself a member:
sudo addgroup git
sudo usermod -aG git daniel
I had to exit
and reconnect in order for my session to
pick up the new group membership.
Then, I created the /srv/git
directory, made
git
the owning group (leaving root
as the
owning user), and toggled the
setgid
bit so that any contents created therein would inherit the group
ownership:
sudo mkdir /srv/git
sudo chgrp git /srv/git
sudo chmod g+s /srv/git
After that, I created an empty repository for each of my projects; for example:
git init --bare --shared iphoto-extractor.git
The --bare
creates only the .git
folder
with no working directory, as is typical for server-side repositories.
The --shared
flag propagates the group ownership from the
repository’s parent directory, though I’m not sure this is
actually necessary since I already set the setgid bit.
Within each repository, in the hooks
subdirectory, I
saved a copy of the
post-receive
hook example from the cgit repository (with the
.agefile
extension removed). This enables cgit to inspect
commit metadata whenever changes are pushed in order to calculate
accurate age values—which are shown on the “summary”
page for each repository, and a few other places—instead of trying
to estimate them based on file modification timestamps.
I also created a symlink called public
in my home
directory that points to /srv/git
, plus a directory named
private
. This lets me use SSH URLs with the form
git.rdnlsmith.com:public/repo-name
or
git.rdnlsmith.com:private/repo-name
instead of
git.rdnlsmith.com:/srv/git/repo-name
.
ln -s /srv/git ~/public
mkdir ~/private
On my local machine, I renamed the existing remote for each repository and added a new default remote pointing to my server.
cd ~/code/iPhotoExtractor
git remote rename origin github
git remote add origin git.rdnlsmith.com:public/iphoto-extractor.git
git push --all origin
Configuring cgit
cgit’s configuration file is /etc/cgitrc
. The
available options are described by man cgitrc
.
Unfortunately, it doesn’t give much indication as to which options
you’re likely to need; but, so far, I haven’t needed
much:
cache-size=1000
Any positive value here enables caching, so cgit won’t have to re-generate a recently-served page if someone else visits it (or the same person visits it again). To save disk space, cgit will start deleting the oldest cached pages if the number of entries reaches the configured number (1000).
You can also configure how long different types of pages should be served from a cached copy before the cache is considered stale and the page is re-generated anyways. I’ve kept the defaults: most pages can be cached for about five minutes; repository “about” pages for fifteen; commits indefinitely, since they’re immutable.
readme=:README.md
mimetype-file=/etc/mime.types
about-filter=/usr/lib/cgit/filters/about-formatting.sh
email-filter=/usr/lib/cgit/filters/email-gravatar.py
The first line above says to look for a root-level file named
README.md
in each repository and use its contents for the
repository’s “about” tab. You can list this option
more than once with different file names if you use different
conventions from one repository to another; or you can configure it
separately for each repository.
The second line tells cgit to use the file
/etc/mime.types
—commonly included in Linux
distributions—to look up which MIME types to use for which file
extensions. This is necessary in order for e.g.
embedded pictures in the “about” pages to actually render as
pictures.
The last two lines specify scripts to be run when generating “about” pages or when displaying contributor names, respectively. Both of these are included with cgit, but you can use your own custom scripts too.
about-formatting.sh
checks if the “about”
file is one of several common formats—Markdown, reStructuredText,
a man page, a plain-text file—and runs it through an appropriate
converter program so it will render nicely as HTML. I had to install the
python3-markdown
package for Markdown to work.
email-gravatar.py
fetches the
Gravatar image for each
contributor’s email address and displays it beside their name
wherever it appears. There’s also a Lua version, which is supposed
to be faster, but the Python one worked fine for me and I didn't want to
bother figuring out how to get the Lua script working.
You can use filters to enable syntax highlighting, as well. I’ve left this off for now, in keeping with my blog’s aesthetic.
enable-git-config=1
This lets me store repository-specific settings in each
repository’s Git configuration file (./config
in a
bare repository, or ./.git/config
in one that has a working
tree) instead of having a separate cgit configuration file in each. The
only thing I’m using this for right now is to allow some of my
projects to have their displayed names written differently than their
URLs; for example, my
iPhotoExtractor
project has its name displayed in Pascal case (as is typical in the .NET
ecosystem) but all of my repository URLs are in kebab case
(iphoto-extractor
).
enable-http-clone=0
clone-url=https://git.rdnlsmith.com/$CGIT_REPO_URL.git
cgit supports cloning over HTTP via Git’s older, “dumb” HTTP protocol. This is on by default, but I chose to disable it in favor of the “smart” HTTP(S) protocol, which I’ll cover later on.
The second line sets the pattern for the clone URL(s) displayed at the bottom of each repository’s “summary” page. You can list multiple patterns separated by spaces if e.g. you support more than one protocol. Note, however, that listing these patterns is for display purposes only; it does not make those URLs actually work.
The variable $CGIT_REPO_URL
contains the path to the
repository relative to a configured root directory. In my case, the root
directory is /srv/git
and the path is just the repository
name. This can be overridden with per-repository configuration.
remove-suffix=1
virtual-root=/
As is typical for bare repositories, my directories under
/srv/git
all have a .git
suffix; e.g. iphoto-extractor.git
. The
remove-suffix
option excludes that from the URL and the
displayed name (if not overridden) for each repository. I set this
because it looks pretty. Consequently, I needed to include
.git
at the end of my clone-url
pattern
above.
I’m honestly not totally sure why I need the
virtual-root
setting here. The man page implies I
shouldn’t need it anymore if I’ve set my
PATH_INFO
CGI parameter correctly, which I think I
have. Without it, though, relative links throughout the website started
their paths too far up the hierarchy and, consequently, didn’t
work.
scan-path=/srv/git
Finally, this tells cgit where to look for my repositories. This pretty much has to go last, because only the settings above this line will be applied to the repositories discovered here. It’s also possible to explicitly list out paths to individual repositories (with a different setting) if you don’t want cgit to scan for them.
Enabling HTTPS and Read-Only Public Cloning
Enabling HTTPS requires two things. First, I needed to allow HTTPS traffic through the firewall:
sudo ufw allow 'Nginx HTTPS'
(You could skip this step by allowing “Nginx Full” in the first place, instead of starting with just HTTP.)
Secondly, I needed to obtain a TLS certificate from a widely-trusted certificate authority. The simplest way to do this is to get one from Let’s Encrypt using Certbot; which, once configured, will automatically obtain a certificate and renew it whenever necessary.
For Ubuntu, Certbot’s official instructions recommend installing it as a snap package:
sudo snap install core
sudo snap install --classic certbot
sudo ln -s /snap/bin/certbot /usr/bin/certbot
Then, run it and answer the prompts:
daniel@git.rdnlsmith.com:~$ sudo certbot --nginx -d git.rdnlsmith.com
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Enter email address (used for urgent renewal and security notices)
(Enter 'c' to cancel): daniel@rdnlsmith.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please read the Terms of Service at
https://letsencrypt.org/documents/LE-SA-v1.3-September-21-2022.pdf. You must
agree in order to register with the ACME server. Do you agree?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: y
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Would you be willing, once your first certificate is successfully issued, to
share your email address with the Electronic Frontier Foundation, a founding
partner of the Let's Encrypt project and the non-profit organization that
develops Certbot? We'd like to send you email about our work encrypting the web,
EFF news, campaigns, and ways to support digital freedom.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: n
Account registered.
Requesting a certificate for git.rdnlsmith.com
Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/git.rdnlsmith.com/fullchain.pem
Key is saved at: /etc/letsencrypt/live/git.rdnlsmith.com/privkey.pem
This certificate expires on 2024-06-26.
These files will be updated when the certificate renews.
Certbot has set up a scheduled task to automatically renew this certificate in the background.
Deploying certificate
Successfully deployed certificate for git.rdnlsmith.com to /etc/nginx/sites-enabled/cgit
Congratulations! You have successfully enabled HTTPS on https://git.rdnlsmith.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
If you like Certbot, please consider supporting our work by:
* Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate
* Donating to EFF: https://eff.org/donate-le
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Certbot found the Nginx configuration file that contains the
server block for git.rdnlsmith.com
and made a few changes.
It removed the lines
listen 80;
listen [::]:80;
that I had written originally, and inserted several lines at the end to listen for HTTPS traffic and use the certificate it acquired:
listen [::]:443 ssl ipv6only=on; # managed by Certbot
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/git.rdnlsmith.com/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/git.rdnlsmith.com/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
It also added another server block in the same file to catch HTTP traffic and redirect it to HTTPS:
server {
if ($host = git.rdnlsmith.com) {
return 301 https://$host$request_uri;
} # managed by Certbot
listen 80;
listen [::]:80;
server_name git.rdnlsmith.com;
return 404; # managed by Certbot
}
Git’s smart HTTP backend is another CGI executable,
git-http-backend
, which is included in Ubuntu’s
git
package. All I had to do to get it working was add
another location block to my Nginx configuration, very similar to the
one for cgit:
# Smart HTTP backend
location ~ \.git {
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME /usr/lib/git-core/git-http-backend;
fastcgi_param GIT_HTTP_EXPORT_ALL "";
fastcgi_param GIT_PROJECT_ROOT /srv/git;
fastcgi_param PATH_INFO $uri;
fastcgi_pass unix:/run/fcgiwrap.socket;
}
The ~
after location
indicates a regular
expression; \.git
will match any URL that ends with
.git
. So, the URL
<https://git.rdnlsmith.com/dotnet-pgn>
will be handled by cgit, but
<https://git.rdnlsmith.com/dotnet-pgn.git>
will be handled by
git-http-backend
.
The GIT_HTTP_EXPORT_ALL
line creates an empty
environment variable of the same name, instructing
git-http-backend
to serve any and all repositories it finds
in the GIT_PROJECT_ROOT
directory. Without this, I would
have to create a file named git-daemon-export-ok
within
each repository that I wanted to make available.
Examples that I’ve seen also included
client_max_body_size 0;
…disabling the default size limit for incoming requests. However,
this is to facilitate git push
, and I only intend to push
via SSH, so I left this out.
Now, I can clone repositories via HTTPS:
rdnlsmith@zephyr ~ $ git clone https://git.rdnlsmith.com/dotnet-pgn.git
Cloning into 'dotnet-pgn'...
remote: Enumerating objects: 25, done.
remote: Counting objects: 100% (25/25), done.
remote: Compressing objects: 100% (25/25), done.
remote: Total 25 (delta 3), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (25/25), 9.42 KiB | 3.14 MiB/s, done.
Resolving deltas: 100% (3/3), done.
…or HTTP, via redirect:
rdnlsmith@zephyr ~ $ rm -rf dotnet-pgn
rdnlsmith@zephyr ~ $ git clone http://git.rdnlsmith.com/dotnet-pgn.git
Cloning into 'dotnet-pgn'...
warning: redirecting to https://git.rdnlsmith.com/dotnet-pgn.git/
remote: Enumerating objects: 25, done.
remote: Counting objects: 100% (25/25), done.
remote: Compressing objects: 100% (25/25), done.
remote: Total 25 (delta 3), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (25/25), 9.42 KiB | 4.71 MiB/s, done.
Resolving deltas: 100% (3/3), done.
…but any attempt to push via HTTP(S) is rejected, as intended:
rdnlsmith@zephyr ~ $ cd dotnet-pgn
rdnlsmith@zephyr ~/dotnet-pgn [master ≡]$ vim README.md
rdnlsmith@zephyr ~/dotnet-pgn [master ≡ +0 ~1 -0 !]$ git commit -am "push test"
[master a5f14e0] push test
1 file changed, 2 insertions(+)
rdnlsmith@zephyr ~/dotnet-pgn [master ↑1]$ git push
fatal: unable to access 'https://git.rdnlsmith.com/dotnet-pgn.git/': The requested URL returned error: 403
Footnotes
SourceHut is still in a public alpha stage, and until that changes, paying for Git hosting is optional. Nonetheless, it wouldn’t feel right to me to use it without paying. ↩︎
The git.sr.ht module’s first commit actually incorporates some CSS from cgit (later customized). ↩︎
I refuse to call them “droplets.” ↩︎
The data transfer allowance actually accrues based on how much time the VM is active; 1000 GiB (and $6) assumes it’s running for the entire month. Usage in excess of your allowance is billed at $0.01 per 1 GiB. ↩︎
For the record: quoting verbatim from DigitalOcean’s website does not count as me “call[ing] them ‘droplets.’” ↩︎
In retrospect, this was risky. DNS resolvers can cache a negative result—as in, cache the fact that a (sub)domain doesn’t exist. In that case, trying to
ping git.rdnlsmith.com
before I created the record could have caused me to have to wait quite some time—possibly hours—before the follow-up test would have worked. ↩︎