Trying out Nikola Update 1

It's been a couple of weeks now, since I started "trying out Nikola", so I thought I'd do an update. As I've worked through getting the site set out the way I want, I've learnt quite a few things about Nikola, and hosting it on my RPi 4. The mere fact that I've stuck with it so far probably means that it can do all the things I need it to. Here are a few observations on that process:

1. Speed

There are three factors here - speed of setting up, speed of getting posts up, and speed of serving the site.

Getting Nikola set up isn't really that hard, and doesn't take that long, but it's a bit more involved than just downloading and installing a package. There was a fair bit of Linux command-line (CLI) action here: installing required Linux packages with "apt get"; then installing the required Python libraries with pip; then configuring Nikola, and a web server; and finally getting the web-server accessible to the outside world. These parts, spread unevenly over the past 19 days, have been by far the most time-consuming part, however, I feel that's par for the course. Even if I'd gone for a less "technical" web-publishing system, I'd still have had to do the operational bits if I wanted to host it on my own broadband - which I very much do.

Getting posts up is not as user friendly as using a dynamic web publishing platform like WordPress or Wix, but it is pretty simple and quick, because I'm pretty comfortable with the CLI. I'm sure I could use a GUI editor to create the raw posts if I wanted, but at the moment, I don't feel a need for that. My workflow currently looks mostly like this:

Run "nikola new_post -e posts/.md
Type in the title, then after the "vim" editor launches...
Edit the metadata header, this (for this post, see below)
Type the post body as Markdown formatted text (or one of many other text-based formats)
Run "nikola build"
Run "nikola deploy"
If revisions are required (they usually are), edit the post with vim, and go back to step 5.

The metadata block from step 5 looks like this:

<!--
 .. title: Trying out Nikola Update 1
 .. slug: trying-out-nikola-update1
 .. date: 2024-01-18 13:13:54 UTC
 .. tags: Linux,Project,RaspberryPi,WebTech,Updates
 .. category: Trying out Nikola
 .. link:
 .. description:
 .. type: text
-->

Because I'm a bit lazy, I decided that the "deploy" step would be superfluous if I just served the content directly from the "output" sub-directory of where my blog sits. Because of my setup, deploying actually would just copy it somewhere else on the same volume, and I can't see any reason that I should do that. This step is supposed to be used where Nikola resides on a different computer to the web server, and mine doesn't. I may come back to that later - perhaps by running Nikola on my laptop, and not on the RPi, but that's for another day.

The speed of serving the site was a big surprise. I've worked with lots of dynamic web publishing systems in the past, as well as quite a few static ones, so I should have seen this one coming really. Bearing in mind that it's a static site, served by "lighttpd" (Lighty to their friends) running on a lowly Raspberry Pi 4, and is currently connected to my WiFI, not on Ethernet, the site is shockingly fast. I would have been happy with "as fast as a free online blogging system", but the page render times I'm seeing would make most e-commerce site admins (which I used to be) envious. The bottleneck it seems is entirely my broadband, but even then, it's not making a big impact. It will probably slow down a bit when I work out how to remove dependencies on the Google Fonts API, but still.

2. Theme

Theming is relatively easy, and yet again, quite CLI-intensive. The docs for theming Nikola are a bit spread out, but there is information out there. Hacking an existing theme is fine if you can find one that you like as it is, or can mould into what you want. I chose the latter - the Nikola site, https://getnikola.com/ has a nice gallery here: https://themes.getnikola.com/ where you can view examples of free themes full of Lorem Ipsum text. I ended up choosing the "gruberwine" theme because I liked the aesthetics - taste isn't universal, and I can't explain it more than that! To do this, you just have to install the theme (on the CLI) and then enable it in your site's conf.py file. There's not much to configure at this stage.

3. Operations

Making the site accessible on the Internet was in roughtly three phases:

First, I had to configure my router to forward web traffic to the web server. For some reason, I still have to have port 80 (HTTP) open, even though I'm serving everything over 443 (HTTPS). It's partly to do with the Letsencrypt setup, and I would much rather not have port 80 ope at all. This is quite easy on most modern-ish broadband routers - you just have to find the right part of the admin menu.

Secondly, I had to set up a dynamic DNS provider, so I can have a "real" domain name, rather than just an IP address, which periodically changes when my ISP decides. I opted for dynu.com who still offer really free dynamic DNS. After a significant time pondering what to call the site, I ended up with makerpunkbuzz.mywire.org... not perfect, but in the end, I didn't want to spend forever deciding, and it had already taken too long! Anyway, once decided, and registered, I had to set up an agent on the RPi to keep the address updated. This takes the form of installing and configuring DDClient from the Debian repository. Very straight-forward, but obviously, a with a little CLI fun thrown in.

Finally, to enable proper HTTPS, or encrypted HTTP, I opted for Letsencrypt. This was a bit annoying, but ultimately Letsencrypt have to choose a way of packaging which works best for them, but why they chose "snap" packages I don't know. This mainly involved working out how to install the snap software on Debian, just so I could install Letsencrypt's certbot. Certbot is a great system, which automates all the admin required for setting up and maintaining the TLS digital certificates. Once done, however, it should basically run itself.

For day-to-day operations, I've ended up doing a fair bit of tweaking, and most of what I've done was to do with the Nikola theme - minor adjustments to the CSS styling like adding in the handwriting-style font, called RockSalt, and the layout, and learning how to do some things with the templating. I'm not super-hot on CSS, but I can find my way with lots of "goggling" (not a smelling pistake!) and reading of blog posts, w3c docs, stackoverflow questions and so on. I don't remember the exact details, but I eventually got everything the way I want it. It's not going to win any awards, but it seems OK to me ;-)

I've also added quite a few "comfort" features to the command-line environment, like the mosh shell (so I can keep SSH sessions to the RPi going even if I hybernate my main laptop); command-line utilities not installed by default on Dietpi; tweaking my .vimrc so it's not annoying when pasting text from the clipboard; and configuring the access log on Lighty so I can see which pages are being viewed but not much else.

I wanted to enable some basic request logging, without storing any tracking data or personal information like IP addresses, and in doing so, I came up against some quirks of Dietpi, the RPi OS I'm using; I had to write a little script to save Lighty's logfile to the main flash storage, because Dietpi, by default, stores server logs in a RAMdisk for better webserver performance, and to save wear on the flash. Because RAM is quite limited, it then clears the logs at 17 minutes past each hour, meaning there is no permanent record of the requests beyond an hour old. You can "tail" Lighty's access.log file to see what requests are coming in, but it keeps saying "file truncated" every hour, because the contents have been wiped by this hourly maintenance script. As I want a longer-term record of which pages are being read, I had to come up with a proper solution. What I have ended up doing is writing a little "one liner" script to run just before Dietpi's script (in /etc/cron.hourly) which appends the current log onto a backup file, so they don't get lost. This will obviously cause more wear on the flash, but it will be a small fraction of what it would have been if I changed Lighty to log directly to flash, and I'm aware there's a race condition here that could result in the loss of any requests logged between when my script runs, and when the cleanup script runs, but I'm comfortable with the very low risk. Anyway, after running this arrangement for a few days, and getting so much junk logged from drive-by exploit attempts, I've now evolved the script to only log the successful requests, and store the backup log compressed with gzip. I feel this strikes a decent balance between flash wear and my ability to see what's going on.

It does complicate seeing requests for the current day, however, because up to an hour's worth of logs will still be in the RAMdisk. That's easy enough to cope with though, and I now have a nice BASH shell one-liner that summarises requests since midnight, so I can see if anyone is actually reading what I'm writing... Sweet.

4. Hacking scans

One thing that I had largely forgotten about since giving up doing Linux admin for a living, was how utterly hostile the open web is. I decided I'd enable limited logging on Lighty (lighttpd) so I could see if anyone was actually reading my posts. By limited, I mean I log just the time of each request, the response code, response size, HTTP verb and the URI requested, so nothing that identifies viewers. What shocked me most was that on many days, drive-by scans by hacker scripts were by far the most frequent requests made. I think this is mainly because of port 80 being open, which I think is just for Letsencrypt - this is definitely something I'll be looking at further. It's a fearful world out there, especially on the "Tinkernet".