Disaster Recovery and The Lab’s Gaming Rig

Not too long ago, coding with my coffee on my patio, I went to check some changes into the lab git repository. The repo was not responding. I went to login to the virtual server handing the repo ~ also no response. On my way to the Nagios monitoring dashboard, I spotted the alert emails ~ it seems that one service was not all that was missing. An entire host server fell off the radar!

Now this particular host was something of an experiment: It was an HP pre-built / about to be discontinued model from Best Buy with an 8th generation, 6-core Intel Core i7 processor with hyperthreading, 16 GB of memory, a 128 GB NVME drive and a 1 TB SATA spinner — not too shabby. On-board graphics meant no need for a GPU to show a text console on the screen — bonus. I’d picked it up, stripped Windows from it, and installed XCP-ng, the xen server virtualization platform. With the six cores plus hyperthreading, the machine could run the equivalent of 10+ Amazon AWS micro-instances which cost around $10/month each ~ very cool. It’s primary role as an experiment itself was to host experiments ~ non-critical stuff that didn’t require failover or similar.

Good thing!

Firstly, no worries: The hosted virtual machines themselves were routinely backed-up to local disconnected storage and could be reloaded on on other servers, and particular services such as the local git repo were redundant with cloud services. Bottom line: there was nothing critical that couldn’t be recovered in some reasonable state with a little bit of work and time. That said, it would certainly be nice if I could get the server running again from where it left off…

Recovery

The server itself was roughly a $1000 investment and had been running for 1-2 years. It had some quirks, but was not too much trouble overall. Finding it powered up with fans gently spinning but unresponsive was not in any way typical. Bottom line? The box wouldn’t POST: no video signal, no beeps ~ only the fans firing up and then ramping down and settling in. Pulling the memory and trying again did result in error beeping, so at least something was alive in there…

Approach? Well, since the boot drive was an NVME solid state device, there was clearly power to the motherboard with very little demand, and single-stick memory checks didn’t produce different results, I speculated I was facing either a failed motherboard or a failed CPU. Surely at least the memory and drives could be salvaged ~ maybe the CPU too, especially given the difference in prices between the a motherboard an a CPU.

Well, the prebuilt design produced a tight and reportedly non-standard form with little space for upgrades or expansions — a consideration in light of the “momento mori” event and thinking the salvage should be anything but a disposable commodity — so I’d probably need a new case, and looking at the power supply? A non-descript 180W? One or two future accessories would kill that for sure. Ok then! Round One: motherboard, case, and power supply it is!

… and apparently a CPU cooler, too — for the second visit, that is. The original cooler, a simple air cooler, was probably suitable, but remember that prebuilt aspect? The cooler’s mounting bracket was glued to the bottom of the motherboard ~ not reuseable.

Alright: New motherboard and cooler, old CPU, memory, drives, and power supply. The results? No POST.

Next trip? Wait — about these “next trips:” The nearest place carrying all of these raw computer parts are our friendly neighborhood Micro Centers, were the “neighborhoods” are a choice of a 45-minute drive plus tolls either to the northeast of Baltimore or to the northwest of Washington, D.C. ~ choose your Beltway. So for the next trip, I’d pick up both replacement memory (as the old set was a pair of nondescript 8 GB, 2666 Hz sticks) as well as a new CPU — after carefully confirming the return policy, of course. The CPU would be hedging the bet and saving a trip, just in case… The 8th generation i7 was no longer sold by Micro Center as all the 9th generation stuff was out. The latest model, the Core i9 was certainly overkill — and over priced — for my needs, so the step in between would do: the 9th generation Core i7 9700k. I’m told the gamers love it, what with 8-cores and overclocking capable, what’s not to love except maybe the price around $400?

Well, the memory alone was not a fix — but as long as I had the better memory, might as well keep it. And yes, big air cooler off, old CPU pull, new CPU push, air cooler remount did the trick — we have POST!

What I didn’t have was any recognized boot drive. Finagling a wired USB keyboard, a wired USB mouse, and a USB thumb drive, I got a live instance of Ubuntu running. From there, yes, I could see the boot partition, but the box somehow couldn’t… And the XCP-ng partitioning wasn’t great (at least not at the time): Only about 40G of the 128G NVME was used by XCP-ng during the auto-install. In an old effort to recover some of that space, I linked the spinning drive to the NVME drive with linux logical volume management (LVM) — a bad move that produced no useful results. “It’d probably just be easier to kill those disks and start fresh, right?” I thought to myself… and then pushed the button.

Okay then: a fresh linux install. “Maybe this will serve as a development station in the meantime…” Install the desktop environment and… achievement unlocked!

… but what’s with that annoying flickering?

Google research… upgrade motherboard BIOS… tinker with settings in the OS and in BIOS… Swap HDMI ports, cables, monitors… No, still the occasional, full-screen blackout flicker. That is certainly not suitable for a workstation… The internet had no solution indexed — just some discussions regarding the Intel integrated graphics and the latest Linux 5.3 kernel…

Last trip, this time to the local Best Buy for a light raping by the price of a GPU card. Later that evening, after some installation details worthy of their own story, the flicker is gone. Woot.

Looking into the virtualization options, a hiccup. Remember the hyperthreading on the 6-core i7 8-gen? And remeber the hyperthreading for the 8-core i9 9-gen? The hyperthreading that effectively doubles the number of apparent CPUs for a hypervisor, that would place me now between 12 and 16? Guess what Intel omitted from the 8th-gen i7?

Right. No hyperthreading. Only 8 apparent CPUs from a hypervisor’s perspective. However fast they might be, it was a net loss for the intended use.

So what did I end up with? Well, if I stick Windows on there I’ll have built a fairly substantial gaming rig! That’s a far cry from the targeted virtualization server… Maybe my son will enjoy the new box… right after he has me buy several new games for it :-p

Disaster Recovery Scorecard?

Well, given that we had classified this server as experimental, deploying no primary or critical services on it, and given that we did keep routine backups off-box and off-site, there was no impact operations. There were, however, man-hours and expenses associated with restoring to the previous state, which has not yet been accomplished.

Was it worth it?

Well, kind of.

  • The expenses associated with the original purchase and even the replacement parts and time compared favorably to commercial cloud hosting. Admittedly though, that’s because this host was not significant to operations — that is, no clients were impacted with downtime.
  • As cloud provisioning becomes increasingly trendy and easy, I believe we’re losing a lot of our basic knowledge and DIY skills. Increasingly, we lose the capability to architect systems with other than cloud solutions and, as a result, we’re drawn into an increasing dependency cycle with associated pricing — the latest variant on “vendor lock-in.”
  • We hold to the tenant that using cloud computing means we’re using other people’s computers and networks — something that is fundamentally not conducive to operations requiring data security, privacy, anonymity, controls against vendor outages, and so forth.

To be resilient and effective, it’s important to keep skills sharp at the “roughing it”- and “guerrilla warfare”-levels of computing and network operations. Lessons learned at the lowest levels are applicable at every level.

So, yes: it’s worth it ~ personally and professionally, for ourselves and for our clients 🙂

Tor Hidden Service Access

More as an experiment than anything else, for the time being this site is available via the following Tor hidden service address:

dgszfpm3lxkssbu6xof3th2hv5c2drmxpt4qhmfv7pyhsb6cumfnqeqd.onion

This will allow allow access for anonymous reading as well as access from known non-U.S. IP addresses that are now temporarily blocked by policy.

Users should find that the mechanism prevents contact form submission as well as the ability to login to the site; however, all posts and pages are available as usual. Note that this secondary access is not designed as a hidden service hardened against known Tor implementation weaknesses; rather, it’s just another way to view the site with general Tor protections. Lessons learned here will likely find their way to the Sanctuary IdP project. In the meantime, I make no guarantee that I’ll keep this particular access open.

Enjoy!

Dovecot XOAUTH2 & Keycloak #technote

This is a technical, detailed follow-up to the last, more general post, Integrator Challenges, Email Example. It required a bit of work to get the parameters right, with no clear examples found googling around. So, for the archives and for anyone following, my notes:

Objective

SMTP & IMAP (email) authentication using OpenID-Connect tokens.

Audience

Sophisticated user, not for the faint of heart. This is a sketch of the solution involving several moving parts, not a full installation guide.

Our Baseline

  • Ubuntu Server 18.04
  • Postfix 3.3.0-1ubuntu0.2
  • Dovecot (dovecot-core 1:2.2.33.2-1ubuntu4.5)

References

Notes

  • In an integrated deployment, Dovecot provides authentication mechanisms to Postifx (SMTP), Dovecot (IMAP), and other Dovecot-supported protocols.
  • According to Dovecot, XOAUTH2 & OAUTHBEARER support are added beginning with v2.2.28. From my earlier note, you’ll see that Ubuntu Server 16.04’s repository default Dovecot package fell under the mark. There is no particular additional dovecot package required to add these two methods.
  • I assume you have a working installation supporting PLAIN & LOGIN protocols.
  • I will assume you have a working Keycloak installation supporting SSL/TLS, a user in that realm for testing, and that the email system will handle users from the domain matching the user’s email field.

Instruction

Set-up a default-valued Keycloak client, changing (1) from public to private client, and (2) removing all client scopes except email. Lessons learned, the default access token JWT will all client scopes and roles present will be too large to be an IMAP value in a key-value authentication instruction. Trimming down to email-only makes it work in initial testing.

In /etc/dovecot/conf.d/10-auth.conf, add the auth_mechanisms and an include for passdb details:

-auth_mechanisms = plain login
+auth_mechanisms = plain login oauthbearer xoauth2
+!include auth-oauth2.conf.ext

Create file /etc/dovecot/conf.d/auth-oauth2.conf.ext

passdb {
driver = oauth2
mechanisms = xoauth2 oauthbearer
args = /etc/dovecot/dovecot-oauth2.conf.ext
}

The last file references a file one directory-level up. Create /etc/dovecot/dovecot-oauth2.conf.ext, making the obvious replacements for your keycloak server FQDN, realm name, client ID, and client secret. The debug and rawlog lines are optional. If you are using other than a public CA chain for your server certificates, you’ll need to change the tls_ line and possibly specify public and private keys as noted in the References.

tokeninfo_url = https://keycloak.example.com/auth/realms/<realm-name>/protocol/openid-connect/userinfo?access_token=
introspection_url = https://client_id:client_secret@keycloak.examplecom/auth/realms/<realm-name>/protocol/openid-connect/token/introspect
introspection_mode = post
debug = yes
rawlog_dir = /tmp/oauth2
tls_ca_cert_file = /etc/ssl/certs/ca-certificates.crt

Testing

Testing is described in the Referenced documents. Since default Keycloak access tokens have a lifespan of five minutes, you’ll have to perform the test sequence quickly (or adjust that lifespan for testing):

  1. Use Postman or curl to obtain an access token via a password grant request to Keycloak.
  2. Base64 encode the authentication request, which includes the username (here, the user’s email address) and the Bearer token (the access token you just received).
  3. Connect to the IMAP server using openssl, e.g. openssl s_client imapserver.example.com:993.
  4. Try to authenticate, e.g.: aaa authenticate xoauth2 base64-encoded-blob-from-last-steps.

With any luck, your user will be logged in. Otherwise, we have some debugging to do…

To Do

Note: I still have to test if this integration works when using Keycloak’s own TOTP 2FA mechanism.

Fingers crossed as we move to modifying the webmail client to take the token from Apache and pass it on to the mail servers.