Server migration seems to have worked so far…
md array with disk gone missing – recovering data
Had a server that decided to drop a disk (or disk went faulty) in a RAID5 array. On reboot the array didn’t want to start. Output of
mdadm --detail /dev/md0
/dev/md0: Version : 1.2 Creation Time : Wed May 22 18:17:58 2013 Raid Level : raid5 Used Dev Size : -1 Raid Devices : 3 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sun Jan 3 06:50:05 2021 State : active, degraded, Not Started Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : nebula:0 (local to host nebula) UUID : 962b8ff0:00d88161:5a030e1f:236466af Events : 31168 Number Major Minor RaidDevice State 4 8 16 0 active sync /dev/sdb 1 0 0 1 removed 3 8 48 2 active sync /dev/sdd
Try to start array with
mdadm --run /dev/md0
mdadm: failed to run array /dev/md0: Input/output error
As expected, given it didn’t autorun on boot.
mdadm --assemble --run /dev/md0 /dev/sdb /dev/sdd
mdadm: /dev/sdb is busy - skipping mdadm: /dev/sdd is busy - skipping
This is because the md device is using the disks. Stop it: with
mdadm --stop /dev/md0
then reassemble with
mdadm --assemble --run --force /dev/md0 /dev/sd[bd]
mdadm: Marking array /dev/md0 as 'clean' mdadm: /dev/md0 has been started with 2 drives (out of 3).
mount -a to rerun
fstab. Took a few seconds but worked. Copy data off ASAP!
Running svnadmin verify as root permission issues on Berkeley DB repositories
After a system crash I ran
svnadmin verify on some of the relevant repositories to check things were ok – this was done as root. Afterwards normal network access (via Apache) was broken with
Internal error: Berkeley DB error for filesystem
appearing in the logs. The fix was to
chown -R www-data:www-data on the broken repositories. It looks like running
svnadmin verify as root changes permissions on SVN repositories somewhere (at least Berkeley DB ones).
Lesson learned – check normal SVN access after doing this sort of thing!
Blocking or disabling autofs automounts with the -null map
Suppose you have a linux network setup with automounter maps that come from the network (via
LDAP etc.) and you want to block some of them acting on a particular system. In our case we have an automount map that acts on
/opt and mounts various software packages from network shares. The problem with this is that you can’t then install your own stuff locally to
/opt, which is what a lot of Debian/Ubuntu packages expect to be able to do.
It turns out there is a option in the automounter for this sort of situation. There is a built-in map called
-null that blocks any further automounts to a particular mountpoint. In our case we want to block
auto.opt, so we add a line to
auto.master (somewhere before the bottom
Then restart the
autofs service (if stuff was mounted on
/opt then unmount it). Or reboot the system. You should find that you can put stuff in the local
To check the map is blocked you can also run
(also handy for checking what is actually meant to be mapped where).
Another way of doing this that leaves the system
auto.master untouched is to create a file
/etc/auto.master.d/opt.autofs (the first part of the name can be anything you want). Put the same contents in the file, e.g.
Note that using this mechanism normally requires two files – one in
/etc/auto.master.d/ and a map file that it refers to. In this case
-null is a built-in map.
Unfortunately this option is not well documented. Places where it is referred to are:
There are also other built-in maps, e.g.
-fedfs. Of these only the
-hosts map is documented in the
auto.master(5) man page.
-null is confirmed to work in CentOS 7, CentOS 8, Ubuntu 20.04, Debian 10.
TrueNAS and Windows clients – NTLMv2 issues
Situation – TrueNAS (or FreeNAS, or other Samba servers) serving a SMB share with NTLMv1 authentication disabled. A standalone Windows 10 system can connect to it, but a domain joined Win 10 system constantly claims wrong password.
The culprit here was a old group policy setting in the domain:
Network Security: LAN Manager authentication level
Computer Configuration - Windows Settings - Security Settings - Local Policies - Security Options)
This was set to
Send LM & NTLM - use NTLMv2 session security if negotiated, for backwards compatibility reasons with Win 2000 boxes and the like. This affects the registry key
lmcompatibilitylevel (setting it to 1) under
Unfortunately this is a bit misleading. According to this article:
Security Watch: The Most Misunderstood Windows Security Setting of All Time
This should negotiate better session security if possible, but does not actually send NTLMv2 requests or responses.
Thus trying to connect to a TrueNAS SMB share fails unless NTLMv1 Auth is explicitly enabled (in the service settings).
Ideally the group policy should be removed and the normal setting restored (NTLMv2 only). Or we can enable NTLMv1 on the share if it isn’t going to be a permanent setup.
OpenProject Apache reverse proxy with https secure connection
These are some notes on setting up OpenProject on a backend server (let’s call it
backsrv.example.com), and accessing it via a front-end system (
frontsrv.example.com). Normally we’d do the SSL termination at the reverse proxy, and there is some documentation on this. In this case I wanted to do things properly, and protect the login credentials all the way. This means using an https connection between the reverse proxy and the back end server.
Firstly, the reverse proxy has to trust the SSL certificate that the back end uses. There are several ways to go about this. I chose to set up a local certificate authority using the
easy-rsa scripts (using another small virtual machine set up only for this purpose). For one connection this is probably overkill, but for multiple backends in the future it will make the administration a lot easier.
- Set up CA
- Debian 10, install
easy-rsapackage, do required setup.
- Debian 10, install
- Copy CA root certificate to
- For Debian systems, copy to
- For Debian systems, copy to
- Create CSR on
backsrv, copy it to CA, sign it and copy resulting certificate to
backsrv. Put cert and key in sensible places (
etc/ssl/local-certs/). Make sure permissions are correct.
- Configure Apache on
backsrvand check cert works (for OpenProject edit
/etc/openproject/installer.datto put in the correct certificate paths and run
openproject configureto update the config).
Set up Apache to do proxy stuff on
frontsrv. Here’s the beginning fragment of
default-ssl.conf that should work:
<IfModule mod_ssl.c> <VirtualHost _default_:443> ServerAdmin webmaster@localhost DocumentRoot /var/www/html RequestHeader edit Destination ^https http early SSLProxyEngine on SSLProxyCheckPeerName off # To openproject server on backserv ProxyPass /openproject https://backsrv.example.com/openproject ProxyPassReverse /openproject https://backsrv.example.com/openproject <Location /openproject> ProxyPreserveHost On Require all granted </Location>
You also need to go to the OpenProject web interface admin area, go to System Settings – General and change the
Host name to the reverse proxy, and set protocol to https. It will complain if there’s a hostname mismatch (case sensitive, even!). You may also want to go to Email – Email notifications and change the
Emission email address to be consistent.
Don’t forget, need
For OpenProject the subdirectory locations on the front and back ends do need to match.
ProxyPreserveHost On is required per the OpenProject documentation. Unfortunately, that means it tries to match the name
frontsrv.example.com to the back end cert, and the SSL handshake fails. This is the reason for the
SSLProxyCheckPeerName off directive – it disables checking the certificate CN or Subject Alternative Names.
SSLProxyCheckPeerName off can go in a
<Proxy>...</Proxy> matching block with Apache 2.4.30 or newer, which would be nice. As it is this will turn it off for the whole vhost, which is a small lessening of security.
I suppose in principle we could create the certificate for the back end with the name of the front end, or add it to the SANs. I haven’t tried this and it seems like it could be a recipe for confusion and subtle bugs.
Secure disk wipe with Windows format command
From Windows 8 Microsoft snuck in a refinement to the
format command. It is now possible to get it to do multi-pass random-number disk wipes. From the help (Win 10 20H2):
/P:count Zero every sector on the volume. After that, the volume will be overwritten "count" times using a different random number each time. If "count" is zero, no additional overwrites are made after zeroing every sector. This switch is ignored when /Q is specified.
So to do a single-pass random wipe:
- Repartition disk with one partition (if desired) and give it a drive letter (let’s say F for this example). Probably a good idea to remove any OEM, EFI, recovery partitions like this. A quick way to do this is to use the
format F: /P:1
- If you feel like it finish up with a
This should do a pass with all zeros, and then a random-number pass.
Note this isn’t a full ‘write random data to every block in the drive’ erase, but should still be secure enough for most purposes.
Triggering redetection of network type in Server 2012
Had an issue where a Windows Server 2012 R2 system could not be accessed by RDP or remote management, as the network type had changed to Private (and thus the firewall wasn’t letting these connections through). File sharing was still working.
Found solution via SpiceWorks forum. Restart the Network Location Awareness service (needed to log on to system locally to do this). This triggered a redetection and the type wend back to Domain. RDP etc then worked again.
Upgrading from m.2 SATA to Crucial NVMe drive on Latitude 7490
Dell Latitude 7490 with existing SATA m.2 SSD. We want to upgrade to larger NVMe drive (Crucial 1Tb).
First tried new drive in Startech NVMe USB enclosure (M2E1BMU31C). Downloaded Crucial cloning software (locked version of Acronis). Problem – not recognised as Crucial drive so Acronis won’t run.
Posts suggest that the new drive should be installed in the laptop first and the system booted via USB. So take current drive out and put it in a SATA USB m.2 enclosure. Attach this to USB-C port and reboot.
This doesn’t work. What does work is attaching it to a USB-A port instead. Then it boots with no intervention.
After that the disk was clones (with no reboot necessary!), the old dive disconnected and the system booted happily from the new drive.