Kea DHCP service failing to open network socket on boot

System – Kea IPv4 DHCP server (ver 2.2.0) running on Debian 12, serving addresses on second network interface (PCIe card).

On reboot the server starts, but warns that it could not open the socket on the prescribed interface, and so is not serving any addresses. Running

systemctl status kea-dhcp4-server.service

The error is


kea-dhcp4[582]: WARN DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: the interface enp3s0 is not running
kea-dhcp4[582]: INFO DHCP4_OPEN_SOCKETS_FAILED maximum number of open service sockets attempts: 0, has been exhausted without success

Restarting the service afterwards works normally. This is a known problem. There are various suggestions to fix the ordering, but the simplest (albeit slightly inelegant) way to get it to work is to tell it to retry a bunch of times with the service-sockets-max-retries and service-sockets-retry-wait-time options. So the start of the configuration looks like this:

"Dhcp4": {
    "interfaces-config": {
        "interfaces": [ "enp3s0" ],
        "dhcp-socket-type": "raw",
        "service-sockets-max-retries": 100,
        "service-sockets-retry-wait-time": 5000
    },

This retries every 5 seconds up to 100 times, which should be enough to get things in order. Note that, annoyingly, there is no message logged when a retry succeeds.

Blocking or disabling autofs automounts with the -null map

Suppose you have a linux network setup with automounter maps that come from the network (via nis, sssd, LDAP etc.) and you want to block some of them acting on a particular system. In our case we have an automount map that acts on /opt and mounts various software packages from network shares. The problem with this is that you can’t then install your own stuff locally to /opt, which is what a lot of Debian/Ubuntu packages expect to be able to do.

It turns out there is a option in the automounter for this sort of situation. There is a built-in map called -null that blocks any further automounts to a particular mountpoint. In our case we want to block auto.opt, so we add a line to auto.master (somewhere before the bottom +auto.master line)

/opt  -null

Then restart the autofs service (if stuff was mounted on /opt then unmount it). Or reboot the system. You should find that you can put stuff in the local /opt.

To check the map is blocked you can also run

automount --dumpmaps

(also handy for checking what is actually meant to be mapped where).

Another way of doing this that leaves the system auto.master untouched is to create a file /etc/auto.master.d/opt.autofs (the first part of the name can be anything you want). Put the same contents in the file, e.g.

/opt  -null

Note that using this mechanism normally requires two files – one in /etc/auto.master.d/ and a map file that it refers to. In this case -null is a built-in map.

Unfortunately this option is not well documented. Places where it is referred to are:

There are also other built-in maps, e.g. -passwd, -hosts, -fedfs. Of these only the -hosts map is documented in the auto.master(5) man page.

-null is confirmed to work in CentOS 7, CentOS 8, Ubuntu 20.04, Debian 10.

Triggering redetection of network type in Server 2012

Had an issue where a Windows Server 2012 R2 system could not be accessed by RDP or remote management, as the network type had changed to Private (and thus the firewall wasn’t letting these connections through). File sharing was still working.

Found solution via SpiceWorks forum. Restart the Network Location Awareness service (needed to log on to system locally to do this). This triggered a redetection and the type wend back to Domain. RDP etc then worked again.

Setting Windows 10 web proxy per-user

There are a couple of GUI routes for setting the system web proxy for Windows 10 – the old control panel page (via Network and Internet – Network Options):

Windows 7 and later Control Panel system web proxy settings panel.

And the new settings style:

Windows 10 system Proxy settings new style.

Note that the new style does not warn you that you may not be allowed to set the proxy – you can change the settings, but if you select another panel and then go back to Proxy your settings will be gone.

The reason for this is often that the system is configured to set the proxy at the machine level, not per-user. On domain systems this can be changed using Group policy. On standalone systems this can be changed using a registry key, located under

HKLM\Software\Policies\Microsoft\Windows\CurrentVersion\internet Settings

There is a DWORD key here called ProxySettingsPerUser (if not, create it). 0 means the proxy is set at machine level, 1 enables per-user settings.

If you change this to 1 then you should immediately be able to change the proxy settings.

FDQNs in FlexNet license files

When trying to query FlexNet licenses using lmutil or similar from systems with a different DNS suffix, make sure that the license file server name contains the FDQN for the SERVER line. If not you can find that lmutils complains that the lmgrd process is not running, even though you can run the actual program with the appropriate license.

For example, with a line in the licence file like

SERVER servername 4eca3b4b8326 1055

running

lmutil lmstat -a -c 1055@servername.physics.gla.ac.uk

results in a HOST_NOT_FOUND error. Running the Client ANSLIC_ADMIN Utility gives the same error. However, ANSYS fires up as normal.

To fix this put the FDQN in the license file (refreshing the licence server afterwards!)

SERVER servername.physics.gla.ac.uk 4eca3b4b8326 1055

The lmutil queries should then work as normal.

Making nis authentication work with Ubuntu 16, Debian 8, Fedora 28 etc.

After updating anything to use systemd-235 NIS logins either don’t work at all (usually for GUI logins), or take a long time to login (console or ssh, sometimes). The culprit is a line in the systemd-logind.service:

IPAddressDeny=any

This sandboxes the service and doesn’t allow it to talk to the network. Unfortunately this affects nis lookups done via the glibc NSS API. See the links at https://github.com/systemd/systemd/pull/7343

The quick solution is to turn off the sandboxing, either by commenting out or changing the line in systemd-logind.service, or creating a drop-in snippet that overrides it. This can be done by creating a file /etc/systemd/system/systemd-logind.service.d/IPAddress_clear.conf with the contents:

[Service]
IPAddressDeny=

The file can be called anything you like (.conf).

Then restart things:

systemctl daemon-reload
systemctl restart systemd-logind.service

You can check that the drop-in is being loaded with

systemctl status systemd-logind.service

In the output you should see something like:

   Loaded: loaded (/lib/systemd/system/systemd-logind.service; static; vendor preset: enabled)
  Drop-In: /etc/systemd/system/systemd-logind.service.d
  └─IPAddress_clear.conf

The other test is to see if NIS logins work correctly, of course…

The slightly slower solution is to use nscd to cache the lookup requests, and apparently does so in a way that plays nicely with the sandboxing. The much slower solution is to switch to using sssd or similar and ditch NIS once and for all…

Note – this may also affect systemd-udevd.

Making static networking work in Ubuntu 18.04

Ubuntu 18.04 has switched to netplan for configuring the network interfaces. Netplan generates configurations for NetworkManager or systemd-networkd and effectively replaces ifupdown and the /etc/network/interfaces file.

In an install of Ubuntu Desktop, the default netplan configuration comes from /etc/netplan/01-network-manager-all.yaml which reads:

network:
  version: 2
  renderer: NetworkManager

This basically hands over all network control to NetworkManager. For a static setup we can change the configuration to:

network:
  version: 2
  renderer: networkd
  ethernets:
    enp0s25:
      dhcp4: no
      addresses: [aaa.aaa.aaa.aaa/24]
      gateway4: bbb.bbb.bbb.bbb
      nameservers:
        search: [example.co.uk]
        addresses: [xxx.xxx.xxx.xxx,yyy.yyy.yyy.yyy,zzz.zzz.zzz.zzz]

Where enp0s25 is my network interface in this case, the address has a netmask of 255.255.255.0 and the search is the default dns search domain(s) (note this can be vital for getting automounting to work if your setup just uses machine names and assumes the domain is the same).

Note that if you have a laptop you could put this in a file called, say 02_ethernet_interface.yaml and it should override the first configuration for that interface only. I think. Later configurations override earlier ones.

To test:

netplan try

This applies the configuration and then rolls it back in 120 seconds (by default). Press Enter to accept the new settings.

netplan apply

To apply the changes.

For a desktop I just deleted the first config file.

In theory you could probably use this to generate a configuration for NetworkManager (note that on a server you need to explicitly configure NetworkManager to bring up the interface on boot).

Connecting to NFS shares at boot using fstab in Debian 9 Stretch

Note – this fix in principle should work on most systemd distributions.

Problem – trying to get a Debian 9 system to mount an NFS share at boot. This was declared in /etc/fstab in the normal way, but kept failing on boot. However, once the system was up you could log in and do a mount -a, which would work fine. Reading around, it looks like a case of the system trying to mount before the network is up (and in this case the network should be reliable, as it’s an internal one between a VM and it’s host…)

Tried using the bg option first, which should mount in the background and come up eventually, but still got error on boot.

192.168.45.1:/export  /hostshare  nfs  bg,rw,soft  0   0

There is another option that in theory should help: _netdev. I haven’t tried this yet.

What does work is adding an option x-systemd.automount. This, unsurprisingly, tells systemd to try and mount the share on demand. So changing the line in fstab to read:

192.168.45.1:/export  /hostshare  nfs  rw,soft,x-systemd.automount  0   0

works. Booting the system gives no errors (on the console anyway). The share does not show as mounted until the local mountpoint is accessed, and then it works without complaint.

Footnotes

  1. The context for this is a VM running under VirtualBox. The VM is Debian 9, the host is Ubuntu 17.10. The VM has one network interface with the default NAT setup to talk to the outside world, and a second interface to talk to a host-only network. This allows you to SSH into the the guest from the host, and also allows this NFS setup. You can use the VirtualBox shared folder setup to transfer files, but I figured as both the host and guest were Linux NFS would be easier (and not require the Guest Extensions to be installed on the guest).
  2. Debian 9 wouldn’t successfully install on the laptop, but I needed it for an easy install of LALSuite (Debian is a reference system for this, Ubuntu isn’t and has dependency issues). Hence this rather complicated setup. Fortunately LALSuite is entirely command line based…
  3. Yes, Docker or similar would be more efficient. I’m not so familiar with it and it’s a bit of a pain to get it talking to the host filesystem. I’d argue that running a full VM is slightly more portable, although you’d need to change how the filesharing is set up on a Windows or Mac host.

Monitoring GPU temperatures with nvidia-smi and Check MK (OMD)

The Nvidia monitoring setup described at https://elwe.rhrk.uni-kl.de/howto/ worked in Check MK 1.2.8, but fails in 1.4. After some modification things now work – it required some modification of the check script /omd/yoursite/local/share/check_mk/checks/nvidia_smi. The two modifications needed were:

Remove the grouping of nvidia_smi.errors1 and 2 (I can live with this as our GTX1070 doesn’t report this anyway).

Remove the unicode degree characters from the temperature output, as this seems to cause the system to choke on the textual output.

Needed to delete and recreate the host to get it to work properly – possibly unicode characters hanging around in the generated graph definitions or similar?

#!/usr/bin/python
# -*- encoding: utf-8; py-indent-offset: 4 -*-
# +------------------------------------------------------------------+
# | ____ _ _ __ __ _ __ |
# | / ___| |__ ___ ___| | __ | \/ | |/ / |
# | | | | '_ \ / _ \/ __| |/ / | |\/| | ' / |
# | | |___| | | | __/ (__| < | | | | . \ | # | \____|_| |_|\___|\___|_|\_\___|_| |_|_|\_\ | # | | # | Copyright Mathias Kettner 2012 mk@mathias-kettner.de | # +------------------------------------------------------------------+ # # This file is part of Check_MK. # The official homepage is at http://mathias-kettner.de/check_mk. # # check_mk is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation in version 2. check_mk is distributed # in the hope that it will be useful, but WITHOUT ANY WARRANTY; with- # out even the implied warranty of MERCHANTABILITY or FITNESS FOR A # PARTICULAR PURPOSE. See the GNU General Public License for more de- # ails. You should have received a copy of the GNU General Public # License along with GNU Make; see the file COPYING. If not, write # to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, # Boston, MA 02110-1301 USA. ####################################### # Check developed by ####################################### # Dr. Markus Hillenbrand # University of Kaiserslautern, Germany # hillenbr@rhrk.uni-kl.de ####################################### # Tweaked by Jamie Scott # University of Glasgow # Jamie.Scott@glasgow.ac.uk ####################################### # the inventory functions def inventory_nvidia_smi_fan(info): inventory = [] for line in info: if line[2] != 'N/A': inventory.append( ("GPU"+line[0], "", None) ) return inventory def inventory_nvidia_smi_gpuutil(info): inventory = [] for line in info: if line[3] != 'N/A': inventory.append( ("GPU"+line[0], "", None) ) return inventory def inventory_nvidia_smi_memutil(info): inventory = [] for line in info: if line[4] != 'N/A': inventory.append( ("GPU"+line[0], "", None) ) return inventory def inventory_nvidia_smi_errors1(info): inventory = [] for line in info: if line[5] != 'N/A': inventory.append( ("GPU"+line[0], "", None) ) return inventory def inventory_nvidia_smi_errors2(info): inventory = [] for line in info: if line[6] != 'N/A': inventory.append( ("GPU"+line[0], "", None) ) return inventory def inventory_nvidia_smi_temp(info): inventory = [] for line in info: if line[7] != 'N/A': inventory.append( ("GPU"+line[0], "", None) ) return inventory def inventory_nvidia_smi_power(info): inventory = [] for line in info: if line[8] != 'N/A' and line[9] != "N/A": inventory.append( ("GPU"+line[0], "", None) ) return inventory # the check functions def check_nvidia_smi_fan(item, params, info): for line in info: if "GPU"+line[0] == item: value = int(line[2]) perfdata = [('fan', value, 90, 95, 0, 100 )] if value > 95:
return (2, "CRITICAL - %s fan speed is %d%%" % (line[1], value), perfdata)
elif value > 90:
return (1, "WARNING - %s fan speed is %d%%" % (line[1], value), perfdata)
else:
return (0, "OK - %s fan speed is %d%%" % (line[1], value), perfdata)
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_gpuutil(item, params, info):
for line in info:
if "GPU"+line[0] == item:
value = int(line[3])
perfdata = [('gpuutil', value, 100, 100, 0, 100 )]
return (0, "OK - %s utilization is %s%%" % (line[1], value), perfdata)
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_memutil(item, params, info):
for line in info:
if "GPU"+line[0] == item:
value = int(line[4])
perfdata = [('memutil', value, 100, 100, 0, 100 )]
if value > 95:
return (2, "CRITICAL - %s memory utilization is %d%%" % (line[1], value), perfdata)
elif value > 90:
return (1, "WARNING - %s memory utilization is %d%%" % (line[1], value), perfdata)
else:
return (0, "OK - %s memory utilization is %d%%" % (line[1], value), perfdata)
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_errors1(item, params, info):
for line in info:
if "GPU"+line[0] == item:
value = int(line[5])
if value > 500:
return (2, "CRITICAL - %s single bit error counter is %d" % (line[1], value))
if value > 100:
return (1, "WARNING - %s single bit error counter is %d" % (line[1], value))
else:
return (0, "OK - %s single bit error counter is %d" % (line[1], value))
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_errors2(item, params, info):
for line in info:
if "GPU"+line[0] == item:
value = int(line[6])
if value > 500:
return (2, "CRITICAL - %s double bit error counter is %d" % (line[1], value))
if value > 100:
return (1, "WARNING - %s double bit error counter is %d" % (line[1], value))
else:
return (0, "OK - %s double bit error counter is %d" % (line[1], value))
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_temp(item, params, info):
for line in info:
if "GPU"+line[0] == item:
value = int(line[7])
perfdata = [('temp', value, 80, 90, 0, 95 )]
if value > 90:
return (2, "CRITICAL - %s temperature is %dC" % (line[1], value), perfdata)
elif value > 80:
return (1, "WARNING - %s temperature is %dC" % (line[1], value), perfdata)
else:
return (0, "OK - %s temperature is %dC" % (line[1], value), perfdata)
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_power(item, params, info):
for line in info:
if "GPU"+line[0] == item:
draw = float(line[8])
limit = float(line[9])
value = draw * 100.0 / limit
perfdata = [('power', draw, limit * 0.8, limit * 0.9, 0, limit )]
if value > 90:
return (2, "CRITICAL - %s power utilization is %d%% of %dW" % (line[1], value, limit), perfdata)
elif value > 80:
return (1, "WARNING - %s power utilization is %d%% of %dW" % (line[1], value, limit), perfdata)
else:
return (0, "OK - %s power utilization is %d%% of %dW" % (line[1], value, limit), perfdata)
return (3, "UNKNOWN - GPU %s not found in agent output" % item)

# declare the check to Check_MK

check_info['nvidia_smi.fan'] = (check_nvidia_smi_fan, "%s fan speed" , 1, inventory_nvidia_smi_fan)
check_info['nvidia_smi.gpuutil'] = (check_nvidia_smi_gpuutil, "%s utilization" , 1, inventory_nvidia_smi_gpuutil)
check_info['nvidia_smi.memutil'] = (check_nvidia_smi_memutil, "%s memory" , 1, inventory_nvidia_smi_memutil)
#check_info['nvidia_smi.errors1'] = (check_nvidia_smi_errors1, "%s errors single" , 0, inventory_nvidia_smi_errors1)
#check_info['nvidia_smi.errors2'] = (check_nvidia_smi_errors2, "%s errors double" , 0, inventory_nvidia_smi_errors2)
check_info['nvidia_smi.temp'] = (check_nvidia_smi_temp, "%s temperature" , 1, inventory_nvidia_smi_temp)
check_info['nvidia_smi.power'] = (check_nvidia_smi_power, "%s power" , 1, inventory_nvidia_smi_power)

#checkgroup_of['nvidia_smi.errors1'] = 'hw_errors'
#checkgroup_of['nvidia_smi.errors2'] = 'hw_errors'