FDQNs in FlexNet license files

When trying to query FlexNet licenses using lmutil or similar from systems with a different DNS suffix, make sure that the license file server name contains the FDQN for the SERVER line. If not you can find that lmutils complains that the lmgrd process is not running, even though you can run the actual program with the appropriate license.

For example, with a line in the licence file like

SERVER servername 4eca3b4b8326 1055

running

lmutil lmstat -a -c 1055@servername.physics.gla.ac.uk

results in a HOST_NOT_FOUND error. Running the Client ANSLIC_ADMIN Utility gives the same error. However, ANSYS fires up as normal.

To fix this put the FDQN in the license file (refreshing the licence server afterwards!)

SERVER servername.physics.gla.ac.uk 4eca3b4b8326 1055

The lmutil queries should then work as normal.

Monitoring GPU temperatures with nvidia-smi and Check MK (OMD)

In the previous post on this subject we used code from Technische Universität Kaiserslautern to monitor our GPUs using OMD checkmk (now checkmk raw). With some new RTX2080s installed this broke, as the nvidia-smi check doesn’t report anything for ECC errors (rather than 0, as previous cards did). The solution was to remove the ECC checking completely.

The new scripts are:

On the client system in /usr/lib/check_mk_agent/local/ (or plugins/)

if which nvidia-smi >/dev/null; then
   echo '<<<nvidia_smi>>>'
   nvidia-smi -q -x > /tmp/.check_mk_nvidia_smi
   cards=$(xml_grep --text_only 'nvidia_smi_log/attached_gpus' /tmp/.check_mk_nvidia_smi | tr -d ' ')
   IFS=$'\n' names=($(xml_grep --text_only 'nvidia_smi_log/gpu/product_name' /tmp/.check_mk_nvidia_smi | tr -d ' '))
   IFS=$'\n' fan_speed=($(xml_grep --text_only 'nvidia_smi_log/gpu/fan_speed' /tmp/.check_mk_nvidia_smi | tr -d ' '))
   IFS=$'\n' gpu_utilization=($(xml_grep --text_only 'nvidia_smi_log/gpu/utilization/gpu_util' /tmp/.check_mk_nvidia_smi | tr -d ' '))
   IFS=$'\n' mem_utilization=($(xml_grep --text_only 'nvidia_smi_log/gpu/utilization/memory_util' /tmp/.check_mk_nvidia_smi | tr -d ' '))
   IFS=$'\n' temperature=($(xml_grep --text_only 'nvidia_smi_log/gpu/temperature/gpu_temp' /tmp/.check_mk_nvidia_smi | tr -d ' '))
   IFS=$'\n' power_draw=($(xml_grep --text_only 'nvidia_smi_log/gpu/power_readings/power_draw' /tmp/.check_mk_nvidia_smi | tr -d ' '))
   IFS=$'\n' power_limit=($(xml_grep --text_only 'nvidia_smi_log/gpu/power_readings/power_limit' /tmp/.check_mk_nvidia_smi | tr -d ' '))

   for i in $(seq 1 $cards) ; do
       index=$(($i - 1))
       fan_speed[$index]=${fan_speed[$index]/\%/}
       gpu_utilization[$index]=${gpu_utilization[$index]/\%/}
       mem_utilization[$index]=${mem_utilization[$index]/\%/}
       temperature[$index]=${temperature[$index]/C/}
       power_draw[$index]=${power_draw[$index]/W/}
       power_limit[$index]=${power_limit[$index]/W/}
       echo "$index ${names[$index]} ${fan_speed[$index]} ${gpu_utilization[$index]} ${mem_utilization[$index]} ${temperature[$index]} ${power_draw[$index]} ${power_limit[$index]}"
   done
fi
[/code title="nvidia_smi" lang="python"]

Don't forget to make it executable! You also need xml_grep installed.

On the OMD server at <code>/omd/sites/omd_XYZ/local/share/check_mk/checks/</code>


#!/usr/bin/python
# -*- encoding: utf-8; py-indent-offset: 4 -*-
# +------------------------------------------------------------------+
# |             ____ _               _        __  __ _  __           |
# |            / ___| |__   ___  ___| | __   |  \/  | |/ /           |
# |           | |   | '_ \ / _ \/ __| |/ /   | |\/| | ' /            |
# |           | |___| | | |  __/ (__|   <    | |  | | . \            |
# |            \____|_| |_|\___|\___|_|\_\___|_|  |_|_|\_\           |
# |                                                                  |
# | Copyright Mathias Kettner 2012             mk@mathias-kettner.de |
# +------------------------------------------------------------------+
#
# This file is part of Check_MK.
# The official homepage is at http://mathias-kettner.de/check_mk.
#
# check_mk is free software;  you can redistribute it and/or modify it
# under the  terms of the  GNU General Public License  as published by
# the Free Software Foundation in version 2.  check_mk is  distributed
# in the hope that it will be useful, but WITHOUT ANY WARRANTY;  with-
# out even the implied warranty of  MERCHANTABILITY  or  FITNESS FOR A
# PARTICULAR PURPOSE. See the  GNU General Public License for more de-
# ails.  You should have  received  a copy of the  GNU  General Public
# License along with GNU Make; see the file  COPYING.  If  not,  write
# to the Free Software Foundation, Inc., 51 Franklin St,  Fifth Floor,
# Boston, MA 02110-1301 USA.

#######################################
# Check developed by
#######################################
# Dr. Markus Hillenbrand
# University of Kaiserslautern, Germany
# hillenbr@rhrk.uni-kl.de
#######################################

# the inventory functions

def inventory_nvidia_smi_fan(info):
    inventory = []
    for line in info:
        if line[2] != 'N/A':
           inventory.append( ("GPU"+line[0], "", None) )
    return inventory
def inventory_nvidia_smi_gpuutil(info):
    inventory = []
    for line in info:
        if line[3] != 'N/A':
           inventory.append( ("GPU"+line[0], "", None) )
    return inventory
def inventory_nvidia_smi_memutil(info):
    inventory = []
    for line in info:
        if line[4] != 'N/A':
           inventory.append( ("GPU"+line[0], "", None) )
    return inventory
def inventory_nvidia_smi_temp(info):
    inventory = []
    for line in info:
        if line[5] != 'N/A':
           inventory.append( ("GPU"+line[0], "", None) )
    return inventory
def inventory_nvidia_smi_power(info):
    inventory = []
    for line in info:
        if line[6] != 'N/A' and line[7] != "N/A":
           inventory.append( ("GPU"+line[0], "", None) )
    return inventory

# the check functions

def check_nvidia_smi_fan(item, params, info):
    for line in info:
        if "GPU"+line[0] == item:
           value = int(line[2])
           perfdata = [('fan', value, 90, 95, 0, 100 )]
           if value > 95:
              return (2, "CRITICAL - %s fan speed is %d%%" % (line[1], value), perfdata)
           elif value > 90:
              return (1, "WARNING - %s fan speed is %d%%" % (line[1], value), perfdata)
           else:
              return (0, "OK - %s fan speed is %d%%" % (line[1], value), perfdata)
    return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_gpuutil(item, params, info):
    for line in info:
        if "GPU"+line[0] == item:
           value = int(line[3])
           perfdata = [('gpuutil', value, 100, 100, 0, 100 )]
           return (0, "OK - %s utilization is %s%%" % (line[1], value), perfdata)
    return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_memutil(item, params, info):
    for line in info:
        if "GPU"+line[0] == item:
           value = int(line[4])
           perfdata = [('memutil', value, 100, 100, 0, 100 )]
           if value > 95:
              return (2, "CRITICAL - %s memory utilization is %d%%" % (line[1], value), perfdata)
           elif value > 90:
              return (1, "WARNING - %s memory utilization is %d%%" % (line[1], value), perfdata)
           else:
              return (0, "OK - %s memory utilization is %d%%" % (line[1], value), perfdata)
    return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_temp(item, params, info):
    for line in info:
        if "GPU"+line[0] == item:
           value = int(line[5])
           perfdata = [('temp', value, 80, 90, 0, 95 )]
           if value > 90:
              return (2, "CRITICAL - %s temperature is %dC" % (line[1], value), perfdata)
           elif value > 80:
              return (1, "WARNING - %s temperature is %dC" % (line[1], value), perfdata)
           else:
              return (0, "OK - %s temperature is %dC" % (line[1], value), perfdata)
    return (3, "UNKNOWN - GPU %s not found in agent output" % item)

def check_nvidia_smi_power(item, params, info):
    for line in info:
        if "GPU"+line[0] == item:
           draw = float(line[6])
           limit = float(line[7])
           value = draw * 100.0 / limit
           perfdata = [('power', draw, limit * 0.8, limit * 0.9, 0, limit )]
           if value > 90:
              return (2, "CRITICAL - %s power utilization is %d%% of %dW" % (line[1], value, limit), perfdata)
           elif value > 80:
              return (1, "WARNING - %s power utilization is %d%% of %dW" % (line[1], value, limit), perfdata)
           else:
              return (0, "OK - %s power utilization is %d%% of %dW" % (line[1], value, limit), perfdata)
    return (3, "UNKNOWN - GPU %s not found in agent output" % item)

# declare the check to Check_MK

check_info['nvidia_smi.fan']     = (check_nvidia_smi_fan,     "%s fan speed"      , 1, inventory_nvidia_smi_fan)
check_info['nvidia_smi.gpuutil'] = (check_nvidia_smi_gpuutil, "%s utilization"    , 1, inventory_nvidia_smi_gpuutil)
check_info['nvidia_smi.memutil'] = (check_nvidia_smi_memutil, "%s memory"         , 1, inventory_nvidia_smi_memutil)
check_info['nvidia_smi.temp']    = (check_nvidia_smi_temp,    "%s temperature"    , 1, inventory_nvidia_smi_temp)
check_info['nvidia_smi.power']   = (check_nvidia_smi_power,   "%s power"          , 1, inventory_nvidia_smi_power)

To get the pretty indicators put this in /omd/sites/omd_XYZ/share/check_mk/web/plugins/perfometer/

#!/usr/bin/python

def perfometer_nvidia_smi_fan(row, check_command, perf_data):
    varname, value, unit, warn, crit, minn, maxx = perf_data[0]
    perc_used = 100 * (float(value) / float(maxx))
    perc_free = 100 - float(perc_used)
    return str(value)+" %", '<table><tr>' \
                               + perfometer_td(perc_used, '#0f8') \
                               + perfometer_td(perc_free, '#fff') \
                               + '</tr></table>'
def perfometer_nvidia_smi_gpuutil(row, check_command, perf_data):
    varname, value, unit, warn, crit, minn, maxx = perf_data[0]
    perc_used = 100 * (float(value) / float(maxx))
    perc_free = 100 - float(perc_used)
    return str(value)+" %", '<table><tr>' \
                               + perfometer_td(perc_used, '#0f8') \
                               + perfometer_td(perc_free, '#fff') \
                               + '</tr></table>'
def perfometer_nvidia_smi_memutil(row, check_command, perf_data):
    varname, value, unit, warn, crit, minn, maxx = perf_data[0]
    perc_used = 100 * (float(value) / float(maxx))
    perc_free = 100 - float(perc_used)
    return str(value)+" %", '<table><tr>' \
                               + perfometer_td(perc_used, '#0f8') \
                               + perfometer_td(perc_free, '#fff') \
                               + '</tr></table>'
def perfometer_nvidia_smi_temp(row, check_command, perf_data):
    varname, value, unit, warn, crit, minn, maxx = perf_data[0]
    perc_used = 100 * (float(value) / float(maxx))
    perc_free = 100 - float(perc_used)
    return str(value)+" C", '<table><tr>' \
                               + perfometer_td(perc_used, '#0f8') \
                               + perfometer_td(perc_free, '#fff') \
                               + '</tr></table>'
def perfometer_nvidia_smi_power(row, check_command, perf_data):
    varname, value, unit, warn, crit, minn, maxx = perf_data[0]
    perc_used = 100 * (float(value) / float(maxx))
    perc_free = 100 - float(perc_used)
    return str(value)+" W", '<table><tr>' \
                               + perfometer_td(perc_used, '#0f8') \
                               + perfometer_td(perc_free, '#fff') \
                               + '</tr></table>'

perfometers['check_mk-nvidia_smi.fan']     = perfometer_nvidia_smi_fan
perfometers['check_mk-nvidia_smi.gpuutil'] = perfometer_nvidia_smi_gpuutil
perfometers['check_mk-nvidia_smi.memutil'] = perfometer_nvidia_smi_memutil
perfometers['check_mk-nvidia_smi.temp']    = perfometer_nvidia_smi_temp
perfometers['check_mk-nvidia_smi.power']   = perfometer_nvidia_smi_power

Notes on setting up Canon PiXMA iX6850 A3 inkjet

Windows driver – IJ Network tool does allow you to input IP address eventually (Mac version does not). Conveniently our print server was on the same subnet as the printer, so it found it straight away. Driver can be installed on Server 2012 and shared, but cannot be shared as a LPD queue (as Canon don’t use a standard IP port).

Printer does function as an IPP printer, and LPD (if enabled).

On the Mac, use the IP address of the printer – it doesn’t communicate properly with the DNS name.

Cannot print in colour or double-sided using HP Universal driver

If the HP Universal print driver cannot automatically retrieve a printer configuration (say SNMP is blocked on the network or disabled on the device) it defaults to monochrome printing (UPD version 5.0 and newer). This can also lead to long pauses while trying to open the printer properties, as it presumably times out while trying to retrieve the data. On the Device Settings tab, click Device Type: Auto Detect to display the drop-down list and then select Color (and also check the duplex setting).

See HP LaserJet, HP PageWide Enterprise, HP OfficeJet Enterprise – Unable to Print in Color or Unable to Auto-Duplex (2-Sided Printing Fails) after installing the UPD

Cloning Mac disk using external enclosure and Disk Utility

We got a ssd upgrade for a Mac (JetDrive 885) which comes with an enclosure to allow data to be copied to the new drive. This can be done with Disk Utility – as you are cloning the boot drive you need to run this in recovery mode (boot holding cmd – r). Presumably booting from a usb stick would work as well.

Once in disk utility choose the external drive, select Restore and choose to restore from the Macintosh HD. At this point we ran into an issue:

Could not change the partition type for /dev/disk1s1 - error -5342

We eventually found the problem – the new ssd had been configured with a MBR partition table, not GPT. To fix this go to the View menu and choose Show All Devices. Then select the external device at the top level and click erase. This should allow you to choose MBR or GPT partition schemes.

Publishing websites with Jekyll, Apache and SVN

Now I’ve got this working to some extent here are some notes about setting up Jekyll with SVN and Apache:

Server – Debian 9 Stretch, normal command-line only install. Set up system to use email server (campus smarthost in our case).

Install SVN and Apache and set up accordingly.

Install Jekyll:

apt install jekyll

Create an SVN repository for the site files.

Create new project directories at a temporary location, e.g.

jekyll new /tmp/newsite

Commit these files to the SVN repository (I normally check out the repository on my local workstation, copy the directory in /tmp from the server into the working directory on the workstation, add them and commit). Delete the directory in /tmp.

On the server, create the actual website file location by exporting from the SVN via a temporary location:

svn export file:///path/to/repository /tmp/buildfiles
jekyll build --source /tmp/buildfiles /var/www/sitename
rm -Rf /tmp/buildfiles

Configure Apache to serve from /var/www/sitename. In our case we ultimately wanted to serve multiple sites through a reverse proxy, so we used a vhost serving on an alternate port. This can be a handy testing configuration – you don’t have to worry about fiddling with the other website settings. For example, using port 8081:

<VirtualHost *:8081>

    ServerAdmin webmaster@localhost
    DocumentRoot /var/www/sitename

</VirtualHost>

(Remember to change ports.conf to listen on the new port!)

Test by pointing a webserver at server:8081

Once that’s all working, set up the post-commit hook script to automatically build the site on a commit. Our current setup is:

#!/bin/sh

# POST-COMMIT HOOK

REPOS="$1"
REV="$2"
REPOS_BASENAME=$(/usr/bin/basename "$REPOS")
TMP_SVN_EXPORT="/tmp/$REPOS_BASENAME"

# These two need configured!
PUBLIC_WWW="/var/www/sitename"
BUILD_EMAIL="your.email@this.address"

"$REPOS"/hooks/mailer.py commit "$REPOS" $REV "$REPOS"/hooks/mailer.conf

LOGVAR=$(/export0/svn_config/jekyll_build.sh "$REPOS" $REV "$TMP_SVN_EXPORT" "$PUBLIC_WWW" 2>&1)

echo "$LOGVAR" | /usr/bin/unix2dos | mail -s "$REPOS_BASENAME build $REV" "$BUILD_EMAIL"

(Note that on Debian you need to install the dos2unix package. Needed as plain text email expects CRLF line terminators as specified in RFC 2822.)

#!/bin/sh

REPOS="$1"
REV="$2"
TMP_SVN_EXPORT="$3"
PUBLIC_WWW="$4"

/usr/bin/svn export --quiet file:///"$REPOS" "$TMP_SVN_EXPORT"
/usr/bin/jekyll build --source "$TMP_SVN_EXPORT" --destination "$PUBLIC_WWW"
rm -Rf "$TMP_SVN_EXPORT"

Note that the build process runs under the Apache user account, so set permissions appropriately. Also, when troubleshooting remember that on Debian 9 the Apache process is configured by default to use a private /tmp directory!

This works for our current needs, although it isn’t optimised. Improvements would be:

  • Unify the setup for the commit email and build email scripts.
  • Build the site in the background (although you’d have to tweak how the logging output works in that case).

Of course, the professionals would use something like a combination of GitLab and Jenkins to automate this stuff properly…

 

Private /tmp directories in Debian 9 Stretch with Apache

In Debian 9 Stretch Apache is configured to use systemd‘s PrivateTmp feature by default. This means that the Apache tmp directory actually lives in /tmp/systemd-private-BIGLONGSTRING--apache2.service-STRING.

So if you are running an SVN server that uses Apache for serving, anything written to /tmp in the hook scripts ends up in the private directory rather than the normal userspace one.

Dell Latitude 7490 freezing when unplugging USB3 WD15 dock

Setup – Dell Latitude 7490 running Ubuntu 18.04 Bionic and Dell WD15 USB-C dock.

Problem – system freezes when dock unplugged.

This problem started after updates. The solution found was to revert to the previous kernel (4.15.0-43-generic) from 4.15.0-44-generic. Did this by setting GRUB to remember the boot setting – change /etc/grub/grub with:

GRUB_DEFAULT=saved
GRUB_SAVEDEFAULT=true

and run

update-grub

Then hit esc at the loading screen to get to the grub menu.