Scylla / OpenNMS NewTS? – Use a REDIS cache

We noticed that our larger hosts, specifically PoP routers with thousands of interfaces were having intermittent resource graphing. This seemed strange since we have a Scylla backend we are using with NewTS that has gobs of resources. The Horizon server is also fine resource wise, well as it turns out, implementing the REDIS cache for OpenNMS/Horizon makes a world of difference.

In our case we went from ~3600/sec queries against the Scylla cluster to ~450/sec and all graphing gaps went away. Also viewing resource graphs got faster. It would appear that the internal Cache in Horizon may just not be powerful enough and is not very efficient when compared to REDIS.

OpenNMS Email Notifications

Lately I’ve been facing an issue where some hosts that are only supposed to be notified after a 10 minute outage, due to them being LTE devices, they drop a lot, and we really don’t need to know about it until they’ve been down for a while, then we can take action.

Well we tried to exclude these from our catch-all nodeDown filter, all our LTE devices are in RFC1918 10.10.X.X space, so we created a filter on our global rule to exclude them:

!(IPADDR IPLIKE 10.10.*.*)

The inverse of this (IPADDR IPLIKE 10.10.*.*) works just fine on our 10m delay filter.

Well it turns out even if you are not monitoring an IP, and that IP exists on the host, it will try and match it. This is a big problem, it means you have to either explicitly list all the IPs that you want to EXCLUDE or, this code needs to be changed to only look at monitored IPs. I think a isManaged filter should be added on the SQL query. At any rate, if you’re hitting mystery notification, that don’t show up in the validate list, that is why.

OpenNMS Horizon 30 Update / 503 / Karaf Failure

When upgrading to OpenNMS Horizon 30, you may find that even after following standard upgrade procedures it will produce a HTTP/503 for you. Meaning Jetty started but… Karaf is dead. This appears to be another *finger guns* gotcha by the OpenNMS team for the non-paid product.

You must MANUALLY update your config.properties file in opennms/etc to update the reference for Felix that was upgraded from 6.0.4 to 6.0.5;

config.properties:karaf.framework.felix=mvn\:org.apache.felix/org.apache.felix.framework/6.0.4

to

config.properties:karaf.framework.felix=mvn\:org.apache.felix/org.apache.felix.framework/6.0.5

Wiki.JS docker-compose w/ postgres persistent storage via NFS and Traefik

Here’s an example of our docker-file for Wiki.JS with NFS DB storage, Postgres and Traefik;

version: "3"
volumes:
  db-data:
      driver_opts:
        type: "nfs"
        o: addr=nfshost.example.com,nolock,soft,rw
        device: ":/mnt/Pool0/WikiJS"
services:
  db:
    image: postgres:11-alpine
    environment:
      POSTGRES_DB: wiki
      POSTGRES_PASSWORD: Supersecurepassword
      POSTGRES_USER: wikijs
    command: postgres -c listen_addresses='*'
    logging:
      driver: "none"
    restart: unless-stopped
    networks:
      - internal
    labels:
     - traefik.enable=false
    volumes:
      - type: volume
        source: db-data
        target: /var/lib/postgresql/data
        volume:
            nocopy: true


  wiki:
    image: ghcr.io/requarks/wiki:2
    depends_on:
      - db
    environment:
      DB_TYPE: postgres
      DB_HOST: db
      DB_PORT: 5432
      DB_USER: wikijs
      DB_PASS: Supersecurepassword
      DB_NAME: wiki
    restart: unless-stopped
    networks:
      - proxy
      - internal
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=proxy"
      - "traefik.http.routers.ex-wikijs.entrypoints=https"
      - "traefik.http.routers.ex-wikijs.rule=Host(`wikijs.example.com`)"
      - "traefik.http.services.ex-wikijs.loadbalancer.server.port=3000"

networks:
  proxy:
    external: true
  internal:
    external: false

This stands up a postgres instance using the NFS mount as storage, allows the internal network to connect to it, stands up a wiki.js instance and gets it all going for you. All behind a Traefik proxy.

LED Rope Lighting / Alexa

I’ve had a few people ask me about this so, I figured I would write an article on what I used to do it.

Parts List:

Zigbee/Zwave Controller: I utilize a SmartThings hub ( now Aeotec ) but you can use anything that is compatible with Zigbee/Zwave if you already have a hub.

RGBW Controller: I used a monoprice 136511 but it looks like maybe they aren’t making these any more or are sold out? Any RGBW controller that is Z-Wave or Zigbee will work, just make sure it supports enough amperage to power your LED strips, depending on length, this can be up to 20 Amps.

LED Power Supply: Now this depends on your LED length and voltage, I used 12VDC 16′ light sections from Amazon for the big ceiling, these are three of those strips connected together. As such I need quite a bit of amperage at 12VDC, so I chose a hard wired PSU which I added a cooling fan to the top of.

For our shorter runs, we used standard power brick style PSUs. Although you could use this style for both, it has enough power capability.

LED Strips: I chose generic RGBWW ( Warm White ) LED strips that are pretty cheap with an adhesive backing to them.

Installation:

Installation is fairly simple if you hare hand and know basic electrical skills.

Power supply to the RGBW controller, controller to the LED light strip(s).

Generally, once the controller is powered the hub you have will see it and you’ll be able to adopt it and it basically just works from there, I’m not going to go into detail on this step because the variations depending upon your devices is limitless but, as long as they speak the same protocol(s): Z-Wave or Zigbee, you should have no issues.

We used multiple controllers, one for each set of LEDs so that they could be controlled individually, with things like SmartThings and Alexa you can group them so if you want them to change together they can, or they can be told to change independently, the flexibility is yours.

We have fantastic ceilings that have a great place to hide the lights behind some crown moulding, and we couldn’t be happier with the results, this is 2-years into these being in service and being used every day and they all still work perfectly:

The green around the fan is one controller, and one adjoined long set of RGBWW LEDs.

The red around the TV and above it are another set of two LED strips connected with a longer extension but controlled by the same controller.

The white gap on the left above the fireplace is our dining room on another controller. We also have started adding them underneath our cabinets in the kitchen although that project has stalled a bit lol.

Rev.IO and Netbox…

Rev.IO has kind of a terrible asset management interface, (and they’ve killed their WWW subdomain without a redirect,… but that’s another rant) but we’ve chosen it due to its ability to handle MSP billing, so while it’s not ideal it’s something that my team has to work with.

So our first task was taking all of the Inventory that was in Rev.IO which as been transitioned into being our asset management platform as well since inventory is tracked in Rev.IO for billing purposes it does not make sense to use another platform given our small volume. The issue lies in how Rev.IO does its asset management and that is that you cannot tie an asset to a physical location. You can add multiple sites, but cannot associate an inventory item to a physical location. This is where Netbox comes in for us, since we have all of our physical locations in Netbox, we can associate an asset tag ID and asset ID from Rev.IO to a physical location or customer.

Here is what I did, with Python, to make this work;

First, we have to ensure that we import all of our customers from Rev.IO into Netbox, now we have two issues with Rev.IO here. Their documentation indicates that the ALL flag will get you all customer status’ – OPEN, CLOSED, PENDING etc… this is untrue, you must run each individually to get them all, the ALL flag, returns 0.

We created a custom_field in Netbox called revio that is the customer_id from Rev.io to allow pivoting on that id.

#!/usr/bin/python
import requests
import json
import argparse
import sys
import re

#Get customer list from Rev.IO and ensure they are all in Netbox
        r_parms = {"search.page_size":"100000","search.status":"OPEN"}
        response = requests.request("GET", url + "Customers", headers=headers, params=r_parms)
        netbox_r = requests.request("GET", netbox + "tenancy/tenants/?limit=10000", headers=netbox_h)
        #Read JSON in response
        data = response.json()
        #Read JSON in netbox
        netbox_d = netbox_r.json()
        #Iterate over JSON
        customercount = 0
        for i in data['records']:
                customercount += 1
                revio_id = i['customer_id']
                netbox_hasit = False
                for n in netbox_d['results']:
                        if n['custom_fields']['revio'] == revio_id:
                                netbox_hasit = True
                                break
                if not netbox_hasit:
                        print("No Netbox entry for " + i['service_address']['company_name'] + " - " + str(revio_id) + "Adding it")
			netbox_name = i['service_address']['company_name']
			netbox_slug = re.sub('[!@#$\'\".,&()]', '', netbox_name)
			netbox_slug = netbox_slug.replace("/", "-")
			netbox_p = {'name': netbox_slug, 'slug': netbox_slug.replace(" ", "-"), 'custom_fields': {'revio':revio_id}}
			sc = requests.post(netbox + "tenancy/tenants/", json=netbox_p, headers=netbox_h)
			print(sc.text)

The above will check Netbox for existing customers (tenants) that have the matching custom_field value and if not, add them. Again, you have to change the parms value from OPEN to CLOSED etc. to get everyone.

More to follow in a later post.