OpenNMS / PagerDuty

With the release of OpenNMS 30, we found that the PagerDuty plugin was broken.

Issue #9 was opened on Jul 20 to address the error:

Error executing command: Unable to resolve root: missing requirement [root] osgi.identity; osgi.identity=opennms-plugins-pagerduty; type=karaf.feature; version="[0.1.3,0.1.3]"; filter:="(&(osgi.identity=opennms-plugins-pagerduty)(type=karaf.feature)(version>=0.1.3)(version<=0.1.3))" [caused by: Unable to resolve opennms-plugins-pagerduty/0.1.3: missing requirement [opennms-plugins-pagerduty/0.1.3] osgi.identity; osgi.identity=org.opennms.plugins.pagerduty-plugin; type=osgi.bundle; version="[0.1.3,0.1.3]"; resolution:=mandatory [caused by: Unable to resolve org.opennms.plugins.pagerduty-plugin/0.1.3: missing requirement [org.opennms.plugins.pagerduty-plugin/0.1.3] osgi.wiring.package; filter:="(&(osgi.wiring.package=org.opennms.integration.api.v1.alarms)(version>=0.5.0)(!(version>=1.0.0)))"]]

The devs did update the master tree a few weeks ago to accommodate the changes for OpenNMS 30, however it still does not build. I have forked the project and made one small change and it compiles and works.

Scylla / OpenNMS NewTS? – Use a REDIS cache

We noticed that our larger hosts, specifically PoP routers with thousands of interfaces were having intermittent resource graphing. This seemed strange since we have a Scylla backend we are using with NewTS that has gobs of resources. The Horizon server is also fine resource wise, well as it turns out, implementing the REDIS cache for OpenNMS/Horizon makes a world of difference.

In our case we went from ~3600/sec queries against the Scylla cluster to ~450/sec and all graphing gaps went away. Also viewing resource graphs got faster. It would appear that the internal Cache in Horizon may just not be powerful enough and is not very efficient when compared to REDIS.

OpenNMS Email Notifications

Lately I’ve been facing an issue where some hosts that are only supposed to be notified after a 10 minute outage, due to them being LTE devices, they drop a lot, and we really don’t need to know about it until they’ve been down for a while, then we can take action.

Well we tried to exclude these from our catch-all nodeDown filter, all our LTE devices are in RFC1918 10.10.X.X space, so we created a filter on our global rule to exclude them:

!(IPADDR IPLIKE 10.10.*.*)

The inverse of this (IPADDR IPLIKE 10.10.*.*) works just fine on our 10m delay filter.

Well it turns out even if you are not monitoring an IP, and that IP exists on the host, it will try and match it. This is a big problem, it means you have to either explicitly list all the IPs that you want to EXCLUDE or, this code needs to be changed to only look at monitored IPs. I think a isManaged filter should be added on the SQL query. At any rate, if you’re hitting mystery notification, that don’t show up in the validate list, that is why.

OpenNMS Horizon 30 Update / 503 / Karaf Failure

When upgrading to OpenNMS Horizon 30, you may find that even after following standard upgrade procedures it will produce a HTTP/503 for you. Meaning Jetty started but… Karaf is dead. This appears to be another *finger guns* gotcha by the OpenNMS team for the non-paid product.

You must MANUALLY update your config.properties file in opennms/etc to update the reference for Felix that was upgraded from 6.0.4 to 6.0.5;

config.properties:karaf.framework.felix=mvn\:org.apache.felix/org.apache.felix.framework/6.0.4

to

config.properties:karaf.framework.felix=mvn\:org.apache.felix/org.apache.felix.framework/6.0.5

Wiki.JS docker-compose w/ postgres persistent storage via NFS and Traefik

Here’s an example of our docker-file for Wiki.JS with NFS DB storage, Postgres and Traefik;

version: "3"
volumes:
  db-data:
      driver_opts:
        type: "nfs"
        o: addr=nfshost.example.com,nolock,soft,rw
        device: ":/mnt/Pool0/WikiJS"
services:
  db:
    image: postgres:11-alpine
    environment:
      POSTGRES_DB: wiki
      POSTGRES_PASSWORD: Supersecurepassword
      POSTGRES_USER: wikijs
    command: postgres -c listen_addresses='*'
    logging:
      driver: "none"
    restart: unless-stopped
    networks:
      - internal
    labels:
     - traefik.enable=false
    volumes:
      - type: volume
        source: db-data
        target: /var/lib/postgresql/data
        volume:
            nocopy: true


  wiki:
    image: ghcr.io/requarks/wiki:2
    depends_on:
      - db
    environment:
      DB_TYPE: postgres
      DB_HOST: db
      DB_PORT: 5432
      DB_USER: wikijs
      DB_PASS: Supersecurepassword
      DB_NAME: wiki
    restart: unless-stopped
    networks:
      - proxy
      - internal
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=proxy"
      - "traefik.http.routers.ex-wikijs.entrypoints=https"
      - "traefik.http.routers.ex-wikijs.rule=Host(`wikijs.example.com`)"
      - "traefik.http.services.ex-wikijs.loadbalancer.server.port=3000"

networks:
  proxy:
    external: true
  internal:
    external: false

This stands up a postgres instance using the NFS mount as storage, allows the internal network to connect to it, stands up a wiki.js instance and gets it all going for you. All behind a Traefik proxy.

LED Rope Lighting / Alexa

I’ve had a few people ask me about this so, I figured I would write an article on what I used to do it.

Parts List:

Zigbee/Zwave Controller: I utilize a SmartThings hub ( now Aeotec ) but you can use anything that is compatible with Zigbee/Zwave if you already have a hub.

RGBW Controller: I used a monoprice 136511 but it looks like maybe they aren’t making these any more or are sold out? Any RGBW controller that is Z-Wave or Zigbee will work, just make sure it supports enough amperage to power your LED strips, depending on length, this can be up to 20 Amps.

LED Power Supply: Now this depends on your LED length and voltage, I used 12VDC 16′ light sections from Amazon for the big ceiling, these are three of those strips connected together. As such I need quite a bit of amperage at 12VDC, so I chose a hard wired PSU which I added a cooling fan to the top of.

For our shorter runs, we used standard power brick style PSUs. Although you could use this style for both, it has enough power capability.

LED Strips: I chose generic RGBWW ( Warm White ) LED strips that are pretty cheap with an adhesive backing to them.

Installation:

Installation is fairly simple if you hare hand and know basic electrical skills.

Power supply to the RGBW controller, controller to the LED light strip(s).

Generally, once the controller is powered the hub you have will see it and you’ll be able to adopt it and it basically just works from there, I’m not going to go into detail on this step because the variations depending upon your devices is limitless but, as long as they speak the same protocol(s): Z-Wave or Zigbee, you should have no issues.

We used multiple controllers, one for each set of LEDs so that they could be controlled individually, with things like SmartThings and Alexa you can group them so if you want them to change together they can, or they can be told to change independently, the flexibility is yours.

We have fantastic ceilings that have a great place to hide the lights behind some crown moulding, and we couldn’t be happier with the results, this is 2-years into these being in service and being used every day and they all still work perfectly:

The green around the fan is one controller, and one adjoined long set of RGBWW LEDs.

The red around the TV and above it are another set of two LED strips connected with a longer extension but controlled by the same controller.

The white gap on the left above the fireplace is our dining room on another controller. We also have started adding them underneath our cabinets in the kitchen although that project has stalled a bit lol.