Holiday Busy-ness

Well it’s the holidays and as such there’s a million other little projects that have been obstructing hobby jobs and tinkering so here’s a couple of the things we’ve been working on.

First, we made snow out of a few parts from the hardware store, the big air compressor and a pressure washer:

We had to get our Christmas lights hung up;

Then we had to fix an outlet on the front of the house where we connect the Christmas lights as it had gone flaky

And finally we did actually get to work on the RV a bit and replace the door lock:

Bad kitchen plumbing

So this is an issue on this house that has been around since we bought the house and frankly I should have fixed it a long time ago but, here we are… one of those low items on the list as it only presents an issue every year or so and the duration it takes to resolve it is significantly less than a permanent fix.. At any rate, I made a video so that people can understand that if you add a garbage disposal, take care in understanding your drain elevations so you know to either expect this issue or, far preferably; avoid it.

OpenNMS / PagerDuty

With the release of OpenNMS 30, we found that the PagerDuty plugin was broken.

Issue #9 was opened on Jul 20 to address the error:

Error executing command: Unable to resolve root: missing requirement [root] osgi.identity; osgi.identity=opennms-plugins-pagerduty; type=karaf.feature; version="[0.1.3,0.1.3]"; filter:="(&(osgi.identity=opennms-plugins-pagerduty)(type=karaf.feature)(version>=0.1.3)(version<=0.1.3))" [caused by: Unable to resolve opennms-plugins-pagerduty/0.1.3: missing requirement [opennms-plugins-pagerduty/0.1.3] osgi.identity; osgi.identity=org.opennms.plugins.pagerduty-plugin; type=osgi.bundle; version="[0.1.3,0.1.3]"; resolution:=mandatory [caused by: Unable to resolve org.opennms.plugins.pagerduty-plugin/0.1.3: missing requirement [org.opennms.plugins.pagerduty-plugin/0.1.3] osgi.wiring.package; filter:="(&(osgi.wiring.package=org.opennms.integration.api.v1.alarms)(version>=0.5.0)(!(version>=1.0.0)))"]]

The devs did update the master tree a few weeks ago to accommodate the changes for OpenNMS 30, however it still does not build. I have forked the project and made one small change and it compiles and works.

Scylla / OpenNMS NewTS? – Use a REDIS cache

We noticed that our larger hosts, specifically PoP routers with thousands of interfaces were having intermittent resource graphing. This seemed strange since we have a Scylla backend we are using with NewTS that has gobs of resources. The Horizon server is also fine resource wise, well as it turns out, implementing the REDIS cache for OpenNMS/Horizon makes a world of difference.

In our case we went from ~3600/sec queries against the Scylla cluster to ~450/sec and all graphing gaps went away. Also viewing resource graphs got faster. It would appear that the internal Cache in Horizon may just not be powerful enough and is not very efficient when compared to REDIS.

OpenNMS Email Notifications

Lately I’ve been facing an issue where some hosts that are only supposed to be notified after a 10 minute outage, due to them being LTE devices, they drop a lot, and we really don’t need to know about it until they’ve been down for a while, then we can take action.

Well we tried to exclude these from our catch-all nodeDown filter, all our LTE devices are in RFC1918 10.10.X.X space, so we created a filter on our global rule to exclude them:

!(IPADDR IPLIKE 10.10.*.*)

The inverse of this (IPADDR IPLIKE 10.10.*.*) works just fine on our 10m delay filter.

Well it turns out even if you are not monitoring an IP, and that IP exists on the host, it will try and match it. This is a big problem, it means you have to either explicitly list all the IPs that you want to EXCLUDE or, this code needs to be changed to only look at monitored IPs. I think a isManaged filter should be added on the SQL query. At any rate, if you’re hitting mystery notification, that don’t show up in the validate list, that is why.

OpenNMS Horizon 30 Update / 503 / Karaf Failure

When upgrading to OpenNMS Horizon 30, you may find that even after following standard upgrade procedures it will produce a HTTP/503 for you. Meaning Jetty started but… Karaf is dead. This appears to be another *finger guns* gotcha by the OpenNMS team for the non-paid product.

You must MANUALLY update your config.properties file in opennms/etc to update the reference for Felix that was upgraded from 6.0.4 to 6.0.5;

config.properties:karaf.framework.felix=mvn\:org.apache.felix/org.apache.felix.framework/6.0.4

to

config.properties:karaf.framework.felix=mvn\:org.apache.felix/org.apache.felix.framework/6.0.5