This Week (or Two) in OpenNMS, Friday February 27th

By RangerRick, on February 27th, 2009

So I'm a slacker. But I have an excuse, really! I was having so much fun working on JSPs last Friday, that I totally forgot to write TWiO last week. OK, it's JSPs, so I guess I'm lying about having fun, but I was really almost done with a new feature. 😉

Anyways, since I've gotten a little sloppy, I'm going to make it up for you, with a feature on the new Provisiond, which is shaping up quite nicely. And to show my shame, I've dubbed this week's article: This Week (or Two) in OpenNMS.

But, to begin... What's been going on for the last two weeks?

Project Updates

New Node Page

Stable: Current Release is 1.6.2

1.6.2 is still the current release, and while there are a few fixes pending since it's release, there are no immediate plans for a 1.6.3 yet.
Unstable: Current Release is 1.7.0

Commits have been speeding by on trunk as Provisiond moves it's way into a feature-complete state. I'm still hoping we'll get a 1.7.1 release out soon so people can give it a shot, but, well, we have to stop to breathe first. In the meantime, feel free to try the nightly snapshots (if it's not on a production system, of course).
Trunk: RANCID Updates

Antonio and Guglielmo have been furiously working away on the RANCID integration, focusing on authentication, and the UI for the most part.
Trunk: SNMP Refactoring

Matt has spent some time refactoring some of the SNMP data so that it is instead in the snmpInterface table where it belongs, and it means no more 0.0.0.0 IPs where they don't belong. This cleans up our model a bit to be more sane. (A bit.)
Trunk: Provisiond Updates

Provisiond is nearing feature-complete, and is definitely at a point where it can be tried out on real-world networks. See below for a feature on Provisiond.
Trunk: Node Page Updates

Donald and Matt have both been working on updating the node page to be more dynamic. It's pretty slick now, offering tabs detailing various information related to the node.
Branch: Inventory Management

Matt Raykowski continued work on his inventory daemon. I'm still waiting for a more detailed update of what it does/will do, I'll try to get an update before next week's TWiO.

Provisiond: An Overview

Provisioning Groups: UI

Provisioning Groups: Edit Requisition

Provisioning Groups: Edit Foreign Source

A History Lesson

Architecturally, OpenNMS is very sound. It was written from the beginning with scalability in mind, by people who had been dealing with network management for their entire careers. However, Java, as a language (and as an environment), has grown immensely in power, reliability, and expressiveness since we started this project. Gone are the heady days of requiring IBM's 1.3 JDK because it was the only one that handled large numbers of open ports gracefully.

That does not mean all remnants of those days are gone, however.

Over the years, most of OpenNMS has gone through an architectural upgrade, under the guidance of Matt Brozowski. Whole subsystems were cleaned up, rewritten, and/or modernized. 3 glaring exceptions remained, until recently: Capsd (the capability scanner), Notifd (the notification engine), and the web UI (I don't know if it's web 1.0, or web 0.9, but well... it needs an upgrade.) Thanks to the recent implementation of Provisiond, I'm happy to say we only have two glaring exceptions to the wonderful architectural upgrades OpenNMS has gone through. Notifd: I'm coming for you next!

Capsd! Huugh! Good God, Y'all! What Is It Good For?

OpenNMS has three distinct phases when it comes to how it discovers and works with nodes:

Discovery (What's Out There?)

Discovery is the beginning of the life of a node. OpenNMS does a ping sweep across the range of addresses specified in your configuration, and says, "did it respond?" If the answer is "yes," discovery shouts into the event bus, "HEY! I suspect I might have found a new node!" Discovery is not actually a requirement of OpenNMS, it is entirely possible to use it without doing discovery, either by sending your own newSuspect event (through a script, or the "Add Interface" link in the admin UI), or by using the model importer.
Capability Scanning (What Does It Do?)

Capsd, the capability scanning daemon, listens for newSuspect events, and says, "Hmm, what is this IP address capable of?" It does service scans (based on the capsd plugins defined), checks for SNMP, and does some other things to determine whether it's a new node or part of another node, and what services are publicly listening.
Monitoring (Is It Still Doing It?)

Once a node has been discovered and capability scanned, it goes into the queue to be monitored, based on the poller-configuration. At that point it becomes a part of the normal poll/data-collection/thresholding/notification cycle that you know and love.

The Model Importer

Some of our users are integrating OpenNMS into larger systems, or want tighter control over what nodes OpenNMS is aware of than can be provided by a ping sweep and Capsd scan. For that, we wrote the model importer, or "provisioner".

To use the model importer, you would create an import definition XML file, which describes the nodes, their interfaces, and services which you want to manage directly. Most folks who are doing this either export from some other internal database of managed nodes, or use the Manage Provisioning Groups UI and custom-build a set of nodes directly. You hit the import button, and voila! Your database is populated with the exact set of nodes, interfaces, and services you specified.

This is very powerful, but also very rigid. It's pretty much an all-or-nothing affair; there's no room for discovering extra information about provisioned nodes.

What Does Provisiond Do?

Provisiond replaces the functionality of Capsd, and replaces and expands upon the old model importer, by defining not only the nodes and interfaces you wish to monitor, but also adding the ability to distinguish how we should behave when importing these nodes. It has a pluggable architecture which allows for defining policies such as, "scan the imported node, and if it's in this IP address range, automatically put it in the routers category," and "scan the specified interface, automatically discover the SNMP interfaces, and if they are non-IP, don't bother putting them in the database."

Architecturally, Provisiond is chopped up into a number of subsystems:

The Provisioner

This is the meat of provisiond, it's responsible for managing a provisioning "lifecycle," which includes scheduling scans, listening for events, and kicking off the various parts of scanning and writing to the database. Matt has written a very powerful engine for doing these various tasks which I suspect will move into other parts of the code as we refactor them in the future. There's a lot more flexibility in determining when things will be scanned than Capsd's main loop.
Detectors

The detectors are the replacements for Capsd scanners, and they've received an upgrade over their Capsd counterparts as well. They've been rewritten with an architecture that allows them to run asynchronously, so service scans which take a while don't have to bottle up the scanning of other nodes. Additionally, the SNMP scanning has received an upgrade that lets us react to bulk gets as soon as a complete "row" of data is received, so node sub-interfaces discovered through SNMP will start showing up sooner.
Policies

Policies act as filters on the data as it's coming in. We've got a few simple ones in the first release, but since they're written with a very simple and pluggable API, it makes it easy for you to define your own business logic by creating an incredibly simple class and dropping it in the classpath.

A Bit More on the Model

The existing model importer had the concept of a "provisioning group," which is a collection of node/interface/etc. definitions, and was configured through the Manage Provisioning Groups UI. Each provisioning group had a "foreign source" which defined a discrete collection of nodes, so that if you removed a node from that foreign source, it would be deleted from the database as well.

In Provisiond, the foreign source is no longer just a tag that's a unique identifier, it's a definition of the behavior of the provisioner (ie, defines the detectors and policies). To that end, we've renamed the generic "provisioning group" to a Requisition. There is then a separate configuration for the Foreign Source which defines the policies which describe how the requisition is imported.

What Works?

At this point, almost everything. The only thing missing is the part that listens for newSuspects, so it doesn't completely replace Capsd as of yet. However, barring that, all of the functionality of Capsd is essentially implemented in Provisiond at this point.

How Do I Try It?

Grab a 1.7.1 snapshot or build from source, and fire it up. Provisiond is enabled by default now, and you can go to the Manage Provisioning Groups link in the UI to try it out. Did I mention this is unstable software? Really, it's unstable software. Don't do this on a production system! That said, it's cool. =)

Upcoming Events

March 14th, 2009: OpenNMS User Conference Europe 2009 will be held in Frankfurt am Main, Germany.
April 6th-10th, 2009: OpenNMS training will be available through The OpenNMS Group at the OpenNMS training facility in Pittsboro, NC.
June 14th-19th, 2009: OpenNMS Dev-Jam 2009, the annual OpenNMS developers conference, will be in Minneapolis-St. Paul this year.

If you have anything to add to the events list, or you wish to be a Dev-Jam sponsor, please let me know.

I'm Beat

Man, it was a lot of work to write that up. I'm so tired, I just might miss TWiO next week. We'll just have to see. 😉

As always, if you wish to berate me publicly, feel free to leave a comment on my blog, and if you wish to berate me privately, feel free to send me an e-mail. Until next time!

Share on Facebook

Tales of the Raccoon Fink