OpenNMS: MIB Management Through the GUI

Two of the main strengths of OpenNMS as a network management application platform are the fact that it writes to a database and that most of the configuration is done via XML.

The database is sometimes invaluable in addressing a complex problem and while XML isn’t the easiest to use, since it can be parsed by both humans and machines the idea has always been to eventually have a configuration user interface for every file. Thus you could choose which method works best for you.

I like to tell a story of a trip I took to Australia where the client handed me a list of names, usernames, e-mail addresses and phone numbers for over 50 users and asked me to put them in OpenNMS. Rather than struggle to enter them in via the GUI (with some hard to catch transposition errors to be expected) I was able to save the list as a comma separated variable (CSV) file and then use a script to convert that to XML. I just dropped that into the users.xml configuration file and was done.

But some things can be a bit laborious to do by hand and a decent user interface would make things that much easier. One such task is dealing with SNMP MIB files for both events and data collection configuration. So I was very excited when I saw this video that Alejandro put together on his new UI feature.

Check it out. It runs a little over ten minutes and does a great job of demonstrating the functionality. Hats off to Alejandro and the rest of the team for pulling this together. You can find it in the 1.11 releases and it will be included in the next stable version of OpenNMS due in 2013.

OpenNMS: Topology Views

I haven’t been blogging much lately, mainly because I’ve been on the road a lot. It doesn’t look like this is going to change much for the rest of the year, but I am hoping to share more about what is going on in OpenNMS-land.

For the next stable release of OpenNMS, we are aiming to greatly improve aspects of the user interface. We have settled on a technology called Vaadin which should make the UI more like a desktop client than a web page.

A lot of that effort has been focused on a new topology view, i.e a map. I’ve gone on record before as someone who really hates maps. It goes back to some time I spent in 1999 where I was deployed at a client site where I spent hours each day tweaking HP OpenView maps. While the pointy-haired boss types really seem to like maps, they tend to have limited usefulness in actually diagnosing network issues, and quite frequently the effort that goes into maintaining the map isn’t worth it.

So the problem we were trying to solve is: how to make a map that is both useful and maintainable?

Leveraging the fact that OpenNMS is an application platform, Matt wrote a complex piece of code we call the “topology provider”. The idea was that different people need to see the network in different ways, so rather than limit things to one view, why not create a generic topology provider that can be customized to fit those needs. You could look at the network from a Layer 2 and/or Layer 3 perspective (i.e. most maps), or you could choose a geographical view, or a business function view, etc.

The development team in Fulda, Germany, ran with this and decided to create a VMWare VSphere view. With this topology provider, VSphere is queried in order to gather a layout of virtual machines, both host systems and guests, as well as their network attached storage.

One nice feature of the new topology view is what we are calling “semantic zooming”. While you can zoom around the map like you are used to, you can also zoom in and out from a semantic perspective. For example, here is a top level view of three separate VSphere instances.

As you zoom in, you get to see the main host machines as well as other network details:

Zooming in farther brings up a much greater level of detail:

And then you can use traditional zooming to zero in on a machine that is having issues:

The new topology view also provides context sensitive menus. In this case you can bring up the ability to SSH into the system that is having problems:

This is a web-based SSH client that proxies through the OpenNMS system, so you can reach devices on your network even if you are accessing OpenNMS over a VPN or NAT (of course, you can remove this if you think it poses a security risk).

This is available in the current 1.11 development (unstable) branch of OpenNMS, and we are improving it every day. This is just one of the changes we hope will improve the usability of OpenNMS. I’ll post another one tomorrow.

Zipcar

I travel a lot, and one thing I really hate to do is to rent a car. My usual trip is a week or so at a client site deploying OpenNMS, and outside of work and dinner I pretty much just live in the hotel, so there isn’t much driving around that I need to do. However, at certain sites you just can’t avoid driving so I just suck it up and pay something like $5 a mile for the privilege.

Enter Zipcar.

A couple of years ago I was planning on being in Portland, Oregon, probably for OSCON. Portland has great public transportation, so I hate renting a car when I go there, but I wanted to visit a client across the river in Vancouver, Washington. So I checked out the Zipcar service. Zipcar lets you rent cars by the hour. It includes fuel and insurance (with a $750 deductible) and it is supposed to be pretty easy to pick up and drop off. I figured I’d rent a car for a few hours, drive over and drive back, all for less hassle than a usual car rental.

So I signed up. It costs $50 a year for a membership and then rentals are an hourly rate on top of that. For whatever reason, I didn’t use Zipcar on that trip, but I’ve kept my membership current and I travel with my card.

Last week I was in Chicago. I love Chicago, especially since I can get around pretty easily on public transit. They have a great subway system supplemented with lots of bus service. However, I ended up having to go to Loves Park, Illinois, not realizing it is practically in Wisconsin. I pretty much decided that I couldn’t make the trip until I remembered Zipcar.

I was staying on the Magnificent Mile, and the Zipcar website told me that there was a location a couple of blocks away at the Four Seasons hotel. I booked a car for the afternoon, setting up my reservation on-line (I later downloaded the Android app) and when the time came I used the detailed instructions on the website to find the parking garage and my car.

The Zipcar membership card uses RFID, and there is a sensor mounted to the driver’s side windshield of the car. You use your card to both unlock and lock the vehicle, as the key is supposed to stay with the car at all times (mine was attached to the dash using a retractable pull string). Once I got settled in the car, I used the access card in a pocket in the visor (right next to a gas charge card) to exit the garage.

The trip to Loves Park was uneventful, and it would have been impossible to get there on public transit.

Included in the car rental is fuel. The Zipcar philosophy is all about sharing and courtesy, so you are supposed to return the car with at least a quarter of a tank of fuel. I estimated that I would have at least that much left after I returned to Chicago, but I wanted to try out their gas purchasing system, so I filled the tank in Loves Park (I also figured the cost would be less than in Chicago proper).

Each Zipcar comes with a gas card. It works like a credit card, but when you use it the pump asks you for the odometer reading and your membership (or “driver”) number. I’m not sure if this is something unique to Zipcar or if this is a common system usable by any motor fleet, but as soon as I entered the information I was able to fill the tank.

Oh, you might be wondering what type of car Zipcar provided to me. It was a BMW 328i.

I was requested to use premium gasoline.

Some extra nice touches: there was a micro-USB charger in the glovebox, which I ended up using since I needed my phone to act as a GPS (sorry iPhone users – there was no plug for you). There was also a pretty decent mixed CD in the CD player (burned on a CD-ROM). I don’t know if they come with all Zipcars or someone just left them there, but I did the same to keep within the Zipcar spirit.

The only downside was that there was something a little off with the alignment and there was a pronounced vibration while driving. It was very noticeable in the 55-65 mph range, and I dutifully tested a variety of speeds (strictly as a matter of science) but it never fully went away. I can’t seem to find any way to report this to Zipcar, however. While I seem to remember seeing something on my phone during the reservation, I was trying very hard to get the car back in time for the next renter that I didn’t explore it at the time, and now the option is gone. I did manage to get back under the 180 mile/day limit, clocking in at 174.

Outside of the mileage limit (which won’t affect most renters) the other thing to watch out for is that there is no refund if you return the car early, and while I am not sure of the penalties, the website hints that it is very uncool to return a car late, especially if someone has a reservation after you.

Overall the experience was so nice that I’m thinking that NASDAQ:ZIP at $6.50 is probably a deal. It was way more convenient than any car rental I have done in my life, and that is actually saying something.

Oh, are any of my three readers going to the LISA conference in San Diego in December? I won’t be at the conference but I plan to be in the city, and I’ll be using Zipcar to get around if anyone wants to get together.

[UPDATE: I got a surprise call this morning from Nef, the Chicago Zipcar fleet manager. He apologized for the alignment issues I was experiencing and even gave me a credit against future rentals.

What I love about this is that a) someone found this post, b) cared enough to read it and then c) bothered to look up my phone number and call me. I’m definitely looking forward to my next opportunity to use their service]

Ever Wonder What Your Support Dollar Buys?

One of the hardest parts of our business is justifying the purchase of support and services for our open source product. Shouldn’t it “just work”? If I have a bunch of smart people working for me, shouldn’t I be able to figure this out on my own?

The issue is that with a product as powerful and complex as OpenNMS, quite often it is something outside of the application that is causing the problem. For example, I am visiting a large telecom provider this week and we spent part of my time figuring out a complex issue with syslogs. It had nothing to do with OpenNMS and everything to do with their various devices each sending in logs with different formats (sometimes two or more formats from the same type of equipment). I doubt any solution without the flexibility of OpenNMS would have been able to solve it, and the customer told me today “we couldn’t have gotten this to work without you”.

Also, today Jeff was dealing with a support issue with another one of our clients. Their large provisioning import was never completing. He dug around and posted this reply:

Watching the logs with Provisiond turned up to DEBUG, I noticed a single pattern accounting for nearly all the messages in provisiond.log:

2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SingleInstanceTracker: Requesting oid following: .1.3.6.1.2.1.1.1
2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SingleInstanceTracker: Requesting oid following: .1.3.6.1.2.1.1.2
2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SingleInstanceTracker: Requesting oid following: .1.3.6.1.2.1.1.3
2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SingleInstanceTracker: Requesting oid following: .1.3.6.1.2.1.1.4
2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SingleInstanceTracker: Requesting oid following: .1.3.6.1.2.1.1.5
2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SingleInstanceTracker: Requesting oid following: .1.3.6.1.2.1.1.6
2012-10-17 19:20:40,087 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SnmpWalker: Sending tracker pdu of size 6
2012-10-17 19:20:40,090 DEBUG [DefaultUDPTransportMapping_127.0.0.1/0]SnmpWalker: Received a tracker PDU of type RESPONSE from /172.22.66.210 of size 0, errorStatus = 0, errorStatusText = Success, errorIndex = 0

It’s the same six objects being requested over and over from ten hosts, all of which appear to be Eltek Valere power units. See there how the size of the RESPONSE PDU is listed as zero? That seemed odd, so I captured some of the SNMP traffic and loaded up the dump into Wireshark.

These devices are replying to our BULK-GETs with response PDUs containing no varbinds, but also indicating no errors, which is silly and seems to send our SnmpWalker class into an infinite loop. You can reproduce this problem using the Net-SNMP snmpbulkget utility:

snmpbulkget –verbose -v2c -c public -Cn6 -Cr1 172.22.66.106 .1.3.6.1.2.1.1.1 .1.3.6.1.2.1.1.2 .1.3.6.1.2.1.1.3 .1.3.6.1.2.1.1.4 .1.3.6.1.2.1.1.5 .1.3.6.1.2.1.1.6

Falling back to SNMPv1 and GET-NEXT seems to elicit a valid response. So I’ve done that for these ten nodes.

This is an example of why we at The OpenNMS Group only hire highly experienced people – how long would it have taken the average person to figure out that out of 7000+ devices these ten were the culprits, as well as coming up with a workaround?

This customer has a bunch of talented people working for them, but they are focused on that client’s business and aren’t as expert on OpenNMS as we are. They might have figured this out, but it could have taken days. Outside of the salaries for that time, the business would suffer since the solution wouldn’t be working. This case on its own probably justified the cost of support.

Now someone might say that a commercial product wouldn’t have suffered from this problem. I find that hard to believe, as any device that abuses the standard to this degree would cause problems for any application. Plus, commercial vendors view support as a cost center, not a revenue stream, and the chance that you would have gotten someone knowledgable on the first try is slim. So you are back to wasting time, and time is money.