Using an XML Parser

October 9th, 2014

You know when the XML nerds say not to use regular expressions to parse XML? They’re right.

As part of a less is more project, we wanted to remove the tags from all of the OpenNMS event files. We spent much of the morning playing with a number of methods to find and replace with empty space those tags, and we failed. We came close a couple of times, but then some weird aspect of formatting (tags that spanned multiple lines, some with spaces and some without, etc.) would foil it.

Then I found out about xmlstarlet. We installed it and ran:

xml ed -L -d "/events/event/alarm-data" [filename]

and it just worked. Pipe that bad boy through find and you are good to go.

While I don’t think the option exists, it would be cool if instead of deleting the tag we could just comment it out, but that doesn’t seem to be currently possible.

When Less is More

October 7th, 2014

One of the things I’ve noticed in my years of deploying network management solutions is that people can get real excited when they go from having no visibility into their network to being able to see in great detail what’s happening, as when they deploy OpenNMS. The problem then changes from having no information to having too much.

Network geeks like myself tend to be loathe to turn off certain alerts, but sometimes that can be the best thing for an organization.

When OpenNMS was started, workflow was based on events. Events appeared in the browser, events triggered notices, you could acknowledge events – pretty much everything was events. But events can be noisy, especially if you leverage the SNMP trap capability of many devices. This is why we implemented the alarms subsystem. Alarms can take many events and reduce them into a single alarm. Alarm processing can be automated to insure that issues that are important are escalated, and issues that have been cleared can be removed. The alarms list is supposed to be a “to do” list for the NOC staff.

In order to make that happen, it is a good idea to consider each alarm in your system and insure that it is “actionable”. Each alarm comes with two fields for tracking the resolution progress, and these can be used to document the actions taken to fix the alarm.

The “Sticky” memo field is used to annotate a particular instance of an alarm. For example, suppose there was a “link down” alarm due to a circuit being cut by a backhoe. The NOC engineer would be able to note that the repair was in process and maybe even include a case number. Once the issue is resolved the sticky memo goes away.

The “Journal” memo field is permanently associated with the alarm. This is for notes that could be useful the next time the alarm happens, such as “Contact Jim – he knows how to fix this”, etc.

Alarms can be acknowledged, which will remove them from the list of current issues. It is pretty easy to create an automation that can unacknowledge an alarm if it hasn’t been cleared in a particular amount of time. Thus you can automate “reminders” that the issue is still outstanding.

This doesn’t discount the value of events. In OpenNMS, events have become more like log messages. When an alarm happens on a particular node, that node’s page will reflect the events associated with it, which may shed some light onto the problem. But having too many events appear as alarms can overwhelm the NOC staff to the point that they stop using the system.

Unfortunately, often the best way of dealing with network issues involves trial and error. By limiting alarms it is possible to miss something important. But once that happens, alarms can be created to insure it doesn’t happen again. But the opposite, dumping too much information into alarms, will guarantee that alarms will be ignored, greatly increasing the chance that something important will be missed.

I developed my alarm philosophy during my first network management deployment in the early 1990s. I was consulting for a cellular provider and installing HP OpenView Network Node Manager (version 2.2 I believe) and they had me working in the server room. Besides being a bit cold, in the corner was a large UPS that was constantly beeping.

Beep … Beep … Beep

I asked Avery, the guy I was working with, what was wrong and he replied “Oh, it always does that”. At that very moment I decided that if there is an alarm and you don’t do anything to resolve it, just turn it off.

Just remember that OpenNMS is a platform and thus you get to make a lot of the decisions on how best to get it to work with your organization. Consider that when deciding which events to turn into alarms, and then focus on using automations to insure that the most important issues are treated as such.

OpenNMS-based App Wins Digital Jersey Hackathon

October 3rd, 2014

I was delighted to find out that an Android app using OpenNMS as the backend won the “Best App” prize at the first ever Jersey Hackathon.

Note: This is Jersey as in the island and not Jersey as in New.

The Open Alert “Man on Site” app is a small Android application that is designed to track the activities of people working alone at a remote site. From the wiki:

When activated this reports the location of the phone on a regular basis back to a central OpenNMS server. OpenNMS is configured to plot the current location and status of the device on a geographical map (Open Streetmap).

The App has four buttons;

Start Job – This is pressed by the worker when they start lone working on site. This starts a timer in the local App and on OpenNMS. The local timer will generate an alarm on the local device if the user forgets to report in after a set time.

Report In – This must be pressed when prompted by the local timer. If it is pressed both the timer in OpenNMS and the local device will be reset. If it isn’t pressed then OpenNMS will escalate the ‘Man on site’ event to the next level of severity and notify the OpenNMS operator that there is a problem. (Obviously the local timer should be set to 5-10 minutes less than the OpenNMS time out.) OpenNMS will keep escalating the alarm until it is signalled as critical. If the alarm is escalated, then there should be manual processes in place to contact the worker by other means or send someone else to site to make sure they are OK.

Finish Job – This should be pressed when the worker leaves site. The man on site alarm is cleared in OpenNMS and no further escalation takes place.

Panic – If the panic button is pressed, an immediate critical alarm is created in OpenNMS indicating that the worker on site is in trouble and needs help.

OpenNMS maintains a log of all of the movements of the user and also of the time of starting work / stopping work / panic events which could be important for triage if an incident happens.

Congratulations to the authors, Craig Gallen and Mark Wharton, who created this during the 48 hours of the Hackathon. We built OpenNMS to be a platform and not just an application and this is one example of what can be created leveraging it.

More information can be found on the UK OpenNMS Site and the code is available on Github.

OpenNMS in Dublin

October 2nd, 2014

I’ve been to 34 countries so far, and my goal is to hit 50 by the time I’m 50 (which is closer than I’d like). In all that time I’ve managed to miss Ireland, but that is about to change.

Airspeed Telecom is hosting a workshop next Wednesday, October 8th, at the Morrison Hotel in Dublin, described as “probably the hippest & coolest luxury hotel in Dublin city centre”.

That’s just how we roll.

The workshop will feature a case study by Airspeed, as well as a futures roadmap presentation by Dr. Craig Gallen and David Hustace.

Oh, and I’ll be there, too.

If you can make it, please email Liz at lhand@airspeed.ie.

Review: OnePlus One Android Phone

October 1st, 2014

I agonize over my technology decisions, often to a point that other people, including free software people, tease me about it. Is my distribution of choice free enough? Is it secure? Is my privacy protected so that I choose exactly what I want to share?

My current Android ROM of choice is OmniROM, and I’ve been quite happy with it. I do have issues with the limited number of phones that are officially supported, but it was my choice of ROM that drove me to buy an HTC One (m7).

I like the HTC. My main complaint is with the horrible battery life, and the phone is somewhat old having been replaced by the m8 which I don’t believe is supported by OmniROM. I’ve been frustrated in that it seems I have to choose between freedom and cool gear.

But maybe that isn’t the case anymore.

My friend Ronny first brought the OnePlus One (OPO) to my attention, and recently, through one of my Ingress friends Audrey, I was able to get an invite to purchase the new OnePlus One handset. While not supported officially by OmniROM as of yet, it is one of the new phones to ship with Cyanogenmod, and since OmniROM is a fork it should be compatible. Plus, it is very similar to the phones from Oppo which are supported by OmniROM, so perhaps support will come when the OPO becomes more widely available.

The first thing I realized when I opened the box is that this handset is a monster. It boasts a 5.5 inch screen at 1920×1080 pixels (full HD) which makes it the same as the new iPhone 6 Plus (401 ppi). It has a 2.5GHz Qualcomm Snapdragon 801 processor and 3GB of DD3 RAM at 1866MHz which makes it fast. I bought the 64GB version (quite a jump from my HTC One’s 16GB) and the 3100mAh battery lasts all day and then some. I thought the size would worry me, but I quickly got use to it. I can even read magazines on it which may cause me to travel less with my Nexus 7, and as my eyes age I’m finding the OPO’s screen to be much to my liking.

The phone arrived two days after I ordered it via USPS in two separate boxes. There was a thin square one holding the phone

and underneath it a USB cable and a SIM tray removal tool. To remove the OPO SIM you need a longer tool than the standard Apple one, so I’ll have to be sure to carry it with me. In a separate small box was a wall charger.

There was zero paper and no earbuds of any sort, but I would rate the packaging equal to that of other premium products like those from Apple.

Even though it has pretty much the same size screen as the iPhone 6 Plus, the phone itself is slightly smaller and lighter, although thicker (the iPhone is wicked thin – you are almost worried you’ll bend it). The back of the “Sandstone Black” model is coated with a rough textured finish that makes the phone feel solid in your hand and I haven’t come close to dropping it.

Another improvement over the HTC is the camera. The OPO comes with a 13MB Sony Exmor IMX 214 with six physical lenses. It can shoot 4K video (including slow motion) or 720p video at 120fps. It takes nice pictures.

But you could have read that on the website. How does it fare in real life?

I was concerned with the fact it ran Cyanogenmod. When they announced they were going to take on investment to license their code to handset makers, they handled their community poorly (which resulted in the OmniROM fork) and I was worried that the OPO would be “less free”. I was happy to find out that it was very open. Unlocking the phone was the same as with Nexus devices, simple hook it up to your computer and run “fastboot oem unlock”. While I despised the “flat” icon theme that shipped with the device, it took about two taps to change it back. If I wanted a theme that looked like Windows 8 I would have bought an iPhone.

All my usual options were there. I disabled the Google search bar, increased the icon layout grid size and otherwise customized the phone exactly how I wanted it. I rooted the device and used Helium to restore my application settings and the whole conversion took less than an hour.

I did have to make a change to allow the phone to work with my Linux Mint Desktop. The system wouldn’t recognize it when I plugged it in, and I had to edit “/lib/udev/rules.d/69-libmtp.rules” to include the following two lines:

# Added for OPO
ATTR{idVendor}=="05c6", ATTR{idProduct}=="6764", SYMLINK+="libmtp-%k", MODE="660", GROUP="audio", ENV{ID_MTP_DEVICE}="1", ENV{ID_MEDIA_PLAYER}="1", TAG+="uaccess"
ATTR{idVendor}=="05c6", ATTR{idProduct}=="6765", SYMLINK+="libmtp-%k", MODE="660", GROUP="audio", ENV{ID_MTP_DEVICE}="1", ENV{ID_MEDIA_PLAYER}="1", TAG+="uaccess"

After that it was a breeze. Note: that on one system I had to reboot to get it to recognize the phone, but I don’t think I did on the first one. Strange.

There are a few shortcomings. It took me several tries to get it to pair with my Motorola T505 bluetooth speaker, but once paired it seems to connect reliably. The voice recognition sucks like most Android phones. I don’t use Google Now but I shouldn’t have to send information off to a remote server to voice dial a call. I do miss that from my iPhone days when the original (non-Siri) voice dialer rarely made a mistake. Voice dialing on the OPO is usable, though, and there is a rumour that there will be an “OK OnePlus” voice activation feature like on the Moto X but it isn’t there now. No microSD slot, but with 64GB of internal flash memory that is less of an issue and fewer and fewer phones offer that. I also just tested this little dongle I have for accessing microSD cards via the USB port and it worked just fine.

I’m sticking with the stock ROM for now to see what Cyanogenmod will do in the future, but I know that I have the ability to put on my own Recovery and ROM should I so choose. At the moment they are in the “not evil” column, but I was a little worried about their Gallery app. I noticed a new Galley app account on my phone that looked like it was going to sync my pictures somewhere. Some research suggests that it is disabled when autobackup is off, but it would still like a little more transparency about random, non-removable accounts on my phone.

All in all I’ve been very happy with the OnePlus One and I’m eager to see where they take it. I am especially enamored of the the price. At US$349, the black 64GB version is the same price as a 16GB Nexus 5 and half the price of the iPhone 6 Plus. Probably the best bang for the buck in the Android world at the moment, if not phones in general.