Monitoring? Meh.

Posted on June 1, 2017 by Tarus

Recently, I was talking to a person in the tech industry and describing all of the cool things we are doing with OpenNMS, when he kind of cut me off and went “Oh, monitoring? Meh.”

Well, I can’t remember if there was an actual “meh” but that’s how it came across, and I’m afraid the reaction is probably more common that I would think. Monitoring isn’t sexy, but it surprises me that people can’t see how critical it will be to the future of any business.

While forecasts vary, by 2020 there are expected to be over 30 billion devices on the Internet, and that figure will skyrocket to over 75 billion by 2025. Just knowing what is connected to your business network is going to become critical, as well as making sure it belongs there in the first place and, if so, is functioning properly.

Outside of the obvious security concerns, as people began to transact business more and more through devices rather than people, faults in those devices will directly impact revenue as people search for other options when faced with a bad experience.

Here are a couple of examples.

One of the greatest inventions in my lifetime is the ability to buy fuel at the pump. You just pull up, swipe your card, pump and then leave. You used to have to pay inside, and some places made you pay first which meant two trips in if you were paying by credit card. It could be cold or rainy, and not only did you have to wait in line behind people buying food or lottery tickets, you had to leave your car out by the pump possibly blocking the next customer.

The only problem I’ve experienced with this process concerns the receipt. Quite frequently I need a receipt, but it seems the pumps I choose are always out of paper. The little red indicator mark when the paper roll is almost finished isn’t visible to the cashier since there really isn’t one out by the pump. It is frustrating, but it is not like I have a choice at the moment. If there was some way to monitor the pump for a “low paper” alarm, it would improve my shopping experience.

One shopping experience that did result in my leaving the store without a purchase happened yesterday at a Lowe’s Home Improvement store. I needed some florescent lights for the new office so I went by on my way home. I picked up four bulbs (two that I needed and two spares) and went to the checkout area.

I walked past several unmanned cash registers until I got to the “Self Checkout” section, which was the only thing open. Of the four machines, two had red blinking lights on them (that are green when things are functioning normally) and the one lone, overworked cashier was doing her best to help people out. I usually don’t mind using Self Checkout and when I noticed one of the two machines was open (everyone else was waiting for the attention of the lone cashier) I went to it and started my purchase.

I scanned my “My Lowe’s” card and then the first bulb. “Eight ninety-five” piped up the voice and I placed it in a bag.

Here is where the problems started. First, I hate the fact that with these Self Checkout kiosks they don’t trust you to use a “quantity” key. I was buying four identical items but I was required to scan each one. Next, the bulb was light enough that it didn’t register as having been bagged, so the interface yelled at me and presented me with a button marked “Skip Bagging Item?”.

I sighed and, having no other option, hit the button. I then went on to scan the next three bulbs. However, as I bagged the fourth bulb, the scale must have started working since the whole unit went into some kind of alarm mode, screeching “Unidentified Object in the Bagging Area!” and the screen was locked until the cashier had time to come and fix it.

I looked around the area, and by this time all four kiosks had a flashing red light, there were at least three shoppers lined up to use them in addition to those of us already there, and our valiant cashier was busy helping a guy ring up his plumbing supply purchase which consisted of a ton of small copper fittings which most likely wouldn’t be registered by the scale.

I gave up. I picked up my bulbs and returned them to the Lighting section, passing three employees in the customer service area helping zero customers. Before I reached the car I’d ordered the same bulbs on Amazon at a fraction of the price, and they’ll be here on Friday.

Yes, I’m complaining, but how could monitoring have helped here? First, there is some sort of monitoring – those little red lights. When they all light up you would assume someone, or perhaps multiple someones, would come by to help. A monitoring system could have made sure that happened by using an additional notification system outside of the lights, and escalating it until the problem was addressed.

A more long term solution would be to collect information on the purchasing experience and the problems people encountered and to make changes to the automated kiosk software. I’m certain that Lowe’s didn’t write that software but instead bought it, and like most proprietary software solutions they now have to fit their processes to the application instead of the other way around. It probably wasn’t designed for a store that sells a lot of small, light things which is central to the issues I have using it.

With the rise of IoT devices, robotics and other forms of automation, monitoring is going to become extremely important. Lowe’s lost out on a $40 sale, but think of something like an assembly line where a problem could result in the loss of thousands of dollars a minute. Our goal at OpenNMS is to be ready for it, and to build products that make people go “Monitoring? Oh yeah!”.

Server Room Nightmares

Posted on May 25, 2017 by Tarus

I’m interested in any server room nightmares people would like to share.

Here’s one of mine.

We are in the process of moving offices from Pittsboro, NC down the road to Apex. Unfortunately, we are having some issues getting Spectrum Enterprise to complete the fiber installation at the new place, so while we are out of our old building the lack of network access in the new building means we have a bunch of servers in the old location.

Today while I was working in the new office and mooching of our kind neighbor’s wi-fi, I got several notices that links had failed.

linkDown event list

These were some workstations that we use for training, but when they are not in use we use them as part of our continuous improvement Bamboo farm. I immediately hopped on our Mattermost IT channel and asked if anyone was rebooting or otherwise messing with the machines, and when the answer was “no” I started to investigate.

One suggestion was that the air conditioning may have failed and those machines shut down from overheating. It has happened in the past, but it was both rather cool today and other machines that are more sensitive to such things were still running. I checked it out anyway using our AKCP probe.

temperature graph

The temperature had increased a bit, but it wasn’t anything that should have caused problems (it was caused by the server room door being left open).

Being 30 minutes away, I decided to text my friend Donnie, who is technically gifted as well as working in our old location, and he went to investigate.

For some reason, those three machines had been disconnected from the switch.

Now just for this situation we have an Arlo camera installed in the server room, so using the time stamp on the linkDown traps I found the following video.

Note the slightly balding guy in the red shirt in the lower left corner of the video. He is busy unplugging our devices.

Why? I have no idea. These people represent the IT people for the new tenant, and I assume they had legitimate reasons for being in the server room but messing with our equipment was not one of them.

Seriously, in over 30 years of working with computers, I’ve never heard of anyone going into someone’s house, office, server room or data center and just start unplugging cables. I still have not heard an explanation, but the landlord has had a discussion with the new tenant and it shouldn’t be happening again. It is one reason the important stuff is in that locked half-rack seen in the upper left corner of the video, and the really important stuff is hosted elsewhere.

I am curious – I’m certain this pales compared to other stories out there. Do you have any whoppers to share?

Fifteen Years

Posted on May 9, 2017 by Tarus

On Sunday my mother celebrated her 75th birthday.

Although a happy occasion, why is this relevant to an open source blog? Well, it was soon after her 60th birthday in 2002 that I started my first company around OpenNMS.

I did not start OpenNMS, it began in the summer of 1999, with the first code posted on Sourceforge in March of 2000 by a company called Oculan. I started working with Oculan in September of 2001, and in May of 2002 they decided to stop contributing to OpenNMS. I saw the potential, so I asked Steve Giles, the founder and CEO, if I could have the OpenNMS project. He looked at his watch and said if I was off his payroll by Friday, he’d give me the domain names, a couple of servers, and he would sprinkle water on me and I would be the new OpenNMS maintainer.

That was actually the easy part. Explaining to my wife that I had quit my job and started a company “selling free software” was a bit harder.

sortova.com from archive.org circa May 2002

And thus Sortova Consulting Group was born. It was named after my farm. When Andrea and I decided we wanted to have a farm, we first bought raw land. In driving out from Raleigh to work on it we would pass this little farm with a barn, some cows, etc., and on the mailbox was a sign reading “Almosta Farm”. I joked that if that was “almost a farm” then what we had was just “sort of a farm”. Later, when we bought the place where we still live, the name Sortova Farm stuck.

We pronounce it “Sore-toe-va”. Only one customer ever pulled me aside and asked if it really meant “sort of a” consulting group. He laughed when I confirmed that it did.

Considering that I didn’t have any prior business experience, Java experience, or even real Internet access at my home, it is amazing that OpenNMS survived to this day. It is a wonder what you can accomplish with pure stubbornness.

Now my one true superpower is my ability to get the most fantastic people on the planet to work with me. The first group of those came from the OpenNMS community. When I was running Sortova it was the gang that later became the Order of the Green Polo that kept me going, mainly through mailing lists and IRC. In September of 2004 my good friend and business partner David Hustace and I founded the OpenNMS Group, and that corporation is still going strong. In 2009 we mortgaged our houses to buy the copyright to the Oculan OpenNMS code and thus brought all of it back under one organization, and two of the original OpenNMS team at Oculan now work for OpenNMS.

When I visit Silicon Valley I often get to meet some brilliant people, but the joy of this can be offset by the pervasive attitude of focusing on technology simply to make money. I know of a number of personally successful people who built companies, sold them, and then those products vanished into obscurity. Remember VA Linux? Their stock rose over 700% on the first day of trading, but where are they now? Did they ever deliver on their promises to the stockholders?

I want to build with OpenNMS something that will last well beyond my involvement with the project. I’ve gotten it to the point where I am not longer expressly required to make it thrive, but I am still working on its legacy. We want it to be nothing less than the de facto standard for monitoring everything, which is a high bar.

Note that I still would like to make a lot of money, but that isn’t the core driving force of the business. Our mission statement is “Help Customers – Have Fun – Make Money” in that order. If you have happy customers and happy employees, the money will come.

Fifteen years ago I made a leap of faith, in both myself, my family and my friends. I’m extremely happy I did.

How Version 2.0 Killed Android Wear

Posted on April 10, 2017 by Tarus

I am the happy owner of an LG Urbane smartwatch. Unfortunately, I just upgraded to Android Wear 2.0 and now I can’t use it.

Andrea Wear 2.0 Upgrade

Luckily for me, my smartwatch is not “mission critical”. If I leave it at home by mistake, I don’t turn around to go back to get it. The main thing I use it for is notifications. I like the fact that if it is with me, it will automatically mute my phone and then vibrate when I have a notice. A quick glance at my wrist will tell me if I need to deal with it right this moment, or if it can wait.

The second thing I use it for is to do simple voice searches or to set reminders and timers. Outside of that there are a few apps I use and I like the fact that it tracks my steps, but overall I don’t use a ton of features.

When the notice popped up that I could upgrade, I blindly went ahead and did it. In retrospect, that was stupid, but I often get in trouble rushing out to install the “new shiny”. The upgrade seemed to go fine, and I didn’t think that much about it until lunch.

One of the things I do before heading out to lunch is check the temperature to see if I need a jacket. So I did the usual wrist flick to “wake” the watch and said “Ok Google” to get to the voice prompt.

Nothing happened.

Hrm, I did some research and apparently with 2.0 you have to press the button on the side of the watch to get to the Google prompt. I think this is a huge step backward, because now I have to involve both hands, and I find it ironic that with Android Wear 1.5 I I had to sit through a demo of one-handed gestures over and over again (I often have to re-pair my watch due to reloading software on my phone) and now they’ve thrown “do everything with one hand” out the window.

Anyway, I pressed the button which then brought up the Google Assistant setup screen on my phone. With 2.0 if you want to use voice searches, etc., you must use Google Assistant and you have to give Google access to all of your contacts, calendars etc.

(sigh)

I work hard to “sandbox” my Google activity from the rest of my digital life. It’s not that I think they are evil, it’s just that I don’t want anyone to have that much information on me, well, other than me. I kind of despair for free and open source software solutions in the consumer space. Everyone seems to be rushing to adopt these “always on” digital assistants with absolutely no regard to privacy, and this is causing vendors to lock down their ecosystems more and more. While open source is definitely winning on the server side, I don’t think the outlook has ever been so grim on the consumer side.

There were some upsides with 2.0, such as improvements to the look and feel, but I also found that I didn’t care for the new notification system (I seemed to miss a lot of them – perhaps I needed to change a configuration). But the requirement for Google Assistant was a deal breaker.

I thought about going back to 1.5, which I liked, but I can’t seem to find a factory image. In trying to locate one, I discovered that TWRP does have a version for bass (the codename for the LG Urbane) and I should have installed that and made a backup before upgrading. I contacted LG and they told me it was impossible to downgrade. That’s a load of crap because I could easily sideload the old version if they made it available, but then I’d have to deal with constant upgrade reminders and the few apps I do use would probably stop support for 1.5 to focus on 2.0.

It just isn’t worth it.

I know at least one of my three readers is thinking I should just cave and learn to embrace the Google, but I can’t bring myself to do it. I am eagerly awaiting open source alternatives like Asteriod OS (which just isn’t ready for daily use) and Mycroft (which is supposed to be shipping units this month) but I really don’t think I’ll miss my Urbane enough to spend the time on it.

I plan to sell my Urbane on eBay and I’ve gone back to my previous “dumb” watch (a nice little Frederique Constant I bought on a flight from Dubai to London). It’s kind of a shame since I enjoyed using it, but to be honest I’m not going to miss it all that much.

The Importance of Contributor Agreements

Posted on March 27, 2017 by Tarus

One thing that puzzles me is the resistance within the open source community to contributor agreements. This was brought into focus today when I read that the OpenSSL Project wants to migrate to the Apache 2.0 license from the current project specific OpenSSL license.

In order to do that they need permission from all of the nearly 400 contributors of the project over the last 20+ years, and contacting them will be a huge undertaking. If one person refuses to agree, then they will either have to abandon the effort, or locate that person’s contribution and either remove or replace it.

Many years ago we found out that a company was using OpenNMS in violation of our license. When our lawyer approached them about it, they claimed that they were only using those parts of the code for which we didn’t hold copyright. At that time, early versions of OpenNMS were still copyright Oculan, the company that started the project, and not OpenNMS. Since Oculan wasn’t around anymore it took us awhile to track down the intellectual property, but in the end David and I were able to mortgage our houses to purchase that copyright so that now the project can control all of the code and defend it from license abuse in the future.

But the question arose about what to do moving forward, specifically how should we deal with community contributions? In the past companies like MySQL required all contributors to sign a document with phrases like “You hereby irrevocably assign, transfer, and convey to MySQL all right, title and interest in and to the Contribution” which seemed a little harsh.

I posed this question to the Order of the Green Polo, the de facto project administrators, and DJ Gregor suggested we adopt the Sun Contributor Agreement that we now call the OpenNMS Contributor Agreement, or OCA. This was a straightforward document that asked two things.

First, you attest that you have the right to contribute the code. This is more important than you know, because it helps remove liability from the project should the contribution turn out to be encumbered in some way, such at the author writing it as part of their job and thus it is actually the property of the employer. We allow both individuals and companies to sign the OCA.

Second, you assign copyright to OpenNMS while retaining copyright yourself. This introduces the concept of “dual copyright”. Now some critics will say that this concept hasn’t been tested in court, but there is a long history of authors sharing copyright. Considering that Oracle maintained the agreement in the form of the Oracle Contributor Agreement, it appears that their lawyers were satisfied.

I claim responsibility for the license under which these Contributor Agreements are published: the Creative Commons Attribution-Share Alike License. When DJ suggested the Sun Contributor Agreement I noticed that there wasn’t any license on the agreement itself. I didn’t want to just copy it and change “Sun” to “OpenNMS”, so I contacted Brian Aker who had just moved to Sun with the MySQL acquisition and asked him about it. Soon thereafter the Agreement was updated with the license and we adopted our version of it.

Once we adopted the OCA, I was tasked with tracking down anyone who had ever contributed to OpenNMS outside of the company or Oculan and asking them to sign it. They all did, but I can tell you that I had a hard time tracking down a number of them (people move, e-mails change). I don’t envy OpenSSL at all.

I hope this story illustrates the importance of some sort of Contributor Agreement for open source projects. They don’t have to be evil, and in the end getting your copyright and licensing issues completely sorted out will make managing them in the future so much easier.

Electronic Devices and CPB

Posted on March 9, 2017March 9, 2017 by Tarus

With the change in administration in the United States, Customs and Border Protection (CBP) have modified their behavior to include actions with which I don’t agree. These include forcing a US citizen to unlock his mobile device, even though it was a work device and contained sensitive information. I set out to come up with how I will deal with this situation should it arise in the future.

TL;DR My plan is as follows: before I enter the United States, I will generate a long, random password and set that as the encryption password for my laptop and my handy. I will then ssh into an old iMac I have on my desk, store the password in a file, and then shut the computer down. At that point I will not be able to access the information on my device until I return to the office and power on the system.

UPDATE: The EFF has published a detailed guide to help understand your rights at the border.

First off, let me say that until recently I’ve always respected CPB. They have a tough job and everyone I’ve ever met while returning from my travels has been efficient, competent and friendly.

But after the recent “Muslim Ban” fiasco I’ve come to realize that my experience is not universal. I think one of the main problems is this idea that the Constitution stops at the CBP desk, and until you are past it you really aren’t “in America” and thus the Constitution doesn’t apply.

I don’t agree with this interpretation, but it can probably be traced to the actions taken by the US government after 9/11 and the creation of the prison at Guantanamo Bay.

Prior to that, when “bad hombres” were captured by the US government, they fell into one of two categories: criminals or prisoners of war. How each class was treated was fairly well defined. Criminals were processed according to the rule of law, and the treatment of POW’s was covered under the various Geneva Conventions.

The US government decided that those two classifications were inconvenient, and so they ventured into the murky waters of “enemy combatant” and Guantanamo. Their logic goes that since Guantanamo isn’t in the US, US law doesn’t apply, and since these people aren’t members of a foreign country’s military force with which we are at war, then they aren’t POWs. So, the US gets to make up its own rules about how these people are treated.

This is dangerous for a number of reasons. Since nothing is really codified about the treatment and rights of the detainees at Guantanamo, the rules are arbitrary. Also, this opens the door for other countries such as Russia to do similar things without fear of international repercussions. The US has survived for so long because things like this are not supposed to happen, yet here we are.

This thought now extends to the border. Even though a US citizen is being questioned by another US citizen, in the role of a representative of the US government on US soil, somehow the rules of the Constitution are suspended. It’s arbitrary and I don’t buy it. The Constitution codifies a right to privacy in the Fourth Amendment, and it doesn’t go away when entering the country. And it definitely extends to mobile devices, which in today’s world are probably the most personal item people own.

So how can people like me, with almost no political power, resist this threat to our freedom?

I’ve always done little things, like opting out of millimeter wave scans at airports and getting a pat down instead (I’m not shy). If everyone did this the whole system would collapse, and they would find better ways of dealing with security than the security theater we have now. Seriously, if the Israelis don’t use it, it ain’t worth using.

When I turned to the problem of dealing with CBP, my main thoughts went to two devices that I use when traveling: my handy (mobile “phone”) and my laptop. I figured the easiest thing to do would be to just wipe them before coming into the country, but that presents some logistics problems.

For example, I could make a backup of my handy, copy it to a server at home, and then wipe it. The problem is that I have 64GB of storage on the device and I doubt I could transfer a backup in time over, say, a hotel Wi-Fi connection. One of my coworkers uses an iPhone and they thought about wiping their phone and just restoring it from iCloud when they were in the country, but then CBP could require that he turn over his iCloud password.

On my laptop I use whole disk encryption, but I thought about just rsync’ing my home directory and then deleting it before leaving, then again there is the WiFi issue and I really don’t want to have to deal with copying everything back when I’m home.

Then it dawned on me that if I didn’t know the encryption password, then I couldn’t reveal it. The problem became how to create a secure password that I couldn’t remember yet get it back when I needed it.

While my main desktop computer runs Linux Mint, I keep an old iMac on my desk mainly to run WebEx sessions and for those rare times I am forced to use a piece of software not available for Linux. It’s connected to the network, so I can access it remotely. But, if I can access it, I would be lying if CBP asked me for my password and I said I couldn’t retrieve it. Unlike the US Attorney General, I refuse to perjure myself.

Then it dawned on me that I could shut the iMac down remotely and have no way to turn it back on. Thus I could store a passphrase on it, retrieve it when I was back in the country, but until then I would be unable to unlock my devices.

That became the plan. So, the next time I’m returning from overseas, I’ll generate a new, random password. I’ll set that as the whole disk encryption password on my laptop and the encryption password on my handy (note that this is different from the screen-lock password). This will also tie up all of my social network passwords since I use complex ones and store them on those devices. Well, with the exception of my Google account, but since I use two-factor authentication I should be safe as my handy is the device that generates the codes (and I won’t carry any of the backup codes). As long as both of those devices stay powered on, I’ll be able to use them, but once I power them off they will be useless until I get to the office, power on the iMac, and retrieve the passphrase. Note that in order to do that, I’ll be firmly in the US and anyone who wants me to unlock my devices will need a court order.

Which I would respect, unlike CBP. I think the scariest part of the whole “Muslim Ban” incident was when CBP refused to honor court orders. America is built on three branches of government, and when the Executive branch ignores the orders of the Judicial branch we are all in trouble.

I had a two other problems to address, one of which is done. If I’m in the US but my handy is locked, how would I make calls? I might need to call my ride home, etc. To that end I bought a cheap “feature” phone and I’ll just move the SIM card to it when we land.

ZTE Feature Phone

The second issue is that while I should be on solid legal ground concerning my electronic devices, there is nothing preventing CBP from holding me for a long time. Thus the final step is to find an attorney and execute a G-28 form allowing them to represent me. I’m not sure if I need a civil rights lawyer or an immigration lawyer but I’m looking into it. My goal is to be able to notify my attorney when I am coming back into the country, and then send an SMS to them when I am through immigration. If that doesn’t arrive within two hours of my scheduled arrival, they need to come and get me.

I think the thing that bothers me the most about this whole process is the need for it. I’m not a tinfoil-hat conspiracy guy but the actions of the new government have me worried. As I use open source software almost exclusively I know I’m safer than most when it comes to surveillance, and I also don’t expect to run into any problems being an older, white male. But I’d rather be safe than sorry, and the only thing necessary for the triumph of evil is that good men do nothing.

Fourteen Years

Posted on February 19, 2017February 19, 2017 by Tarus

I just wanted to take a second to thank my three readers for fourteen years of support.

My first post on this blog happened on this date in 2003, and when I wrote it I had little idea I’d still be doing it almost a decade and a half later.

It does seem weird that I still consider OpenNMS a start-up. We took a much different path than a lot of other companies, focusing on our customers instead of fundraising. With our mission statement of “Help Customers, Have Fun, Make Money” and our business plan of “Spend Less Than You Earn” we’ve not only managed to survive but thrive, and both the company and the project have never been stronger. While we are always looking for good investors, this allows us to pick just the right partner.

I’d like to end this with a quote from Michael Seibel of Ycombinator. Actually, it is almost his entire blog post but it really resonated with me.

I’d like to make the point that success isn’t the same as raising a round of financing. Quite the opposite: raising a round should be a byproduct of success. Using fundraising itself as a benchmark is dangerous for the entire community because it encourages a culture of optimizing for short term showmanship instead of making something people want and creating lasting value.

I believe founders, investors, and the tech press should fundamentally change how they think about fundraising. By deemphasizing investment rounds we would have more opportunity to celebrate companies who develop measurable milestones of value creation, focus on serving a customer with a real need, and generate sustainable businesses with good margins.

Optimizing for funding rounds is just as unproductive as optimizing for headcount, press mentions, conference invites, fancy offices, speaking gigs or top line revenue growth with massively negative unit economics.

Ulf: My Favorite Open Source Animal

Posted on February 15, 2017 by Tarus

Over at opensource.com they asked “What’s your favorite open source animal?” Hands down, it’s Ulf.

OpenNMS Kiwi: Ulf

When I was at FOSDEM this year, we were often asked about the origin of having a kiwi as our mascot. Kiwi’s are mainly associated with New Zealand, and OpenNMS is not from New Zealand. But Ulf is.

Every year we have a developer’s conference called “Dev Jam“. Back in 2010, a man named Craig Miskell came from NZ and brought along a plush toy kiwi. He gave it to a group of people who had come from Germany, since he had come the furthest east for the conference and they had come the furthest west. They named him “Ulf”.

There was no conscious decision to make Ulf our mascot, it just happened organically. People in the project started treating him as a “traveling gnome“, setting up a wiki page to track some of the places he’s been, and he even has his own Twitter account.

I lost him once. We had a holiday party a few years ago and Ulf went missing. We thought he had been left in a limo, so I dutifully sought out a replacement. I found one for US$9, but of course shipping from NZ was an additional US$80 more, so I bought two. I later found Ulf hiding in the pocket of a formal overcoat I rarely wear (but had the night of the party) so now we have a random array of individual Ulf’s.

Anyway, Ulf manages to represent OpenNMS often, from stickers to holiday cards and keychains. I love the fact that he just kind of happened, we didn’t make a conscious decision to use him in marketing. If you happen to come across OpenNMS at conferences like FOSDEM, be sure to stop by and say “hi”.

OpenNMS 101

Posted on January 20, 2017 by Tarus

One of my favorite things to do is to teach people about OpenNMS. I am one of the main trainers, and I usually run the courses we hold here at OpenNMS HQ. I often teach these classes on-site as well (if you have three or more people who want to attend, it can be cheaper to bring someone like me in for a week than to send them here), and the feedback I got from a recent course at a defense contractor was “that was the best class I’ve ever attended, except for the ones I got to blow stuff up.”.

Unfortunately, a lot of people can’t spare a week away from the office nor do they have the training or travel budget to come to our classes. And teaching them can be draining. While I can easily talk about OpenNMS for hours on end, it is much harder to do for days on end.

To help with that I’ve decided to record the lessons in a series of videos. I am not a video editing wizard, but I’ve found a setup using OBS that works well for me and I do post production with OpenShot.

The first class is called “OpenNMS 101” and we set it up as a video playlist on Youtube. The lessons are built on one another so beginners will want to start with Module 0, the Introduction, although you can choose a particular single episode if you need a refresher on that part of OpenNMS.

My goal is to put up two or three videos a week until the course material is exhausted. That will not begin to cover all aspects of OpenNMS, so the roadmap includes a follow up course called “OpenNMS 102” which will consist of standalone episodes focused on a particular aspect of the platform. Finally, I have an idea for an “OpenNMS 201” to cover advanced features, such as the Drools integration.

I’ve kept the videos as informal as the training – when I make a mistake I tend to own it and explain how to fix it. It also appears that I use “ummmmmmm” a lot as a place holder, although I’m working to overcome that. I just posted the first part of “Module 4: Notifications” and I apologize for the long running time and the next lessons will be shorter. I had to redo this one (the longest, of course) as during the first take I forgot to turn on the microphone (sigh).

We have also posted the slides, videos and supporting configuration files on the OpenNMS project website.

I’d appreciate any feedback since the goal is to improve the adoption of OpenNMS by making it easier to learn. Any typos in the slides will be fixed on the website but I am not sure I’ll be able to redo any of the videos any time soon. I think it is more important to get these out than to get them perfect.

Perfection is the enemy of done.

Network World Reviews OpenNMS

Posted on November 21, 2016November 21, 2016 by Tarus

Today Network World published the results of a comparison among open source network monitoring applications. OpenNMS did not win but I was pretty happy with the article.

The main criticism I have is that the winner, Pandora FMS, seems to be the only one of the four reviewed that is more “open core” than “open source”. They have a large number of versions, each with different features, and you have to pay for those features based on the number of monitored devices. It seems to be difficult to have open source software that is limited in this fashion, as anyone should be able to easily remove that limit. Thus I have to assume that their revenue model is firmly based on selling software licenses, which is antithetical to open source. That said, it looks like the review was based on the “community” version of Pandora which does appear to be free software, just don’t expect any of the “enterprise” features to be available in that version any time soon.

I don’t know why I have such a visceral dislike of the “per managed node” pricing model, outside of having to deal with it back in the 1990s and 2000s. It seems like an unnecessary tax on your growth, “hey, customer, for every new device you add you have to pay for another monitoring license.” Plus, in these days of virtualization and microservices it seems silly. Our customers might spin up between 10 and 100 virtual servers as needed and tear them down just as quickly, and I can’t imagine the complexity that would get added to have to manage a license of each one of them.

Network World Comparison

Of the other applications reviewed, I’m not familiar with NetXMS but I do know Zabbix. They, like OpenNMS, are 100% open source and they are great people. It was awesome to finally meet Alexei Vladishev in person at this year’s All Things Open conference.

Alexei Vladishev and Tarus Balog

The only other thing that immediately pushed a button was the sentence “All four products were surprisingly good.” At first I took it to express surprise that free software could also be good, but then I calmed down a bit and figured they meant it was surprising that all four applications were strong.

For the article they installed OpenNMS on Windows. When I read that my heart just sank, because while it does run on Windows our support of that operating system grew out of a bet. We were talking many years ago about Java’s “write once, run anywhere” slogan and I mentioned that if that were true, why don’t we run on Windows? The team took up the challenge and it took two weeks to port. The first week was spent getting the few bits of code written in C to compile on Windows, and the second week on soft-coding the file separator character so that it would use a back-slash instead of a forward-slash. Even on Windows, the comments in the article were really positive, which make me think this whole Java thing isn’t such a bad idea after all (grin).

They used Windows because apparently was an issue with getting OpenNMS installed on CentOS 7, which was a surprise to me, but then Ronny pointed out that there can be some weird conflicts with Java and packages like LibreOffice that I don’t experience since I always do a minimal install. There is a cool installer for CentOS 7 which may help with that. We also maintain Docker images that make installation easy if you are used to that environment.

Fortunately, or unfortunately, not much has been done for OpenNMS on Windows since we got it working. It is fortunate because not much is required to keep OpenNMS running on Windows due to Java, but it is unfortunate because we really don’t have the Windows expertise that would be required to get it to run as a service, create an MSI installer, etc. Susan Perschke, the author of the article, seems to be a Windows-guru so I plan to reach out to her about improving the OpenNMS experience for Windows users.

One thing that is both common and valid is criticism of the web user interface. At the moment we spend most of our time focused on making OpenNMS even more scalable, and thus we don’t have the resources to make the user interface easier to use. That is changing, and most of the current effort goes into Compass™, the OpenNMS mobile app. The article didn’t mention it which means they probably didn’t try it out, which is more a failure on our part to market it versus an oversight on theirs.

They also didn’t talk directly about scalability, although it was listed in the comparison chart (see above). OpenNMS is designed to monitor tens of thousands to hundreds of thousands of devices with our goal to be virtually unlimited in order to address scale on the order of the Internet of Things. That is why we wrote Newts for storing performance data and are working on both the Minion and Underling to easily distribute OpenNMS functionality.

Another reason we haven’t spent much time on the user interface is that our larger customers tend not to use it much. They rely on the ReST interface to integrate their own systems with OpenNMS and on things like the Business Service Monitoring.

But still, it was nice to be included. We don’t do much direct marketing and even though typing “open source network monitoring” into Google returns OpenNMS as the first hit we are often overlooked. Let’s hope they revisit this in a year and we can impress them even more.

Adventures in Open Source

Commentary