A Brief History of an Open Source Company

I’ve been invited to give a keynote at this year’s Ohio Linuxfest being held in Columbus, Ohio, on 29-30 September. I am both excited and humbled as this is one of my favorite conferences of the year and I know a lot of amazing people will be there to share their knowledge of free and open source software.

Ohio Linuxfest Logo

I take my presentations pretty seriously, especially keynotes, so I wanted to come up with something that was both funny and interesting. They asked me to speak on running a business around open source software, and I immediately thought I should come up with some click-bait title like “Ten Things About Open Source Business, Four of Them Will Shock You!” but it just didn’t feel right. Then I thought about Hawking’s A Brief History of Time and that seemed more fitting.

My most popular talk so far has been on starting an open source business, but that focuses mainly on the mechanics of the process. For this talk I want to trace my history with OpenNMS starting with my first day on the job and then describing how it grew to become what it is today. In those 15+ years I’ve had a lot of adventures, some good and some bad, and I’ve met a some wonderful people. It is the work of many of those people that actually make OpenNMS what it is – I act more like a “crap umbrella” with my one job being to block all of the things that might keep the team from being productive – and I want to talk how that came about. This presentation will consist almost entirely of real world examples of the problems we encountered and our decision process for solving them.

I hope it will be entertaining and useful, and look forward to seeing you there.

When Not To Start an Open Source Company

Over the weekend, Chris Aniszczyk posted a link on Twitter to a very interesting article by Matt Klein about his decision not to start an open source company around his project, Envoy. I thought it raised a number of interesting points worth a few comments.

First off, Matt works for Lyft, which, in case you haven’t heard of it, is Uber without the moral decay. I abandoned Uber some time ago, despite being an early adopter, and I’ve been very happy with Lyft. One of the main differences is that Lyft allows you to tip your driver, which I almost always do with few exceptions. The fact that Lyft is able to keep and motivate people like Matt speaks volumes for their corporate culture.

It also demonstrates a wonderful trend of commercial companies starting and maintaining open source projects. I’ve been working with open source for almost two decades and I can remember when any software developed at a company was considered confidential. To this day there are a number of vendors who consider their SNMP MIB files (which, I should point out, are really only useful to people who have purchased their products) proprietary information. Companies like Lyft, Paypal and Facebook, none of which would self-identify as open source companies, have gained a lot of value for little cost by making the tools they use open source.

When talking about open source for the enterprise, I often talk about the fact that it is the processes that a company uses to serve its customers that make it unique and define its value, not the tools used by the company. So often with commercial software you have to change those process to fit how the application thinks you should work, and in the process you lose some part of what makes you special to your customers. With open source you can fit the application to those processes. It is how you use the tools and not the tools themselves that is important, and so there is a lot to gain and little to lose by making them open source.

Getting back to Matt’s article, he is a project maintainer for Envoy, which is a “high performance C++ distributed proxy and communication bus designed for large service oriented architectures.” While I don’t consider myself a coder so I don’t claim to fully understand the its advantages, I do recognize enough buzzwords in that sentence to know that it would attract some attention from investors, and Matt was approached about leaving Lyft to start a commercial business around Envoy. He decided not to, and as I read his article about his decision I realized I’d found a kindred soul, someone who was more interested in creating something of value that would last versus making a quick buck.

He had me with this paragraph:

In my opinion, the best opportunity to commercialize OSS lies with projects that can be easily turned into SaaS products. Ultimately, even if software is completely open, many customers are happy to pay for a turnkey solution that “just works” and has a defined SLA with 24/7 monitoring and support. In some sense, customers pay for the operational expertise that comes from deeply understanding and running the software, versus the software itself.

Amen.

I’ve been making a living on open source for 15 years now working with OpenNMS, and I’ve spent a lot of time thinking about business models. We started out with the “service and support” model, which kept the doors open but limited growth. Then our clients started asking us for features, so we added custom development, which was time intensive but allowed us to finance OpenNMS features which attracted even more customers as the platform became more powerful. When we hit the problem of trying to balance the “release early, release often” philosophy of open source with the need for stability, we adopted the Red Hat model of splitting our application into a feature-rich, rapidly developed release (which we call Horizon™, similar to Fedora) and a more stable, subscription-based release that may lag in features but is better suited production environments (which we call Meridian®, similar to RHEL). But ultimately we came to the decision that what we really wanted to do was to offer OpenNMS as a service.

One company that inspired that decision was Automattic, maintainers of WordPress. I don’t think I know of a more powerful piece of software that is easier to install. They have a famous “5 Minute Install” that is quite simple. First, you drop the software into the webroot of your web server of choice. Next, you create a database account on your database of choice with certain permissions. Then you navigate to a web page and follow the prompts.

However, for a lot of people, terms like “webroot” are gibberish, and even with WordPress you still need some minimal database skills to maintain it. So Automattic offers up WordPress as a service. For a small monthly fee they’ll do everything for you, and this has generated revenues on the order of tens of millions of dollars per year.

OpenNMS is way more complicated, thus the value of a hosted version should be greater. In order to do so we needed some way to access the client’s network in a secure fashion, so with Horizon 20 we introduced the Minion. The Minion software allows for OpenNMS functionality to be distributed. It is built on the Karaf container, so once installed all of its features can be remotely managed. For smaller networks, the Minion can be sold as an appliance and talk to a hosted version of OpenNMS. It can bring a complex and powerful tool like OpenNMS into the hands of the masses.

For larger companies it solves issues of scale as Minions can be deployed to cover even the largest networks (our goal is IoT scale). We’ve had them in production at one client for months now handling over 2 million events an hour. That translates to around 555 events per second, although the system itself can handle over 10,000 events per second so they have room to grow. If they ever hit that limit, we can simply add more Minions. They have the option of hosting all of OpenNMS in their own data center, or they could choose a hybrid model where some of the functionality is outsourced.

For pretty much the first time in the history of OpenNMS, we are seriously and actively seeking investment. There are a number of companies entering this space who have raised enormous amounts of money, and we think we can be competitive for far less money and provide a better solution. Plus, also for the first time in the history of OpenNMS, we have a reason to make it easier to use versus spending all of our resources making it more powerful.

Matt talks about investment in his post (remember Matt? As usual, I’ve made this all about me. Meeee!) It was actually his stories about dealing with investors that prompted me to write this. As Envoy started to get some traction, investors wanted him to leave and start a company. He writes:

Over the last few months I’ve been told by several investors that no OSS has become ubiquitous without having explicit commercial backing. I think this is false and is situation dependent. If anything, I would argue that if I were to leave Lyft now and start a platform company around Envoy, it will decrease the chance of Envoy becoming ubiquitous, primarily because it would negate all of the reasons laid out above.

That first sentence is interesting, since “ubiquitous” and “commercial” are a little vague. I would make the claim that the Apache web server was ubiquitous until its success spawned NGINX, and it was backed by the Apache Software Foundation which is a non-profit. Is a foundation “commercial”? The idea that for a project to become successful it needs a number of people to spend a lot of time working on it seems obvious, and the best way to achieve that is to pay those people to work on it.

He goes on to write:

It took me a lot of time to ultimately understand the previous simple point. Investors are extremely persuasive. They capitalize on “fear of missing out.” However, it’s important to realize that the opportunity cost is hugely mismatched between investor and company.

When he writes “investors” above I believe he means specifically venture capitalists. We’ve talked with a few VCs in the past and I can remember the almost “strong arm” tactics they used. If I hear “a rising tide lifts all boats” one more time, I might have to hit somebody. I’m not saying that all VCs are the same, but many of them come across as gamblers and not investors. I’m risk friendly but I don’t gamble. I’m heavily invested in wanting to build something with OpenNMS that outlasts me (it is already much bigger than me as the team I work with has way more to do with its success than I do) and I don’t want to gamble with it.

I do hope that there are some investors out there that can appreciate that aspect of our company as well as the fact that we’re profitable, have mature products and wonderful customers. Perhaps private equity or perhaps another company that shares our vision and wants to advance the project through acquisition. In any case we’re looking for them.

When I was a young man, old guys like I am now would tell me “work on something you love, not just for the money”. I always dismissed it with the thought that with enough money I can buy love. When you immerse yourself in something as personal as an open source project for ten to twelve hours a day, year after year, you really do have to love it and the satisfaction you get just can’t be bought. Matt’s thoughts are similar:

Ultimately, on a personal level I’m just having too much fun solving tough computer science problems at large scale at Lyft and building a community around Envoy. The bar to do something different is therefore extremely high, and it took a long time to realize that it’s perfectly OK to accept that and keep going down the existing path that I’m on. On another level, leaving now to start a company would feel very much like not following through on my original goal of open sourcing Envoy; the industry desperately needs a high quality and community-driven solution to microservice networking. Follow-through is something I take very seriously.

With that attitude the success of Envoy is almost assured.

CubaConf 2017

UPDATE: Today the United States administration announced tougher restrictions on travel to Cuba. While nothing has changed at the moment, there will be some changes in the next 30 days. This should not impact people attempting to go to Cuba for this conference as it should fall under the “professional” or “educational” travel categories. This may change again before November and I’ll be sure to post updates.

While tourist travel remained officially banned, Obama also allowed a broad category of “people to people” visits to Cuba. Trump’s new directive still allows individual travel in all but that category, and reverts to an earlier policy of requiring “people to people” visits only in a Treasury-licensed group.

Free and open source software is as close to a true meritocracy than anything else I’ve found. It doesn’t matter what is the color of your skin, your gender or where you live, your value is judged simply by your contributions to the project. I wrote up my favorite instance of that for opensource.com concerning my friend Alejandro who got involved with OpenNMS when he lived in Venezuela. He and his wife are now permanent residents in the US due to his work on our project.

I actually forget how I came across CubaConf, but I was immediately interested in attending. This is an annual free software conference held in Havana, Cuba.

CubaConf

It has been illegal for US Citizens to travel to Cuba since before I was born. Last year the Obama administration eased some of those restrictions, so it is now possible, under certain conditions, to travel to Cuba as well as to use US Dollars while there.

Cuba has been pretty isolated since the 1960s, and as it races to catch up with the rest of the world it will need access to modern technology, especially software. I see an opportunity for free software to play a huge role in the future of that country, and I am eager to meet the people who will help make that happen.

I want to use this post to encourage all of my free and open source software friends to come to CubaConf. This is a three-day event that follows a format similar to one we used for our OpenNMS user conferences. The first day is a normal conference, with various tracks and presentations set to a schedule. The second day is a “barcamp” style conference where the attendees will set the agenda, and the third day is a hackathon.

Presentations are welcome in both Spanish and English, so I’ve submitted two talks (both in English). One is on starting an open source business. This will be different from my usual talk as I want to focus on how someone in Cuba could both spread the use of free software while getting paid to do it, without as much focus on setting up a corporation or other formal business entity. The second talk is on OpenNMS. While business transactions are still difficult between the US and Cuba, I really want to bring the magic that is OpenNMS to their attention so that when things ease between our countries people will be familiar with it.

I plan to attend all three days, and Alejandro is coming with me to help with any language issues (my Spanish is passable but not nearly as fluent as a native speaker). Note that the Call for Papers is open until the end of August.

Since you might be hesitant to consider going to Cuba from the United States, I wanted to share with you how it works.

First, tourism to Cuba for Americans is still illegal. However, the State Department has come up with a list of 12 categories which qualify for visiting.

12 Visa Categories for Cuba

In the case of CubaConf, you will choose either number four “Professional research and professional meetings” or number five “Educational activities”. I guess number six might work “Public performances, clinics, workshops, athletic and other competitions, and exhibitions” since it is kind of a workshop, but I’d stick with the first two. Since I am a free software professional, I plan to use number four, as I consider this a professional meeting.

Note that Cuba could care less about why you are there – this is a requirement of the US government.

Second, once you have a legal travel category, you’ll need a visa. In speaking with my favorite airline, American Airlines (they offer direct flights to Havana from Charlotte, NC, and Miami, FL), once you book your travel they will outsource the visa process to Cuba Travel Services who will handle the whole thing via e-mail. The visa costs $50 and it looks like there may be a $35 fee, but I’m not sure if the fee applies if you are referred via the airline and it may be built into the price of the ticket.

Speaking of things included in the price, the third thing to consider is that all Americans traveling to Cuba must have non-US health insurance. This is included as a $25 charge when you purchase your ticket.

That covers much of the “getting there” part. The fourth, and in my mind most important thing to know is that Cuba is still very much a cash-only country. American banks are still not doing business there so your credit cards won’t work, nor will the ATM, so you’ll need to bring cash. I verified this with calls to Bank of America, Chase and Citi – currently none of those banks have cards that work in that country.

There are two types of currency in use: The Cuban National Peso (CUP) and the Cuban Convertible Peso (CUC) or “kook”. The CUC is pegged to the dollar and is the currency used by most visitors. Luckily, Havana is a pretty safe place, although I still won’t want to carry around a lot of money if I can avoid it.

I’m not sure where I will stay. Being a big Marriott fan I do have the option to stay at the Four Points Sheraton, but it seems to be pretty far away from the Colegio Universitario San Gerónimo where the conference will be held. Most people visiting stay in a “casa particular” which is a room in someone’s house, and it appears that Airbnb is also in Cuba.

I plan to use the open source way and just ask my friends organizing the conference where I should stay. It is very easy to do, as they have set up a Telegram channel for the conference. While Spanish is the main language in the channel, English is welcome, and if you are thinking about coming to CubaConf I would consider going there first.

I am very exited about the opportunity to visit Havana in November. Despite the modern history between the US and Cuba, I know I’ll make some new friends.

Software libre crea amistades inmediatas.

Horizon™ Version 20 Released

Just a heads up that version 20 of Horizon has been released.

Since version 20 coincides with the 20th anniversary of the film The Fifth Element, we’ve decided to use characters from that movie as codenames for this release. Version 20.0.0 is called “Leeloo”.

This release continues our commitment to rapid releases in the Horizon product line, and is mainly focused on bug fixes, small enhancements and code cleanup. We have removed all use of Castor for the parsing of XML files and replaced it with JAXB, and number of deprecated events have been removed from the system.

Probably the biggest new feature is a topology provider that can be used to create custom maps. The Asset Topology Provider generates a GraphML topology based on node metadata including asset fields.

You can read the announcement and for more information, check out the release notes.

Why the FCC’s Title II is so Important (Spectrum Rant)

Here is a rant about Time Warner/Charter/Spectrum or whatever the heck they call themselves these days. It illustrates how this large company can have a huge negative impact on a small business, and why treating Internet providers as common carriers is so important.

Our company wouldn’t exist without the Internet. Outside of the fact that our products are mainly used to monitor Internet resources, we host a number of servers from our office and about half of the staff works remotely so we rely on the Internet to communicate and coordinate.

Back in 2012 I contracted with Time Warner to provide Internet access to our office. We had fiber to the building and while our service was considerably more expensive than coax, I liked the fact that it was symmetrical and expandable. We started of with 20 Mbps but soon increased that to 50 Mbps. Over five years we only had one outage, due to a misconfiguration of our Customer Premise Equipment (CPE), and they corrected it within 20 minutes. I love the fact that when you called in the person who answered the phone understood terms like “duplex” and they were always very helpful.

Note the scenario: happy customer who is happy paying a premium for enterprise-level service.

Now let me tell you why all that goodwill has gone away.

Earlier this year we decided to move our office from Pittsboro, NC to Apex, NC. The first thing I did was contact Time Warner (well, Charter at the time) to insure that they could provide fiber to the new location. They said they could, although it would take 45 to 60 days. As our new office space needed to be completed, we were targeting an April 1st move in date anyway, so on February 15th I placed the order for the new service. At best, it would be available on the 1st and at worst it would be ready by the 15th. We told the old landlord we’d be out by April 30th just in case and to give us more time to move.

Finally, Spectrum doubled our speed and cut the price in half. I was feeling pretty good about the whole thing.

The feeling didn’t last.

As we got closer to April, things started to go wrong, most of it due to the fact that Spectrum is now such a behemoth that they have no idea what they are doing. In order to get fiber into our new building, they needed what is called a “Right of Entry”. They sent it to our landlord who promptly completed the form and sent it back. However, that person didn’t let the project manager know the form had been received, so he did absolutely nothing. Ten days (!) later I get a note that our build out had been suspended because of the lack of the ROE form. A form, I should point out, that was sent to them, twice.

At the end of March I’m told that our new date is May 11th. I’m unhappy – due to their poor processes I now have a new office that I can’t use for six weeks (remember, we took possession and started paying rent on April 1st). We also had to be out of the old office by the end of April. Luckily I work with a great team that is able to be productive when working from home, so I decided to suck it up and live with it.

On April 12th I get an update – the new date for the end of construction is now May 15th due to processes within Spectrum taking too long to finalize the work with a contractor. Now the actual date we’ll have Internet has been pushed out to the week of May 29th.

I am livid. By this point I’m ready to switch to the other option, AT&T. Unfortunately, they also need 45 to 60 days for service installation so I realize at this point I’m stuck with Spectrum.

I ask my salesperson for options and he suggests we get coax installed for a month (for a fee, of course). Since our office is right next to a large housing development they can get coax in the following week. I sign off on it.

It didn’t happen. When May arrived some of us started working in the new office mooching off the neighbor’s Wi-Fi from AT&T (with permission of course). I ended up traveling for a couple of weeks so I completely forgot about the coax option (it’s not like Spectrum was keeping me updated on anything – I’d have to reach out to them for an update). I did get a note on May 10th that all construction had been completed for the fiber and another note on May 18th that our new install date was June 2nd.

(sigh)

So, 45 days late, we have a firm install date. Wonderful.

Imagine how I felt when on the 24th of May I received a note that more construction was needed and that it would be pushed out another 30 days at least. When I get extremely angry I refer to it as going “non-linear” as that how fast my blood pressure rises. As I was ranting to pretty much everyone I’d ever interacted with at Spectrum it dawned on me that this could be for the coax order. Turns out that was the case. Apparently our crack project manager on the coax side decided to route our service from a point several miles away instead of from the one nearly across the street. This is why it was delayed and why the construction was needed. By this time we are about a week out from having fiber so I canceled the order. I did get a very apologetic call from the coax salesperson which I appreciated (under Spectrum, fiber [Enterprise] is handled by one sales team and coax [Business] is handled by another), and I made it clear that I’d be okay with everything as long as the fiber was delivered as promised on the 2nd.

It was. Around noon on June 2nd we had our 100 Mbps service and on the 3rd we moved all of our devices from the old office in Pittsboro to the new one in Apex. I informed my salesperson that they could disconnect the old service and despite all of the problems, I was happy with the new service.

So the whole process cost me two months rent and a few years off my life, but it was finally over.

Not so fast – the other shoe fell today.

I get an e-mail that I need to confirm my disconnect request. That didn’t bother me, in fact I appreciated it, but what did bother me was an additional note that it would be done within 30 days. When I replied I asked for clarification – would I be *paying* for the service I wasn’t using until they could disconnect it? The answer was “yes”.

I experienced a new word – apoplectic.

Due to the fact that the bureaucracy behind the new merged Spectrum company is so bad, I’m out nearly ten thousand dollars. That is the real money – it’s probably cost us twice that again in lost productivity from lack of network access and dealing with them throughout this process. We’re not one of those companies that is too big to fail so this really impacts us negatively. Had it been explained to me that I’d have to pay for the service until it was disconnected, I would have put the disconnect order in a month ago, but then had I used the date I was originally promised, our servers would have been off-line for over a month. That would have been catastrophic to our company.

Finally, I’ve gone from a happy customer to an extremely pissed off one who will be actively looking for options. Based on my experience I would suggest any business looking for network access look elsewhere.

Access to the Internet has become as important as other utilities such as electricity, water and sewer and just like those utilities it needs to be regulated as one. This is why the decision by the new industry-picked head of the FCC to reverse the decision to classify Internet access under Title II as a “common carrier” is so devastating to businesses like mine. Our company is small, yet we put millions of dollars into the local economy each year. You multiply that by the number of other small businesses and it can have a great impact to any community. Barriers put up by companies like Spectrum demonstrate that they can’t self-regulate and the government needs to take a firmer hand (and this is coming from a left-leaning libertarian).

I will be protesting that final bill for Internet access and I would welcome any advice on how to deal with a company like Spectrum. Let’s hope that there is a change soon so that other businesses can focus on creating value and not have to deal with the crap we had to endure.

I’m not holding my breath.

Service Outage Tomorrow, Saturday June 3rd

Wonder of wonders, Time Warner/Charter/Spectrum/whatever has finally delivered connectivity to our new office, albeit a month late.

So, we’ll be moving a number of servers from our old location to the new one, which means certain things, such as demo and Bamboo will be down for a few hours. Almost everything else is hosted elsewhere and redundant, so we shouldn’t have any other issues.

Sorry for the outage and thanks for your patience.

Monitoring? Meh.

Recently, I was talking to a person in the tech industry and describing all of the cool things we are doing with OpenNMS, when he kind of cut me off and went “Oh, monitoring? Meh.”

Well, I can’t remember if there was an actual “meh” but that’s how it came across, and I’m afraid the reaction is probably more common that I would think. Monitoring isn’t sexy, but it surprises me that people can’t see how critical it will be to the future of any business.

IoT Devices Over Time

While forecasts vary, by 2020 there are expected to be over 30 billion devices on the Internet, and that figure will skyrocket to over 75 billion by 2025. Just knowing what is connected to your business network is going to become critical, as well as making sure it belongs there in the first place and, if so, is functioning properly.

Outside of the obvious security concerns, as people began to transact business more and more through devices rather than people, faults in those devices will directly impact revenue as people search for other options when faced with a bad experience.

Here are a couple of examples.

One of the greatest inventions in my lifetime is the ability to buy fuel at the pump. You just pull up, swipe your card, pump and then leave. You used to have to pay inside, and some places made you pay first which meant two trips in if you were paying by credit card. It could be cold or rainy, and not only did you have to wait in line behind people buying food or lottery tickets, you had to leave your car out by the pump possibly blocking the next customer.

The only problem I’ve experienced with this process concerns the receipt. Quite frequently I need a receipt, but it seems the pumps I choose are always out of paper. The little red indicator mark when the paper roll is almost finished isn’t visible to the cashier since there really isn’t one out by the pump. It is frustrating, but it is not like I have a choice at the moment. If there was some way to monitor the pump for a “low paper” alarm, it would improve my shopping experience.

One shopping experience that did result in my leaving the store without a purchase happened yesterday at a Lowe’s Home Improvement store. I needed some florescent lights for the new office so I went by on my way home. I picked up four bulbs (two that I needed and two spares) and went to the checkout area.

I walked past several unmanned cash registers until I got to the “Self Checkout” section, which was the only thing open. Of the four machines, two had red blinking lights on them (that are green when things are functioning normally) and the one lone, overworked cashier was doing her best to help people out. I usually don’t mind using Self Checkout and when I noticed one of the two machines was open (everyone else was waiting for the attention of the lone cashier) I went to it and started my purchase.

I scanned my “My Lowe’s” card and then the first bulb. “Eight ninety-five” piped up the voice and I placed it in a bag.

Here is where the problems started. First, I hate the fact that with these Self Checkout kiosks they don’t trust you to use a “quantity” key. I was buying four identical items but I was required to scan each one. Next, the bulb was light enough that it didn’t register as having been bagged, so the interface yelled at me and presented me with a button marked “Skip Bagging Item?”.

I sighed and, having no other option, hit the button. I then went on to scan the next three bulbs. However, as I bagged the fourth bulb, the scale must have started working since the whole unit went into some kind of alarm mode, screeching “Unidentified Object in the Bagging Area!” and the screen was locked until the cashier had time to come and fix it.

I looked around the area, and by this time all four kiosks had a flashing red light, there were at least three shoppers lined up to use them in addition to those of us already there, and our valiant cashier was busy helping a guy ring up his plumbing supply purchase which consisted of a ton of small copper fittings which most likely wouldn’t be registered by the scale.

I gave up. I picked up my bulbs and returned them to the Lighting section, passing three employees in the customer service area helping zero customers. Before I reached the car I’d ordered the same bulbs on Amazon at a fraction of the price, and they’ll be here on Friday.

Yes, I’m complaining, but how could monitoring have helped here? First, there is some sort of monitoring – those little red lights. When they all light up you would assume someone, or perhaps multiple someones, would come by to help. A monitoring system could have made sure that happened by using an additional notification system outside of the lights, and escalating it until the problem was addressed.

A more long term solution would be to collect information on the purchasing experience and the problems people encountered and to make changes to the automated kiosk software. I’m certain that Lowe’s didn’t write that software but instead bought it, and like most proprietary software solutions they now have to fit their processes to the application instead of the other way around. It probably wasn’t designed for a store that sells a lot of small, light things which is central to the issues I have using it.

With the rise of IoT devices, robotics and other forms of automation, monitoring is going to become extremely important. Lowe’s lost out on a $40 sale, but think of something like an assembly line where a problem could result in the loss of thousands of dollars a minute. Our goal at OpenNMS is to be ready for it, and to build products that make people go “Monitoring? Oh yeah!”.

Server Room Nightmares

I’m interested in any server room nightmares people would like to share.

Here’s one of mine.

We are in the process of moving offices from Pittsboro, NC down the road to Apex. Unfortunately, we are having some issues getting Spectrum Enterprise to complete the fiber installation at the new place, so while we are out of our old building the lack of network access in the new building means we have a bunch of servers in the old location.

Today while I was working in the new office and mooching of our kind neighbor’s wi-fi, I got several notices that links had failed.

linkDown event list

These were some workstations that we use for training, but when they are not in use we use them as part of our continuous improvement Bamboo farm. I immediately hopped on our Mattermost IT channel and asked if anyone was rebooting or otherwise messing with the machines, and when the answer was “no” I started to investigate.

One suggestion was that the air conditioning may have failed and those machines shut down from overheating. It has happened in the past, but it was both rather cool today and other machines that are more sensitive to such things were still running. I checked it out anyway using our AKCP probe.

temperature graph

The temperature had increased a bit, but it wasn’t anything that should have caused problems (it was caused by the server room door being left open).

Being 30 minutes away, I decided to text my friend Donnie, who is technically gifted as well as working in our old location, and he went to investigate.

For some reason, those three machines had been disconnected from the switch.

Now just for this situation we have an Arlo camera installed in the server room, so using the time stamp on the linkDown traps I found the following video.

Note the slightly balding guy in the red shirt in the lower left corner of the video. He is busy unplugging our devices.

Why? I have no idea. These people represent the IT people for the new tenant, and I assume they had legitimate reasons for being in the server room but messing with our equipment was not one of them.

Seriously, in over 30 years of working with computers, I’ve never heard of anyone going into someone’s house, office, server room or data center and just start unplugging cables. I still have not heard an explanation, but the landlord has had a discussion with the new tenant and it shouldn’t be happening again. It is one reason the important stuff is in that locked half-rack seen in the upper left corner of the video, and the really important stuff is hosted elsewhere.

I am curious – I’m certain this pales compared to other stories out there. Do you have any whoppers to share?

New Meridian® Releases Available

Just a quick note to point out that new Meridian releases are now available: 2015.1.5 and 2016.1.5

For those who aren’t aware, Meridian is a subscription-based version of OpenNMS built to complement Horizon, the cutting edge release. You can think of it as Meridian is our Red Hat Enterprise Linux to Horizon’s Fedora. There is one major Meridian release per year and each major release is supported for three years.

Before the Meridian/Horizon split it was taking us 18 months or so to do a new major release of OpenNMS. Now we do three to four Horizon major releases a year.

About half of our revenue comes from support contracts and so we had to be extra careful when doing a release, and even with that many of our customers were reluctant to upgrade because the process could be involved. This was bad for two main reasons: often they wouldn’t get bug fixes which meant an increase in support tickets, and more importantly they might miss security updates.

Updates to Meridian, within a major release, are dead simple. This is the process I used yesterday to upgrade our production instance of OpenNMS.

First, I made a backup of the /opt/opennms/etc and /opt/opennms/jetty-webapps/opennms directories. The first is out of habit since configuration files shouldn’t change between point releases, but the second is to preserve any customizations made to the webUI. I modify the main OpenNMS page to include a “weather widget” and that customization gets removed on upgrades. Most users won’t have an issue but just in case I like having a backup.

Next, I stop OpenNMS and run yum install opennms which will download and install the new release. The final step is to run /opt/opennms/bin/install -dis to insure the database is up to date.

And that’s it. In my case, I copy the index.jsp from my backup to restore the weather information, but otherwise you just restart OpenNMS. The process takes minutes and is basically as fast as your Internet connection.

If you have a Meridian subscription, be sure to upgrade as soon as you are able, and if you don’t, what are you waiting for? (grin)

OpenNMS Team Wins 5000€ Prize at TM Forum {open}:hack

A group of four students from Southampton Solent University, mentored by Dr. Craig Gallen, used OpenNMS to win the top prize at the TeleManagement Forum {open}:hack competition at the TM Forum Live conference in Nice, France.

{open}:hack Winners

Now, a little background is in order. Dr. Gallen founded Entimoss, our OpenNMS partner in the UK and Ireland. He got involved with OpenNMS over a decade ago when he was working on his doctoral thesis entitled “Improving the Practice of Operations Support Systems in the Telecommunications Industry using Open Source”.

Most of his work was focused on a business solution framework called NGOSS (now Frameworx) developed by the TM Forum for creating next generation OSS/BSS software and systems. Now the TM Forum is the world’s leading trade organization for telecommunications providers and at the time was not very friendly toward open source. He demonstrated how an open source platform like OpenNMS could be used to integrate with and tie together these different interfaces to build a reference implementation for part of the framework. Open source was a new concept for the industry, and we were branded the “open source pirates” at first. But Craig persisted, and in 2011 he was awarded the TM Forum’s Outstanding Contributor Award.

In addition to his persistence and ability to deal with large organizations, Craig is also a great teacher. When the TM Forum introduced its {open}:hack program, he wanted to get involved and he found several interested students at Southampton Solent University.

The goals of {open}:hack are:

  1. Accelerate industry deployment of Forum Open APIs, metamodels and architecture across the industry
  2. Validate existing APIs and provide feedback for future iterations to technical collaboration teams
  3. Create IoT/Smart City & NFV/SDN solutions leveraging the Forum Open APIs
  4. Accelerate the incubation of new digital business opportunities in the areas of 5G Network Services & IoT/Smart City
  5. Create extensions to Forum Open APIs to be shared with industry

Participants were given access to APIs from the TM Forum, Huawei, Salesforce and Vodafone, which included things like data from drones, and tasked with creating something beneficial. Their project was called “Port-o-matic” which created an application for accessing services at shipping ports, as well as measuring environmental factors such as pollution. This was especially relevant to them since Southampton is the UK’s number one cruise port and second largest container port (the Titanic set sail from there).

{open}:hack architecture

Their solution leveraged the power of the OpenNMS platform to tie all of these APIs together and then to provide aggregated data to their web application. It can scale to almost any size using the new OpenNMS “Minion” feature which can distribute data collection and monitoring out to the edges of a network, offloading the need to have all of the functionality in a central location and positioning OpenNMS for the Internet of Things (IoT).

The hardest thing to get across to people new to OpenNMS is that it is a platform and not strictly an application. The learning curve can be steep and it is hard to see its value straight out of the box. I love the fact that solutions like the “Port-o-matic” demonstrate the power of OpenNMS.

It is also interesting to note that the second place prize went to a team from Red Hat. For an organization like the TM Forum that was wary of open source to demonstrate such a change of heart is encouraging, and I credit Dr. Gallen with a lot of that advancement.

{open}:hack Group Photo

So congratulations to Joe Appleton, Jergus Lejko, Michael Sievenpiper and Marcin Wisniewski, the winners of this latest {open}:hack competition, and I look forward to seeing more great things from you in the future.