Fifteen Years

On Sunday my mother celebrated her 75th birthday.

Although a happy occasion, why is this relevant to an open source blog? Well, it was soon after her 60th birthday in 2002 that I started my first company around OpenNMS.

I did not start OpenNMS, it began in the summer of 1999, with the first code posted on Sourceforge in March of 2000 by a company called Oculan. I started working with Oculan in September of 2001, and in May of 2002 they decided to stop contributing to OpenNMS. I saw the potential, so I asked Steve Giles, the founder and CEO, if I could have the OpenNMS project. He looked at his watch and said if I was off his payroll by Friday, he’d give me the domain names, a couple of servers, and he would sprinkle water on me and I would be the new OpenNMS maintainer.

That was actually the easy part. Explaining to my wife that I had quit my job and started a company “selling free software” was a bit harder.

sortova.com from archive.org circa May 2002

And thus Sortova Consulting Group was born. It was named after my farm. When Andrea and I decided we wanted to have a farm, we first bought raw land. In driving out from Raleigh to work on it we would pass this little farm with a barn, some cows, etc., and on the mailbox was a sign reading “Almosta Farm”. I joked that if that was “almost a farm” then what we had was just “sort of a farm”. Later, when we bought the place where we still live, the name Sortova Farm stuck.

We pronounce it “Sore-toe-va”. Only one customer ever pulled me aside and asked if it really meant “sort of a” consulting group. He laughed when I confirmed that it did.

Considering that I didn’t have any prior business experience, Java experience, or even real Internet access at my home, it is amazing that OpenNMS survived to this day. It is a wonder what you can accomplish with pure stubbornness.

Now my one true superpower is my ability to get the most fantastic people on the planet to work with me. The first group of those came from the OpenNMS community. When I was running Sortova it was the gang that later became the Order of the Green Polo that kept me going, mainly through mailing lists and IRC. In September of 2004 my good friend and business partner David Hustace and I founded the OpenNMS Group, and that corporation is still going strong. In 2009 we mortgaged our houses to buy the copyright to the Oculan OpenNMS code and thus brought all of it back under one organization, and two of the original OpenNMS team at Oculan now work for OpenNMS.

When I visit Silicon Valley I often get to meet some brilliant people, but the joy of this can be offset by the pervasive attitude of focusing on technology simply to make money. I know of a number of personally successful people who built companies, sold them, and then those products vanished into obscurity. Remember VA Linux? Their stock rose over 700% on the first day of trading, but where are they now? Did they ever deliver on their promises to the stockholders?

I want to build with OpenNMS something that will last well beyond my involvement with the project. I’ve gotten it to the point where I am not longer expressly required to make it thrive, but I am still working on its legacy. We want it to be nothing less than the de facto standard for monitoring everything, which is a high bar.

Note that I still would like to make a lot of money, but that isn’t the core driving force of the business. Our mission statement is “Help Customers – Have Fun – Make Money” in that order. If you have happy customers and happy employees, the money will come.

Fifteen years ago I made a leap of faith, in both myself, my family and my friends. I’m extremely happy I did.

OpenNMS Group Turns Twelve

Heh, it almost slipped my mind completely but The OpenNMS Group turned 12 years old today.

I did have to go give our co-founder, David Hustace, a hug and if we weren’t so slammed it would have been time for a beer. Raincheck.

I did spend a second reflecting on our wonderful customers who make all this possible, as well as all the people who contribute to and use OpenNMS. There are a lot of people who don’t believe a company can survive with a 100% open source model, but the funny thing is that we’ve outlived quite a few proprietary software companies in the last decade or so, thus we must be doing something right.

Our business plan of “Spend Less Money Than You Earn” and our mission statement of “Help Customers, Have Fun, Make Money” are as true today as they were in 2004. I look forward to getting ever better on delivering on both of them.

Kippis!

New Additions to OpenNMS

I am very happy to announce that Chris Manigan has joined the OpenNMS team.

Chris has been using OpenNMS since 2010 when he worked at Towerstream in Rhode Island. He gave us a very nice testimonial for the website, and has a lot of experience with using OpenNMS as scale.

Chris Manigan

He put that experience to use at Turbine, insuring that their infrastructure could deliver gaming content to users who demand performance. Now he’s going to use that experience to insure that OpenNMS is ready to take on the Internet of Things, for both our internal infrastructure and those of our customers.

I also want to announce that Jesse White, our CTO, and his wife Sara welcomed Charles White into the world early yesterday morning.

Charles White

Weighing 7 pounds and 11 ounces, he is already writing code in Python and we hope to have him making commits in Java in the next week or so.

Nagios XI vs. OpenNMS Meridian – the Return of the FUD

It seems like our friends over at Nagios have been watching a little too much election coverage this year, and they’ve updated their “Nagios vs. OpenNMS” document with even more rhetoric and misinformation.

As my three readers may recall, back in 2011 I tore apart the first version of this document. Now they have decided to update it to target our Meridian™ version.

Let’s see how they did (please look at it and follow along as it is quite amusing).

The first misleading bit is the opening paragraph with the phrase “most widely used open-source monitoring project in the world”. Now, granted, they do indicate that means “Nagios Core” but it seems a little disingenuous since what they are selling is Nagios XI, which is much different.

Nagios XI is not open source. It is published under the “Nagios Open Software License” which is about as proprietary as they get. I’m not even sure why the word “open” was added, except to further mislead people into thinking it is open source. The license contains clauses like “The Software may not be Forked” and “The Software may only be used in conjunction with products, projects, and other software distributed by the Company.” Think about it, you can’t even integrate Nagios XI with, say, a home grown trouble ticketing system without violating the license. Doesn’t sound very “open” at all. OpenNMS Meridian is published under the AGPLv3, or a similar proprietary license should your organization have an issue with the AGPL. You don’t have that choice with Nagios XI.

Next, let’s check out the price. The OpenNMS Group has always published its prices on-line. One instance of Meridian, which includes support in the form of access to our “Connect” community, is $6,000. They have it listed as $25,995, which is the price should you choose the much more intensive “Prime” support option. I’m not sure why they didn’t just choose our most expensive product, Ultra Support with the 24×7 option, to make them seem even better.

Nagios XI Node Limitation

Also, note the fine print “Price based on one instance of XI with 220 nodes/devices”. There is no device limit with OpenNMS Meridian. So let’s be clear, for $6000 you get access to the Meridian software under an open source license versus $5000 to monitor 220 nodes with extreme limitations on your rights.

Our smaller customers tend to have around 2000 devices, which means to manage that with Nagios XI you would need roughly ten instances costing nearly $50,000 (using the math presented in this document). And from the experience we’ve heard with customers coming to us from Nagios, the reason it is limited to so few nodes is that you probably can’t run much more on a single instance of Nagios XI. Compare that to OpenNMS where we have customers with over 100,000 devices in a single instance (and they’ve been running it for years).

We also price OpenNMS as a platform. You get everything: trouble-ticketing integration, graphing, reporting, etc. in one application. It looks like Nagios has decided to nickel and dime you for logs, etc. and a thing called “Nagios Fusion” which you’ll need to manage your growing number of Nagios instances since it won’t natively scale. And remember, due to the license you are forbidden from using the software with your own tools.

I especially had to laugh at the “You Speak, We Listen” part. If you have a feature or change you need, if you ask nicely they might make it for you. With OpenNMS Meridian you are free to make any changes you need since it is 100% open source, and with our open issue tracker we address dozens of user requests each point release.

Finally, there is the feature comparison, which at a minimum is misleading and is often just blatantly false. Almost every feature marked as lacking in Meridian exists, and at a level far beyond what Nagios XI can provide. Seriously, is it really objective to state that OpenNMS doesn’t support Nagvis, a specific tool that even has “Nagios” in the name?

Nagvis

I had to laugh at the hubris. They obviously didn’t Google “opennms nagvis“, because, guess what? There has been an OpenNMS Nagvis integration for some time now, contributed by our community. Just in case you were wondering, we have an integration with Network Weathermap as well.

Nagios is just another proprietary software product that wants to lock you into its ecosystem, and this is just a shameful attempt to monetize an application that is long past its prime. Heck, it was the inability of the Nagios leadership to get along with others that resulted in the very popular Icinga fork, and with it Nagios lost a lot of contribution that helped make up its “Thousands of Free Add-Ons” (and the way Nagios took over the community lead plug-in site was also poorly handled). Plus, many of those add-ons won’t scale in an enterprise environment, which probably lead to the 220 device limit.

Compare that to OpenNMS. We not only want to encourage you integrate with other products, we do a lot of it for you. OpenNMS has great graphing, but we also created the first third party plug-in for Grafana. When it comes to mapping, OpenNMS is on the leading edge, with a focus on various topology views that can ultimately handle millions of devices in a fashion that is actually usable. Need to see a Layer 2 topology? Choose the “enhanced linkd view”. Run VMware and Vcenter? It is simple to import all of your machines and see them in a view that shows hosts, guests and network storage. Plus the unique ability to focus on just those devices of interest allows you to use a map with hundreds of thousands if not millions of nodes.

Nagios Map

Compare that to the Nagios map screenshot where it looks like “localhost” is having some issues. Oh no, not localhost! That’s like, all of my machines.

As for “Business Process Intelligence” I’ve been told that the Nagios XI version is like our Business Service Monitor “Except BSM is more featureful, and has a significantly better UI/UX”. Need real Business Intelligence? OpenNMS has Red Hat Drools support, the open source leader, built right into the product.

We also support integration with popular Trouble Ticketing systems such as Request Tracker, Jira, OTRS and Remedy. And the kicker is that you can also run any Nagios check script natively in OpenNMS using the “System Execute Monitor“, but once you get used to the OpenNMS platform, why would you?

I’m not really sure why Nagios goes out of its way to spread fear, uncertainty and doubt about OpenNMS. We rarely compete in the same markets. I’m sure that Sunrise Community Banks get their money’s worth from Nagios, and for companies like NRS Small Business Solutions, Nagios might be a good fit. But if you have enterprise and carrier-level requirements, there is no way Nagios will work for you in the long term.

When a company does something like this to mislead, from wrong information about our product to using terms like “open” when they mean “closed”, it shows you what they think of their competition. What does it say about what they think about their customers?

2016 PB and Jam

OpenNMS is headquartered in the idyllic small town of Pittsboro, NC, sometimes just called “PBO”. Since a number of people who come to Dev-Jam travel a fair distance, we’ve started a tradition of a “mini Dev-Jam” the week after, hosted at OpenNMS HQ.

This is much more focused on the work of The OpenNMS Group, but it is still a lot of fun. Last night as a team building exercise we decided to try an “escape” room.

This is a a relatively new thing where a group of people get put in a room and they have a certain amount of time to figure out puzzles and escape. Jessica set us up with Cipher Escape in their “Geek Room” which was the only one that could accommodate 11 of us.

It’s a lot of fun. For our experience we were lead into about a 15×15 room and given the following backstory: you are watching your neighbors cat while they are on vacation and after you feed her you realize you are locked in their house. You have 60 minutes to escape.

One thing I thought was funny was that the room was dotted with little pink stickers and we were told that these indicate things that don’t need to be manipulated (e.g. there was a picture frame that when you turned it over you would see the stickers, which meant you weren’t supposed to take it apart). I can only imagine the beta testing that went into determining where to put the stickers (our hostess specifically mentioned that you didn’t need to take the legs off the furniture).

To tell anything more would spoil it, but I was extremely proud that the team escaped with over 10 minutes to spare (we missed the best time by ten minutes, so it wasn’t close, but we did beat a team from Cisco that didn’t escape at all).

Escape Room Success

It was a ton of fun, and I’d put this team up against any challenge.

Afterward, most of us went out for sushi at Waraji. I’ve known the owner Masatoshi Tsujimura for almost 30 years, and even though they were packed they were able to set us up with a tatami room.

Waraji Dinner

It’s a bit out of the way for me to visit often, so I was happy to have an excuse for a victory celebration.

Emley Moor, Kirklees, West Yorkshire

I spent last week back in the United Kingdom. I always find it odd to travel to the UK. When I’m in, say, Germany or Spain, I know I’m in a different country. With the UK I sometimes forget and hijinks ensue. As Shaw may have once said, we are two countries separated by a common language.

Usually I spend time in the South, mainly Hampshire, but this trip was in Yorkshire, specifically West Yorkshire. I was looking forward to this for a number of reasons. For example, I love Yorkshire Pudding, and the Four Yorkshiremen is my favorite Monty Python routine.

Also, it meant that I could fly into Manchester Airport and miss Heathrow. Well, I didn’t exactly miss it.

I was visiting a big client that most people have never heard of, even though they are probably an integral part of your life if you live in the UK. Arqiva provides the broadcast infrastructure for much of the television and mobile phone industry in the country, as well as being involved in deploying networks for projects such as smart metering and the Internet of Things.

We were working at the Emley Moor location, which is home to the Emley Moor Mast. This is the largest freestanding structure in Britain (and third in the European Union). With a total height of 1084 feet, it is higher than the Eiffel Tower and almost twice as high as the Washington Monument.

Emily Moor Mast View

The mast was built in 1971 to replace a metal lattice tower that fell, due to a combination of ice and wind, in 1969. I love the excerpt from the log book mentioned in the Wikipedia article:

  • Day: Lee, Caffell, Vander Byl
  • Ice hazard – Packed ice beginning to fall from mast & stays. Roads close to station temporarily closed by Councils. Please notify councils when roads are safe (!)
  • Pye monitor – no frame lock – V10 replaced (low ins). Monitor overheating due to fan choked up with dust- cleaned out, motor lubricated and fan blades reset.
  • Evening :- Glendenning, Bottom, Redgrove
  • 1,265 ft (386 m) Mast :- Fell down across Jagger Lane (corner of Common Lane) at 17:01:45. Police, I.T.A. HQ, R.O., etc., all notified.
  • Mast Power Isolator :- Fuses removed & isolator locked in the “OFF” position. All isolators in basement feeding mast stump also switched off. Dehydrators & TXs switched off.

They still have that log book, open to that page.

Emily Moor Log Book

If you have 20 minutes, there is a great old documentary on the fall of the old tower and the construction of the new mast.

On my last day there we got to go up into the structure. It’s pretty impressive:

Emily Moor Mast Up Close

and the inside looks like something from a 1970s sci-fi movie:

Emily Moor Mast Inside

The article stated that it takes seven minutes to ride the lift to the top. I timed it at six minutes, fifty-seven seconds, so that’s about right (it’s fifteen seconds quicker going down). I was working with Dr. Craig Gallen who remembers going up in the open lift carriage, but we were in an enclosed car. It’s very small and with five of us in it I will admit to a small amount of claustrophobia on the way up.

But getting to the top is worth it. The view is amazing:

View from Emily Moor Mast

It was a calm day but you could still feel the tower sway a bit. They have a plumb bob set up to measure the drift, and it was barely moving while we were up there. Toby, our host, told of a time he had to spend seven hours installing equipment when the bob was moving four to five inches side to side. They had to move around on their hands and knees to avoid falling over.

Plumb Bob

I’m glad I wasn’t there on that day, but our day was fantastic. Here is a shot of the parking lot where the first picture (above) was taken.

View of Emily Moor Parking Lot

I had a really great time on this trip. The client was amazing, and I really like the area. It reminds me a bit of the North Carolina mountains. I did get my Yorkshire Pudding in Yorkshire (bucket list item):

Yorkshire Pudding in Yorkshire

and one evening Craig and I got to meet up with Keith Spragg.

Keith Spragg and Craig Gallen

Keith is a regular on the OpenNMS IRC channel (#opennms on freenode.net), and he works for Southway Housing Trust. They are a non-profit that manages several thousand homes, and part of that involves providing certain IT services to their tenants. They are mainly a Windows/Citrix shop but OpenNMS is running on one of the two Linux machines in their environment. He tried out a number of solutions before finding that OpenNMS met his needs, and he pays it forward by helping people via IRC. It always warms my heart to see OpenNMS being used in such places.

I hope to return to the area, although I was glad I was there in May. It’s around 53 degrees north latitude, which puts it level with the southern Alaskan islands. It would get light around 4am, and in the winter ice has been known to fall in sheets from the Mast (the walkways are covered to help protect the people who work there).

I bet Yorkshire Pudding really hits the spot on a cold winter’s day.

Welcome Ecuador (Country 29)

It is with mixed emotions that I get to announce that we now have a customer in Ecuador, our 29th country.

My emotions are mixed as my excitement at having a new customer in a new country is offset by the tragedy that country suffered recently. Everyone at OpenNMS is sending out our best thoughts and we hope things settle down (quite literally) soon.

They join the following countries:

Australia, Canada, Chile, China, Costa Rica, Denmark, Egypt, Finland, France, Germany, Honduras, India, Ireland, Israel, Italy, Japan, Malta, Mexico, The Netherlands, Portugal, Singapore, Spain, Sweden, Switzerland, Trinidad, the UAE, the UK and the US.

OpenNMS is Sweet Sixteen

It was sixteen years ago today that the first code for OpenNMS was published on Sourceforge. While the project was started in the summer of 1999, no one seems to remember the exact date, so we use March 30th to mark the birthday of the OpenNMS project.

OpenNMS Project Details

While I’ve been closely associated with OpenNMS for a very long time, I didn’t start it. It was started by Steve Giles, Luke Rindfuss and Brian Weaver. They were soon joined by Shane O’Donnell, and while none of them are associated with the project today, they are the reason it exists.

Their company was called Oculan, and I joined them in 2001. They built management appliances marketed as “purple boxes” based on OpenNMS and I was brought on to build a business around just the OpenNMS piece of the solution.

As far as I know, this is the only surviving picture of most of the original team, taken at the OpenNMS 1.0 Release party:

OpenNMS 1.0 Release Team

In 2002 Oculan decided to close source all future work on their product, thus ending their involvement with OpenNMS. I saw the potential, so I talked with Steve Giles and soon left the company to become the OpenNMS project maintainer. When it comes to writing code I am very poorly suited to the job, but my one true talent is getting great people to work with me, and judging by the quality of people involved in OpenNMS, it is almost a superpower.

I worked out of my house and helped maintain the community mainly through the #opennms IRC channel on freenode, and surprisingly the project managed not only to survive, but to grow. When I found out that Steve Giles was leaving Oculan, I applied to be their new CEO, which I’ve been told was the source of a lot of humor among the executives. The man they hired had a track record of snuffing out all potential from a number of startups, but he had the proper credentials that VCs seem to like so he got the job. I have to admit to a bit of schadenfreude when Oculan closed its doors in 2004.

But on a good note, if you look at the two guys in the above picture right next to the cake, Seth Leger and Ben Reed, they still work for OpenNMS today. We’re still here. In fact we have the greatest team I’ve every worked with in my life, and the OpenNMS project has grown tremendously in the last 18 months. This July we’ll have our eleventh (!) annual developers conference, Dev-Jam, which will bring together people dedicated to OpenNMS, both old and new, for a week of hacking and camaraderie.

Our goal is nothing short of making OpenNMS the de facto management platform of choice for everyone, and while we still have a long way to go, we keep getting closer. My heartfelt thanks go out to everyone who made OpenNMS possible, and I look forward to writing many more of these notes in the future.

OpenNMS at Scale

So, yes, the gang from OpenNMS will be at the SCaLE conference this weekend (I will not be there, unfortunately, due to a self-imposed conference hiatus this year). It should be a great time, and we are happy to be a Gold Sponsor.

But this post is not about that. This is about how Horizon 17 and data collection can scale. You can come by the booth at SCaLE and learn more about it, but here is the overview.

When OpenNMS first started, we leveraged the great application RRDTool for storing performance data. When we discovered a java port called JRobin, OpenNMS was modified to support that storage strategy as well.

Using a Round Robin database has a number of advantages. First, it’s compact. Once the file containing the RRD database is created, it never grows. Second, we used RRDTool to also graph the data.

However, there were problems. Many users had a need to store the raw collected data. RRDTool uses consolidation functions to store a time-series average. But the biggest issue was that writing lots of files required really fast hard drives. The more data you wanted to store, the greater your investment in disk arrays. Ultimately, you would hit a wall, which would require you to either reduce your data collection or partition out the data across multiple systems.

No more. With Horizon 17 OpenNMS fully supports a time-series database called Newts. Newts is built on Cassandra, and even a small Cassandra cluster can handle tens of thousands of inserts a second. Need more performance? Just add more nodes. Works across geographically distributed systems as well, so you get built-in high availability (something that was very difficult with RRDTool).

Just before Christmas I got to visit a customer on the Eastern Shore of Maryland. You wouldn’t think that location would be a hotbed of technical excellence, but it is rare that I get to work with such a quick team.

They brought me up for a “Getting to Know You” project. This is a two day engagement where we get to kick the tires on OpenNMS to see if it is a good fit. They had been using Zenoss Core (the free version) and they hit a wall. The features they wanted were all in the “enterprise” paid version and the free version just wouldn’t meet their needs. OpenNMS did, and being truly open source it fit their philosophy (and budget) much better.

This was a fun trip for me because they had already done most of the work. They had OpenNMS installed and monitoring their network, and they just needed me to help out on some interesting use cases.

One of their issues was the need to store a lot of performance data, and since I was eager to play with the Newts integration we decided to test it out.

In order to enable Newts, first you need a Cassandra cluster. It turns out that ScyllaDB works as well (more on that a bit later). If you are looking at the Newts website you can ignore the instructions on installing it as it it built directly into OpenNMS.

Another thing built in to OpenNMS is a new graphing library called Backshift. Since OpenNMS relied on RRDTool for graphing, a new data visualization tool was needed. Backshift leverages the RRDTool graphing syntax so your pre-defined graphs will work automatically. Note that some options, such as CANVAS colors, have not been implemented yet.

To switch to newts, in the opennms.properties file you’ll find a section:

###### Time Series Strategy ####
# Use this property to set the strategy used to persist and retrieve time series metrics:
# Supported values are:
#   rrd (default)
#   newts

org.opennms.timeseries.strategy=newts

Note: “rrd” strategy can refer to either JRobin or RRDTool, with JRobin as the default. This is set in rrd-configuration.properties.

The next section determines what will render the graphs.

###### Graphing #####
# Use this property to set the graph rendering engine type.  If set to 'auto', attempt
# to choose the appropriate backend depending on org.opennms.timeseries.strategy above.
# Supported values are:
#   auto (default)
#   png
#   placeholder
#   backshift
org.opennms.web.graphs.engine=auto

If you are using Newts, the “auto” setting will utilize Backshift but here is where you could set Backshift as the renderer even if you want to use an RRD strategy. You should try it out. It’s cool.

Finally, we come to the settings for Newts:

###### Newts #####
# Use these properties to configure persistence using Newts
# Note that Newts must be enabled using the 'org.opennms.timeseries.strategy' property
# for these to take effect.
#
org.opennms.newts.config.hostname=10.110.4.30,10.110.4.32
#org.opennms.newts.config.keyspace=newts

There are a lot of settings and most of those are described in the documentation, but in this case I wanted to demonstrate that you can point OpenNMS to multiple Cassandra instances. You can also set different keyspace names which allows multiple instances of OpenNMS to talk to the same Cassandra cluster and not share data.

From the “fine” documentation, they also recommend that you store the data based on the foreign source by setting this variable:

org.opennms.rrd.storeByForeignSource=true

I would recommend this if you are using provisiond and requisitions. If you are currently doing auto-discovery, then it may be better to reference it by nodeid, which is the default.

I want to point out two other values that will need to be increased from the defaults: org.opennms.newts.config.ring_buffer_size and org.opennms.newts.config.cache.max_entries. For this system they were both set to 1048576. The ring buffer is especially important since should it fill up, samples will be discarded.

So, how did it go? Well, after fixing a bug with the ring buffer, everything went well. That bug is one reason that features like this aren’t immediately included in Meridian. Luckily we were working with a client who was willing to let us investigate and correct the issue. By the time it hits Meridian 2016, it will be completely ready for production.

If you enable the OpenNMS-JVM service on your OpenNMS node, the system will automatically collected Newts performance data (assuming Newts is enabled). OpenNMS will also collect performance data from the Cassandra cluster including both general Cassandra metrics as well as Newts specific ones.

This system is connected to a two node Cassandra cluster and managing 3.8K inserts/sec.

Newts Samples Inserted

If I’m doing the math correctly, since we collect values once every 300 seconds (5 minutes) by default, that’s 1.15 million data points, and the system isn’t even working hard.

OpenNMS will also collect on ring buffer information, and I took a screen shot to demonstrate Backshift, which displays the data point as you mouse over it.

Newts Ring Buffer

Horizon 17 ships with a load testing program. For this cluster:

[root@nms stress]# java -jar target/newts-stress-jar-with-dependencies.jar INSERT -B 16 -n 32 -r 100 -m 1 -H cluster
-- Meters ----------------------------------------------------------------------
org.opennms.newts.stress.InsertDispatcher.samples
             count = 10512100
         mean rate = 51989.68 events/second
     1-minute rate = 51906.38 events/second
     5-minute rate = 38806.02 events/second
    15-minute rate = 31232.98 events/second

so there is plenty of room to grow. Need something faster? Just add more nodes. Or, you can switch to ScyllaDB which is a port of Cassandra written in C. When run against a four node ScyllaDB cluster the results were:

[root@nms stress]# java -jar target/newts-stress-jar-with-dependencies.jar INSERT -B 16 -n 32 -r 100 -m 1 -H cluster
-- Meters ----------------------------------------------------------------------
org.opennms.newts.stress.InsertDispatcher.samples
             count = 10512100
         mean rate = 89073.32 events/second
     1-minute rate = 88048.48 events/second
     5-minute rate = 85217.92 events/second
    15-minute rate = 84110.52 events/second

Unfortunately I do not have statistics for a four node Cassandra cluster to compare it directly with ScyllaDB.

Of course the Newts data directly fits in with the OpenNMS Grafana integration.

Grafana Inserts per Second

Which brings me to one down side of this storage strategy. It’s fast, which means it isn’t compact. On this system the disk space is growing at about 4GB/day, which would be 1.5TB/year.

Grafana Disk Space

If you consider that the data is replicated across Cassandra nodes, you would need that amount of space on each one. Since the availability of multi-Terabyte drives is pretty common, this shouldn’t be a problem, but be sure to ask yourself if all the data you are collecting is really necessary. Just because you can collect the data doesn’t mean you should.

OpenNMS is finally to the point where the storing of performance data is no longer an issue. You are more likely to hit limits with the collector, which in part is going to be driven by the speed of the network. I’ve been in large data centers with hundreds of thousands of interfaces all with sub-millisecond latency. On that network, OpenNMS could collect on hundreds of millions of data points. On a network with lots of remote equipment, however, timeouts and delays will impact how much data OpenNMS could collect.

But with a little creativity, even that goes away. Think about it – with a common, decentralized data storage system like Cassandra, you could have multiple OpenNMS instances all talking to the same data store. If you have them share a common database, you can use collectd filters to spread data collection out over any number of machines. While this would take planning, it is doable today.

What about tomorrow? Well, Horizon 18 will introduce the OpenNMS Minion code. Minions will allow OpenNMS to scale horizontally and can be managed directly from OpenNMS – no configuration tricks needed. This will truly position OpenNMS for the Internet of Things.

UPDATE: Alejandro pointed out the following:

There is a property in OpenNMS for Newts that doesn’t appear in opennms.properties called heartbeat org.opennms.newts.query.heartbeat, which expects a duration in milliseconds. By default is 450000 (i.e. 1.5 x 5min) and it is used when no heartbeat is specified. Should generally be 1.5x your biggest collection interval.

If you were using 10 minute polls, set it to 15 min (1.5 x 10min), and then you will see graphs. To do this, add the following to opennms.properties and restart OpenNMS:

org.opennms.newts.query.heartbeat=900000

♫ Don’t Call It a Comeback ♫

Welcome to 2016. My year started out with an invitation to join the AARP. (sigh)

As my three readers know, when it comes to this business of open source we are pretty much making things up as we go along. We are lead by our business plan of “spend less money than you earn” and our mission statement of “help customers, have fun, make money” but the rest is pretty fluid.

In 2013 we mixed things up and tried a more “traditional” start up path by seeking out investment and spending more money than we had. It didn’t work out so well.

Thus 2014 was more of a rebuilding year as we tried to move the focus back to our roots. It paid off, as 2015 was a very good year. We had record gross revenues, and although we didn’t make much money on the bottom line, it was positive once again. At the moment we are still investing in the company and the project so pretty much every extra dollar goes into growth.

And we had a lot of growth. The decision to split OpenNMS into Meridian and Horizon paid off in three major Horizon releases. Horizon 17 was an especially large and important release as it brought in the Newts integration. At the moment we are working with it on a customer site using a ScyllaDB cluster capable of supporting 75K inserts per second. The technologies introduced in 2015 will make it in to Meridian 2016, due in the spring, and it should solidify OpenNMS as a platform that can really scale.

In 2015 we also received orders from two of the Fortune 5 companies. I’ll leave it as an exercise to the reader to guess which two and you have a 1 in 16 shot at getting it right (grin). The fact that companies that can choose, literally, any technology they want yet they choose OpenNMS speaks volumes.

One of these days we’re going to have to figure out a way to talk about our customers by name, since they are all so cool. We are working on it, but it is surprisingly difficult to get permission to publicly post that information. Above all we respect our clients’ privacy.

I have high expectations for 2016 and the power of the Open Source Way. Thanks to everyone who has supported us over the last decade and more, and we just hope you find our efforts provide some value.

Happy New Year.