Dealing with Docker Interfaces

We run a lot of instances of OpenNMS (‘natch) and lately we’ve seen issues with disk space being used up faster than expected.

We tracked the issue down to Docker. If Docker is running on a machine, SNMP will discover a Docker interface, usually labelled “docker0”. When that instance is stopped and restarted, or another Docker instance is created, another interface will be created. This will create a lot of RRD files of limited usefulness, so here is how to address it.

First, we want to tell OpenNMS not to discover those interfaces in the first place. This is done using a “policy” in the foreign source definition for the devices in question. Here is what it looks like in the webUI:

Skip Docker Interfaces Policy

The “SNMP Interface Policy” will match on various fields in the snmpinterface table in the database, which includes ifDescr. The regular expression will match any ifDescr that starts with the string “docker” and it will not persist (add) it to the database. This policy has only one parameter, so either “Match All Parameters” or “Match Any Parameter” will work.

If you want to use the command line, or have a lot of custom foreign source definitions, you can paste this into the proper file:

   <policies>
      <policy name="Ignore Docker interfaces" class="org.opennms.netmgt.provision.persist.policies.MatchingSnmpInterfacePolicy">
         <parameter key="action" value="DO_NOT_PERSIST"/>
         <parameter key="ifDescr" value="~^docker.*$"/>
         <parameter key="matchBehavior" value="ALL_PARAMETERS"/>
      </policy>
   </policies>

This will not deal with any existing interfaces, however. For that there are two steps: delete the interfaces from the database and delete them from the file system.

For the database, with OpenNMS stopped access PostgreSQL (usually with psql -U opennms opennms) and run:

delete from ipinterface where snmpinterfaceid in (select id from snmpinterface where snmpifdescr like 'docker%');

and restart OpenNMS.

For the filesystem, navigate to where your RRDs are stored (usually /opt/opennms/share/rrd/snmp) and run:

find . -type d -name "docker*" -exec rm -r {} \;

That should get rid of existing Docker interfaces, free up disk space and prevent new Docker interfaces from being discovered.

Open Source is Still Dead

Last week I attended the 20th O’Reilly Open Source Conference (OSCON), held this year in Portland, Oregon.

OSCON 20th Anniversary Sign

This is the premiere open source conference in the US, if not the world, and it is rather well run. It is equal to if not better than a lot of proprietary technology conferences I’ve attended, perhaps because it is pretty much a proprietary software conference in itself. I found it a little ironic that the Wednesday morning keynotes started off with a short, grainy video clip where an open source geek shouts out “We’re starting a revolution!”.

I tried to find the source of that quote, and I thought it came from the documentary “Revolution OS“. That movie chronicles the early days of open source software in which the stated goal was to take back software from large companies like Microsoft. There is a famous quote by Eric S. Raymond where he replies to a person from Microsoft with the words “I’m your worst nightmare.” Microsoft is now a major sponsor of OSCON.

When I attended OSCON in 2014 I asked the question “Is Open Source Dead?” Obviously the open source development model has never been more alive, but I was thinking back to my early involvement with open source where the idea was to move control of software out of the hands of big companies like IBM and Microsoft and into the hands of the users. Back then the terms “open source” and “free software” were synonymous. It was obvious that open source operating systems, mainly Linux, would rule the world of servers, so the focus was on the desktop. No one in open source predicted the impact of mobile, and by extension, the “cloud”. Open source today is no more than a development model used mostly to help create proprietary software, usually provided as a subscription or a service over the network. I mean, it makes sense. Companies like Google, Facebook and Amazon wouldn’t exist today if it wasn’t for Linux. If they had to pay a license to Microsoft or Sun (now Oracle) for every server they deployed their business models simply wouldn’t work, and the use of open source for building the infrastructure for applications simply makes sense.

Please note that I am not trying to make any sort of value judgement. I am still a big proponent of free software, and there are companies like Red Hat, OpenNMS and Nextcloud that try to honor the original intention of open source. All of us, open and proprietary, benefit from the large amount of quality open source software being created these days. But I do mourn the end of open source as I knew it. It used to be that open source software was published with “restrictive” licenses like the GPL, whereas now the trend is to move to “permissive” licenses like the MIT or Apache licenses. This allows for the commercialization of open source software, which in turn creates an incentive for large software companies to get involved.

This trend was seen throughout OSCON. The “diamond” sponsors were companies like IBM, Microsoft, Amazon and Google. The main buzzword was “Kubernetes” (or “K8s” if you’re one of the cool kids) which is an open source orchestration layer for managing containers. Almost all of the expo companies were cloud companies that either used open source software to provide a platform for their applications or to create open source agents that would feed back to their proprietary cloud back-end.

I attended my first OSCON in 2009 as a speaker, and I was a speaker for several years after that. My talks were always well-attended, but then for several years none of my paper submissions were accepted. I thought I had pissed off one or more of the organizers (it happens) but perhaps my thoughts on open source software had just become outdated.

I still like going to the conference, even though I no longer attempt to submit a talk. When I used to speak I found I spent most of my time on the Expo floor so now I just try to schedule other business during the week of OSCON and I get a free “Expo only” pass. You also get access to the keynotes, so I was sure to be in attendance as the conference officially started.

OSCON Badge

My favorite keynote was the first one, by Suz Hinton from Microsoft. She is known for doing live coding on the streaming platform Twitch, and she did a live demonstration for her keynote. She used an Arduino to control a light sensor and a servo. When she covered the sensor, the servo would move and “wave” at the OSCON audience. It was a little hard to fight the cognitive dissonance of a Microsoft employee using a Mac to program an open hardware device, but it was definitely entertaining.

OSCON Suz Hinton

My second favorite talk was by Camille Eddy. As interactions between computers and humans become more automated, a number of biases are starting to appear. Google image search had a problem where it would label pictures of black people as “gorillas”. An African-American researcher at MIT named Joy Buolamwini found that a robot recognized her better if she wore a white mask. Microsoft had an infamous experiment where it created a Twitter bot named “Tay” that within 24 hours was making racist posts. While not directly related to open source, a focus on an issue that affects the user community is very much in the vein of classic open source philosophy.

OSCON Camille Eddy

The other keynotes were from Huawei, IBM and Amazon (when you are a diamond sponsor you get a keynote) and they focused more on how those large software companies were using the open source development model to, well, offset the cost of development.

OSCON Tim O'Reilly

The Wednesday keynotes closed with Tim O’Reilly who talked about “Open Source and Open Standards in the Age of Cloud AI”. It kind of cemented the theme for me that open source had changed, and the idea is now much more about tools development and open APIs than in creating user-owned software.

OSCON Expo Floor

The rest of my time was spent wandering the Expo floor. OSCON offers space to traditional open source projects which I usually refer to as the “Geek Ghetto”. This year it was split to be on either side of the main area, and I got to spend some time chatting with people from the Software Freedom Conservancy and the Document Foundation, among others.

OSCON Geek Ghetto

I enjoyed the conference, even if it was a little bittersweet. Portland is a cool town and the people around OSCON are cool as well. If I can combine the trip with other business, expect to find me there next year, wandering the Expo floor.

Prodigal Customers

Growing up in the southern United States meant Sunday mornings were spent at Sunday School. One of the stories we would study was the Parable of the Prodigal Son. A man has two sons. The younger son asks for his inheritance in advance and he goes off and squanders it. When he returns, his father throws a big celebration to welcome him back.

I never really got the point of that story, as I always identified with the older, dutiful son, so it is surprising that it took working with OpenNMS for me to understand it.

We have great customers. Since we do little marketing, before we get a customer they have to first discover OpenNMS, then investigate it to see if it meets their needs, and only then do they contact us. It means that they are self-selecting, and without exception they are incredibly smart, physically beautiful and possessing of a wit so sharp they make Ginsu knives look dull. (grin)

The first company to ever buy an OpenNMS support subscription did so in December of 2001, and this year they renewed for the 17th time. It is a wonderful testament to the work of the team that they created something to inspire such a long commitment.

That said, we do lose a few customers each year. The first one I lost was a little heartbreaking. It was a hospital in Virginia, and when I called them to see if they would renew their support subscription they told me “no”. I was a little shocked, as I was unaware of any problems and they hadn’t opened tickets in awhile, and they told me that was the point. They loved OpenNMS but it “just worked” so they saw no value in getting support, they were still using it.

A more common case for us losing a customer is that our “internal champion” leaves. OpenNMS is a complex and powerful tool, and it does take awhile to climb the learning curve to see its full potential. If all of that knowledge is focused on one person, and that person leaves, their replacement can be overwhelmed and seek out something simpler, even if it is more expensive and less powerful.

I am alway saddened when this happens, but lately we’ve been experiencing what I’m calling “Prodigal Customers”. These are customers who leave and come back.

Cartoon by Chad Essley http://www.cartoonmonkey.com

I love them, and always want to slaughter (figuratively) the fattened calf to welcome them back.

It’s hard to explain, but while it is wonderful to have someone use something you’ve created for almost two decades straight, it is almost more rewarding to have someone go and try something else and discover it doesn’t stack up. Heck, I’d love it if all our customers could try out every possible option, because those that then chose OpenNMS for their solution would truly recognize what an awesome platform it can be.

Being 100% open source, OpenNMS does not have any way to “lock in” a particular customer. You can use it with our services or without, but you always have access to the latest code. Thus choosing to use OpenNMS is a validation of the work we’ve put into it, and whether you are a long time customer, a new customer, or a “prodigal” customer, your preference to use OpenNMS makes all the work to create it worthwhile.

2018 New Zealand Network Operators Group (NZNOG)

One thing that all open source projects struggle with is getting users. Most people in IT and software are overwhelmed with a plethora of information and options, and matching the right material to the right audience is a non-trivial problem.

Last year my friend Chris suggested that I speak at a Network Operators Group (NOG) meeting, specifically AusNOG. It was a lot of fun. I felt very comfortable among this crowd. so I decided to reach out to more NOGs to see if they would be interested in learning about OpenNMS.

The thing I like the most about NOGs is that they value getting things done above all else. While “getting things done” is still important with the free and open source crowd, there seems to be more philosophy and tribalism at those shows. “Oh, that’s written in PHP, it must suck” etc. As a “freetard” I live for the philosophical and social justice aspects of the community, but from a business standpoint it doesn’t translate well into paying customers.

At NOGs the questions are way more business-focused. Does it work? Is it supported? What does it cost? While I’m admittedly biased toward OpenNMS and its open source nature, the main reason I keep promoting it is that it just makes solid business sense for many companies to use it instead of their current solution.

Plus, these folks are pretty smart and entertaining while dispensing solid advice and knowledge.

Anyway, with that preamble, at AusNOG I learned about the New Zeland NOG (NZNOG) and submitted a talk. It got accepted and I found myself in Queenstown.

NZNOG Scenary

The main conference was spread out over two days, and like AusNOG it consisted of 30 to 45 minute talks in one track.

While I know it won’t work for a lot of conferences, I really like the “one track” format. It exposes me to things I wouldn’t have gone to otherwise, and if there is something I am simply not interested in learning about I can use that time to catch up on work or participate in the hallway track.

NZNOG Clare Curran

The conference started with a presentation by the Honorable Clare Curran, a newly minted Member of Parliament (they recently held elections in New Zealand). I’m slowly seeing politicians getting more involved in information technology conferences, which I think is a good thing, and I can only hope it continues. She spoke about a number of issues the government is facing with respect to communications technology.

Several things bother me about the US government, but one big one is the lack of understanding of the importance of access to the Internet at broadband speeds. Curran stated that “lack of reliable high-speed network access is a new measure of poverty”. Later in the day John Greenhough spoke on New Zealand’s Ultra-Fast Broadband (UFB) project, where on one slide broadband was defined as 20Mbps download speed.

NZNOG John Greenhough

Where I live in the US I am lucky to get 10Mbps and many of my neighbors are worse off, yet the government is ceding more of the decision making process about where to build out new infrastructure to the telecommunications companies which have zero incentive to improve my service. It’s wonderful to see a government realize the benefits of a connected populace and to take steps to make it happen.

Because we all need Netflix, right? (grin)

There was a cool talk about how Netflix works, and I didn’t realize that they are working with communications providers to provide low-latency solutions distributed geographically. This is done by supplying providers with caching content servers so that customers can access Netflix content while minimizing the need for lots of traffic over expensive backhaul links.

NZNOG Netflix RRD

I did find it cool that one of the bandwidth graphs presented was obviously done using RRDtool. I don’t know if they collected the data themselves or used something like OpenNMS, but I hope it was the latter.

With this push for ubiquitous network access comes other concerns. New Zealand has a law called TICSA that requires network providers to intercept and store network traffic data for use by law enforcement.

NZNOG Lawful Intercept

I thought the requirements were pretty onerous, but I was told that the NZ government did set aside some funds to help providers with deploying solutions for collecting and storing this data (but I doubt it can cover the whole cost, especially over time). The new OpenNMS Drift telemetry project might be able to help with this.

NZNOG Aftab Siddiqui

There were a couple of talks I had seen in some form at AusNOG. The ever entertaining Aftab Siddiqui talked about MANRS (Mutually Agreed Norms for Routing Security) but unlike in Australia he was hard pressed to find good examples of violations. Part of that could be that New Zealand is much smaller than Australia, but I’m giving the NZ operators the credit for just doing a good job.

NZNOG NetNORAD

The Facebook folks were back to talk about their NetNORAD project. While I have a personal reluctance to deploy agents, there really isn’t a way to measure latency at the detail they want without them. I think it would be cool to be able to gather and manage the data created by this project under OpenNMS.

NZNOG Geoff Huston

What I like most about these NOG meetings is that I always learn something cool, and this one was no different. Geoff Huston gave a humorous talk on DNSSEC and handling DNS-based DDoS attacks. While I was somewhat familiar with DNSSEC, I was unaware of the NSEC part of it.

Most DNS DDoS attacks work by asking for non-existent domains, and the overhead in processing them is what causes the denial of service. The domain name is usually randomly generated, such as jeff123@example.com, jeff234@example.com, etc. If the DNS server doesn’t have the domain in its cache, it will have to ask another DNS server, which in turn won’t have the domain as it doesn’t exist.

The NSEC part of DNSSEC, when responding to a non-existent domain request, will return the next valid domain. In the example above, if I ask for jeff123@example.com, the example.com DNS server can reply that the domain is invalid and, in addition, the next valid domain is www.example.com. If implemented correctly, the original DNS server should then never query for jeff234@example.com since it knows it, too, doesn’t exist.

Pretty nifty.

NZNOG Rata Stanic

One talk I was eagerly awaiting was from Rada Stanic at Cisco. She also spoke at AusNOG but I had to leave early and missed it. While she disrespected SNMP a little more than I liked (grin), her talk was on implementing new telemetry-based monitoring protocols such as gRPC. OpenNMS Drift will add this functionality to the platform. Our experience so far is that the device vendor implementation of the telemetry protocols leaves something to be desired, but it does show promise.

NZNOG Ulf

It was nice being in New Zealand again, and our mascot Ulf seemed to be popular with the locals. Can’t imagine why.

2018 Linuxconf Australia Sysadmin Miniconf

I just wanted to put up a quick post on my trip to Linuxconf Australia (LCA) being held this week in Sydney.

First, a little background. I’ve been curtailing my participation in free and open source software conferences for the last couple of years. It’s not that I don’t like them, quite the opposite, but my travel is funded by The OpenNMS Group and we just don’t get many customers from those shows. A lot of people are into FOSS for the “free” (as in gratis) aspect.

Contrast that with telcos and network operators who tend to have the opposite viewpoint, if they aren’t spending a ton of money then they must be doing it wrong, and you can see why I’ve been spending more of my time focusing on that market.

Anyway, we have recently signed up a new partner in Australia to help us work with clients in the Pacific Rim countries called R-Group International, and I wanted to come out to Perth and do some training with their team. Chris Markovic, their Technical Director as well as being “mobius” on the OpenNMS chat server, suggested I come out the week after LCA, so I asked the LCA team if they had room on their program for me to talk about OpenNMS. They offered me a spot on their Sysadmin Miniconf day.

Linuxconf Australia Sign

The conference is being held at the University of Technology, Sydney (UTS) and I have to say the conference hall for the Sysadmin track was one of the coolest, ever.

Linuxconf Australia - UTS Lecture Hall

The organizers grouped three presentations together dealing with monitoring: one on Icinga 2, one from Nagios and mine on OpenNMS. While I don’t know much about Icinga, I do know the people who maintain it and they are awesome. One might think OpenNMS would have an antagonistic relationship with other FOSS monitoring projects, but as long as they are pure FOSS (like Icinga and Zabbix) we tend to get along rather well. Plus I’m jealous that Icinga is used on the ISS.

Linuxconf Australia Icinga2 Talk

I think my talk went well. I only had 15 minutes and for once I think I was a few seconds under that limit. While it wasn’t live-streamed it was up on YouTube very quicky, and you can watch it if you want.

I had to leave LCA to head to the New Zealand Network Operator’s Group (NZNOG) meeting, so I missed the main conference, but I am grateful the organizers gave me the opportunity to speak and I hope to return in the future.

Linuxconf Australia During a Break

Conferences: Australia, New Zealand and Senegal

Just a quick note to mention some conferences I will be attending. If you happen to be there as well, I would love the opportunity to meet face to face.

Next week I’ll be in Sydney, Australia, for linux.conf.au. I’ll only be able to attend for the first two “miniconf” days, and I’ll be doing a short introduction to OpenNMS on Tuesday as part of the Systems Administration Miniconf.

Then I’m off to Queenstown, New Zealand for the New Zealand Network Operators Group (NZNOG) conference. I will be the first presenter on Friday at 09:00, talking about, you guessed it, OpenNMS.

The week after that I will be back in Australia, this time on the other side in Perth, working with our new Asia-Pacific OpenNMS partner R-Group International. We are excited to have such a great partner bringing services and support for OpenNMS to organizations in that hemisphere. Being roughly 12 hours out from our home office in North Carolina, USA, can make communication a little difficult, so it will be nice to be able to help users in (roughly) their own timezone.

Plus, I hope to learn about Cricket.

Finally, I’m excited that I’ve been asked to do a one day tutorial at this year’s African Network Operators Group (AfNOG) in Dakar, Senegal, this spring. The schedule is still being decided but I’m eager to visit Africa (I’ve never been) and to meet up with OpenNMS users (and make some new ones) in that part of the world.

I’ll be posting a lot more about all of these trips in the near future, and hope to see you at at least one of these events.

Welcome to 2018

I love New Year’s. Not exactly the party on New Year’s Eve, as I tend to spend it as a quiet evening with friends, but the idea of starting over and starting fresh.

It is also a good time to reflect on the year past. While 2017 was pretty tumultuous for the world at large, for OpenNMS it was a pretty good year.

Our decision to split OpenNMS into two versions is still paying off. We did three major releases of Horizon (19, 20, and 21) as well as point releases every month there wasn’t a major release, and Meridian 2017 finally came out, although later than I would have liked. Horizon users get to experience rapid advancements in power and features while Meridian users can relax knowing their system is very stable and secure.

While it is hard to pick out the best features added in 2017, I’d have to go with OpenNMS Helm and the Minion.

Helm allows you to combine and manage multiple instances of OpenNMS from a Grafana dashboard.

OpenNMS Helm

The Minion is our foray into the whole “Internet of Things” space with an application that can be installed on a small device and used to send remotely collected data to a central OpenNMS instance. Minions have minimal configuration and can be configured redundantly, yet they have the ability to collect massive amounts of monitoring data. We’re very eager to see what novel uses our users come up with for the technology (we have one customer that is “Minion-only”, i.e. they do no monitoring or collection from the central OpenNMS instance at all and instead just put two Minions at each location).

As for the OpenNMS Group, the company behind OpenNMS, we experienced modest growth but still had a record year for gross revenue. What is more exciting is that net income was also a record and several hundred percent above last year, so we are going into 2018 well positioned in our Business Plan of “Spend less than you earn”.

2018 should be exciting. The OpenNMS Drift project brings telemetry (flow) data into OpenNMS, and we are working on some exciting features regarding correlation which will probably involve new machine learning technology.

As always, these features will be available as 100% free and open source software.

Personally, I added three new countries to my list, bringing the total number of countries I’ve been in to forty. I had a great time in Estonia and Latvia, and I really enjoyed my trip to Cuba.

One last thing. If you are reading this you are probably a user of OpenNMS. If so, thank you. We are a small but dedicated group of people creating this platform and often we don’t get much feedback on who uses it and what they like about it. The fact that people do find it useful makes it worthwhile, and we wouldn’t exist without our users and clients.

So, Happy New Year, and may 2018 exceed your wildest expectations.

Update on Expensify

I recently posted a rant on how a vendor we use, Expensify, appeared to be exposing confidential data to workers with the Amazon Mechanical Turk service. In response to the general outcry, they posted a detailed explanation on their blog.

It did little to change my mind.

So apparently what happened is that they used to use the Mechanical Turk from 2009 to 2012, so if you we a customer back then your information was disclosed to those third party workers. Then they stopped, supposedly using some other, similar, in-house system.

But, some genius there decided that the best way for certain customers to insure their receipts were truly private was to have them use the Mechanical Turk with their own staff. I covered that in my first post and it is so complex it hardly registers as a solution.

Of course, they decided to test this new “solution” starting the day before the American Thanksgiving holiday. This was done using receipts from “non-paying customers”. While we pay to use the service (not for much longer), if you were trying it out for free your receipts were exposed to Mechanical Turk workers. Heh, if you aren’t paying for the product you are the product. The post goes on to talk about the security of the Mechanical Turk service, which was surprising because they went on and on about how they didn’t use it.

What really angered me was this paragraph:

The company was away with our families and trying hard to be responsive, while also making the most of a rare opportunity to be with our loved ones. Accordingly, this vacuum of information provided by the company was filled with a variety of well-intentioned but inaccurate theories that generated a bunch of compounding, exaggerated fears. As a family-friendly business we try hard to separate work life from home life, and in this case that separation came at a substantial cost.

Well, boo hoo. If you truly cared about your employees you wouldn’t start a major beta test the day before a big holiday. I spent my holiday worrying about my employees’ personal data possibly being exposed through the Expensify service. Thanks for that.

What pisses me off the most is this condescending Silicon Valley speak that their lack of transparency is somehow our fault. That our fears are just “exaggerated”. When Ryan Schaffer posted on Quora that nothing personal is included on receipts, he demonstrated a tremendous lack of understanding about something on which he should be an expert. As they turn this new leaf and try to be more transparent, I noticed he deleted his answer from the Quora question.

Smells like a cover up to me.

Look, I know that being from North Carolina I can’t possibly understand all the nuances of the brain-heavy Valley, but if Expensify truly does have a “patented, award-winning” methodology for scanning receipts, why don’t they just make that available to their customers instead of using the Turk? This long-winded defense of the Turk seems like they are protesting too much. Something doesn’t make sense here.

I’ve told my folks to stop using SmartScan and that we would move away from Expensify at the end of the year. If you use, or are planning to use, Expensify you should deeply consider whether or not this is a company you want to associate with and if they will act in your best interests.

I decided the answer was “no”.

Dougie Stevenson – The Elvis of Network Management

David messaged me yesterday that Dougie Stevenson had died.

I hadn’t seen Dougie in person in a long time, but I’d kept up with him through the very networks he, in part, helped manage. While I had heard he wasn’t in the best of health, the news of his passing hit me harder than I expected.

I can’t remember the first time I met Dougie. I do remember it was always Dougie, rarely Doug and never Douglas. While most adults might drop such a nickname, it is a reflection on his almost childlike friendliness and good nature that he kept it. I do know that I was working at a company called Strategic Technologies at the time, so this would be the mid-1990s. I was working with tools like HP OpenView, and I’d often run into Dougie at OpenView Forum events. When he decided to take a job at Predictive Systems I followed him, even though it meant commuting to DC four to five days a week.

It was at Predictive that I got to see his genius at work. With his unassuming nature and down-to-earth mannerisms it was easy to miss the mind behind them, but when it came to seriously thinking about the problems of managing networks there were few who could match his penchant for great ideas. I used to refer to him as the “Elvis” of network management.

We were both commuters then. While he had lived in many places, he called Texas home as much as I do North Carolina. We were working on a large project for Qwest near the Ballston metro stop, and after work we’d often visit the nearby Pizzeria Uno. The wait staff loved to see Dougie, and would always laugh when he referred to the cheese quesadillas appetizer as “queasy-dillies”. This was back during the first Internet bubble, around 1999, and while many of us were working hard to make our fortune, Dougie never really cared that much for money. He used to joke it would all go to his ex-wives anyway. I know he had been married but we didn’t talk too much about that aspect of his life. He’d much rather talk about the hotrod pickup truck he was always working on when he had the time. I do remember he once walked away from a small fortune over principles – that was just the kind of person he was.

I can’t remember the last time I saw Dougie, but it could have been in Austin back in 2008. I have this really bad picture I took then:

Dougie and Me

Notice he has on his OpenNMS shirt. He never failed to promote our efforts to create a truly free and open source network management platform whenever he could.

As I’ve gotten older, I wish more for time than money. Between the business and the farm I’m kept so busy that I rarely get to spend as much time with the amazing people I know, and it would have been nice to see Dougie at least once more. In any case, a small part of him lives on in the hearts and minds of those who did know him.

Though it saddens me to say it, Elvis has left the building.

Expensify and Why I Hate the Cloud

Over the weekend I found out that Expensify, a service I use for my company, outsources a feature to Amazon’s Mechanical Turk service. Expensify handles the management of business expenses, which for a company like ours can be problematic as we do a lot of travel when deploying services. The issue is that the feature, the “smart scanning” of receipts, could potentially expose confidential data to third parties. As a user of Expensify, this bothers me.

Expensify touts “SmartScan” as:

As background, SmartScan is the patented, award-winning technology that underpins our “fire and forget” design for expense management. When you get a receipt, rather than stuffing it into your pocket to dread for later, just:

1. Take your phone out of your pocket
2. SmartScan the receipt
3. Put your phone back in your pocket

What they never told us is that if their “patented, award-winning technology” can’t read your receipt, they send it to the Mechanical Turk, which in turn presents it to a human being who will interpret the receipt manually. The thing is, we have no control over who will see that information, which could be confidential. For example, when I post a receipt for an airline ticket, it may include my record locater, ticket number and itinerary, all of which are sensitive.

This apparently never occurred to the folks at Expensify. Take this Quora answer from Ryan Schaffer, listed as Expensify Director of Marketing & Strategy:

Also, its worth mentioning, they don’t see anything that can personally identify you. They see a date, merchant, and amount. Receipts, by their very nature, are intended be thrown away and are explicitly non-sensitive. Anyone looking at a receipt isunable to tell if that receipt is from me, you, your neighbor, or someone on the other side of the world.

Wrong, wrong, wrong. It seems that Mr. Schaffer may limit his business expenses to the occasional coffee at Starbucks, but for the rest of us it is rarely that limited. For someone whose job is to perfect dealing with receipts, his view is pretty myopic.

For examples of what Expensify exposes, take a look at this tweet by Gary Pendergast.

Information Exposed by Expensify Tweet

It is also worth noting that it appears Expensify does its business on the Mechanical Turk as “Fluffy Cloud” instead of Expensify, which strikes me as a little disingenuous.

In a blog post this morning the company addressed this:

As you might imagine, doing this is easier said than done. Given the enormous scale and 24/7 nature of this task, we have agents positioned around the world to hand off this volume from timezone to timezone. Most of the US team is located in Ironwood, MI or Portland, OR (where we have offices and can train in person). Most of the international team is in Nepal or Honduras (where we work with a third-party provider to manage the on-site logistics). But regardless of the location, every single agent is bound by a confidentiality agreement, and subject to severe repercussions if that agreement is broken.

But if this were true, why are random people on Twitter announcing that they can see this data? Are they relying on the Amazon agreement with the people working as part of the Mechanical Turk? That doesn’t instill much confidence in me. But then in the same blog post they double down, and suggest that if you want extra security, you can just set up your own staff as part of the Mechanical Turk:

1. You hire a 24/7 team of human transcription agents.
         o For the fastest processing we suggest staffing three separate shifts — or daytime shifts in three different offices around the world. Otherwise your receipts might lag for many hours before getting processed.

2. They apply to Amazon Mechanical Turk for an account. Be aware that this is a surprisingly involved process, including:
         o The agent must sign up using their actual personal Amazon account. If your account doesn’t have an adequate history of purchases (each of which implies a successful credit card billing transaction and package delivery) or other activity, you will be rejected.
         o The agent must provide their full name, address, and bank account information for reimbursement. Amazon verifies this with a variety of techniques (eg, confirm that your IP is in the country you say you are, verify the bank account is owned by the name and address provided, full criminal background check), and if anything doesn’t add up, you will be rejected.
         o Rejection is final. It requires such an abundance of verifiable documentation (most notably being an active Amazon account with a long history) that you can’t just create a new account and try again.
         o There is no apparent appeals process. Accordingly, I would recommend confirming before hiring that the candidate can pass Amazon Mechanical Turk’s many strict controls because we have no ability to override their judgement.

3. You notify us of the “workerID” of each of your authorized agents.
         o Though you are not obligated to share your staff’s identity with us directly, your staff will still be obligated to follow the Expensify terms of services. Failure to comply with our terms will result in an appropriate response, starting with immediate banning by our automated systems, ranging up to our legal team subpoenaing you (or failing that, Amazon) for the identity of the agent to press charges directly.

4. We will create a “Qualification” for your “Human Intelligence Tasks” (HITs) that ensures only your agents will see your receipts.

5. Your staff will use the Amazon Mechanical Turk interface to discover and process your employee’s receipts.

That’s the solution? This is what passes for security at Expensify? Hire three shifts of employees all using verified personal Amazon accounts and then you can be sure your confidential data is kept confidential?

Wouldn’t it just be easier to create a small webapp that would present receipts to people in a company directly without going through the Mechanical Turk? Heck, why not just bounce it back out to you – it isn’t that great of a chore.

Plus, basically, if you don’t do this Expensify is saying they can’t keep your information secure.

This is what frustrates me the most about “the Cloud”. Everyone is in such a rush to deploy solutions that they just don’t think about security. Hey, it’s only receipts, right? Look what I was able to find out with just a discarded boarding pass – receipts can have much more information. And this is from a company that is supposed to be focused on dealing with expenses.

I demand two things from companies I trust with our information in the cloud: security and transparency. It looks like Expensify has neither.

I will be moving us away from Expensify. If you know of any decent solutions, let me know. Xpenditure looks pretty good, and since they are based in the EU perhaps they understand privacy a little better than they do in San Francisco.