MariaDB Server Fest 2022

The thing I like most about my job is that I get to meet and work with amazing people. Recently I traveled to Helsinki, Finland, to attend the MariaDB Server Fest conference. It was a great experience and I met some very talented people, including Monty Widenius himself.

Note: The usual disclaimer that this is my personal blog and what I write here does not necessarily reflect the views of my employer, Amazon Web Services

My role at AWS is to work with open source companies and communities and to act as a liaison between them and Amazon. In thinking about important open source projects one of the first that comes to mind is MariaDB.

When I first got seriously involved in open source back in 2001, the MySQL database was an example of an open source success story. While a lot of the focus of the early days of open source was on the operating system, MySQL demonstrated that open source applications were powerful enough to compete with existing proprietary solutions. Plus, if you were building an open source application, quite often you needed a database, and MySQL provided a great option.

Many of us expected MySQL to IPO, but instead the company was bought by Sun Microsystems. That wasn’t too worrisome since Sun was a big proponent of open source, but when Sun was bought out by Oracle a couple of years later, that all changed.

On the day the acquisition was announced, Monty Widenius (the lead developer of MySQL) announced a fork of MySQL called MariaDB. In the years since then, a lot of people have replaced MySQL with MariaDB. While Oracle has continued to work on MySQL, the last major release, version 8.0, came out in April of 2018 so one must wonder how motivated they are to work on a product that competes with their main proprietary offering.

When I learned about Server Fest I decided to attend. As much as I like the ease of remote communication, sometimes nothing beats meeting face to face. I had also been to Helsinki a couple of time before and I really like the city, although I really should try to visit once in the summer time.

Boarding Sign for Flight to Helsinki

I flew from North Carolina to JFK and then took a Finnair flight to Finland. Helsinki is seven hours ahead of New York, so it is one of those weird trips where you leave in the night and land the following afternoon. When I travel I tend to stay at Marriott properties, but all the Marriott affiliated hotels were booked. I later learned that this was because a popular start-up conference called Slush was happening at the same time as Server Fest. Because of this there was no meeting space for rent, so the MariaDB event was being held at Monty’s house, which I thought was kind of cool.

Proposed schedule for Server Fest 2022

The conference was on Thursday, November 17th, and was going to be live-streamed on YouTube. In order to better match up with the time zone in New York, it started around 3pm and ran into the night. I arrived mid-morning.

When you walk into Monty’s house, the first thing you notice is that is has a very open floorplan. Directly across from the entryway is a huge table that can probably seat about 20 people, and that’s where most folks had set up their laptops. To the right of that was a large kitchen, and to the left was an open area where the walls were lined with bookcases, and that is where lights and cameras had been set up for the livestream.

Now MariaDB is organized in two parts. There is the MariaDB corporation, which is the main commercial enterprise behind the project, and there is the MariaDB Foundation, which manages project governance and promotion. Both were represented in the day’s presenters, and I also got to meet and spend a lot of time with Kaj Arnö, who is the CEO of the Foundation.

I also got to meet the true boss of the event, Anna Widenius, Monty’s wife. As you can imagine, getting a bunch of open source geeks organized is like herding cats, but she did a great job in getting the conference underway and keeping it moving.

The event kicked off with a presentation by Manuel Arostegui, who helps maintain the MySQL infrastructure at the Wikimedia Foundation.

Manuel Arostegui Presenting

He talked about “Chasing Bugs in Production”. When Wikimedia upgraded from MariaDB 10.3 to 10.6 they ran into a performance issue. His talk describes their upgrade process and how they were able to work with MariaDB to get the issue addressed. I also found it interesting that they run MariaDB on bare metal. So much of today’s IT infrastructure is based on clouds and Kubernetes that it was refreshing to see someone taking advantage of individual servers when it makes sense.

There was a bit of a hiccup with the second speaker who was supposed to join remotely, so Monty Widenius moved his presentation on query optimization in MariaDB to the second slot.

Monty Widenius Presenting

There are several methods that can be used to execute a query against a database. A good database will optimize the query method to choose the best plan to return the result in fastest time. In MariaDB 11, Monty has changed the method to use a cost-based optimizer (versus rule-based) with parameters that can be tuned by the user. This has resulted in more efficient queries and thus a better user experience.

The third presentation was by Daniel Black who joined us via Zoom from Australia (where it was 1am the next day.

Daniel Black via Zoom with Kaj Arnö

His topic involved the community, and I really liked his comment that MariaDB is “dependent on its users to give MariaDB its purpose” which I thought was pretty insightful.

The next presentation was one of my favorite. It was on MindsDB and was presented by Jorge Torres.

Jorge Torres Presenting

MindsDB allows you to integrate machine learning easily into your database. In his example he used a model trained by Hugging Face to analyze text in order to detect “sentiment” – i.e. is the text positive, negative or neutral. And you access this using SQL queries.

For example, supposed you have a blog or other website where users can submit comments. MindsDB would allow you to examine those comments to detect general sentiment without having to learn an entirely new system. I thought it was pretty cool.

The fifth presentation was on “respecting the stack” or insuring continuity within MariaDB. This was lead by Soeren Von Varchmin along with Kaj Arnö and Dom Taylor.

Soeren Von Varhmin via Zoom with Kaj Arnö

This resonated with me as it focused on building the sponsorship community within MariaDB, for both individuals and entities. MariaDB is an important piece of technology and there are a lot of stakeholders, and this talk really reinforced the idea of a “big tent” environment within the project.

For the next presentation we finally got to hear from Frederico Razzoli founder of Vettabase (he was originally scheduled to go second but there was some time zone confusion) as he talked about new MariaDB features to learn “for a happy life”.

Frederico Razzoli via Zoom with Kaj Arnö

He started off with the comment that MariaDB (and open source projects in general) are very good at creating new features and not so good about documenting or advertising them. He discussed the most recent releases of MariaDB and then highlighted various new features that people should find useful.

The seventh presentation was by Sergey Petrunia and revisited the optimizer, but focused on changes made before Monty’s changes in MariaDB 11.

Sergey Petrunia Presenting

His talk focused on those changes made in the last year (i.e. since the last Server Fest) and it looks like a lot of progress was made to make the optimizer more consistent.

The following talk by Marko Mäkelä was on Multi-Version Concurrency Control (MVCC) and it went right over my head.

Marko Mäkelä Presenting

From what I can tell, the idea behind MVCC is that active databases are constantly processing transactions but there is a need to provide a consistent “view” at a given point in time, so MVCC determines which transactions are supposed to be considered committed at that point in time and which are not. This is to prevent someone who is reading from the database from being served incomplete information.

The final scheduled talk was by Andrew Hutchings who works to analyze contributions to the MariaDB project.

Andrew Hutchings Presenting

As transparency is key to any open source project, MariaDB publishes statistics on code contributions. The latest one I can find is through September of this year, and I was happy to see Amazon on the list of contributors to the MariaDB Server code.

Of course the majority of code commits, nearly 80% were done by the MariaDB corporation, and another 14% by the MariaDB Foundation. Amazon represented 1.42% of contributions, but Andrew pointed out to me that they came from 14 unique committers versus 8 from the Foundation. I’d love to see that involvement increase.

We did get a bonus talk by Andrei Elkin on the GTID-aware mariadb-binlog

Andrei Elkin Presenting

The mariadb-binlog is a binary log containing a record of all changes to the databases, both data and structure. There is a command line tool that lets you examine this log which now supports Global Transaction IDs, making it easier to filter transactions.

After the Server Fest stream was over, we got to my favorite part of any conference – the socializing.

I did spend some time talking with Manuel Arostegui. One of my friends, Eric Evans, works at the Wikimedia Foundation focused on Cassandra. It turns out that both Manuel and Eric are in the same department. Small world.

Manuel and Me

We eventually sat down to dinner prepared by the Widenius’s. Monty cooked a huge beef tenderloin, and we talked, sang songs and drank. I managed to get back to my hotel about 1am the next morning.

Usually when I travel home from Europe my flight will leave around noon and I get back in the early afternoon local time. For some reason those flights were over $1000 more than the Finnair flight that left at 5pm, so I returned to Monty’s on Friday morning to visit for a few hours.

I loved the fact that Monty was so welcoming and also that he and his family keep a lot of animals (we do the same). In addition to six cats and three dogs, there is a boa constrictor named Monty Python who is about two meters long. The story I heard was that it was a gift given to a family member that ended up at Monty’s. They originally thought it was a python but later learned it was a boa, but the name stuck.

Monty Python the Boa Constrictor

The trip home was uneventful except for the fact that I got home home close to 2am and I ended up catching a bad case of influenza. To my knowledge no one else at the conference got sick, for which I’m happy, and while it knocked me out of commission for almost two weeks it was worth it.

Remaining 2022 Conferences

Today marks my three month anniversary with AWS, and I’m loving it. It has been a lot of fun returning to conferences, so I thought I’d post a list of the ones I will be attending for the rest of the year.

If you are going to any of these as well, please reach out as I miss seeing people in person and would love to catch up (or, get acquainted).

2022 SCaLE 19x – Day Four

The last day of SCaLE was bittersweet, as I didn’t want it to be over but I was also ready to head home.

After stopping by the booth I was eager to visit a session on OpenNMS presented by my friend Jeff Gehlbach.

Jeff Gehlbach presenting on network flow data

Jeff has stepped up in the the presenter role I used to have, and he did a very good job of covering what network flows are, the different types and why they are important.

Back in the Exhibit Hall I was happy to learn that the AWS booth had won the “Most Memorable” award.

Most Memorable Booth Certificate

Hats off to Spot and Ashley for coming up with such a cool concept and creating a great space for people to hang out.

At 1:30pm we held a raffle for a pretty nice 3D printer. You had to be present to win and there was a lot of interest.

Spot surrounded by a crowd as the raffle winner is chosen

Then it was time to tear down the booth as the Exhibit Hall closed at 2pm.

This gave us time to get to the closing keynote by Internet pioneer Vint Cerf.

Vint Cerf in front of a podium

For someone who recently turned 79 he was a dynamic and entertaining speaker, and it was fun to listen to his stories on creating ARPANET, and how it grew into the public Internet we use today.

He also mentioned Jon Postel several times. I had an e-mail correspondence with Jon in the mid-1990s when I was trying to wrap my brain around the process for getting an “enterprise number” from IANA. I didn’t realize until after his untimely death who he was, and I’m still impressed at how much time he was willing to give a newbie like me.

While I enjoyed the presentation, I did regret that we ran out of time for details on his last slide, which concerned “unfinished business”.

Slide Listing Unfinished Business for the Internet

I mean, I get it. Each of the six topics on the slide could be a talk on its own, but I was very curious to hear his thoughts on fixing things such as disinformation. I love living in a world with almost instant access to information and the ability to connect with others, but there are problems, too, and I’m not sure we have the solutions.

All in all I am extremely happy to have been able to attend SCaLE. I’m still not comfortable in crowds and I was a little put out that not everyone in attendance decided to honor the mask policy. I talked with the SCaLE staff and they told me they were doing all they could, but even when people were reminded to mask up they tended to remove them as soon as the staff member walked away.

I was especially unhappy when I saw sponsors going maskless. On the one hand I am happy for their support of SCaLE, but on the other when you are standing in front of your company logo showing a disregard for the safety of your potential customers, it sends a bad message.

I’m not bringing this up to start a debate on the efficacy of masks, as I realize that they provide varying degrees of protection depending on type and use, but if your staff isn’t willing to abide by the conference rules, perhaps you just shouldn’t be there.

Note that I did refrain from posting the pictures I took of specific sponsors since it really wouldn’t change anything. I must be getting soft in my old age.

In any case I hope this is a non-issue for SCaLE 20x in Pasadena next March. I’m not optimistic that the pandemic will be over but for me the risk was worth the benefit, and I can’t wait to return.

2022 SCaLE 19x – Day Three

Day Three of SCaLE kicked off the start of the main conference, which meant I spent most of the day in the AWS booth.

AWS Booth Picture Showing a Television Screen Playing Frontalot Videos

Traffic was pretty good and I got to talk with a lot of interesting people. I did take a break around 2pm and noticed from Twitter that I was missing a talk by Frank Karlitschek of Nextcloud fame, so I skedaddled over to his room to catch it.

It was pretty good. It focused on how copyleft-style licenses are often better for business since they level the playing field for all contributors, versus a number of newer licenses that are more “source available” instead of “open source”.

Frank Karlitschek Presenting at SCaLE

Please note that I’m an unabashed Nextcloud fanboy so I have some biases. (grin)

The big evening event was “Game Night” where they turned the basement ballrooms into a big gaming playground. From the classics such as checkers and chess, to Vegas-style games such as roulette and blackjack, up to the most modern of games using VR, there was something for everyone.

AWS sponsored the music for the event, and I was eager to see MC Frontalot perform. He didn’t disappoint.

MC Frontalot Performing at SCaLE

He did an hour-long set spanning the classics to the newer stuff, including “Secrets From the Future” featuring a video generated using AI.

Afterward he hung out at the merch table to chat with folks, and I got to spend some time with a new friend named Silona Bonewald.

MC Frontalot and Silona Bonewald

I was introduced to Silona through Spot as she was on the same hotel shuttle bus when we arrived on Wednesday evening. She is in charge of open source at IEEE as well as being a Burner, and I always look forward to chance to talk with her.

Today is the final day of the conference, and remember if you are reading this before 1:30pm PDT there is a raffle for an awesome 3D printer at the AWS booth, so come by to get your ticket.

2022 SCaLE 19x – Day Two

This is the first conference since joining AWS that I have booth duty, so I won’t be able to spend as much time in the sessions as I would like, but I did want to catch one of the first sessions of the day which was “Speedrunning Kubernetes”.

A slide with the session title 'Speedrunning Kubernetes'

The main reason I wanted to see this talk was to see Kat Cosgrove in action. Prior to coming to AWS I didn’t know about her but I ended up following her on Twitter and found that she has strong opinions, and I tend to like people who have strong opinions. I figured the presentation would be entertaining and that I might learn something.

I wasn’t disappointed.

Kat Cosgrove being introduced by Josh Berkus

The title alludes to a “speedrun” which is an attempt to complete a video game as quickly as possible. The goal of this talk was to bring up a working Kubernetes cluster as if you were doing a speedrun. It also included one of the more … unusual … analogies I’ve seen in a technical presentation (including my own) by using a Chihuahua as a metaphor.

A chihuahua with two cheeseburgers under each of their four feet

If the goal is to provide the “cheeseburger” application, consisting of the bun service, the patty service, the cheese service, the mustard service, etc., each instance of the application (i.e. each burger) can be considered a “pod”. There are two pods under each foot of the dog representing two-pod “nodes” and the dog forms the control plane.

Remember, now that you’ve seen it, you can’t unsee it.

That was the only session I made on Day Two, but I did get some time to wander around the Exhibit Hall. The Software Freedom Conservancy had a booth, and since they are one of my favorite organizations I stopped by to chat with Pono Takamori. I know a number of folks that work there and they serve as almost a reference implementation for trying to live using 100% free software. Pono was telling me that it was getting almost impossible to find a totally free mobile wireless solution since 3G went away, as all of the modern modems tend to use binary blobs.

Pono Takamori in the Software Freedom Conservancy Booth

Now, when these exhibit halls are being set up, the “booths” are laid out with little generic signs showing the owner of the booth, and most of the time they eventually get covered up once the booth is complete.

The MySQL booth with an Oracle sign in the background

I know the Sun acquisition was a long time ago, but I still get cognitive dissonance when I see a MySQL sign next to an Oracle one.

The AWS booth for this conference is really awesome. I bow down to the genius that is Spot Callaway, and he pitched a booth design that was to invoke a teenage geek’s basement, where one might play video games and Dungeons and Dragons (think Stranger Things). The walls of the booth are made to look like brick, and there are chairs, a couch and an SNES console emulator.

The AWS Booth showing people playing a video game

The featured AWS project for this conference is Bottlerocket, and I got to learn a bit about it and meet members of the team. Bottlerocket is a minimal operating system designed just to run containers. I compared it to LibreELEC, which is a purpose-built O/S that I use to run Kodi, and while it was explained to me that I was oversimplifying things a bit, it was otherwise a good analogy.

While it is, of course, being used withing AWS, it is a 100% open source project and you can get the code on Github, and the hope is that others will find it valuable and will get involved with the community. If this is something you’re into, stop by the booth and say “hi”.

Speaking of stopping by the booth, we do have some tasty sodas and Bottlerocket branded bottle openers, but the big giveaway is an awesome 3D printer. Get a raffle ticket and stop by the booth at 1:30pm on Sunday for the drawing (you must be present to win).

AWS employees are not eligible to participate. (sniff)

2022 Scale 19x – Day One

I am back at the Southern California Linux Expo (SCaLE) for the first time in many years, and I was surprised at how happy this makes me. It is always a well run conference and it tends to bring a lot of people I like together in one place, which means I get to meet a lot more people to like as well.

The main SCaLE sessions occur over the weekend, but there are a lot of cool things that happen in the days before. For Thursday, AWS sponsored Cloud Native Builder Day to showcase some of the amazing open source technologies one can use to solve a number of challenges, and I was eager to learn about them.

But before that I needed to get registered. The first step was to show proof of vaccination. While I am thankful that we can have these events, COVID is still a thing and the organizers are doing all they can to mitigate the risk to the conference attendees. Since I’m an old I’ve had two shots and two boosters but the darn thing keeps mutating.

SCaLE Registration Sign

Once past that I headed upstairs where I could use the self check-in kiosks. It was pretty simple to sign in and get my badge printed, and then it was just a short trip down the hall to pick up the conference “swag bag” which included the badge holder and lanyard.

SCaLE Registration Area with People Checking In

The only change I would make to the process is that once you printed your badge, you should really hit the “close window” button on the screen, as there is a “back” button that could allow the next person who registers to see your name and e-mail. No biggie, but the security nerd in me always thinks about these things.

The conference spans two floors. The Exhibit Hall with the sponsor booths is on the ground floor behind registration (it is technically in the Plaza Ballroom so I just followed the signs for “ballrooms”) while the sessions are on the second floor along with registration. AWS is going to have a pretty cool booth this year.

As an AWS employee I guess I should say that we always have a cool booth (grin) but I especially like the idea behind this one, despite the fact that we were unable to get a mounted deer head (seriously). It’s booth numbers 300, 302 and 304 if you want to swing by, and for those of you who couldn’t make it I’ll be sure to post about it later.

Cloud Native Builder Day showcased three different open source projects, the first one being Triggermesh. This was presented by Jeff Naef who I immediately liked as he was the first to notice that my mask is made by K&N, a company known for their high-end automotive airflow products. He loves performance automobiles as well as open source (he was wearing a Snap-On tools hat) so I knew we would get along.

Jeff Naef Presenting on Triggermesh

In dealing with cloud native technologies, a lot of the workflow is event driven. Triggermesh lets you seamlessly link together sources and targets for events, normalizing and enriching them along the way. While it does support the ability to create functions using code (in a variety of languages) a lot of the implementation can be done just through configuration.

In one example the data was encoded in base64, and a person asked if Triggermesh could render that in clear text. Jeff was like, sure, and he bravely set out to implement that as we watched. He got really close, but in any case deserves kudos for the attempt, especially considering he was holding a microphone with one hand the entire time.

The next speaker was Zoe Steinkamp from InfluxDB. I first met Zoe at the Open Source Summit in Austin and she is one of my favorite new acquaintances I’ve met through my job at AWS.

Now full disclosure: I missed the first half of her presentation.

SCaLE has done something delightful with the schedule, which is allowing 30 minutes between talks. I’ve talked about this before but this lets speakers switch out without the usual urgency, allows more time for attendees to interact with the speaker after the talk, and improves the hallway track.

I thought I had enough time to grab lunch, which was In-N-Out that Spot had brought for me. We don’t have In-N-Out in North Carolina so I rarely pass up a chance to get it, and I figured I could be back in time. I was wrong. But I did slip into the back of the room which is why this picture isn’t as close as the others.

Zoe Steinkamp Presenting on InfluxDB

I used to work on an open source project that relied heavily on time series data, so I’m a bit of a time series data geek. Every time I see a presentation on InfluxDB I learn more things to like about it. This time I found out that it is possible to get started with it without being a programmer. A lot of people in the data science field aren’t coders, but they can send their data to InfluxDB pretty easily. The folks at Influx have created InfluxDB University as a free resource to get the most out of their solution, and while I haven’t gone through it yet it looks really comprehensive.

The final presenter was Matt Overstreet from Datastax. Datastax focuses on providing solutions around the Apache Cassandra project, which is a distributed “NoSQL” database.

Matt Overstreet Presenting on Apache Cassandra

When most people hear the word “database” they think of relational databases. This is a data structure usually based on “rows” of data made up of “fields” and indexed by a primary key. One then uses something like the Structured Query Language (SQL) to retrieve values from those fields. This is all well and good but it tends to be extremely monolithic, which doesn’t work well in today’s distributed cloud environment.

Think about it. In a datacenter you might have sub-millisecond latency, so a query can be returned quickly. Move that datacenter across the country, and now it your latency is, say, 100ms. Move that to the other side of the world and, well, you get the picture. Now if you only have a few queries that might be okay, but when you consider thousands and then millions of queries, the response time of your application is going to take a hit.

Cassandra allows you to distribute that data both within a datacenter (for reliability) and also regionally. You can then put your data near your customers, improving their experience.

I was already sold on Cassandra (we used it at OpenNMS) but what I learned from this presentation was the wonderfulness that is “k8ssandra” (kate-sandra). This is Cassandra but running in Kubernetes. If you have ever had to extend and expand a Cassandra cluster, you know that while it isn’t super difficult there are a number of gotchas that can cause problems. What if you could automate it? Matt showed us an example that let him spin up (and tear down) an 800 node cluster in minutes.

Cool, huh?

The first day of SCaLE 19x was a blast, and I am eager to see what the rest of the week brings.

Why You Should Attend SCaLE 19x

The 19th iteration of the Southern California Linux Expo (SCaLE) is around two weeks away, and I wanted to suggest some reasons why you should attend, assuming you are into free and open source software. AWS, where I work, is a platinum sponsor. The conference runs for four days starting on July 28th and is located at the Los Angeles Airport Hilton.

Note: Everything expressed here represents my own thoughts and opinions and I am not speaking for my employer Amazon Web Services.

I’ve been to a number of SCaLE conferences and I’m always impressed at how well they are run. This is a grass-roots, volunteer-led conference yet it is always at par with the more commercial trade shows I attend and sometimes exceeds them. This year looks exceptionally good.

The first reason you should go is the content. The conference has quite a number of tracks including one focused on containers and orchestration (‘natch) and also infrastructure, security and observability. There are tracks on using open source in the medical field as well as government. Big Data gets its own track as well as embedded systems, and there are several more tracks guaranteed to touch on almost every interest within free and open source software.

The conference spans four days, with the first two days focused more on workshops. Co-located with SCaLE is a two day, two track technical conference focused on PostgreSQL, and on Friday is the tenth DevOps Day LA. AWS is hosting a half-day workshop focused on Cloud Native builders with presentations on Kubernetes, InfluxDB and Apache Cassandra.

The second reason you should go is networking, or what is often called the “hallway track”.

For the last several years I’ve worked remote (i.e. not in an office outside of the home) and I will probably continue to do so for the rest of my career. Remote work has become almost a standard within technical jobs.

But I have to say I miss being able to see people face to face. When I was with OpenNMS we had this product where you could buy a year of support coupled with a week of on-site professional services and training. I used to love doing those, but even before COVID those trips became less frequent as companies adopted a distributed work force. There was really no “on-site” place to go when your team was across four time zones.

Technical conferences, such as SCaLE, provide a great opportunity to get together in person, and it can be wonderful to talk in an informal setting to people you may only know through e-mails, video calls and social media. A number of my coworkers will be at SCaLE and I am looking forward to spending some “in real life” time with them.

If you look through the list of speakers at this year’s conference, it is a “who’s who” of open source leaders and contributors, and you’ll have to the chance to meet them as well as other like-minded people. I love the fact that the organizers have built in a 30 minute cushion between talks. Not only does this avoid the rush that usually happens as one speaker finishes and another sets up, it gives people time to socialize before heading off to the next talk. Of course, it goes without saying that you should be courteous to speakers and other attendees, and SCaLE has published a Code of Conduct to formalize what that means, but also don’t let that stop you from asking tough or difficult questions of the speakers (just be nice about it). I always loved it when I was a speaker and someone asked me something I had never thought about.

The third reason you should go is the Exhibition Hall. There are a ton of sponsors who will have booths at the show (including AWS) and this is a great chance to talk with those projects you love, find new ones to love, and often there is some great swag to be had. The hall will be open on Friday through Sunday.

Finally, on Saturday night there is the famous “Game Night” reception and party. I’m excited that the original nerdcore rapper, MC Frontalot, will be performing. Frontalot combines musicianship with nerdy topics like video games, cosplay, fairy tales and technology into an incredibly entertaining show. If you are new to his work check out his YouTube channel. One of my favorite songs is “Stoop Sale” (kids especially like that one, so I guess I’m a kid at heart), and he recently had a fan take his song “Secrets from the Future” (about how all of our encrypted secrets will one day be an open book) and run the lyrics through the Midjourney AI image generator. The result is pretty amazing.

A full SCaLE pass runs $85, and I can’t think of a better value. In-person technical instruction runs $500+ a day, and even if you went to one of those on-line class sites you’re still going to pay $15-$50 a class, and here you can attend 15 or so sessions for around $5 per, and that doesn’t include all the extra stuff outside of the presentations. Even with travel it is still a deal.

I am very eager to attend and I hope to see you there, too.

Just one more note, this one on COVID. I am pretty rigorous when it comes to avoiding this disease which is one reason I haven’t traveled much in the last 2+ years. The first conference I attended since the pandemic started was the Open Source Summit in Austin, and while some people did test positive it was a small fraction of total attendees. One reason was that they had a mask requirement (except when eating or drinking) and you had to show proof of vaccination or a negative test. SCaLE has adopted a similar policy, and while this won’t mean it is impossible to get sick the evidence suggests that this will greatly limit exposure among the attendees. If you have health issues you may still want to stay home and if you come and don’t feel well use your best judgement. I will be taking along some rapid tests that I got for free from covid.gov as well as frequently taking my temperature just to be sure.

2022 Open Source Summit – Day 4

I always feel a little sad on the last day of any conference, and Open Source Summit was no different. It seems like the week went by too fast.

With the Sponsor Showcase closing on Thursday, attendance at the Friday keynotes was light, but those of us that showed up got to hear some pretty cool presentations.

Picture of Rachel Rose on stage

The first one was from Rachel Rose, who supervises R&D at Industrial Light and Magic. As a fanboy of ILM I was very eager to hear what she had to say, and she didn’t disappoint. (sorry about the unflattering picture but I took three and they were all bad)

In the past a lot of special effects that combine computer generated imagery (CGI) and live action are created separately. The live action actors perform in front of a green screen and the CGI backgrounds are added later. Technology has advanced to the point that the cutting edge now involves live action sets that are surrounded by an enormous, curved LED screens, and the backgrounds are projected as the actors perform.

This presents a number of challenges as the backgrounds may need to change as the camera moves, but it provides a much better experience for the actors and the audience.

The tie-in to open source is that a lot of the libraries used the creation of these effects are now open. In fact, the Academy of Motion Picture Arts and Sciences (the people responsible for the Oscars) along with the Linux Foundation have sponsored the Academy Software Foundation (ASWF) to act as a steward for the “content creation industry’s open source software base”. The projects under the ASWF fall into one of two tiers: Adopted and Incubation. Currently there are four projects that are mature enough to be adopted and several more in the incubation stage.

A lot of this was so specific to the industry that it went over my head, but I could understand the OpenEXR project, which provides a reference implementation of the EXR file format for storing high quality images.

A slide showing the ILM Stagecraft volume setup

She then went on to talk about Stagecraft, which is the name of the ILM platform for producing content. I would love to be able to visit one day. It would be so cool to see a feature being made with the CGI, sets and actors all integrated.

Picture of Vini Jaiswal on stage

The next speaker was Vini Jaiswal, Developer Advocate for Databricks. I had seen a cool Databricks presentation back on Day 2 and the first part was similar, but Jaiswal skipped the in-depth technical details and focused more on features and adoption. A rather large number of companies are using the Delta Lake technology as a way to apply business intelligence to data lakes, and as the need to analyze normally unstructured data becomes more important, I expect to see even more organizations adopt it.

The third presentation was a video by Dmitry Vinnik of Meta on measuring open source project health.

Begin rant.

To be honest I was a little unhappy to see a video as a keynote. It was the only one for the entire week and I have to admit I kind of tuned it out. It wasn’t even novel, as he has given it at least twice before. The video we were shown is available on Youtube from a conference earlier in the month and he posted another one dated June 24th from the Python Web Conference (while it has a different splash screen it looks to be the same presentation).

A still picture of a part of the video sent in by Demetri Vinnik

Look, I’ve given the same talk multiple times at different conferences, so I get it. But to me keynotes are special and should be unique. I was insulted that I bothered to show up in person, wear a mask, get my temperature checked each day, and I expected something better than a video I could have watched at home.

Note: Rachel Rose played a video as part of her presentation and that’s totally cool, as she didn’t “phone in” the rest of it.

Okay, end rant.

The next two presenters were very inspiring young people, and it was nice to have them included as part of the program.

Picture of Alena Analeigh on stage

The first speaker was Alena Analeigh, an amazing young woman who, among other achievements, has been accepted to medical school at age 13 (note that in trying to find a reference for that I came up blank, except for her twitter bio, so if you have one please let me know and I can update this post).

Med school is just one of her achievements. She also founded The Brown STEM Girls as an organization to get more women of color interested in science, technology, engineering and math. She stated that while men make up 52% of the workforce, they represent 76% of people employed in STEM fields.

My love of such things was fostered at an early age, and programs like hers are a great step to encourage young women of color to get interested in and eventually pursue careers in STEM.

While she seemed a little nervous and tentative while presenting, the final speaker of the morning was the exact opposite. At 11 years old, I could listen to Orion Jean speak for hours.

Picture of Orion Jean on stage

Orion also has a number of accolades, including Time Magazine’s “Kid of the Year“. He got his start as the winner of a speech contest sponsored by Think Kindness, and since then has started the Race to Kindness (“a race where everybody wins”) to spread kindness around the world.

To help inspire acts of kindness he uses the acronym K.I.N.D.:

  • Keep Your Eyes Open: Look for opportunities to be kind to others. One example he used is one I actually practice. If you are in line to check out at the store, and you see a person with a lot less items than you, while not offer to let them check out first?
  • Include Others: No one can effect change alone. Get others involved.
  • Nothing Is Too Small: One thing that keeps us from spreading kindness is that we can try to think too big. Even small acts of kindness can have a huge impact.
  • Do Something About It: Take action. Nothing can change if we do nothing.

After the keynotes I had to focus on some work stuff that I had let languish for the week, so I didn’t make it to any of the presentations, but overall I was happy with my first conference in three years.

There were a few people that attended who tested positive for COVID, so I plan to take some precautions when I get home and hope that the steps the Linux Foundation took to mitigate infection worked. So far I’ve tested negative twice, and I’ll probably take another test on Monday.

My next conference will be SCaLE in Los Angeles at the end of July, and I plan to be in Dublin, Ireland for Open Source Summit – Europe. If you are comfortable getting out and about I hope to see you there.

2022 Open Source Summit – Day 3

Thursday at the Open Source Summit started as usual at the keynotes.

Picture of Robin Bender Ginn on stage

Robin Bender Ginn opened today’s session with a brief introduction and then we jumped into the first session by Matt Butcher of Fermyon.

Picture of Matt Butcher on stage

I’ve enjoyed these keynotes so far, but to be honest nothing has made me go “wow!” as much as this presentation by Fermyon. I felt like I was witnessing a paradigm shift in the way we provide services over the network.

To digress quite a bit, I’ve never been happy with the term “cloud”. An anecdotal story is that the cloud got its name from the fact that the Visio icon for the Internet was a cloud (it’s not true) but I’ve always preferred the term “utility computing”. To me cloud services should be similar to other utilities such as electricity and water where you are billed based on how much you use.

Up until this point, however, instead of buying just electricity it has been more like you are borrowing someone else’s generator. You still have to pay for infrastructure.

Enter “serverless“. While there are many definitions of serverless, the idea is that when you are not using a resource your cost should be zero. I like this definition because, of course, there have to be servers somewhere, but under the utility model you shouldn’t be paying for them if you aren’t using them. This is even better than normal utilities because, for example, my electricity bill includes fees for things such as the meter and even if I don’t use a single watt I still have to pay for something.

Getting back to the topic at hand, the main challenge with serverless is how do you spin up a resource fast enough to be responsive to a request without having to expend resources when it is quiescent? Containers can take seconds to initialize and VMs much longer.

Fermyon hopes to address this by applying Webassembly to microservices. Webassembly (Wasm) was created to allow high performance applications, written in languages other than Javascript, to be served via web pages, although as Fermyon went on to demonstrate this is not its only use.

The presentation used a game called Finicky Whiskers to demonstrate the potential. Slats the cat is a very finicky eater. Sometimes she wants beef, sometimes chicken, sometimes fish and sometimes vegetables. When the game starts Slats will show you an icon representing the food they want, and you have to tap or click on the right icon in order to feed it. After a short time, Slats will change her choice and you have to switch icons. You have 30 seconds to feed as many correct treats as possible.

Slide showing infrastructure for Frisky Kittens: 7 microservices, Redis in a container, Nomad cluster on AWS, Fermyon

Okay, so I doubt it will have the same impact on game culture as Doom, but they were able to implement it using only seven microservices, all in Wasm. There is a detailed description on their blog, but I liked that fact that it was language agnostic. For example, the microservice that controls the session was written in Ruby, but the one that keeps track of the tally was written in Rust. The cool part is that these services can be spun up on the order of a millisecond or less and the whole demo runs on three t2.small AWS instances.

This is the first implementation I’ve seen that really delivers on the promise of serverless, and I’m excited to see where it will go. But don’t let me put words into their mouth, as they have a blog post on Fermyon and serverless that explains it better than I could.

Picture of Carl Meadows on stage

The next presentation was on OpenSearch by Carl Meadows, a Director at AWS.

Note: Full disclosure, I am an AWS employee and this post is a personal account that has not been endorsed or reviewed by my employer.

OpenSearch is an open source (Apache 2.0 licensed) set of technologies for storing large amounts of text that can then be searched and visualized in near real time. Its main use case is for making sense of streaming data that you might get from, say, log files or other types of telemetry. It uses the Apache Lucene search engine and latest version is based on Lucene 9.1.

One of the best ways to encourage adoption of an open source solution is by having it integrate with other applications. With OpenSearch this has traditionally been done using plugins, but there is a initiative underway to create an “extension” framework.

Plugins have a number of shortcomings, especially in that they tend to be tightly coupled to a particular version of OpenSearch, so if a new version comes out your existing plugins may not be compatible until they, too, are upgraded. I run into this with a number of applications I use such as Grafana and it can be annoying.

The idea behind extensions is to provide an SDK and API that are much more resistant to changes in OpenSearch so that important integrations are decoupled from the main OpenSearch application. This also provides an extra layer of security as these extensions will be more isolated from the main code.

I found this encouraging. It takes time to build a community around an open source project but one of the best ways to do it is to provide easy methods to get involved and extensions are a step in the right direction. In addition, OpenSearch has decided not to require a Contributor License Agreement (CLA) for contributions. While I have strong opinions on CLAs this should make contributing more welcome for developers who don’t like them.

Picture of Taylor Dolezal on stage

The next speaker was Taylor Dolezal from the Cloud Native Computing Foundation (CNCF). I liked him from the start, mainly because he posted a picture of his dog:

Slide of a white background with the head and sad eyes of a cute black dog

and it looks a lot like one of my dogs:

Picture of the head of my black Doberman named Kali

Outside of having a cool dog, Dolezal has a cool job and talked about building community within the CNCF. Just saying “hey, here’s some open source code” doesn’t mean that qualified people will give up nights and weekends to work on your project, and his experiences can be applied to other projects as well.

The final keynote was from Chris Wright of Red Hat and talked about open source in automobiles.

Picture of Chris Wright on stage

Awhile ago I actually applied for a job with Red Hat to build a community around their automotive vertical (I didn’t get it). I really like cars and I thought that combining that with open source would just be a dream job (plus I wanted the access). We are on the cusp of a sea change with automobiles as the internal combustion engine gives way to electric motors. Almost all manufacturers have announced the end of production for ICEs and electric cars are much more focused on software. Wright showed a quote predicting that automobile companies will need four times the amount of software-focused talent that the need now.

A slide with a quote stating that automobile companies will need more than four times of the software talent they have now

I think this is going to be a challenge, as the automobile industry is locked into 100+ years of “this is the way we’ve always done it”. For example, in many states it is still illegal to sell cars outside of a dealership. When it comes to technology, these companies have recently been focused on locking their customers into high-margin proprietary features (think navigation) and only recently have they realized that they need to be more open, such as supporting Android Auto or CarPlay. As open source has disrupted most other areas of technology, I expect it to do the same for the automobile industry. It is just going to take some time.

I actually found some time to explore a bit of Austin outside the conference venue. Well, to be honest, I went looking for a place to grab lunch and all the restaurants near the hotel were packed, so I decided to walk further out.

Picture of the wide Brazos river from under the Congress Avenue bridge

The Brazos River flows through Austin, and so I decided to take a walk on the paths beside it. The river plays a role in the latest Neal Stephenson novel called Termination Shock. I really enjoyed reading it and, spoiler alert, it does actually have an ending (fans of Stephenson’s work will know what I’m talking about).

I walked under the Congress Avenue bridge, which I learned was home to the largest urban bat colony in the world. I heard mention at the conference of “going to watch the bats” and now I had context.

A sign stating that drones were not permitted to fly near the bat colony under the Congress Avenue bridge

Back at the Sponsor Showcase I made my way over to the Fermyon booth where I spent a lot of time talking with Mikkel Mørk Hegnhøj. When I asked if they had any referenceable customers he laughed, as they have only been around for a very short amount of time. He did tell me that in addition to the cat game they had a project called Bartholomew that is a CMS built on Fermyon and Wasm, and that was what they were using for their own website.

Picture the Fermyon booth with people clustered around

If you think about it, it makes sense, as a web server is, at its heart, a fileserver, and those already run well as a microservice.

They had a couple of devices up so that people could play Finicky Whiskers, and if you got a score of 100 or more you could get a T-shirt. I am trying to simplify my life which includes minimizing the amount of stuff I have, but their T-shirts were so cool I just had to take one when Mikkel offered.

Note that when I got back to my room and actually played the game, I came up short.

A screenshot of my Finicky Whiskers score of 99

The Showcase closed around 4pm and a lot of the sponsors were eager to head out, but air travel disruptions affected a lot of them. I’m staying around until Saturday and so far so good on my flights. I’m happy to be traveling again but I can’t say I’m enjoying this travel anxiety.

[Note: I overcame by habit of sitting toward the back and off to the side so the quality of the speaker pictures has improved greatly.]

2022 Open Source Summit – Day 2

The word for Day 2 of the Open Source Summit is SBOM.

When I first heard the term my thought was that someone had spoken a particular profanity at an inappropriate time, but SBOM in this context means “Software Bill of Materials”. Open source is so prevalent these days that it is probably included in a lot of the software you use and you may not be aware of it, so when an issue is discovered such as Log4shell it can be hard to determine what software is affected. The idea of asking all vendors (both software-only and software running on devices) to provide an SBOM is a first step to being able to audit this software.

It isn’t as easy as you might think. The OpenNMS project I was involved with used over a hundred different open source libraries. I know because I once did a license audit to make sure everything being used had compatible licenses. I also have used Black Duck Software (now Synopsys) to generate a list of included software, and it looks like they now offer SBOM support as well, but I get ahead of myself.

Note that Synopsys is here in the Sponsor Showcase but when I stopped by the booth no one was there.

Getting back to the conference, the second morning keynotes were more sparsely attended than yesterday, but the room was far from empty. The opening remarks were given by Mike Dolan, SVP and GM of Projects at the Linux Foundation, and he was a last minute replacement for Jim Zemlin, who was not feeling well.

Picture of Mike Dolan on stage

Included in the usual housekeeping announcements was a short “in memoriam” for Shubhra Kar, the Linux Foundation CTO who passed away unexpectedly this year.

Dolan also mentioned that the Software Package Data eXchange (SPDX) open standard used for creating SBOMs had turned 10 years old (and it looks like it will hit 11 in August). This was relevant because with applications of any complexity including hundreds if not thousands of open source software projects, there had to be some formal way of listing them for analysis in an SBOM, and most default to SPDX.

The next speaker was Hilary Carter who is in charge of research for the Linux Foundation.

Picture of Mike Dolan and Hilary Carter on stage

She spoke on the work the Linux Foundation is doing to measure the worldwide impact of open source. As part of that she mentioned that there is a huge demand for open source talent in the market place, but there are also policy barriers for employees of many companies to contribute to open source. She also brought up SBOMs as a way to determine how widespread open source use is in modern applications.

Stylized Mercator Map Projection

Since diversity has been a theme at this conference I wanted to address a pet peeve of mine. This is a slide from Carter’s presentation and it uses a stylized Mercator projection to show the world. I just think it is about time we stop using this projection, as the continent highlighted, Africa, is actually much, much larger in proportion to the other continents than is shown on this map. As an alternative I would suggest the Gall-Peters projection.

Gall-Peters projection of the world yoinked from Wikipedia

To further digress, I asked my friend Ben to run “stylized Gall-Peters projection” through Midjourney but I didn’t feel comfortable posting any of the results (grin).

Anyway, enough of that. The next presenter was Kevin Jakel, who founded Unified Patents.

Picture of Kevin Jakel on stage

The goal of Unified Patents is to protect open source from patent trolls. Patent trolls are usually “non-practicing entities” who own a lot of patents but exist to extract revenue from companies they believe are infringing upon them versus building products. Quite frequently it is cheaper to settle than pursue legal action against these entities and this just encourages more actions on the part of the trolls.

The strategy to combat this is described as “Detect, Disrupt and Deter”. For a troll, the most desired patents are ones that are broad, as this means more companies can be pursued. However, overly broad patents are also subject to review, and if the Patent and Trademark Office is convinced a patent isn’t specific enough it can invalidate it, destroying the revenue stream for the patent troll.

I’m on the fence over software patents in general. I mean, let’s say a company could create a piece of software that exactly modeled the human body and how a particular drug would interact with it, I think that deserves some protection. But I don’t think that anyone owns the idea of, say, “swipe left to unlock”. Also it seems like software rights could be protected by copyright, but then again IANAL (one source for more information on this is Patent Absurdity)

Picture of Amir Montezary on stage

The next person on stage was Amir Montazery, of the Open Source Technology Improvement Fund. The mission of the OSTIF is to help secure open source software. They do this through both audits and fundraising to provide the resources to open source projects to make sure their software is secure as possible.

Jennings Aske, of New York-Presbyterian Hospital spoke next. I have worked a bit with technology in healthcare and as he pointed out there are a lot of network connected devices used in medicine today, from the devices that dispense drugs to the hospital beds themselves. Many of those do not have robust security (and note that these are proprietary devices). Since a hack or other breach could literally be a life and death situation, steps are being taken to mitigate this.

Picture of Jennings Aske on stage

I enjoyed this talk mainly because it was from the point of view of a consumer of software. As customers are what drive software revenues, they stand the best chance in getting vendors to provide SBOMs, along with government entities such as the National Telecommunications and Information Administration (NTIA). The NTIA has launched an effort called Software Component Transparency to help with this, and Jennings introduced a project his organization sponsors called DaggerBoard that is designed to scan SBOMs to look for vulnerabilities.

Picture of Arun Gupta on stage

The next keynote was from Arun Gupta of Intel. His talk focused on building stronger communities and how Intel was working to build healthy, open ecosystems. He pointed out that open source is based largely on trust, which is an idea I’ve promoted since I got involved in FOSS. Trust is something that can’t be bought and must be earned, and it is cool to see large companies like Intel working toward it.

Picture of Melissa Smolensky on stage

The final presenter was Melissa Smolensky from Gitlab who based her presentation around a “love letter to open source”. It was cute. I too have a strong emotional connection to my involvement in free and open source software that I don’t get anywhere else in my professional life, at least to the same degree.

I did get to spend some time near the AWS booth today, and after chatting at length with the FreeRTOS folks I happened to be nearby when Chris Short did a presentation on GitOps.

Chris Short presenting GitOps

In much the same way that Apple inspired a whole generation of Internet-focused products to put an “i” in front of their name, DevOps has spawned all kinds of “Ops” such as AIOps and MLOps and now GitOps. The idea of DevOps was built around creating processes to more closely tie software development to software operation and deployment, and key to this was configuration management software such as Puppet and Ansible. Instead of having to manage configuration files per instance, one could store them centrally and use agents to deploy them into the environment. This central repository allows for a high degree of control and versioning.

It is hard to think of a better tool for versioning than git, and thus GitOps was born. Software developed using GitOps is controlled by configuration files (usually in YAML) and using git to make changes.

While I am not an expert on GitOps by any means, suppose your application used a configuration file to determine the various clusters to create. To generate a new cluster you would just edit the file in your local copy of the repo, git commit and git push.

You application would then use something like Flux (not to be confused with the Flux query language from InfluxData) to note that a change has occurred and then do a git pull which would then cause the change to be applied.

Pretty cool, huh? A lot of people are familiar with git so it makes the DevOps learning curve a lot less steep. It also allows for the configuration of multiple repositories so you can control, say, access to secrets differently than the main application configuration.

Spot Callaway and Brian Proffitt

Also while I was in the booth I got this picture of two Titans of Open Source, Spot Callaway and Brian Proffitt. Oh yeah.

My final session of the day was given by Kelly O’Malley of Databricks on Delta Lake.

Kelly O'Malley presenting on Delta Lake

Now as someone who has given a lot of talks, I try to be respectful of the presenter and with the exception of the occasional picture and taking notes I try to stay off my phone. I apologized to her afterward as I was spending a lot of time looking up terms with which I was unfamiliar, such as “ACID” and “parquet“.

Delta Lake is an open source project to create a “Lakehouse”. The term is derived from a combination of “Data Warehouse” and “Data Lake“.

Data warehouses have been around for a very long time (in one of my first jobs I worked for a VAR that built hardware solutions for storing large data warehouses) and the idea was to bring together large amounts of operational data into one place so that “business intelligence” (BI) could be applied to help make decisions concerning the particular organization. Typically this data has been very structured, such as numeric or text data.

But people started figuring out that a lot of data, such as images, needed to be stored in more of a raw format. This form of raw data didn’t lend itself well to the usual BI analysis techniques.

Enter Delta Lake. Based on Apache Spark, it attempts to make data lakes more manageable and to make them as useful as data warehouses. I’m eager to find the time to learn more about this. When I was at OpenNMS we did a proof of concept about using Apache Spark to perform anomaly detection and it worked really well, so I think it is perfectly matched to make data lakes more useful.

My day ended at an internal event sponsored by Nithya Ruff, who in addition to being the chairperson of the Linux Foundation is also the head of the AWS OSPO. I made a number of new friends (and also got to meet Amir Montazery from the morning keynotes in person) but ended up calling it an early night because I was just beat. Eager to be fresh for the next day of the conference.