OpenStack Swift mid-cycle hackathon summary
Last week more than 30 people from all over the world met at the Rackspace
office in San Antonio, TX for the Swift mid-cycle hackathon. All major companies
contributing to Swift sent people, including Fujitsu, HPE, IBM, Intel, NTT,
Rackspace, Red Hat, and Swiftstack. As always it was a packed week with a lot
of deep technical discussions around current and future changes within Swift.
There are always way more topics to discuss than time, therefore we collected
topics first and everyone voted afterwards. We came up with the following major
discussions that are currently most interesting within our community:
- Hummingbird replication
- Crypto - what's next
- Partition power increase
- High-latency media
- Container sharding
- Golang - how to get it accepted in master
- Policy migration
There were a lot more topics, and I like to highlight a few of them.
H9D aka Hummingbird / Golang
This was a big topic - as expected. It has been shown by Rackspace already that
H9D improves the performance of the object servers and replication
significantly compared to the current Python implementation. There were also
some investigations if it would be possible to improve the speed using PyPy and
other improvements; however the major problem is that Python blocks processes
on file I/O, no matter if it is async IO or not. Sam wrote a very nice summary
about this earlier on .
NTT also benchmarked H9D, and showed some impressive numbers as well. Shortly
summarized, throughput increased 5-10x depending on parameters like object size
and the like. It seems disks are no longer the bottleneck - now the proxy CPU is
the new bottleneck. That said, inode cache memory seems to be even more
important because with H9D one can do many more disk requests.
Of course there were also discussions about another proposal to accept golang
within OpenStack and discussions will continue . My personal view is that
the H9D implementation has some major advantages and hopefully (a refactored
subset) will be accepted to be merged to master.
Crypto retro & what's next
Swift 2.9.0 has been released the past week and includes the merged crypto
branch . Kudos to everyone involved, especially Janie and Alistair! This
middleware make it possible for operators to fully encrypt object data on
We did a retro on the work done so far; it has been the third time that we used
a feature branch and a final soft-freeze to land a major change within Swift.
There are pros and cons for this, but overall it worked pretty well again. It
also made sense that reviewers stepped in late in the process, because this
added new sights onto the whole work. Soft freezes also enforce more reviewers
to contribute to it and get it merged finally.
Swiftstack benchmarked the crypto branch; as expected the throughput decreases
somewhat with crypto enabled (especially with small objects), while proxy CPU
usage increases. There were some discussions about improving the performance,
and it seems the impact from checksumming is significant here.
Next steps to improve the crypto middleware is to work on some external key
master implementations (for example using Barbican) as well as key rotation.
Partition power increase
Finally there is a patch ready for review now, that will allow an operator to
increase the partition power without downtime for end users .
I gave an overview about the implementation, and also showcased a demo how this
works. Based on discussions during the last week I spotted some minor
eventualities that have been fixed meanwhile, and I hope to get this merged
before Barcelona. We somewhat dreamed about a future Swift that might be usable
with automatic partition power increase, where an operator needs to think about
this much less than today.
There are some proposed middlewares that are important to their authors, and we
discussed quite a few of them. This includes:
- High-latency media (aka archiving)
The idea to support high-latency media is to use cold storage (like tape or
other public cloud object storage with a possible multi-hour latency) for less
frequently accessed data and especially to offer a low-cost long-term archival
solution based on Swift . This is somewhat challenging for the upstream
community, because most contributors don't have access to large enterprise tape
libraries for testing. In the end this middleware needs to be supported by the
community, and a stand-alone repository outside of Swift itself might make most
sense therefore (similar to the swift3 middleware ).
A new proposal to implement true history-based versioning has been proposed
earlier on, and some open questions have been talked about. This should land
hopefully soon, adding an improved way to versioning compared to today's
stack-based versioning .
Sending out notifications based on writes to Swift have been discussed earlier
on, and thankfully Zaqar now supports temporary signed urls, solving some of
the issues we faced earlier on. I'll update my patch shortly . There is
also another option to use oslo.messaging. All in all, the whole idea will be
to use a best-effort approach - it's simply not possible to guarantee a
notification has been delivered successfully without blocking requests.
As of today it's a good idea to avoid billions of objects in a single container
in Swift, because writes to that container can get slow then. Matt started
working on container sharding sometime ago , and iterated once again because
he faced new problems with the previous ideas. My impression is that the new
idea is getting much closer to something that will eventually be merged, thanks
to Matt's persistence on this topic.
There were a lot more (smaller) topics that have been discussed, but this
should give you an overview of the current work going on in the Swift
community and the interesting new features that we'll see hopefully soon in
Swift itself. Thanks everyone who contributed and participated and special
thanks to Richard for organizing the hackathon - it was a great week and I'm
looking forward to the next months!
View article »