In Reflecting on Apache Arrow in 2022, Will Jones does a really nice job providing a history of the Apache Arrow project and the broader ecosystem it was originally created to help foster. It’s worth a read in full.
In his post, he describes the C++ Arrow ecosystem as being somewhat fractured and suggests this may be primarily out of the need for other teams to move fast but points out it may have something to do with libarrow’s attractiveness as a dependency.
One quote jumped out at me as particularly insightful is this one:
Yet those are all the same challenges our users experience; would it not be better if we felt those pains ourselves and had incentive to address them? I tend to think we would design better public APIs if we had to use them ourselves for our own query engine. #
This immediately reminded me of something I think Jenny Bryan said (which I cannot currently find) about doing the hard things often so they aren’t hard anymore. If integrating parts of the Arrow ecosystem with each is hard for members of the Arrow project, it’s likely to be considerably harder for those outside of it and I look forward to watching work on this front progress in 2023.