Cephalocon 2022 has ended
July 11 - 13, 2022 | Portland, Oregon + Virtual
View More Details & Registration

Please note: This schedule is automatically displayed in Pacific Daylight Time (PDT). To view the schedule at your preferred time, please choose your location on the right-hand navigation panel under ’Timezone.’
The schedule is subject to change.
Back To Schedule
Wednesday, July 13 • 5:20pm - 6:00pm
pgremapper: CRUSHing Cluster OperationaComplexity - Joshua Baergen, DigitalOcean

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

When working with production Ceph clusters, making changes to your CRUSH map (e.g. adding storage) can be highly disruptive. These changes can result in days to weeks of backfill that's hard to control due to the way that Ceph's backfill scheduling works, holding your cluster hostage by blocking other maintenance activities and slowing down recovery from failure events. In 2018, Dan van der Ster of CERN IT presented techniques ("Mastering Ceph Operations: Upmap and the Mgr Balancer") that they had developed to improve the safety of CRUSH changes. DigitalOcean's storage team has built on these techniques with an open-source tool called pgremapper. In this talk, Joshua Baergen will discuss the problems that operators encounter with CRUSH changes at scale and how DigitalOcean built pgremapper to control and speed up CRUSH-induced backfill.

avatar for Joshua Baergen

Joshua Baergen

Senior Engineer II, DigitalOcean
Joshua Baergen is the Technical Lead of the Storage Systems team at DigitalOcean. This team is responsible for designing and operating the persistence layers for the Volumes and Spaces products (which are built on Ceph) as well as for Droplet images, snapshots, and backups. He has... Read More →

Wednesday July 13, 2022 5:20pm - 6:00pm PDT
Regency Ballroom B