Cephalocon 2022 has ended
July 11 - 13, 2022 | Portland, Oregon + Virtual
View More Details & Registration

Please note: This schedule is automatically displayed in Pacific Daylight Time (PDT). To view the schedule at your preferred time, please choose your location on the right-hand navigation panel under ’Timezone.’
The schedule is subject to change.
Back To Schedule
Tuesday, July 12 • 4:05pm - 4:45pm
Why We Built A “Message-Driven Telemetry System At Scale” Ceph Cluster - Xiaolin Lin & Matthew Leonard, Bloomberg LP

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Bloomberg's Ceph clusters are the backbone of our internal S3 cloud storage systems, which handle billions of requests per day. In order to ensure the smooth and stable operation of our Ceph clusters, we must have a reliable and stable telemetry system. Ceph’s Prometheus module provides performance counter metrics via the ceph-mgr component. While this paradigm works well for smaller installations, it can be problematic to put metric workloads into ceph-mgr at scale. To make this even more challenging, Ceph is just one component of our internal S3 product. We also need to gather telemetry data about consumables like space, objects per bucket, buckets per tenancy, etc., as well as telemetry from a software-defined distributed quality of service (QoS) system which is not natively supported by Ceph. Therefore, we built a holistic telemetry system to collect and monitor various aspects of our system, including Ceph clusters, usage, and QoS to present a unified view of our product to our internal users in a single pane of glass. In this presentation, we will talk about why we built a custom message-driven telemetry monitoring system and how we made it scalable, extensible, fault-tolerant, and able to support S3 and block storage clusters.

avatar for Xiaolin Lin

Xiaolin Lin

Senior Software Engineer, Bloomberg LP
Xiaolin Lin is a Senior Software Engineer at Bloomberg. He is part of the Storage Engineering team in the company's Technology Infrastructure department. Prior to current role, he has worked on Charting Platform to deal with real-time time-series data. He currently leads metric system... Read More →
avatar for Matthew Leonard

Matthew Leonard

Storage Engineering Manager, Bloomberg LP
Matthew Leonard leads the Storage Engineering organization in Bloomberg Engineering’s Technology Infrastructure department. Matthew worked on software for fighter planes, and now leverages his “slow is smooth, smooth is fast” mentality from the aerospace industry to direct Bloomberg’s... Read More →

Tuesday July 12, 2022 4:05pm - 4:45pm PDT
Regency Ballroom B