4/30/2023 0 Comments Monit vs prometheusThanos brings high availability and long-term storage capabilities to Prometheus. Prometheus’ deployment has been replaced by the Prometheus operator running in all our clusters and Thanos is the newly introduced component. Grafana and all the dashboards that were developed in the first setup are still being used, but Grafana now gets data from a Thanos query component. An increase in client Prometheus metrics meant that the central Prometheus had to scrape more and more data, which led to to timeoutsįor our new monitoring setup, we selected our tooling with long-term needs in mind.Alert anomaly detection was not possible with a low data retention policy.Recovering data in volume failures was hard.We were not able to retrieve metric data for a long period of time.Each Prometheus server needed a big dedicated volume to operate, which increased costs.Data retention policy needed to be low to avoid huge costly volumes.It was possible to scrape metrics and handle alerts in real-time, but everything was stored in local volumes. In this setup, a central Prometheus server scraped data from multiple client Prometheus servers using Prometheus Federation. Prometheus is designed for reliability, and we used the federation service to scrape metrics in a multi-cluster environment. Prometheus is an open source systems monitoring and alerting toolkit that’s designed to record numeric time series events and make it easy to query data about the events. Overcoming Prometheus Scaling Difficulties This post will explain how we used Thanos and the Prometheus operator to scale our monitoring infrastructure and meet our long-term storage needs. Our previous architecture used Prometheus federation and was perfect for our small/medium infrastructure size, but was not able to scale in the way we needed. In Mattermost, our monitoring solution is continuously evolving to meet our scaling infrastructure needs.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |