Sending dyno statistics from Heroku with Prometheus and Grafana

pOdom · January 31, 2022, 11:20am

I’m having a pretty basic confusion about Prometheus: I’m trying to get Prometheus stats out of Heroku dynos so that I can see the stats in Grafana. But Heroku dynos are constantly changing / restarting / etc and there’s a variable number of them. If each running dyno has its own Prometheus instance, how do I send those per-dyno statistics to some centralized thing that can be used as a datasource in Grafana?

parkerH · January 31, 2022, 11:55am

I think you are misinterpreting the way Prometheus scraping is meant to be used. You should have a persistent Prometheus instance that scrapes the /metrics endpoints of your short-lived dynos. You’d then use that persistent Prometheus instance as a datasource in your Grafana.

pOdom · January 31, 2022, 1:28pm

But there’s no way in the heroku architecture to access specific dyno’s /metrics endpoint afaik.

pOdom · January 31, 2022, 1:37pm

Dynos are just meant to be these transient replaceable/restartable things. I don’t have much kubernetes experience or otherwise, but don’t those systems support the idea of rolling deploys, e.g. spinning up new server instances while old ones are running, and then retiring the old instances? In such a system, given the dynamicism, how would the centralized Prometheus instance even know which endpoints to scrape?

parkerH · January 31, 2022, 2:53pm

Hm, in that case you have three options:

Use Prometheus Pushgateway and push metrics from dynos there.
Deploy dyno-local Promethei and use the remote_write feature to forward the data to central Prometheus instance.
Run Grafana Cloud Agent on each dyno and make it forward your metrics to central Prometheus instance.

Option [3] is pretty much the same as [2], except the only thing that Agent does is forward the metrics. Option [2] would allow you to query dyno-local Prometheii, but I’m not sure there’s value in that.

parkerH · January 31, 2022, 4:18pm

To address your 2nd comment - Prometheus was built with service discovery in mind, you can read about it here - https://github.com/prometheus/prometheus/tree/master/discovery. e.g. in a Kubernetes environment Prometheus usually relies on k8s API server and Pod labels to figure out which IP addresses it should scrape.

pOdom · January 31, 2022, 5:31pm

Thank you for your help. So hard to know if this stuff is even worth fitting in my brain. We’re a small company, already spread thin and don’t want to spend much time on devops (hence the Heroku charm), but I also want to have centralized metrics/reporting that don’t live in 5 different proprietary dashboards

parkerH · January 31, 2022, 6:54pm

I hate to be a salesman, but you can always try out Grafana Cloud. There’s free tier that should help you get started

pOdom · January 31, 2022, 8:39pm

Maybe; one thing we’re interested is providing an embedded multi-tenant dashboard to our clients

pOdom · January 31, 2022, 9:10pm

Still trying to work through the details, but something tells me a productized cloud service might not offer the flexibility we need

parkerH · January 31, 2022, 9:31pm

It’s actually possible to have your hosted Grafana stack be accessible by different accounts with different permissions (albeit not on free plan). I’m of course not sure how much flexibility you need, but it might be possible that the hosted solution is indeed flexible enough

pOdom · January 31, 2022, 10:37pm

Thanks for your help, I’ll take a closer look at hosted Grafana

tommyHicks · February 1, 2022, 12:14am

If I understand the problem right, you should be able to use promxy and configure a single prometheus datasource which abstracts the multiple prometheis.

tommyHicks · February 1, 2022, 12:56am

https://github.com/jacksontj/promxy

tommyHicks · February 1, 2022, 1:44am

That’s what we do at my place, it works rather well for our 5 separate prometheus instances.

pOdom · February 1, 2022, 2:18am

Do the separate Prometheus instances “push” to the single aggregate one? or does the aggregate fan out and pull from the multiple Prom instances?

tommyHicks · February 1, 2022, 3:55am

It’s just a HA Proxy at the end of the day, so pull is my guess. Any specific requirements around that?

pOdom · February 1, 2022, 5:48am

The main snag for getting this working in heroku is that you can’t access the /metrics of specific running heroku instances

pOdom · February 1, 2022, 6:21am

Everything’s hidden away behind the heroku load balancer. so i think that disqualifies any pull-based approach

tommyHicks · February 1, 2022, 6:45am

Is there any other need for heroku LB in the first place?