Are there other metrics that’d be interesting to add here? I’ve added the DNS queries per second (for a subset of the servers that the system are getting metrics for anyway).
It’s not super interesting because the statuspage system sems to average it up by 5 minutes; even in the 1 minute averages the brief peaks are quite smoothed out.
Yes, I’d like to do that, since it’s also what’s going to feed into the future automatic “help the weak zones” system. I don’t think it’s possible with statuspage.io (to create 100+ graphs, etc).
I have a grafana dashboard with a version of this; the query is pretty intensive on the prometheus backend and I don’t know how to make the grafana dashboard “secure enough for the internet”, so it’ll need some more work to figure out to share it.
You could feed the data from Prometheus into something like Munin, which will make graph image files using RRDtool. These images can be hosted anywhere will little effort.
Thanks @mengzhuo for the tip. I’ve been looking at sharing some of my stats myself.
To make it easier here is the request in curl. This returns a png from the specified dashboard and panel number.