• Jo Miran@lemmy.ml
    link
    fedilink
    arrow-up
    1
    ·
    4 months ago

    We do Grafana + Prometheus for most of our clients but I think that adding Loki into the mix might be necessary. The amount of clients that are missing basic events like “you’ve run out of disk space…two days ago”, is too damn high.

    • Machindo@lemmy.ml
      link
      fedilink
      arrow-up
      2
      ·
      4 months ago

      I would add Alertmanager to your stack if you haven’t already. It’s pretty tightly integrated with prometheus. There’s some canned alerting rules based on predicting disk space full in X number of days. We wire Alertmanager to Pagerduty.