Hi there, I’m new to prometheus and have absolutly 0 dev skills and 0 metrics understanding, I know there are docs about prometheus on their github and tried to read them, but still, I can’t write my own metrics…
Could anyone translate some of my objectives to prometheus metrics ? (i would like creating alerts via alertmanager)
N.B : Everything works (prometheus, alertmanager, prom2teams and grafana), I just need to write the metrics about what I want to monitore :
These are on windows (wmi exporter) :
• Logical disk is nearly full (alert at 5%)
• Physical disk is nearly full (alert at 5%)
• network ok (ping) (alert when ping is nok since 5m)
• CPU above 90% (alert when CPU is above 90% since 1h, with some “servers exceptions”)
• memory above 90% (alert when memory is above 90% since 1h)
• network card is in DHCP instead of static IP
• number of “network packets” in error is above 0 (only on esxi and physical server, not on virtual machine)
More specifics :
• citrix : when citrix services aren’t available since 5 minutes
• sql : when sql server services aren’t available since 5 minutes
• IIS : when iis services aren’t available since 5 minutes
• printing : when spoolers aren’t available since 5 minutes
• oracle : when oracle services aren’t available since 5 minutes
• AD : when AD isn’t available since 5minutes
• DHCP & DNS : when DHCP & DNS services aren’t available since 5 minutes
Thanks in advance to anyone answering my questions