- shell实现用例,查看
- Java, 详见PushGateway类
- Go,详见和PushAdd
- Python, 详见
- Ruby, 详见Pushgateway
这个例子主要说明, 如何执行一个批处理任务,并且在没有执行成功时报警
执行批量作业的代码:
import io.prometheus.client.CollectorRegistry;
import io.prometheus.client.Gauge;
import io.prometheus.client.exporter.PushGateway;
CollectorRegistry registry = new CollectorRegistry();
Gauge duration = Gauge.build()
.name("my_batch_job_duration_seconds")
.register(registry);
Gauge.Timer durationTimer = duration.startTimer();
try {
// Your code here.
// This is only added to the registry after success,
Gauge lastSuccess = Gauge.build()
.help("Last time my batch job succeeded, in unixtime.")
.register(registry);
lastSuccess.setToCurrentTime();
} finally {
durationTimer.setDuration();
PushGateway pg = new PushGateway("127.0.0.1:9091");
pg.pushAdd(registry, "my_batch_job");
}
如果任务最近没有运行,请创建一个警报到Alertmanager。将以下内容添加到Pushgateway的Prometheus服务的记录规则中:record rules
ALERT MyBatchJobNotCompleted
IF min(time() - my_batch_job_last_success_unixtime{job="my_batch_job"}) > 60 * 60
FOR 5m
WITH { severity="page" }
SUMMARY "MyBatchJob has not completed successfully in over an hour"