I want to create a Prometheus alert rules for below scenarios,
Max capacity reached for the cluster
Unusual Scaling activityI think "Max capacity reached for the cluster" can be obtained with combination of following metrics,
1.cluster_autoscaler_unscheulable_pods_count >0
2. sum(cluster_autoscaler_unneeded_nodes_count)==0And,"Unusual Scaling activity" can be obtained from sum(cluster_autoscaler_scaled_up_nodes_total)
I have enabled metrics for Cluster autoscaler.However I am not sure how to create prometheus rule expressions with these metrics.Should I create any Service monitors? how to combine these metrics for the scenarios mentioned above? Do you already have examples of Prometheus rules for the Cluster autoscaler metrics?