hey all, I’m configuring the database monitoring and tracing in one of our production clusters and we have noticed that deploying the change quadruples datadog-agent memory usage (only 1 of the pods, goes from 1GB to 6+, rest is mostly unafffected). I need some advice on which <https://github.com/DataDog/integrations-core/blob/master/postgres/datadog_checks/postgres/data/conf.yaml.example|postgres monitoring parameters> should I tune to optimise this a little.
cluster_check = true
init_config = {}
instances = [
for endpoint in setunion(
[var.dd_agent_postgres_URL],
[
for instance in var.dd_agent_postgres_replicas : "${instance}.${local.db_host}"
]) : {
dbm = true
host = endpoint
port = 5432
username = "datadog"
dbname = var.dd_agent_db_name
password = var.dd_agent_postgres_password
dbstrict = true
collect_settings = {
enabled = true
}
collect_schemas = {
enabled = true
}
collect_function_metrics = true
collect_bloat_metrics = true
query_samples = {
enabled = true
explain_parameterized_queries = true
}
tags = [
"aws_account_environment:${var.environment}",
"env:${var.environment}",
"region:${var.region}",
"service:aurora_${var.dd_agent_db_name}",
]
max_relations = 1000
relations = [
{
relation_regex = ".*"
relkind = ["r", "i", "s", "t", "m", "c", "f", "p"]
}
]
}
]
})```
that’s our TF config for the agent. Postgres is an aurora cluster, 2 read replicas and 1 writer
I have a feeling this may be caused by the configuration being done for each instance separately but that’s the only way I was able to see the metrics being populated correctly
ok so I pinpointed that if I disable query_metrics
the memory usage does not increase, unfortunately that also disables most of the features of the dbm I’ll try tuning the colleciton interval there I guess