AWS RDS Proxy for Aurora MySQL cluster connection issue

Hi all, I’m having a weird issue using AWS RDS proxy with an Aurora MySQL cluster, where, if I check the status of the targets, I’m getting an AVAILABLE state in of of the instances but the other is unhealthy, this is the output of aws rds describe-db-proxy-targets

    "Targets": [
        {
            "Endpoint": "endpoint-instance-0.region.rds.amazonaws.com",
            "TrackedClusterId": "db-cluster",
            "RdsResourceId": "db-instance-0",
            "Port": 3306,
            "Type": "RDS_INSTANCE",
            "Role": "READ_WRITE",
            "TargetHealth": {
                "State": "AVAILABLE"
            }
        },
        {
            "Endpoint": "endpoint-instance-1.region.rds.amazonaws.com",
            "TrackedClusterId": "db-cluster",
            "RdsResourceId": "db-instance-1",
            "Port": 3306,
            "Type": "RDS_INSTANCE",
            "Role": "UNKNOWN",
            "TargetHealth": {
                "State": "endpoint-instance-0.region.rds.amazonaws.com",
                "Reason": "UNREACHABLE",
                "Description": "Timeout connecting to the database"
            }
        },
        {
            "RdsResourceId": "db-cluster",
            "Port": 3306,
            "Type": "TRACKED_CLUSTER"
        }
    ]
}```


And this is what I see of the instance logs:



```[Server] Access denied for user 'rdsadmin'@'localhost' (using password: YES) (sql_authentication.cc:1412)```

I mean, trust the error message, bad password - do you have automated password rotation via secrets manager?

Tho the one error says timeout

I don’t use secrets manager, the password is self managed and I’ve created the cluster and instances via terraform. I can also connect to the instances and the cluster endpoints with dbeaver

So you can connect to all nodes in the cluster?

No, individually with the master password. Connecting to the proxy also works but it is probably using the available RW endpoint

Gotcha, you can’t go through to some specific endpoint?

I assume if all the instances are good it’s a proxy misconfiguration somehow, like it’s trying to route to an old instance that was rebuilt or something

Or at least the health check is doing something different than you are when you can connect to the instance successfully

That’s the thread I would pull on but unfortunately I don’t have like an aha here’s the problem

Are you logging in with the same account as the proxy health check?

The fact that you got a bad password error makes me think that something changed

Yes, same account, the health check gets the credentials from a secret but that secret is replicating the master pwd

Can you just look at the values and make sure that they’re right

Or make sure that they match I guess

I believe something is wrong with the instance but can’t pinpoint it… the bad instance is on us-east-1c and the good one on us-east-1a

You can take drastic measures but I would do all of the reasonable sanity checks first

thinking my next step should be having them both on the same AZ, as I’m not going to need multi AZ at this point, but I wonder why because multi AZ is supported

That seems even more drastic than I was thinking lol

You have some threads to pull on, you’re not in outage