Troubleshooting GKE N1 Instance Pool Exhaustion in New Project

Hi all! I’m trying to create a GKE cluster in a new project, but I’m unable to provision n1-standard-* instances. When I click through to the instance group I see ZONE_RESOURCE_POOL_EXHAUSTED, despite being able to create instances in the same zone in another project. Even extending it to provision nodes in asia-northeast1-a, -b, and -c zones, I end up with the same ZONE_RESOURCE_POOL_EXHAUSTED in every zone. What am I missing?

Do you need to use n1 vs. other instance types? I would probably use something else if possible. That said, did you check the various quotas / limits for the new project? Maybe this is a quota issue, even though the error isn’t the usual obvious quota error?

Is the cluster in standard or autopilot mode?

I have sometimes seen odd or non-obvious errors like this on new projects, so you may want to contact your sales team or your reseller to double-check they don’t need to tweak any settings for you

We need N1 instances because we need to use Nvidia T4 GPUs.

We updated the T4 GPUs and general compute quota settings, but the issue persists and I narrowed it down to specifically using N1 instances. There’s no specific quota setting for N1 CPU cores.

Last I heard, the client is reaching out to Google about their account, but I was hoping to find a solution sooner, since they’re not as quick about getting things done as I’d like.

I ran into something similar and at the time, contacting support was the only way to get it fixed IIRC

but definitely can see some things that are quota related (and request increases) from within the console too.
are you attaching the GPUs as part of the cluster creation too?

I’m specifically requesting GPU nodes when I create the node pool, which limits me to N1 machine types.

The news this morning is that the client did reach out to Google, and apparently they can’t use those things until their first payment goes through. :man-facepalming:

I’m guessing that it’s a security “feature” that the error is so vague vs. making it clear that it’s an artificial limitation :upside_down_face:, but that all jibes with my memory.

Do they have a google account rep? THey should ask them to whitelist the billing account, there is an internal tool for this that will upgrade their billing account to have a better “reputation” and should allow them to proceed.