I am seeing issue with scaling m6i.8xlarge instance using autoscalar in aws. Other instances like m6i.large and m6i.xlarge is being launched

I am seeing issue with scaling m6i.8xlarge instance using autoscalar in aws. Other instances like m6i.large and m6i.xlarge is being launched.

Below is my pod defination

kind: Pod
metadata:
  name: my-pod
  namespace: argo
  labels:
    app: my-app
spec:
  containers:
    - name: my-container
      image: nginx:latest
      resources:
        requests:
          memory: "120Gi"
          cpu: "30"
        limits:
          memory: "128Gi"
          cpu: "32"
  nodeSelector:
    <http://eks.amazonaws.com/nodegroup|eks.amazonaws.com/nodegroup>: default-spot
  tolerations:
    - key: pipeline
      operator: Equal
      value: ""
      effect: NoSchedule```


Autoscalar pod logs


```I0731 02:05:42.399310       1 scale_up.go:93] Pod my-pod can't be scheduled on eks-default-spot-26c51474-c6ce-4ced-a882-5bfbcaf57a06, predicate checking error: Insufficient cpu, Insufficient memory; predicateName=NodeResourcesFit; reasons: Insufficient cpu, Insufficient memory; debugInfo=
I0731 02:05:42.399329       1 scale_up.go:262] No pod can fit to eks-default-spot-26c51474-c6ce-4ced-a882-5bfbcaf57a06```


Screen shot of asg

This is probably simply a lack of available free compute resources with given parameters in a given region/az. AWS does not guarantee anywhere the availability od resources to run a spot instance at any time. Even with on-demand instances they sometimes have problems in single AZs.

thanks for the response but how do we manage this. Another observation is if i have existing m6i.8xlarge instance running already in node group and if i request another one ,auto scalar launch the new one.

Try to use other instance types. AWS does not necessarily have sufficinet compute capacity to launch instances as a Spot with given parameters (specific CPU and RAM).
Spots are resources that are not used by anyone at a given time - on one physical host there may be 2 free CPUs and some RAM, on another 4 and so on. Requesting smaller types gives a higher probability of such resources being available on some server.
If you want to have a guarantee of availability of a given compute capacity, you must use reservations and then scaling is not needed.