Managing resource limits in Kubernetes: Does reaching memory limit trigger instant pod termination or rolling update?

Hi there, quick question about how resources limits are managed in Kubernetes: will a pod reaching it’s memory limit be killed instantly, or is it supposed to trigger a rolling update ? We have a memory leak on one of our pod and until we find a way to solve this leak we don’t want to have any downtime when it reaches the limit

It’ll be killed. But you can simply use a replica count > 1 and you’ll be fine (except if all of your pods exceed their memory limit at the same time).
(also, might not properly terminate connections and stuff like that)

Thanks for your answer. Yes indeed but actually the other problem we have is that our pod is currently not able to load balance the traffic between 2 replicas unfortunatelly :disappointed:

Your pod is not doing the loadbalancing, a k8s service will do that.
Do you mean that your app does not support running 2 instances (or more) at the same time ? In that case… :man-shrugging:

yes exactly that’s what I meant: it’s actually not able to do that because it’s a limitation of our application

Well you’re out of luck.

Yes that’s what I thought… Thanks a lot for your your quick answer !

Not that you will have downtime when upgrading as well, even without a memory leak

Even if I trigger a restart? because if I trigger a upgrade it will launch a new pod, waiting for the health check to be OK and then switch to it correct?

Well, you just told me that it does not support 2 instance at the same time.

Kubernetes can do “surge” aka creating the new pod before removing the old one, but not if your app can’t run concurrently (well, it will, but it won’t work)

at the same time no, but if it’s like this yes:
• pod 1 is running
• trigger a deployment
• new pod is being launched but no traffic is sent to it until healthcheck is OK
• new pod healthcheck is OK, traffic is switch from pod 1 to pod 2
• pod 1 is killed

that’s how it’s supposed to work with only 1 replica right?

not step 4, there is no traffic switch, the old pod is just stopped (with SIGTERM, I think)

(which implicitly remove it from the LB pool yeah)

OK OK got it. According to our dev team, it’s not possible to have more than 1 replica currently because it would mean that data received by our pod would be processed twice. So what you’re telling me is that currently triggering a rolling update will lead to a downtime, but data should not be processed twice ? or is there a possibility that the 2 pods will be running together and both receiving traffic for a short period of time ?

Depend on how you configure your deployment. You can choose downtime or overlapping . It should be in the strategy key in your manifest

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy

We have the default strategy which is rollingUpdate

with both maxSurge and maxUnavailable being also set to default 25%