Troubleshooting CPU Utilization and Core Allocation on Ubuntu EC2 Instances

I’m trying to run multiple processes and take advantage of all 8 vCPUs of my EC2 instance. However, when I try to do this, htop shows the 8 processes running but only on 2 of the 8 cores at 100%. I’m not sure what’s going on. How many physical cores do I have? I’m running an ubuntu instance. Is there a command I can run to clarify things? I’m experiencing a bottleneck in my dataloading for pytorch and basically want to make sure I’m using all the cpu resources I can.

how many processes are you running? Python suffers from GIL, so unless you run 8 processes, you won’t have 8 cores running. EC2 generally has 2 “cores” (a vcpu) per physical CPU, but if you’re running graviton, it’s 1:1. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html

this may be relevant for you https://serverfault.com/questions/1121908/use-more-cpu-cores-using-code-in-pytorch-that-also-uses-gpu

So I ended up stopping and restarting the instance the next day, and I was able to use all 8 vCPUs according to htop. I read somewhere that the node could be shared with others (cpu steal or something like that) so I’m assuming that that is what has happened.

back to only 2 vpcu…

Okay, for the record it’s not AWS related, but is instead a pytorch dataloader issue:
https://github.com/pytorch/pytorch/issues/101850#issuecomment-1557249911