Haven’t finished debugging this unit yet. Thought you might like to see top info. These are physical CPUs BTW, not SMT.
top - 09:21:29 up 3 min, 2 users, load average: 0.22, 0.21, 0.09
Tasks: 219 total, 1 running, 218 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 13.2%sy, 0.0%ni, 86.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu23 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu24 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu25 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu26 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu27 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu28 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu29 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu30 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu31 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: <strong>529366036k</strong> total, 9816336k used, <strong>519549700k</strong> free, 0k buffers
Swap: 0k total, 0k used, 0k free, 70116k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2126 root 39 19 0 0 0 S 13 0.0 0:08.42 kipmi0
1882 root 20 0 15780 736 520 S 0 0.0 0:00.17 irqbalance
2388 root 20 0 79340 3760 2944 S 0 0.0 0:00.05 sshd
2467 root 20 0 19476 1520 1068 R 0 0.0 0:00.05 top
1 root 20 0 24008 2196 1340 S 0 0.0 0:07.03 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0 0.0 0:00.05 ksoftirqd/0
4 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/0:0
5 root 20 0 0 0 0 S 0 0.0 0:00.14 kworker/u:0
ahhhh
It really has 1TB, probably need some boot options or some other bits to get it to see all the ram.
Viewed 23779 times by 2851 viewers
Yeah, it’s fun. We have a 40-physical-core box with 0.25TB… Needs more memory, but hey, it’s on loan gratis.
48 core with mere 16 to 32 GB of RAM here (special-purpose HPC)…
Don’t you just love this message: “Sorry, terminal is not big enough”
@kirjoittaessani
Sadly, something like 1/2 the memory isn’t showing up. I’ll have to run into the lab today and test the RAM. I am guessing a mixture of bad dimms and memory cards. Ugh.
As I said in an earlier comment in case you missed it is there are some pretty serious, in my opinion, issues with anyone reading /proc on kernels from 2.6.32 forward and I wrote it up here – http://collectl.sourceforge.net/SlowProc.html
If this includes your system perhaps you can try out my ‘strace -c’ test and confirm you’re seeing this issue too.
-mark
@Mark
Good catch there … I am wondering if this is what I’ve been running into with Collectl on our 2.6.32 kernels.
Ok … this smells like a /proc – NUMA problem. That the CPUs handling the /proc interface could be different, so its possible that reads are causing all sorts of joyous access issues.
@Mark 2.6.32 is fairly old now, does this still happen with current kernels ?
re newer kernels – I believe it still is a problem. Nevertheless it would be good to test yourself if you have access to a many-core box.
joe – it would be very interesting to see if this is what you’re bumping into. Can you try some of the tests I outlined on that web page?
I too thought it was a numa issue but I think it’s more of an issue handling all the locking on the different memory sections one needs to traverse with a lot of cores. While it turns out you can’t have a lot of cores without a lot of sockets and hence NUMA, it’s not really the numa code that is doing this. At least that’s my understanding.
-mark
@marc sadly not, our largest system is 32 cores (an SGI box with 1TB of RAM) and that’s in production, so can’t fiddle..
@marc – not sure you know, but the Red Hat bugzilla was locked down a few months ago to prevent access by non-subscribed people to bugs, apparently for “security reasons” (my understanding based on what RH told me happened to a bug of ours). So the BZ you link to from your collectl page is not viewable by anyone else I’m afraid.
Is there a discussion on LKML about this kernel regression?
Found something about this – a patch from October which a discussion about why it’s a hard problem for them.
http://lkml.indiana.edu/hypermail/linux/kernel/1110.2/00529.html
…and after chasing the thread over different websites, this appears to be the most recent response (indicating the patch made a huge difference to performance and asking what was needed to get it merged):
https://lkml.org/lkml/2011/12/5/542
It’s not in the mainline as of now.