Renil Thomas
2011-05-18 14:32:01 UTC
Hello,
A Netra T2000 cluster running with S10 + SC3.2U1 is encountering cpu spikes only on the second node and it lasts for less than 2 seconds.
"mpstat" output exhibits that whenever CPU peaks (i;e sys was consuming most of the CPU) the "smtx" value is very high (other parameters are normal only) , below is an excerpt with top 3 Idle time (top 3 sys time),
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 402 0 56398 5810 845 18241 991 5438 2157529 9 5704 1 90 0 8 32
0 1606 18 47535 11672 1725 26948 2769 9099 1525015 4 17115 13 72 0 16 32
0 1755 8 25457 6653 1252 19434 1729 3956 1554547 4 15133 5 69 0 28 32
Also one general observation is that whenever sys is high "smtx" is also high,
So it looks like during cpu peaks the SYS time is spent in "smtx" which is
Inorder to pin down the exact process which is involved in this high smtx, I can use this -->
$ dtrace -n 'lockstat:::adaptive-spin, lockstat:::adaptive-block
{
@execs[execname,probename] = count();
}'
But, as this problem happens randomly for instance twice or thrice in 45 minutes. As this smtx is not sustained forever, so could you please recommend a dtrace script to catch the process on the fly whenever a high value is detected.
Thanking you in anticipation.
Regards,
Renil Thomas.
A Netra T2000 cluster running with S10 + SC3.2U1 is encountering cpu spikes only on the second node and it lasts for less than 2 seconds.
"mpstat" output exhibits that whenever CPU peaks (i;e sys was consuming most of the CPU) the "smtx" value is very high (other parameters are normal only) , below is an excerpt with top 3 Idle time (top 3 sys time),
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 402 0 56398 5810 845 18241 991 5438 2157529 9 5704 1 90 0 8 32
0 1606 18 47535 11672 1725 26948 2769 9099 1525015 4 17115 13 72 0 16 32
0 1755 8 25457 6653 1252 19434 1729 3956 1554547 4 15133 5 69 0 28 32
Also one general observation is that whenever sys is high "smtx" is also high,
So it looks like during cpu peaks the SYS time is spent in "smtx" which is
Inorder to pin down the exact process which is involved in this high smtx, I can use this -->
$ dtrace -n 'lockstat:::adaptive-spin, lockstat:::adaptive-block
{
@execs[execname,probename] = count();
}'
But, as this problem happens randomly for instance twice or thrice in 45 minutes. As this smtx is not sustained forever, so could you please recommend a dtrace script to catch the process on the fly whenever a high value is detected.
Thanking you in anticipation.
Regards,
Renil Thomas.