dtrace - in time of need

Discussion:

tester

2010-05-20 11:46:06 UTC

Hello,

We had a situation on a t5140 running solaris 10 where the load average from prstat were 20-25,30-40,60-80. We ran a simple dtrace cmd to capture the jstack using the pid provider using the PID from prstat. That action pushed the 1 minute load average to 120. Not only that we were not able to get any o/p, the system became unresponsive to SSH sessions. Ctrl-C were ignored for atleast 30 seconds. Some other simple dtrace one liners to aggregate syscalls per PID also resulted in the same behaviour. Any dtrace action during that time pushed the load significantly. I would like to believe that is an case operator error than dtrace issue. Please let us know if we missed something or if there are tunables in those situations.

dtrace cmd used
dtrace -n 'pid1234:::entry{@[jstack()] = count()}'
and some other simple dtrace one liners to aggegate syscalls/pid.

Thanks

--
This message posted from opensolaris.org

Chad Mynhier

2010-05-20 12:47:00 UTC

Permalink

Post by tester
Hello,
We had a situation on a t5140 running solaris 10 where the load average from prstat were 20-25,30-40,60-80. We ran a simple dtrace cmd to capture the jstack using the pid provider using the PID from prstat. That action pushed the 1 minute load average to 120. Not only that we were not able to get any o/p, the system became unresponsive to SSH sessions. Ctrl-C were ignored for atleast 30 seconds. Some other simple dtrace one liners to aggregate syscalls per PID also resulted in the same behaviour. Any dtrace action during that time pushed the load significantly. I would like to believe that is an case operator error than dtrace issue. Please let us know if we missed something or if there are tunables in those situations.
dtrace cmd used
and some other simple dtrace one liners to aggegate syscalls/pid.

The pid provider is fairly heavyweight. What's more, you've
instrumented every single function in that process (including all
libraries.) This is going to have a significant performance impact on
the application, and probably on the server itself, if the application
is the main thing running there.

A much better way to get what you want is the following:

profile-397
/ pid == 1234 /
{
@[jstack()] = count();
}

This will give you a feel for where the application is spending all of
its time, but it does it in a far more lightweight manner. Doing it
this way has a secondary benefit in that it can identify hot spots
inside functions. Your original script will only tell you how often
particular functions are entered and says nothing about how much time
is spent in those functions.

Chad

tester

2010-05-20 15:10:42 UTC

Permalink

Chad: Thanks for the information.

--
This message posted from opensolaris.org

tester

2010-06-22 22:00:23 UTC

Permalink

Hi,

Sorry to bring up an old thread. I need some general guidance on how to use dtrace during loads that is greater than normal. Again using on T2+ systemss running with a 1 minute load average of 10-15. Running the below script shoots up load to 50-60 range and distorts any test.

profile-397
/pid == 18900/
{
@[jstack(40,1024)] = count()
}

tick-5s {

trunc(@,10);
printa(@);
trunc(@);
}

The other symptoms are the system become unresponsive. Any thoughts?

Thanks

--
This message posted from opensolaris.org

Mike Gerdts

2010-06-22 22:33:51 UTC

Permalink

Post by tester
Hi,
Sorry to bring up an old thread. I need some general guidance on how to use dtrace during loads that is greater than normal. Again using on T2+ systemss running with a 1 minute load average of 10-15. Running the below script shoots up load to 50-60 range and distorts any test.

T2+ implies a multi-processor system with 8 cores x 8 threads. Let's
assume you don't have a T5440. Load average of 15 on a dual
processors T2+ system is less than 12% busy (15 / (2*8*8)).

Post by tester
profile-397
/pid == 18900/
{
}
tick-5s {
}
The other symptoms are the system become unresponsive. Any thoughts?

Is it unresponsive only when dtrace is running or normally? With
recent releases of Solaris, I've found systems to be quite responsive
with a load average that is many times higher than the number of CPU's
(as seen by mpstat - 128 for the typical dual processor T5140/T5240).
It seems highly unlikely that the problem is related to being short on
CPU (again, only at about 12% CPU utilization).

If it is unresponsive or sluggish before you start dtrace, I would
guess that one of the following is the case:

- The machine is short on RAM and is paging. Use vmstat to diagnose.
Look at the "b" column (blocked on I/O) and paging related columns
such as sr (scan rate). You would see things as being extremely
sluggish (e.g. when executing a command) because the disk reads needed
to load the commands and related libraries are getting queue behind
the IO requests for paging.

- The network is having troubles. Look for a duplex mismatch or
non-zero values in:

kstat -p e1000g | nawk '$NF != 0 && $0 ~ /(err|drop|fail)/'

- There is some other I/O problem. Does iostat -En show hard errors
on any disk? Does "iostat -xzn 1" show svc_time + wsvc_time over
20ms? How many I/Os are queued and active?

Your question is performance - but you jumped to the conclusion that
dtrace would tell you the answer. It may, but there are likely other
tools that will be helpful with a lot less effort and less system
impact. perf-discuss may be a better list to ask for more help.

--
Mike Gerdts
http://mgerdts.blogspot.com/

tester

2010-06-22 23:22:53 UTC

Permalink

Thanks Mike.

Post by Mike Gerdts
Is it unresponsive only when dtrace is running or
normally?

It becomes unresponsive after starting Dtrace.

Post by Mike Gerdts
With recent releases of Solaris, I've found systems to be
quite responsive
with a load average that is many times higher than
the number of CPU's
(as seen by mpstat - 128 for the typical dual
processor T5140/T5240).

The system is a T5440. 256G RAM.

Post by Mike Gerdts
It seems highly unlikely that the problem is related
to being short on
CPU (again, only at about 12% CPU utilization).

vmstat reports more than 95%CPU free. core utilization is between 2-3%

Post by Mike Gerdts
If it is unresponsive or sluggish before you start
dtrace, I would

No. It gets sluggish after Dtrace is started.

Post by Mike Gerdts
- The machine is short on RAM and is paging. Use
vmstat to diagnose.
Look at the "b" column (blocked on I/O) and paging
related columns
such as sr (scan rate). You would see things as being
extremely
sluggish (e.g. when executing a command) because the
disk reads needed
to load the commands and related libraries are
getting queue behind
the IO requests for paging.

there's plenty of RAM ~240G

Post by Mike Gerdts
- The network is having troubles. Look for a duplex
mismatch or
kstat -p e1000g | nawk '$NF != 0 && $0 ~
/(err|drop|fail)/'
- There is some other I/O problem. Does iostat -En
show hard errors
on any disk? Does "iostat -xzn 1" show svc_time +
wsvc_time over
20ms? How many I/Os are queued and active?
Your question is performance - but you jumped to the
conclusion that
dtrace would tell you the answer. It may, but there
are likely other
tools that will be helpful with a lot less effort and
less system
impact. perf-discuss may be a better list to ask for
more help.

we were checking application performance when we enaged this script to check where the hot spots were; we had to Ctrl-c dtrace because of it behavior.

Even now on a idle server (same system) here is what is what I see, although not that unresponsive now ( vey little load to start with)

Before dtrace:
Total: 162 processes, 2058 lwps, load averages: 0.79, 2.87, 2.26
Total: 162 processes, 2058 lwps, load averages: 1.03, 2.89, 2.27
After Dtrace
Total: 161 processes, 2057 lwps, load averages: 20.61, 6.88, 3.61
Total: 161 processes, 2057 lwps, load averages: 38.40, 10.76, 4.93
Total: 161 processes, 2057 lwps, load averages: 35.38, 10.59, 4.91

This time I was able to get some o/p from the script otherwise with load I have not seen script o/p. Now you can imagine the state of the system if the initail load was 10-15.

Please let mw know if you need more details.

--
This message posted from opensolaris.org

Mike Gerdts

2010-06-23 04:00:35 UTC

Permalink

Post by tester
Thanks Mike.

Post by Mike Gerdts
Is it unresponsive only when dtrace is running or
normally?

It becomes unresponsive after starting Dtrace.

[snip]

Post by tester
This time I was able to get some o/p from the script otherwise with load I have not seen script o/p. Now you can imagine the state of the system if the initail load was 10-15.
Please let mw know if you need more details.

Hmmm... it looks like you are just having troubles with the enabled
probes being to heavy. My approach would be to try to reduce the
amount of data that is being collected. You could try:

- Use a lower frequency in your profile provider. That is, switch
profile-397 to something like profile-113.
- Reduce the number of stack frames from 40 to 10.
- Try tracing only one thread at a time.

For the last suggestion, you can probably figure out which are the
interesting/busy threads with:

prstat -mLc -n 10 1 5

You should be able to observe one thread (tid - the part after the /
at the end of the line in prstat output) at a time with...

profile-113
/pid == 18900 && tid == 123/
{
@[jstack(10,1024)] = count()
}

tick-5s {
trunc(@,10);
printa(@);
trunc(@);
}

Perhaps if you share the output of "prstat -mLc -n 10 1 5" when the
unexpected load exists something will jump out. For example, if we
see a large amount of %sys, it may be worthwhile looking at the
syscall provider.

You may also want to take a look at tools that are java-specific. I
haven't had a need to try these, but a quick search reveals
http://profiler.netbeans.org/ which looks interesting.

Also, I haven't had much of a chance to compare the performance of
jstack vs. ustack. It could be that you are seeing a performance hit
due to the JVM saying "256 CPUs? Ahh, you must want 256 garbage
collection threads!" Using ustack() instead of jstack() may be
lighter weight (I've had good success with ustack(5) at profile-997 on
a pretty busy T5240 debugging JNI calls). I suspect a similar
approach may point out heavy GC activity.

http://www.c0t0d0s0.org/archives/6617-About-Java,-parallel-garbage-collection-and-processor-sets.html

--
Mike Gerdts
http://mgerdts.blogspot.com/

Phil Harman

2010-06-23 06:39:23 UTC

Permalink

Check the archives for Jon Haslam's post on DTrace being sluggish. The original poster also had a T5440. The reason he fave is that DTrace will allocate 4MB per CPU and again if you are using aggregations.

In your case this means 2GB of allocations.Although you have loads of RAM it still has to be managed. If your system has been running for a long time, there may not be many large pages left. In the worst case 2GB is a huge number of 8K pages, and that's a lot if work!

So your run queues are growing because DTrace has given the system quite a lot of extra work to do. Jon suggests that you could tune the buffer size down, but that you may begin to see drops as a result.

Your simple profiling script should be fine. Obviously extreme profile frequencies will have a greater probe effect, but I've never considered even 997Hz that extreme (although you do have one of the slowest SPARC machines on the planet)!

I think your issue is more to do with initialising DTrace. It may have settled down if you had let it, but I don't blame you for pulling the plug. That isn't what DTrace claims on the tin. It's just that on sane hardware there will be less of an impact.

Phil

p.s. in cases like this, nothing beats a bit of good old vmstat, mpstat, iostat -xnz and (less old) prstat -Lm ... say 12 samples with 5 second interval of each

http://harmanholistix.com
Focusing on the big picture and the detail

Post by tester
Thanks Mike.

Post by Mike Gerdts
Is it unresponsive only when dtrace is running or
normally?

It becomes unresponsive after starting Dtrace.

The system is a T5440. 256G RAM.

Post by Mike Gerdts
It seems highly unlikely that the problem is related
to being short on
CPU (again, only at about 12% CPU utilization).

vmstat reports more than 95%CPU free. core utilization is between 2-3%

Post by Mike Gerdts
If it is unresponsive or sluggish before you start
dtrace, I would

No. It gets sluggish after Dtrace is started.

there's plenty of RAM ~240G

we were checking application performance when we enaged this script to check where the hot spots were; we had to Ctrl-c dtrace because of it behavior.
Even now on a idle server (same system) here is what is what I see, although not that unresponsive now ( vey little load to start with)
Total: 162 processes, 2058 lwps, load averages: 0.79, 2.87, 2.26
Total: 162 processes, 2058 lwps, load averages: 1.03, 2.89, 2.27
After Dtrace
Total: 161 processes, 2057 lwps, load averages: 20.61, 6.88, 3.61
Total: 161 processes, 2057 lwps, load averages: 38.40, 10.76, 4.93
Total: 161 processes, 2057 lwps, load averages: 35.38, 10.59, 4.91
This time I was able to get some o/p from the script otherwise with load I have not seen script o/p. Now you can imagine the state of the system if the initail load was 10-15.
Please let mw know if you need more details.
--
This message posted from opensolaris.org
_______________________________________________
dtrace-discuss mailing list

Mike Gerdts

2010-05-20 15:36:09 UTC

Permalink

Post by tester
Hello,
We had a situation on a t5140 running solaris 10 where the load average from prstat were 20-25,30-40,60-80.

The T5140 presumably has two 8 core x 8 strand processors. On this
box, a load average of 128 indicates that there is work for all of the
CPU strands. That is, a T5140 with a load average of 128 has a
saturated CPU, just like a Netra T1 with one UltraSPARC IIi processor
has processor saturation when the load average is 1.

Put another way, a load average of 20 means the processor is 15.6%
utilized. Of course this is simplistic - you need to look at other
factors, particularly corestat (use google).

--
Mike Gerdts
http://mgerdts.blogspot.com/

Bart Smaalders

2010-05-24 21:34:17 UTC

Permalink

Post by Mike Gerdts

Post by tester
Hello,
We had a situation on a t5140 running solaris 10 where the load average from prstat were 20-25,30-40,60-80.

The T5140 presumably has two 8 core x 8 strand processors. On this
box, a load average of 128 indicates that there is work for all of the
CPU strands. That is, a T5140 with a load average of 128 has a
saturated CPU, just like a Netra T1 with one UltraSPARC IIi processor
has processor saturation when the load average is 1.
Put another way, a load average of 20 means the processor is 15.6%
utilized. Of course this is simplistic - you need to look at other
factors, particularly corestat (use google).

Perhaps we should consider a new load average that is normalized to
the number of CPUs....

- Bart

--
Bart Smaalders Solaris Kernel Performance
bart.smaalders-QHcLZuEGTsvQT0dZR+***@public.gmane.org http://blogs.sun.com/barts
"You will contribute more with mercurial than with thunderbird."