Discussion:
Invalid address in call to ustack()?
Ryan Johnson
2010-07-10 07:52:02 UTC
Permalink
Hi all,

I'm trying to use the following profile probe to grab stack traces at
profile-$FREQ /pid==$PID/ {
@profile[ustack()]=count;
}
profile:::profile-7777us): invalid address (0x100000) in action #2
profile:::profile-7777us): invalid address (0x100000) in action #2
profile:::profile-7777us): invalid address (0x100000) in action #2
profile:::profile-7777us): invalid address (0x17f3fa000) in action #2
profile:::profile-7777us): invalid address (0x100000) in action #2
profile:::profile-7777us): invalid address (0x0) in action #2
profile:::profile-7777us): invalid address (0x100000) in action #2
There's enough of those appearing that I suspect it's increasing
dtrace's probe effect. I'm tempted to just throw a -q at it, that would
only mask the symptoms. Is there something else I should do to prevent
the errors from occurring at all?

I'm on Solaris 10/sparcv9, and the app being traced is heavily threaded,
in case that makes any difference.

Thanks!
Ryan
Adam Leventhal
2010-07-10 20:13:30 UTC
Permalink
Hey Ryan,
There's enough of those appearing that I suspect it's increasing dtrace's probe effect. I'm tempted to just throw a -q at it, that would only mask the symptoms. Is there something else I should do to prevent the errors from occurring at all?
It's hard to say why you might be hitting that. You could investigate by doing something like this:

ERROR
{
stop();
printf("stopped %d due to an error", pid);
exit();
}

Then you can use gcore <pid> to grab a core dump and prun <pid> to set the process running again. From that core we should be able to figure out why the stack backtrace failed.
I'm on Solaris 10/sparcv9, and the app being traced is heavily threaded, in case that makes any difference.
Threading should not be a contributing factor to this issue.

Adam
Hi all,
profile-$FREQ /pid==$PID/ {
@profile[ustack()]=count;
}
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x17f3fa000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x0) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
There's enough of those appearing that I suspect it's increasing dtrace's probe effect. I'm tempted to just throw a -q at it, that would only mask the symptoms. Is there something else I should do to prevent the errors from occurring at all?
I'm on Solaris 10/sparcv9, and the app being traced is heavily threaded, in case that makes any difference.
Thanks!
Ryan
_______________________________________________
dtrace-discuss mailing list
--
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
Ryan Johnson
2010-07-12 13:26:46 UTC
Permalink
Post by Adam Leventhal
There's enough of those appearing that I suspect it's increasing dtrace's probe effect. I'm tempted to just throw a -q at it, that would only mask the symptoms. Is there something else I should do to prevent the errors from occurring at all?
ERROR
{
stop();
printf("stopped %d due to an error", pid);
exit();
}
Then you can use gcore<pid> to grab a core dump and prun<pid> to set the process running again. From that core we should be able to figure out why the stack backtrace failed.
OK, I did that (took a looong time -- 5GB dump). I don't see anything
out of the ordinary. Also tried attaching directly with dbx and pstack
-- no errors. I added a printout of the offending tid in hopes of
narrowing things down, but still nothing jumps out -- random tid and
function every time.

Nearly always the offending address is 0x0, but I did (once) manage to
get 0x100000; the output of pstack never shows any sign of the bad
address, though. It always goes all the way down to _lwp_start.

Disassembling functions showed nothing useful either -- the errors seem
to come after any instruction (even those which do not reference memory,
like "rd %pc, %o7").

Is there something specific I should look for?

Thanks!
Ryan

Ryan Johnson
2010-07-12 12:48:35 UTC
Permalink
*** Migrating this back to the list (curse the broken reply-to headers...)

Hi Jim,

Thanks for pointing that out -- I'd never noticed the section about
profile probe args on the wiki before. However, I've never seen anything
which suggests [u|k]stack depend on any of the arg*.

Adding the arg1 predicate reduces the number of errors, but comparing
runs using /arg1/ vs /!arg1/ shows that most errors actually occur with
/arg1/, even though I'm hitting both cases with roughly equal frequency
(241k vs 216k samples over 10s)

Regards,
Ryan
Try adding arg1 to the predicate;
/ arg1&& pid == $PID /
Post by Ryan Johnson
Hi all,
profile-$FREQ /pid==$PID/ {
@profile[ustack()]=count;
}
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x17f3fa000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x0) in action #2
dtrace: error on enabled probe ID 4 (ID 60477: profile:::profile-7777us): invalid address (0x100000) in action #2
There's enough of those appearing that I suspect it's increasing dtrace's probe effect. I'm tempted to just throw a -q at it, that would only mask the symptoms. Is there something else I should do to prevent the errors from occurring at all?
I'm on Solaris 10/sparcv9, and the app being traced is heavily threaded, in case that makes any difference.
Thanks!
Ryan
_______________________________________________
dtrace-discuss mailing list
Loading...