C<perl5db.pl> is the perl debugger. It is loaded automatically by Perl when
you invoke a script with C<perl -d>. This documentation tries to outline the
structure and services provided by C<perl5db.pl>, and to describe how you
can use them.
=head1 GENERAL NOTES
The debugger can look pretty forbidding to many Perl programmers. There are
a number of reasons for this, many stemming out of the debugger's history.
When the debugger was first written, Perl didn't have a lot of its nicer
features - no references, no lexical variables, no closures, no object-oriented
programming. So a lot of the things one would normally have done using such
features was done using global variables, globs and the C<local()> operator
in creative ways.
Some of these have survived into the current debugger; a few of the more
interesting and still-useful idioms are noted in this section, along with notes
on the comments themselves.
=head2 Why not use more lexicals?
Experienced Perl programmers will note that the debugger code tends to use
mostly package globals rather than lexically-scoped variables. This is done
to allow a significant amount of control of the debugger from outside the
debugger itself.
Unfortunately, though the variables are accessible, they're not well
documented, so it's generally been a decision that hasn't made a lot of
difference to most users. Where appropriate, comments have been added to
make variables more accessible and usable, with the understanding that these
I<are> debugger internals, and are therefore subject to change. Future
development should probably attempt to replace the globals with a well-defined
API, but for now, the variables are what we've got.
=head2 Automated variable stacking via C<local()>
As you may recall from reading C<perlfunc>, the C<local()> operator makes a
temporary copy of a variable in the current scope. When the scope ends, the
old copy is restored. This is often used in the debugger to handle the
automatic stacking of variables during recursive calls:
sub foo {
local $some_global++;
# Do some stuff, then ...
return;
}
What happens is that on entry to the subroutine, C<$some_global> is localized,
then altered. When the subroutine returns, Perl automatically undoes the
localization, restoring the previous value. Voila, automatic stack management.
The debugger uses this trick a I<lot>. Of particular note is C<DB::eval>,
which lets the debugger get control inside of C<eval>'ed code. The debugger
localizes a saved copy of C<$@> inside the subroutine, which allows it to
keep C<$@> safe until it C<DB::eval> returns, at which point the previous
value of C<$@> is restored. This makes it simple (well, I<simpler>) to keep
track of C<$@> inside C<eval>s which C<eval> other C<eval's>.
In any case, watch for this pattern. It occurs fairly often.
=head2 The C<^> trick
This is used to cleverly reverse the sense of a logical test depending on
the value of an auxiliary variable. For instance, the debugger's C<S>
(search for subroutines by pattern) allows you to negate the pattern
like this:
# Find all non-'foo' subs:
S !/foo/
Boolean algebra states that the truth table for XOR looks like this:
=over 4
=item * 0 ^ 0 = 0
(! not present and no match) --> false, don't print
=item * 0 ^ 1 = 1
(! not present and matches) --> true, print
=item * 1 ^ 0 = 1
(! present and no match) --> true, print
=item * 1 ^ 1 = 0
(! present and matches) --> false, don't print
=back
As you can see, the first pair applies when C<!> isn't supplied, and
the second pair applies when it is. The XOR simply allows us to
compact a more complicated if-then-elseif-else into a more elegant
(but perhaps overly clever) single test. After all, it needed this
explanation...
=head2 FLAGS, FLAGS, FLAGS
There is a certain C programming legacy in the debugger. Some variables,
such as C<$single>, C<$trace>, and C<$frame>, have I<magical> values composed
of 1, 2, 4, etc. (powers of 2) OR'ed together. This allows several pieces
of state to be stored independently in a single scalar.
A test like
if ($scalar & 4) ...
is checking to see if the appropriate bit is on. Since each bit can be
"addressed" independently in this way, C<$scalar> is acting sort of like
an array of bits. Obviously, since the contents of C<$scalar> are just a
bit-pattern, we can save and restore it easily (it will just look like
a number).
The problem, is of course, that this tends to leave magic numbers scattered
all over your program whenever a bit is set, cleared, or checked. So why do
it?
=over 4
=item *
First, doing an arithmetical or bitwise operation on a scalar is
just about the fastest thing you can do in Perl: C<use constant> actually
creates a subroutine call, and array and hash lookups are much slower. Is
this over-optimization at the expense of readability? Possibly, but the
debugger accesses these variables a I<lot>. Any rewrite of the code will
probably have to benchmark alternate implementations and see which is the
best balance of readability and speed, and then document how it actually
works.
=item *
Second, it's very easy to serialize a scalar number. This is done in
the restart code; the debugger state variables are saved in C<%ENV> and then
restored when the debugger is restarted. Having them be just numbers makes
this trivial.
=item *
Third, some of these variables are being shared with the Perl core
smack in the middle of the interpreter's execution loop. It's much faster for
a C program (like the interpreter) to check a bit in a scalar than to access
several different variables (or a Perl array).
=back
=head2 What are those C<XXX> comments for?
Any comment containing C<XXX> means that the comment is either somewhat
speculative - it's not exactly clear what a given variable or chunk of
code is doing, or that it is incomplete - the basics may be clear, but the
subtleties are not completely documented.
Send in a patch if you can clear up, fill out, or clarify an C<XXX>.
=head1 DATA STRUCTURES MAINTAINED BY CORE
There are a number of special data structures provided to the debugger by
the Perl interpreter.
The array C<@{$main::{'_<'.$filename}}> (aliased locally to C<@dbline> via glob
assignment) contains the text from C<$filename>, with each element
corresponding to a single line of C<$filename>.
The hash C<%{'_<'.$filename}> (aliased locally to C<%dbline> via glob
assignment) contains breakpoints and actions. The keys are line numbers;
you can set individual values, but not the whole hash. The Perl interpreter
uses this hash to determine where breakpoints have been set. Any true value is
considered to be a breakpoint; C<perl5db.pl> uses C<$break_condition\0$action>.
Values are magical in numeric context: 1 if the line is breakable, 0 if not.
The scalar C<${"_<$filename"}> simply contains the string C<_<$filename>.
This is also the case for evaluated strings that contain subroutines, or
which are currently being executed. The $filename for C<eval>ed strings looks
like C<(eval 34)> or C<(re_eval 19)>.
=head1 DEBUGGER STARTUP
When C<perl5db.pl> starts, it reads an rcfile (C<perl5db.ini> for
non-interactive sessions, C<.perldb> for interactive ones) that can set a number
of options. In addition, this file may define a subroutine C<&afterinit>
that will be executed (in the debugger's context) after the debugger has
initialized itself.
Next, it checks the C<PERLDB_OPTS> environment variable and treats its
contents as the argument of a C<o> command in the debugger.
=head2 STARTUP-ONLY OPTIONS
The following options can only be specified at startup.
To set them in your rcfile, add a call to
C<&parse_options("optionName=new_value")>.
=over 4
=item * TTY
the TTY to use for debugging i/o.
=item * noTTY
if set, goes in NonStop mode. On interrupt, if TTY is not set,
uses the value of noTTY or F</tmp/perldbtty$$> to find TTY using
Term::Rendezvous. Current variant is to have the name of TTY in this
file.
=item * ReadLine
If false, a dummy ReadLine is used, so you can debug
ReadLine applications.
=item * NonStop
if true, no i/o is performed until interrupt.
=item * LineInfo
file or pipe to print line number info to. If it is a
pipe, a short "emacs like" message is used.
=item * RemotePort
host:port to connect to on remote host for remote debugging.
=back
=head3 SAMPLE RCFILE
&parse_options("NonStop=1 LineInfo=db.out");
sub afterinit { $trace = 1; }
The script will run without human intervention, putting trace
information into C<db.out>. (If you interrupt it, you had better
reset C<LineInfo> to something I<interactive>!)
=head1 INTERNALS DESCRIPTION
=head2 DEBUGGER INTERFACE VARIABLES
Perl supplies the values for C<%sub>. It effectively inserts
a C<&DB::DB();> in front of each place that can have a
breakpoint. At each subroutine call, it calls C<&DB::sub> with
C<$DB::sub> set to the called subroutine. It also inserts a C<BEGIN
{require 'perl5db.pl'}> before the first line.
After each C<require>d file is compiled, but before it is executed, a
call to C<&DB::postponed($main::{'_<'.$filename})> is done. C<$filename>
is the expanded name of the C<require>d file (as found via C<%INC>).
=head3 IMPORTANT INTERNAL VARIABLES
=head4 C<$CreateTTY>
Used to control when the debugger will attempt to acquire another TTY to be
used for input.
=over
=item * 1 - on C<fork()>
=item * 2 - debugger is started inside debugger
=item * 4 - on startup
=back
=head4 C<$doret>
The value -2 indicates that no return value should be printed.
Any other positive value causes C<DB::sub> to print return values.
=head4 C<$evalarg>
The item to be eval'ed by C<DB::eval>. Used to prevent messing with the current
contents of C<@_> when C<DB::eval> is called.
=head4 C<$frame>
Determines what messages (if any) will get printed when a subroutine (or eval)
is entered or exited.
=over 4
=item * 0 - No enter/exit messages
=item * 1 - Print I<entering> messages on subroutine entry
=item * 2 - Adds exit messages on subroutine exit. If no other flag is on, acts like 1+2.
=item * 4 - Extended messages: C<< <in|out> I<context>=I<fully-qualified sub name> from I<file>:I<line> >>. If no other flag is on, acts like 1+4.
=item * 8 - Adds parameter information to messages, and overloaded stringify and tied FETCH is enabled on the printed arguments. Ignored if C<4> is not on.
=item * 16 - Adds C<I<context> return from I<subname>: I<value>> messages on subroutine/eval exit. Ignored if C<4> is is not on.
=back
To get everything, use C<$frame=30> (or C<o f=30> as a debugger command).
The debugger internally juggles the value of C<$frame> during execution to
protect external modules that the debugger uses from getting traced.
=head4 C<$level>
Tracks current debugger nesting level. Used to figure out how many
C<E<lt>E<gt>> pairs to surround the line number with when the debugger
outputs a prompt. Also used to help determine if the program has finished
during command parsing.
=head4 C<$onetimeDump>
Controls what (if anything) C<DB::eval()> will print after evaluating an
expression.
=over 4
=item * C<undef> - don't print anything
=item * C<dump> - use C<dumpvar.pl> to display the value returned
=item * C<methods> - print the methods callable on the first item returned
=back
=head4 C<$onetimeDumpDepth>
Controls how far down C<dumpvar.pl> will go before printing C<...> while
dumping a structure. Numeric. If C<undef>, print all levels.
=head4 C<$signal>
Used to track whether or not an C<INT> signal has been detected. C<DB::DB()>,
which is called before every statement, checks this and puts the user into
command mode if it finds C<$signal> set to a true value.
=head4 C<$single>
Controls behavior during single-stepping. Stacked in C<@stack> on entry to
each subroutine; popped again at the end of each subroutine.
=over 4
=item * 0 - run continuously.
=item * 1 - single-step, go into subs. The C<s> command.
=item * 2 - single-step, don't go into subs. The C<n> command.
=item * 4 - print current sub depth (turned on to force this when C<too much
recursion> occurs.
=back
=head4 C<$trace>
Controls the output of trace information.
=over 4
=item * 1 - The C<t> command was entered to turn on tracing (every line executed is printed)
=item * 2 - watch expressions are active
=item * 4 - user defined a C<watchfunction()> in C<afterinit()>
=back
=head4 C<$slave_editor>
1 if C<LINEINFO> was directed to a pipe; 0 otherwise.
=head4 C<@cmdfhs>
Stack of filehandles that C<DB::readline()> will read commands from.
Manipulated by the debugger's C<source> command and C<DB::readline()> itself.
=head4 C<@dbline>
Local alias to the magical line array, C<@{$main::{'_<'.$filename}}> ,
supplied by the Perl interpreter to the debugger. Contains the source.
=head4 C<@old_watch>
Previous values of watch expressions. First set when the expression is
entered; reset whenever the watch expression changes.
=head4 C<@saved>
Saves important globals (C<$@>, C<$!>, C<$^E>, C<$,>, C<$/>, C<$\>, C<$^W>)
so that the debugger can substitute safe values while it's running, and
restore them when it returns control.
=head4 C<@stack>
Saves the current value of C<$single> on entry to a subroutine.
Manipulated by the C<c> command to turn off tracing in all subs above the
current one.
=head4 C<@to_watch>
The 'watch' expressions: to be evaluated before each line is executed.
=head4 C<@typeahead>
The typeahead buffer, used by C<DB::readline>.
=head4 C<%alias>
Command aliases. Stored as character strings to be substituted for a command
entered.
=head4 C<%break_on_load>
Keys are file names, values are 1 (break when this file is loaded) or undef
(don't break when it is loaded).
=head4 C<%dbline>
Keys are line numbers, values are C<condition\0action>. If used in numeric
context, values are 0 if not breakable, 1 if breakable, no matter what is
in the actual hash entry.
=head4 C<%had_breakpoints>
Keys are file names; values are bitfields:
=over 4
=item * 1 - file has a breakpoint in it.
=item * 2 - file has an action in it.
=back
A zero or undefined value means this file has neither.
=head4 C<%option>
Stores the debugger options. These are character string values.
=head4 C<%postponed>
Saves breakpoints for code that hasn't been compiled yet.
Keys are subroutine names, values are:
=over 4
=item * C<compile> - break when this sub is compiled
=item * C<< break +0 if <condition> >> - break (conditionally) at the start of this routine. The condition will be '1' if no condition was specified.
=back
=head4 C<%postponed_file>
This hash keeps track of breakpoints that need to be set for files that have
not yet been compiled. Keys are filenames; values are references to hashes.
Each of these hashes is keyed by line number, and its values are breakpoint
definitions (C<condition\0action>).
=head1 DEBUGGER INITIALIZATION
The debugger's initialization actually jumps all over the place inside this
package. This is because there are several BEGIN blocks (which of course
execute immediately) spread through the code. Why is that?
The debugger needs to be able to change some things and set some things up
before the debugger code is compiled; most notably, the C<$deep> variable that
C<DB::sub> uses to tell when a program has recursed deeply. In addition, the
debugger has to turn off warnings while the debugger code is compiled, but then
restore them to their original setting before the program being debugged begins
executing.
The first C<BEGIN> block simply turns off warnings by saving the current
setting of C<$^W> and then setting it to zero. The second one initializes
the debugger variables that are needed before the debugger begins executing.
The third one puts C<$^X> back to its former value.
We'll detail the second C<BEGIN> block later; just remember that if you need
to initialize something before the debugger starts really executing, that's
where it has to go.
=cut
package DB;
use IO::Handle;
# Debugger for Perl 5.00x; perl5db.pl patch level:
$VERSION = 1.28;
$header = "perl5db.pl version $VERSION";
=head1 DEBUGGER ROUTINES
=head2 C<DB::eval()>
This function replaces straight C<eval()> inside the debugger; it simplifies
the process of evaluating code in the user's context.
The code to be evaluated is passed via the package global variable
C<$DB::evalarg>; this is done to avoid fiddling with the contents of C<@_>.
Before we do the C<eval()>, we preserve the current settings of C<$trace>,
C<$single>, C<$^D> and C<$usercontext>. The latter contains the
preserved values of C<$@>, C<$!>, C<$^E>, C<$,>, C<$/>, C<$\>, C<$^W> and the
user's current package, grabbed when C<DB::DB> got control. This causes the
proper context to be used when the eval is actually done. Afterward, we
restore C<$trace>, C<$single>, and C<$^D>.
Next we need to handle C<$@> without getting confused. We save C<$@> in a
local lexical, localize C<$saved[0]> (which is where C<save()> will put
C<$@>), and then call C<save()> to capture C<$@>, C<$!>, C<$^E>, C<$,>,
C<$/>, C<$\>, and C<$^W>) and set C<$,>, C<$/>, C<$\>, and C<$^W> to values
considered sane by the debugger. If there was an C<eval()> error, we print
it on the debugger's output. If C<$onetimedump> is defined, we call
C<dumpit> if it's set to 'dump', or C<methods> if it's set to
'methods'. Setting it to something else causes the debugger to do the eval
but not print the result - handy if you want to do something else with it
(the "watch expressions" code does this to get the value of the watch
expression but not show it unless it matters).
In any case, we then return the list of output from C<eval> to the caller,
and unwinding restores the former version of C<$@> in C<@saved> as well
(the localization of C<$saved[0]> goes away at the end of this scope).
=head3 Parameters and variables influencing execution of DB::eval()
C<DB::eval> isn't parameterized in the standard way; this is to keep the
debugger's calls to C<DB::eval()> from mucking with C<@_>, among other things.
The variables listed below influence C<DB::eval()>'s execution directly.
=over 4
=item C<$evalarg> - the thing to actually be eval'ed
=item C<$trace> - Current state of execution tracing
=item C<$single> - Current state of single-stepping
=item C<$onetimeDump> - what is to be displayed after the evaluation
=item C<$onetimeDumpDepth> - how deep C<dumpit()> should go when dumping results
=back
The following variables are altered by C<DB::eval()> during its execution. They
are "stacked" via C<local()>, enabling recursive calls to C<DB::eval()>.
=over 4
=item C<@res> - used to capture output from actual C<eval>.
=item C<$otrace> - saved value of C<$trace>.
=item C<$osingle> - saved value of C<$single>.
=item C<$od> - saved value of C<$^D>.
=item C<$saved[0]> - saved value of C<$@>.
=item $\ - for output of C<$@> if there is an evaluation error.
=back
=head3 The problem of lexicals
The context of C<DB::eval()> presents us with some problems. Obviously,
we want to be 'sandboxed' away from the debugger's internals when we do
the eval, but we need some way to control how punctuation variables and
debugger globals are used.
We can't use local, because the code inside C<DB::eval> can see localized
variables; and we can't use C<my> either for the same reason. The code
in this routine compromises and uses C<my>.
After this routine is over, we don't have user code executing in the debugger's
context, so we can use C<my> freely.
=cut
############################################## Begin lexical danger zone
# 'my' variables used here could leak into (that is, be visible in)
# the context that the code being evaluated is executing in. This means that
# the code could modify the debugger's variables.
#
# Fiddling with the debugger's context could be Bad. We insulate things as
# much as we can.
sub eval {
# 'my' would make it visible from user code
# but so does local! --tchrist
# Remember: this localizes @DB::res, not @main::res.
local @res;
{
# Try to keep the user code from messing with us. Save these so that
# even if the eval'ed code changes them, we can put them back again.
# Needed because the user could refer directly to the debugger's
# package globals (and any 'my' variables in this containing scope)
# inside the eval(), and we want to try to stay safe.
=head3 Scalar, array, and hash completion: partially qualified package
Much like the above, except we have to do a little more cleanup:
=cut
if ( $text =~ /^[\$@%](.*)::(.*)/ ) { # symbols in a package
=pod
=over 4
=item *
Determine the package that the symbol is in. Put it in C<::> (effectively C<main::>) if no package is specified.
=cut
$pack = ( $1 eq 'main' ? '' : $1 ) . '::';
=pod
=item *
Figure out the prefix vs. what needs completing.
=cut
$prefix = ( substr $text, 0, 1 ) . $1 . '::';
$text = $2;
=pod
=item *
Look through all the symbols in the package. C<grep> out all the possible hashes/arrays/scalars, and then C<grep> the possible matches out of those. C<map> the prefix onto all the possibilities.
=cut
my @out = map "$prefix$_", grep /^\Q$text/, grep /^_?[a-zA-Z]/,
keys %$pack;
=pod
=item *
If there's only one hit, and it's a package qualifier, and it's not equal to the initial text, re-complete it using the symbol we actually found.
=cut
if ( @out == 1 and $out[0] =~ /::$/ and $out[0] ne $itext ) {
return db_complete( $out[0], $line, $start );
}
# Return the list of possibles.
return sort @out;
} ## end if ($text =~ /^[\$@%](.*)::(.*)/)
=pod
=back
=head3 Symbol completion: current package or package C<main>.
=cut
if ( $text =~ /^[\$@%]/ ) { # symbols (in $package + packages in main)
=pod
=over 4
=item *
If it's C<main>, delete main to just get C<::> leading.
We set the prefix to the item's sigil, and trim off the sigil to get the text to be completed.
=cut
$prefix = substr $text, 0, 1;
$text = substr $text, 1;
=pod
=item *
If the package is C<::> (C<main>), create an empty list; if it's something else, create a list of all the packages known. Append whichever list to a list of all the possible symbols in the current package. C<grep> out the matches to the text entered so far, then C<map> the prefix back onto the symbols.