home *** CD-ROM | disk | FTP | other *** search
- =head1 NAME
-
- perlsec - Perl security
-
- =head1 DESCRIPTION
-
- Perl is designed to make it easy to program securely even when running
- with extra privileges, like setuid or setgid programs. Unlike most
- command line shells, which are based on multiple substitution passes on
- each line of the script, Perl uses a more conventional evaluation scheme
- with fewer hidden snags. Additionally, because the language has more
- builtin functionality, it can rely less upon external (and possibly
- untrustworthy) programs to accomplish its purposes.
-
- Perl automatically enables a set of special security checks, called I<taint
- mode>, when it detects its program running with differing real and effective
- user or group IDs. The setuid bit in Unix permissions is mode 04000, the
- setgid bit mode 02000; either or both may be set. You can also enable taint
- mode explicitly by using the B<-T> command line flag. This flag is
- I<strongly> suggested for server programs and any program run on behalf of
- someone else, such as a CGI script. Once taint mode is on, it's on for
- the remainder of your script.
-
- While in this mode, Perl takes special precautions called I<taint
- checks> to prevent both obvious and subtle traps. Some of these checks
- are reasonably simple, such as verifying that path directories aren't
- writable by others; careful programmers have always used checks like
- these. Other checks, however, are best supported by the language itself,
- and it is these checks especially that contribute to making a set-id Perl
- program more secure than the corresponding C program.
-
- You may not use data derived from outside your program to affect
- something else outside your program--at least, not by accident. All
- command line arguments, environment variables, locale information (see
- L<perllocale>), results of certain system calls (readdir(),
- readlink(), the variable of shmread(), the messages returned by
- msgrcv(), the password, gcos and shell fields returned by the
- getpwxxx() calls), and all file input are marked as "tainted".
- Tainted data may not be used directly or indirectly in any command
- that invokes a sub-shell, nor in any command that modifies files,
- directories, or processes, B<with the following exceptions>:
-
- =over 4
-
- =item *
-
- Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
-
- =item *
-
- Symbolic methods
-
- $obj->$method(@args);
-
- and symbolic sub references
-
- &{$foo}(@args);
- $foo->(@args);
-
- are not checked for taintedness. This requires extra carefulness
- unless you want external data to affect your control flow. Unless
- you carefully limit what these symbolic values are, people are able
- to call functions B<outside> your Perl code, such as POSIX::system,
- in which case they are able to run arbitrary external code.
-
- =back
-
- The value of an expression containing tainted data will itself be
- tainted, even if it is logically impossible for the tainted data to
- affect the value.
-
- Because taintedness is associated with each scalar value, some
- elements of an array can be tainted and others not.
-
- For example:
-
- $arg = shift; # $arg is tainted
- $hid = $arg, 'bar'; # $hid is also tainted
- $line = <>; # Tainted
- $line = <STDIN>; # Also tainted
- open FOO, "/home/me/bar" or die $!;
- $line = <FOO>; # Still tainted
- $path = $ENV{'PATH'}; # Tainted, but see below
- $data = 'abc'; # Not tainted
-
- system "echo $arg"; # Insecure
- system "/bin/echo", $arg; # Considered insecure
- # (Perl doesn't know about /bin/echo)
- system "echo $hid"; # Insecure
- system "echo $data"; # Insecure until PATH set
-
- $path = $ENV{'PATH'}; # $path now tainted
-
- $ENV{'PATH'} = '/bin:/usr/bin';
- delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
-
- $path = $ENV{'PATH'}; # $path now NOT tainted
- system "echo $data"; # Is secure now!
-
- open(FOO, "< $arg"); # OK - read-only file
- open(FOO, "> $arg"); # Not OK - trying to write
-
- open(FOO,"echo $arg|"); # Not OK
- open(FOO,"-|")
- or exec 'echo', $arg; # Also not OK
-
- $shout = `echo $arg`; # Insecure, $shout now tainted
-
- unlink $data, $arg; # Insecure
- umask $arg; # Insecure
-
- exec "echo $arg"; # Insecure
- exec "echo", $arg; # Insecure
- exec "sh", '-c', $arg; # Very insecure!
-
- @files = <*.c>; # insecure (uses readdir() or similar)
- @files = glob('*.c'); # insecure (uses readdir() or similar)
-
- # In Perl releases older than 5.6.0 the <*.c> and glob('*.c') would
- # have used an external program to do the filename expansion; but in
- # either case the result is tainted since the list of filenames comes
- # from outside of the program.
-
- $bad = ($arg, 23); # $bad will be tainted
- $arg, `true`; # Insecure (although it isn't really)
-
- If you try to do something insecure, you will get a fatal error saying
- something like "Insecure dependency" or "Insecure $ENV{PATH}".
-
- =head2 Laundering and Detecting Tainted Data
-
- To test whether a variable contains tainted data, and whose use would
- thus trigger an "Insecure dependency" message, you can use the
- tainted() function of the Scalar::Util module, available in your
- nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
- Or you may be able to use the following I<is_tainted()> function.
-
- sub is_tainted {
- return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
- }
-
- This function makes use of the fact that the presence of tainted data
- anywhere within an expression renders the entire expression tainted. It
- would be inefficient for every operator to test every argument for
- taintedness. Instead, the slightly more efficient and conservative
- approach is used that if any tainted value has been accessed within the
- same expression, the whole expression is considered tainted.
-
- But testing for taintedness gets you only so far. Sometimes you have just
- to clear your data's taintedness. The only way to bypass the tainting
- mechanism is by referencing subpatterns from a regular expression match.
- Perl presumes that if you reference a substring using $1, $2, etc., that
- you knew what you were doing when you wrote the pattern. That means using
- a bit of thought--don't just blindly untaint anything, or you defeat the
- entire mechanism. It's better to verify that the variable has only good
- characters (for certain values of "good") rather than checking whether it
- has any bad characters. That's because it's far too easy to miss bad
- characters that you never thought of.
-
- Here's a test to make sure that the data contains nothing but "word"
- characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
- or a dot.
-
- if ($data =~ /^([-\@\w.]+)$/) {
- $data = $1; # $data now untainted
- } else {
- die "Bad data in '$data'"; # log this somewhere
- }
-
- This is fairly secure because C</\w+/> doesn't normally match shell
- metacharacters, nor are dot, dash, or at going to mean something special
- to the shell. Use of C</.+/> would have been insecure in theory because
- it lets everything through, but Perl doesn't check for that. The lesson
- is that when untainting, you must be exceedingly careful with your patterns.
- Laundering data using regular expression is the I<only> mechanism for
- untainting dirty data, unless you use the strategy detailed below to fork
- a child of lesser privilege.
-
- The example does not untaint $data if C<use locale> is in effect,
- because the characters matched by C<\w> are determined by the locale.
- Perl considers that locale definitions are untrustworthy because they
- contain data from outside the program. If you are writing a
- locale-aware program, and want to launder data with a regular expression
- containing C<\w>, put C<no locale> ahead of the expression in the same
- block. See L<perllocale/SECURITY> for further discussion and examples.
-
- =head2 Switches On the "#!" Line
-
- When you make a script executable, in order to make it usable as a
- command, the system will pass switches to perl from the script's #!
- line. Perl checks that any command line switches given to a setuid
- (or setgid) script actually match the ones set on the #! line. Some
- Unix and Unix-like environments impose a one-switch limit on the #!
- line, so you may need to use something like C<-wU> instead of C<-w -U>
- under such systems. (This issue should arise only in Unix or
- Unix-like environments that support #! and setuid or setgid scripts.)
-
- =head2 Cleaning Up Your Path
-
- For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a
- known value, and each directory in the path must be non-writable by others
- than its owner and group. You may be surprised to get this message even
- if the pathname to your executable is fully qualified. This is I<not>
- generated because you didn't supply a full path to the program; instead,
- it's generated because you never set your PATH environment variable, or
- you didn't set it to something that was safe. Because Perl can't
- guarantee that the executable in question isn't itself going to turn
- around and execute some other program that is dependent on your PATH, it
- makes sure you set the PATH.
-
- The PATH isn't the only environment variable which can cause problems.
- Because some shells may use the variables IFS, CDPATH, ENV, and
- BASH_ENV, Perl checks that those are either empty or untainted when
- starting subprocesses. You may wish to add something like this to your
- setid and taint-checking scripts.
-
- delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
-
- It's also possible to get into trouble with other operations that don't
- care whether they use tainted values. Make judicious use of the file
- tests in dealing with any user-supplied filenames. When possible, do
- opens and such B<after> properly dropping any special user (or group!)
- privileges. Perl doesn't prevent you from opening tainted filenames for reading,
- so be careful what you print out. The tainting mechanism is intended to
- prevent stupid mistakes, not to remove the need for thought.
-
- Perl does not call the shell to expand wild cards when you pass B<system>
- and B<exec> explicit parameter lists instead of strings with possible shell
- wildcards in them. Unfortunately, the B<open>, B<glob>, and
- backtick functions provide no such alternate calling convention, so more
- subterfuge will be required.
-
- Perl provides a reasonably safe way to open a file or pipe from a setuid
- or setgid program: just create a child process with reduced privilege who
- does the dirty work for you. First, fork a child using the special
- B<open> syntax that connects the parent and child by a pipe. Now the
- child resets its ID set and any other per-process attributes, like
- environment variables, umasks, current working directories, back to the
- originals or known safe values. Then the child process, which no longer
- has any special permissions, does the B<open> or other system call.
- Finally, the child passes the data it managed to access back to the
- parent. Because the file or pipe was opened in the child while running
- under less privilege than the parent, it's not apt to be tricked into
- doing something it shouldn't.
-
- Here's a way to do backticks reasonably safely. Notice how the B<exec> is
- not called with a string that the shell could expand. This is by far the
- best way to call something that might be subjected to shell escapes: just
- never call the shell at all.
-
- use English '-no_match_vars';
- die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
- if ($pid) { # parent
- while (<KID>) {
- # do something
- }
- close KID;
- } else {
- my @temp = ($EUID, $EGID);
- my $orig_uid = $UID;
- my $orig_gid = $GID;
- $EUID = $UID;
- $EGID = $GID;
- # Drop privileges
- $UID = $orig_uid;
- $GID = $orig_gid;
- # Make sure privs are really gone
- ($EUID, $EGID) = @temp;
- die "Can't drop privileges"
- unless $UID == $EUID && $GID eq $EGID;
- $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
- # Consider sanitizing the environment even more.
- exec 'myprog', 'arg1', 'arg2'
- or die "can't exec myprog: $!";
- }
-
- A similar strategy would work for wildcard expansion via C<glob>, although
- you can use C<readdir> instead.
-
- Taint checking is most useful when although you trust yourself not to have
- written a program to give away the farm, you don't necessarily trust those
- who end up using it not to try to trick it into doing something bad. This
- is the kind of security checking that's useful for set-id programs and
- programs launched on someone else's behalf, like CGI programs.
-
- This is quite different, however, from not even trusting the writer of the
- code not to try to do something evil. That's the kind of trust needed
- when someone hands you a program you've never seen before and says, "Here,
- run this." For that kind of safety, check out the Safe module,
- included standard in the Perl distribution. This module allows the
- programmer to set up special compartments in which all system operations
- are trapped and namespace access is carefully controlled.
-
- =head2 Security Bugs
-
- Beyond the obvious problems that stem from giving special privileges to
- systems as flexible as scripts, on many versions of Unix, set-id scripts
- are inherently insecure right from the start. The problem is a race
- condition in the kernel. Between the time the kernel opens the file to
- see which interpreter to run and when the (now-set-id) interpreter turns
- around and reopens the file to interpret it, the file in question may have
- changed, especially if you have symbolic links on your system.
-
- Fortunately, sometimes this kernel "feature" can be disabled.
- Unfortunately, there are two ways to disable it. The system can simply
- outlaw scripts with any set-id bit set, which doesn't help much.
- Alternately, it can simply ignore the set-id bits on scripts. If the
- latter is true, Perl can emulate the setuid and setgid mechanism when it
- notices the otherwise useless setuid/gid bits on Perl scripts. It does
- this via a special executable called B<suidperl> that is automatically
- invoked for you if it's needed.
-
- However, if the kernel set-id script feature isn't disabled, Perl will
- complain loudly that your set-id script is insecure. You'll need to
- either disable the kernel set-id script feature, or put a C wrapper around
- the script. A C wrapper is just a compiled program that does nothing
- except call your Perl program. Compiled programs are not subject to the
- kernel bug that plagues set-id scripts. Here's a simple wrapper, written
- in C:
-
- #define REAL_PATH "/path/to/script"
- main(ac, av)
- char **av;
- {
- execv(REAL_PATH, av);
- }
-
- Compile this wrapper into a binary executable and then make I<it> rather
- than your script setuid or setgid.
-
- In recent years, vendors have begun to supply systems free of this
- inherent security bug. On such systems, when the kernel passes the name
- of the set-id script to open to the interpreter, rather than using a
- pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
- special file already opened on the script, so that there can be no race
- condition for evil scripts to exploit. On these systems, Perl should be
- compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The B<Configure>
- program that builds Perl tries to figure this out for itself, so you
- should never have to specify this yourself. Most modern releases of
- SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
-
- Prior to release 5.6.1 of Perl, bugs in the code of B<suidperl> could
- introduce a security hole.
-
- =head2 Protecting Your Programs
-
- There are a number of ways to hide the source to your Perl programs,
- with varying levels of "security".
-
- First of all, however, you I<can't> take away read permission, because
- the source code has to be readable in order to be compiled and
- interpreted. (That doesn't mean that a CGI script's source is
- readable by people on the web, though.) So you have to leave the
- permissions at the socially friendly 0755 level. This lets
- people on your local system only see your source.
-
- Some people mistakenly regard this as a security problem. If your program does
- insecure things, and relies on people not knowing how to exploit those
- insecurities, it is not secure. It is often possible for someone to
- determine the insecure things and exploit them without viewing the
- source. Security through obscurity, the name for hiding your bugs
- instead of fixing them, is little security indeed.
-
- You can try using encryption via source filters (Filter::* from CPAN,
- or Filter::Util::Call and Filter::Simple since Perl 5.8).
- But crackers might be able to decrypt it. You can try using the byte
- code compiler and interpreter described below, but crackers might be
- able to de-compile it. You can try using the native-code compiler
- described below, but crackers might be able to disassemble it. These
- pose varying degrees of difficulty to people wanting to get at your
- code, but none can definitively conceal it (this is true of every
- language, not just Perl).
-
- If you're concerned about people profiting from your code, then the
- bottom line is that nothing but a restrictive licence will give you
- legal security. License your software and pepper it with threatening
- statements like "This is unpublished proprietary software of XYZ Corp.
- Your access to it does not give you permission to use it blah blah
- blah." You should see a lawyer to be sure your licence's wording will
- stand up in court.
-
- =head2 Unicode
-
- Unicode is a new and complex technology and one may easily overlook
- certain security pitfalls. See L<perluniintro> for an overview and
- L<perlunicode> for details, and L<perlunicode/"Security Implications
- of Unicode"> for security implications in particular.
-
- =head2 Algorithmic Complexity Attacks
-
- Certain internal algorithms used in the implementation of Perl can
- be attacked by choosing the input carefully to consume large amounts
- of either time or space or both. This can lead into the so-called
- I<Denial of Service> (DoS) attacks.
-
- =over 4
-
- =item *
-
- Hash Function - the algorithm used to "order" hash elements has been
- changed several times during the development of Perl, mainly to be
- reasonably fast. In Perl 5.8.1 also the security aspect was taken
- into account.
-
- In Perls before 5.8.1 one could rather easily generate data that as
- hash keys would cause Perl to consume large amounts of time because
- internal structure of hashes would badly degenerate. In Perl 5.8.1
- the hash function is randomly perturbed by a pseudorandom seed which
- makes generating such naughty hash keys harder.
- See L<perlrun/PERL_HASH_SEED> for more information.
-
- The random perturbation is done by default but if one wants for some
- reason emulate the old behaviour one can set the environment variable
- PERL_HASH_SEED to zero (or any other integer). One possible reason
- for wanting to emulate the old behaviour is that in the new behaviour
- consecutive runs of Perl will order hash keys differently, which may
- confuse some applications (like Data::Dumper: the outputs of two
- different runs are no more identical).
-
- B<Perl has never guaranteed any ordering of the hash keys>, and the
- ordering has already changed several times during the lifetime of
- Perl 5. Also, the ordering of hash keys has always been, and
- continues to be, affected by the insertion order.
-
- Also note that while the order of the hash elements might be
- randomised, this "pseudoordering" should B<not> be used for
- applications like shuffling a list randomly (use List::Util::shuffle()
- for that, see L<List::Util>, a standard core module since Perl 5.8.0;
- or the CPAN module Algorithm::Numerical::Shuffle), or for generating
- permutations (use e.g. the CPAN modules Algorithm::Permute or
- Algorithm::FastPermute), or for any cryptographic applications.
-
- =item *
-
- Regular expressions - Perl's regular expression engine is so called
- NFA (Non-Finite Automaton), which among other things means that it can
- rather easily consume large amounts of both time and space if the
- regular expression may match in several ways. Careful crafting of the
- regular expressions can help but quite often there really isn't much
- one can do (the book "Mastering Regular Expressions" is required
- reading, see L<perlfaq2>). Running out of space manifests itself by
- Perl running out of memory.
-
- =item *
-
- Sorting - the quicksort algorithm used in Perls before 5.8.0 to
- implement the sort() function is very easy to trick into misbehaving
- so that it consumes a lot of time. Nothing more is required than
- resorting a list already sorted. Starting from Perl 5.8.0 a different
- sorting algorithm, mergesort, is used. Mergesort is insensitive to
- its input data, so it cannot be similarly fooled.
-
- =back
-
- See L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
- and any computer science text book on the algorithmic complexity.
-
- =head1 SEE ALSO
-
- L<perlrun> for its description of cleaning up environment variables.
-