By default, environment variables are inherited from a process' parent.
However, when a program executes another program, the calling program
can set the environment variables to arbitrary values.
This is dangerous to setuid/setgid programs, because their invoker can
completely control the environment variables they're given.
Since they are usually inherited, this also applies transitively; a
secure program might call some other program and, without special measures,
would pass potentially dangerous environment variables values on to the
program it calls.
The following subsections discuss environment variables and what to
do with them.
Some environment variables are dangerous because
many libraries and programs are controlled by environment
variables in ways that are obscure, subtle, or undocumented.
For example, the IFS variable is used by the sh and bash
shell to determine which characters separate command line arguments.
Since the shell is invoked by several low-level calls
(like system(3) and popen(3) in C, or the back-tick operator in Perl),
setting IFS to unusual values can subvert apparently-safe calls.
This behavior is documented in bash and sh, but it's obscure;
many long-time users only know about IFS because of its use in breaking
security, not because it's actually used very often for its intended purpose.
What is worse is that not all environment variables are documented, and
even if they are, those other programs may change and add dangerous
environment variables.
Thus, the only real solution (described below) is to select the ones you
need and throw away the rest.
Normally, programs should use the standard access routines to access
environment variables.
For example, in C, you should get values
using getenv(3), set them using the
POSIX standard routine putenv(3) or the BSD extension setenv(3)
and eliminate environment variables using unsetenv(3).
I should note here that setenv(3) is implemented in Linux, too.
However, crackers need not be so nice; crackers can directly control the
environment variable data area passed to a program using execve(2).
This permits some nasty attacks, which can only be understood by
understanding how environment variables really work.
In Linux, you can see environ(5) for a summary how about environment variables
really work.
In short, environment variables are internally stored as a pointer to
an array of pointers to characters; this array is stored in order and
terminated by a NULL pointer (so you'll know when the array ends).
The pointers to characters, in turn, each
point to a NIL-terminated string value of the form ``NAME=value''.
This has several implications, for example, environment variable names
can't include the equal sign, and neither the name nor value can have
embedded NIL characters.
However, a more dangerous implication of this format is that it allows
multiple entries with the same variable name, but with different values
(e.g., more than one value for SHELL).
While typical command shells prohibit doing this,
a locally-executing cracker can create such a situation using execve(2).
The problem with this storage format (and the way it's set)
is that a program might check one of these values
(to see if it's valid) but actually use a different one.
In Linux,
the GNU glibc libraries try to shield programs from this;
glibc 2.1's implementation of getenv will always get the first matching
entry, setenv and putenv will always set the first matching entry, and
unsetenv will actually unset all of the matching entries
(congratulations to the GNU glibc implementers for implementing
unsetenv this way!).
However, some programs go directly to the environ variable and iterate
across all environment variables; in this case,
they might use the last matching entry instead of the first one.
As a result, if checks were made against the first matching entry instead,
but the actual value used is the last matching entry,
a cracker can use this fact to circumvent the protection routines.
For secure setuid/setgid programs, the short list of environment variables
needed as input (if any) should be carefully extracted.
Then the entire environment should be erased,
followed by resetting a small set of necessary environment
variables to safe values.
There really isn't a better way if you make any calls to subordinate
programs; there's no practical
method of listing ``all the dangerous values''.
Even if you reviewed the source code of every program you call
directly or indirectly,
someone may add new undocumented environment variables after you
write your code, and one of them may be exploitable.
The simple way to erase the environment in C/C++
is by setting the global variable
environ
to NULL.
The global variable environ is defined in <unistd.h>; C/C++ users will
want to #include this header file.
You will need to manipulate this value before spawning threads, but that's
rarely a problem, since you want to do these manipulations very early in
the program's execution (usually before threads are spawned).
The global variable environ's definition is defined in various standards; it's
not clear that the official standards condone directly changing its value,
but I'm unaware of any Unix-like system that has trouble
with doing this.
I normally just modify the ``environ'' directly;
manipulating such low-level components is possibly non-portable, but
it assures you that you get a clean (and safe) environment.
In the rare case where you need later access to the entire set of
variables, you could save the ``environ'' variable's value somewhere,
but this is rarely necessary; nearly all programs need only a few values,
and the rest can be dropped.
Another way to clear the environment
is to use the undocumented clearenv() function.
The function
clearenv() has an odd history; it was supposed to be defined in POSIX.1, but
somehow never made it into that standard.
However, clearenv() is defined in POSIX.9
(the Fortran 77 bindings to POSIX), so there is a quasi-official status for it.
In Linux,
clearenv() is defined in <stdlib.h>, but before using #include
to include it you must make sure that __USE_MISC is #defined.
A somewhat more ``official'' approach is to cause __USE_MISC to be defined
is to first #define either _SVID_SOURCE or _BSD_SOURCE, and then
#include <features.h> -
these are the official feature test macros.
One environment value you'll almost certainly re-add is PATH,
the list of directories to search for programs; PATH should
not include the current directory and usually be something simple like
``/bin:/usr/bin''.
Typically you'll also set
IFS (to its default of `` \t\n'', where space is the first character)
and TZ (timezone).
Linux won't die if you don't supply either IFS or TZ,
but some System V based systems have problems if you don't supply a TZ value,
and it's rumored that some shells need the IFS value set.
In Linux, see environ(5) for a list of common environment variables that you
might want to set.
If you really need user-supplied values, check the values first
(to ensure that the values match a pattern for legal values and that they
are within some reasonable maximum length).
Ideally there would be some standard trusted file in /etc with the
information for ``standard safe environment variable values'',
but at this time there's no standard file defined for this purpose.
For something similar, you might want to examine the PAM module pam_env
on those systems which have that module.
If you're using a shell as your programming language,
you can use the ``/usr/bin/env'' program with the ``-'' option
(which erases all environment variables of the program being run).
Basically, you call /usr/bin/env, give it the ``-'' option,
follow that with the set of variables and their values you wish to set
(as name=value),
and then follow that with the name of the program to run and its arguments.
You usually want to call the program using the full pathname
(/usr/bin/env) and not just as ``env'', in case a user has created
a dangerous PATH value.
Note that GNU's env also accepts the options
"-i" and "--ignore-environment" as synonyms (they also erase the
environment of the program being started), but these aren't portable to
other versions of env.
If you're programming a setuid/setgid program in a language
that doesn't allow you to reset the environment directly,
one approach is to create a ``wrapper'' program.
The wrapper sets the environment program to safe values, and then
calls the other program.
Beware: make sure the wrapper will actually invoke the intended program;
if it's an interpreted program, make sure there's no race condition possible
that would allow the interpreter to load a different program than the one
that was granted the special setuid/setgid privileges.