home *** CD-ROM | disk | FTP | other *** search
- <?xml version='1.0' encoding='UTF-8' ?>
- <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
- <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
- <manualpage metafile="cgi.xml.meta">
- <parentdocument href="./">How-To / Tutorials</parentdocument>
-
- <title>Apache Tutorial: Dynamic Content with CGI</title>
-
- <section id="intro">
- <title>Introduction</title>
-
- <related>
- <modulelist>
- <module>mod_alias</module>
- <module>mod_cgi</module>
- </modulelist>
-
- <directivelist>
- <directive module="mod_mime">AddHandler</directive>
- <directive module="core">Options</directive>
- <directive module="mod_alias">ScriptAlias</directive>
- </directivelist>
- </related>
-
- <p>The CGI (Common Gateway Interface) defines a way for a web
- server to interact with external content-generating programs,
- which are often referred to as CGI programs or CGI scripts. It
- is the simplest, and most common, way to put dynamic content on
- your web site. This document will be an introduction to setting
- up CGI on your Apache web server, and getting started writing
- CGI programs.</p>
- </section>
-
- <section id="configuring">
- <title>Configuring Apache to permit CGI</title>
-
- <p>In order to get your CGI programs to work properly, you'll
- need to have Apache configured to permit CGI execution. There
- are several ways to do this.</p>
-
- <section id="scriptalias">
- <title>ScriptAlias</title>
-
- <p>The
- <directive module="mod_alias">ScriptAlias</directive>
-
- directive tells Apache that a particular directory is set
- aside for CGI programs. Apache will assume that every file in
- this directory is a CGI program, and will attempt to execute
- it, when that particular resource is requested by a
- client.</p>
-
- <p>The <directive module="mod_alias">ScriptAlias</directive>
- directive looks like:</p>
-
- <example>
- ScriptAlias /cgi-bin/ /usr/local/apache/cgi-bin/
- </example>
-
- <p>The example shown is from your default <code>httpd.conf</code>
- configuration file, if you installed Apache in the default
- location. The <directive module="mod_alias">ScriptAlias</directive>
- directive is much like the <directive module="mod_alias"
- >Alias</directive> directive, which defines a URL prefix that
- is to mapped to a particular directory. <directive>Alias</directive>
- and <directive>ScriptAlias</directive> are usually used for
- directories that are outside of the <directive module="core"
- >DocumentRoot</directive> directory. The difference between
- <directive>Alias</directive> and <directive>ScriptAlias</directive>
- is that <directive>ScriptAlias</directive> has the added meaning
- that everything under that URL prefix will be considered a CGI
- program. So, the example above tells Apache that any request for a
- resource beginning with <code>/cgi-bin/</code> should be served from
- the directory <code>/usr/local/apache/cgi-bin/</code>, and should be
- treated as a CGI program.</p>
-
- <p>For example, if the URL
- <code>http://www.example.com/cgi-bin/test.pl</code>
- is requested, Apache will attempt to execute the file
- <code>/usr/local/apache/cgi-bin/test.pl</code>
- and return the output. Of course, the file will have to
- exist, and be executable, and return output in a particular
- way, or Apache will return an error message.</p>
- </section>
-
- <section id="nonscriptalias">
- <title>CGI outside of ScriptAlias directories</title>
-
- <p>CGI programs are often restricted to <directive module="mod_alias"
- >ScriptAlias</directive>'ed directories for security reasons.
- In this way, administrators can tightly control who is allowed to
- use CGI programs. However, if the proper security precautions are
- taken, there is no reason why CGI programs cannot be run from
- arbitrary directories. For example, you may wish to let users
- have web content in their home directories with the
- <directive module="mod_userdir">UserDir</directive> directive.
- If they want to have their own CGI programs, but don't have access to
- the main <code>cgi-bin</code> directory, they will need to be able to
- run CGI programs elsewhere.</p>
- </section>
-
- <section id="options">
- <title>Explicitly using Options to permit CGI execution</title>
-
- <p>You could explicitly use the <directive module="core"
- >Options</directive> directive, inside your main server configuration
- file, to specify that CGI execution was permitted in a particular
- directory:</p>
-
- <example>
- <Directory /usr/local/apache/htdocs/somedir><br />
- <indent>
- Options +ExecCGI<br />
- </indent>
- </Directory>
- </example>
-
- <p>The above directive tells Apache to permit the execution
- of CGI files. You will also need to tell the server what
- files are CGI files. The following <directive module="mod_mime"
- >AddHandler</directive> directive tells the server to treat all
- files with the <code>cgi</code> or <code>pl</code> extension as CGI
- programs:</p>
-
- <example>
- AddHandler cgi-script cgi pl
- </example>
- </section>
-
- <section id="htaccess">
- <title>.htaccess files</title>
-
- <p>A <a href="htaccess.html"><code>.htaccess</code> file</a> is a way
- to set configuration directives on a per-directory basis. When Apache
- serves a resource, it looks in the directory from which it is serving
- a file for a file called <code>.htaccess</code>, and, if it
- finds it, it will apply directives found therein.
-
- <code>.htaccess</code> files can be permitted with the
- <directive module="core">AllowOverride</directive> directive,
- which specifies what types of directives can
- appear in these files, or if they are not allowed at all. To
- permit the directive we will need for this purpose, the
- following configuration will be needed in your main server
- configuration:</p>
-
- <example>
- AllowOverride Options
- </example>
-
- <p>In the <code>.htaccess</code> file, you'll need the
- following directive:</p>
-
- <example>
- Options +ExecCGI
- </example>
-
- <p>which tells Apache that execution of CGI programs is
- permitted in this directory.</p>
- </section>
- </section>
-
- <section id="writing">
- <title>Writing a CGI program</title>
-
- <p>There are two main differences between ``regular''
- programming, and CGI programming.</p>
-
- <p>First, all output from your CGI program must be preceded by
- a MIME-type header. This is HTTP header that tells the client
- what sort of content it is receiving. Most of the time, this
- will look like:</p>
-
- <example>
- Content-type: text/html
- </example>
-
- <p>Secondly, your output needs to be in HTML, or some other
- format that a browser will be able to display. Most of the
- time, this will be HTML, but occasionally you might write a CGI
- program that outputs a gif image, or other non-HTML
- content.</p>
-
- <p>Apart from those two things, writing a CGI program will look
- a lot like any other program that you might write.</p>
-
- <section id="firstcgi">
- <title>Your first CGI program</title>
-
- <p>The following is an example CGI program that prints one
- line to your browser. Type in the following, save it to a
- file called <code>first.pl</code>, and put it in your
- <code>cgi-bin</code> directory.</p>
-
- <example>
- #!/usr/bin/perl<br />
- print "Content-type: text/html\n\n";<br />
- print "Hello, World.";
- </example>
-
- <p>Even if you are not familiar with Perl, you should be able
- to see what is happening here. The first line tells Apache
- (or whatever shell you happen to be running under) that this
- program can be executed by feeding the file to the
- interpreter found at the location <code>/usr/bin/perl</code>.
- The second line prints the content-type declaration we
- talked about, followed by two carriage-return newline pairs.
- This puts a blank line after the header, to indicate the end
- of the HTTP headers, and the beginning of the body. The third
- line prints the string "Hello, World.". And that's the end
- of it.</p>
-
- <p>If you open your favorite browser and tell it to get the
- address</p>
-
- <example>
- http://www.example.com/cgi-bin/first.pl
- </example>
-
- <p>or wherever you put your file, you will see the one line
- <code>Hello, World.</code> appear in your browser window.
- It's not very exciting, but once you get that working, you'll
- have a good chance of getting just about anything working.</p>
- </section>
- </section>
-
- <section id="troubleshoot">
- <title>But it's still not working!</title>
-
- <p>There are four basic things that you may see in your browser
- when you try to access your CGI program from the web:</p>
-
- <dl>
- <dt>The output of your CGI program</dt>
- <dd>Great! That means everything worked fine.</dd>
-
- <dt>The source code of your CGI program or a "POST Method Not
- Allowed" message</dt>
- <dd>That means that you have not properly configured Apache
- to process your CGI program. Reread the section on
- <a href="#configuringapachetopermitcgi">configuring
- Apache</a> and try to find what you missed.</dd>
-
- <dt>A message starting with "Forbidden"</dt>
- <dd>That means that there is a permissions problem. Check the
- <a href="#errorlogs">Apache error log</a> and the section below on
- <a href="#permissions">file permissions</a>.</dd>
-
- <dt>A message saying "Internal Server Error"</dt>
- <dd>If you check the
- <a href="#errorlogs">Apache error log</a>, you will probably
- find that it says "Premature end of
- script headers", possibly along with an error message
- generated by your CGI program. In this case, you will want to
- check each of the below sections to see what might be
- preventing your CGI program from emitting the proper HTTP
- headers.</dd>
- </dl>
-
- <section id="permissions">
- <title>File permissions</title>
-
- <p>Remember that the server does not run as you. That is,
- when the server starts up, it is running with the permissions
- of an unprivileged user - usually <code>nobody</code>, or
- <code>www</code> - and so it will need extra permissions to
- execute files that are owned by you. Usually, the way to give
- a file sufficient permissions to be executed by <code>nobody</code>
- is to give everyone execute permission on the file:</p>
-
- <example>
- chmod a+x first.pl
- </example>
-
- <p>Also, if your program reads from, or writes to, any other
- files, those files will need to have the correct permissions
- to permit this.</p>
-
- <p>The exception to this is when the server is configured to
- use <a href="../suexec.html">suexec</a>. This program allows
- CGI programs to be run under different
- user permissions, depending on which virtual host or user
- home directory they are located in. Suexec has very strict
- permission checking, and any failure in that checking will
- result in your CGI programs failing with an "Internal Server
- Error". In this case, you will need to check the suexec log
- file to see what specific security check is failing.</p>
- </section>
-
- <section id="pathinformation">
- <title>Path information</title>
-
- <p>When you run a program from your command line, you have
- certain information that is passed to the shell without you
- thinking about it. For example, you have a path, which tells
- the shell where it can look for files that you reference.</p>
-
- <p>When a program runs through the web server as a CGI
- program, it does not have that path. Any programs that you
- invoke in your CGI program (like 'sendmail', for example)
- will need to be specified by a full path, so that the shell
- can find them when it attempts to execute your CGI
- program.</p>
-
- <p>A common manifestation of this is the path to the script
- interpreter (often <code>perl</code>) indicated in the first
- line of your CGI program, which will look something like:</p>
-
- <example>
- #!/usr/bin/perl
- </example>
-
- <p>Make sure that this is in fact the path to the
- interpreter.</p>
- </section>
-
- <section id="syntaxerrors">
- <title>Syntax errors</title>
-
- <p>Most of the time when a CGI program fails, it's because of
- a problem with the program itself. This is particularly true
- once you get the hang of this CGI stuff, and no longer make
- the above two mistakes. Always attempt to run your program
- from the command line before you test if via a browser. This
- will eliminate most of your problems.</p>
- </section>
-
- <section id="errorlogs">
- <title>Error logs</title>
-
- <p>The error logs are your friend. Anything that goes wrong
- generates message in the error log. You should always look
- there first. If the place where you are hosting your web site
- does not permit you access to the error log, you should
- probably host your site somewhere else. Learn to read the
- error logs, and you'll find that almost all of your problems
- are quickly identified, and quickly solved.</p>
- </section>
- </section>
-
- <section id="behindscenes">
- <title>What's going on behind the scenes?</title>
-
- <p>As you become more advanced in CGI programming, it will
- become useful to understand more about what's happening behind
- the scenes. Specifically, how the browser and server
- communicate with one another. Because although it's all very
- well to write a program that prints "Hello, World.", it's not
- particularly useful.</p>
-
- <section id="env">
- <title>Environment variables</title>
-
- <p>Environment variables are values that float around you as
- you use your computer. They are useful things like your path
- (where the computer searches for a the actual file
- implementing a command when you type it), your username, your
- terminal type, and so on. For a full list of your normal,
- every day environment variables, type
- <code>env</code> at a command prompt.</p>
-
- <p>During the CGI transaction, the server and the browser
- also set environment variables, so that they can communicate
- with one another. These are things like the browser type
- (Netscape, IE, Lynx), the server type (Apache, IIS, WebSite),
- the name of the CGI program that is being run, and so on.</p>
-
- <p>These variables are available to the CGI programmer, and
- are half of the story of the client-server communication. The
- complete list of required variables is at
- <a href="http://hoohoo.ncsa.uiuc.edu/cgi/env.html"
- >http://hoohoo.ncsa.uiuc.edu/cgi/env.html</a>.</p>
-
- <p>This simple Perl CGI program will display all of the
- environment variables that are being passed around. Two
- similar programs are included in the
- <code>cgi-bin</code>
-
- directory of the Apache distribution. Note that some
- variables are required, while others are optional, so you may
- see some variables listed that were not in the official list.
- In addition, Apache provides many different ways for you to
- <a href="../env.html">add your own environment variables</a>
- to the basic ones provided by default.</p>
-
- <example>
- #!/usr/bin/perl<br />
- print "Content-type: text/html\n\n";<br />
- foreach $key (keys %ENV) {<br />
- <indent>
- print "$key --> $ENV{$key}<br>";<br />
- </indent>
- }
- </example>
- </section>
-
- <section id="stdin">
- <title>STDIN and STDOUT</title>
-
- <p>Other communication between the server and the client
- happens over standard input (<code>STDIN</code>) and standard
- output (<code>STDOUT</code>). In normal everyday context,
- <code>STDIN</code> means the keyboard, or a file that a
- program is given to act on, and <code>STDOUT</code>
- usually means the console or screen.</p>
-
- <p>When you <code>POST</code> a web form to a CGI program,
- the data in that form is bundled up into a special format
- and gets delivered to your CGI program over <code>STDIN</code>.
- The program then can process that data as though it was
- coming in from the keyboard, or from a file</p>
-
- <p>The "special format" is very simple. A field name and
- its value are joined together with an equals (=) sign, and
- pairs of values are joined together with an ampersand
- (&). Inconvenient characters like spaces, ampersands, and
- equals signs, are converted into their hex equivalent so that
- they don't gum up the works. The whole data string might look
- something like:</p>
-
- <example>
- name=Rich%20Bowen&city=Lexington&state=KY&sidekick=Squirrel%20Monkey
- </example>
-
- <p>You'll sometimes also see this type of string appended to
- the a URL. When that is done, the server puts that string
- into the environment variable called
- <code>QUERY_STRING</code>. That's called a <code>GET</code>
- request. Your HTML form specifies whether a <code>GET</code>
- or a <code>POST</code> is used to deliver the data, by setting the
- <code>METHOD</code> attribute in the <code>FORM</code> tag.</p>
-
- <p>Your program is then responsible for splitting that string
- up into useful information. Fortunately, there are libraries
- and modules available to help you process this data, as well
- as handle other of the aspects of your CGI program.</p>
- </section>
- </section>
-
- <section id="libraries">
- <title>CGI modules/libraries</title>
-
- <p>When you write CGI programs, you should consider using a
- code library, or module, to do most of the grunt work for you.
- This leads to fewer errors, and faster development.</p>
-
- <p>If you're writing CGI programs in Perl, modules are
- available on <a href="http://www.cpan.org/">CPAN</a>. The most
- popular module for this purpose is <code>CGI.pm</code>. You might
- also consider <code>CGI::Lite</code>, which implements a minimal
- set of functionality, which is all you need in most programs.</p>
-
- <p>If you're writing CGI programs in C, there are a variety of
- options. One of these is the <code>CGIC</code> library, from
- <a href="http://www.boutell.com/cgic/"
- >http://www.boutell.com/cgic/</a>.</p>
- </section>
-
- <section id="moreinfo">
- <title>For more information</title>
-
- <p>There are a large number of CGI resources on the web. You
- can discuss CGI problems with other users on the Usenet group
- <a href="news:comp.infosystems.www.authoring.cgi"
- >comp.infosystems.www.authoring.cgi</a>. And the -servers mailing
- list from the HTML Writers Guild is a great source of answers
- to your questions. You can find out more at
- <a href="http://www.hwg.org/lists/hwg-servers/"
- >http://www.hwg.org/lists/hwg-servers/</a>.</p>
-
- <p>And, of course, you should probably read the CGI
- specification, which has all the details on the operation of
- CGI programs. You can find the original version at the
- <a href="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"
- >NCSA</a> and there is an updated draft at the
- <a href="http://web.golux.com/coar/cgi/">Common Gateway
- Interface RFC project</a>.</p>
-
- <p>When you post a question about a CGI problem that you're
- having, whether to a mailing list, or to a newsgroup, make sure
- you provide enough information about what happened, what you
- expected to happen, and how what actually happened was
- different, what server you're running, what language your CGI
- program was in, and, if possible, the offending code. This will
- make finding your problem much simpler.</p>
-
- <p>Note that questions about CGI problems should <strong>never</strong>
- be posted to the Apache bug database unless you are sure you
- have found a problem in the Apache source code.</p>
- </section>
- </manualpage>
-
-