home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-07-06 | 79.8 KB | 2,041 lines |
- ages complain of mismatched
- psect attributes. You can ignore them. @xref{VMS Install}.
-
- @cindex Alliant
- @item
- On the Alliant, the system's own convention for returning structures
- and unions is unusual, and is not compatible with GNU CC no matter
- what options are used.
-
- @cindex RT PC
- @cindex IBM RT PC
- @item
- On the IBM RT PC, the MetaWare HighC compiler (hc) uses yet another
- convention for structure and union returning. Use
- @samp{-mhc-struct-return} to tell GNU CC to use a convention compatible
- with it.
-
- @cindex Vax calling convention
- @cindex Ultrix calling convention
- @item
- On Ultrix, the Fortran compiler expects registers 2 through 5 to be saved
- by function calls. However, the C compiler uses conventions compatible
- with BSD Unix: registers 2 through 5 may be clobbered by function calls.
-
- GNU CC uses the same convention as the Ultrix C compiler. You can use
- these options to produce code compatible with the Fortran compiler:
-
- @smallexample
- -fcall-saved-r2 -fcall-saved-r3 -fcall-saved-r4 -fcall-saved-r5
- @end smallexample
- @end itemize
-
- @node Incompatibilities
- @section Incompatibilities of GNU CC
- @cindex incompatibilities of GNU CC
-
- There are several noteworthy incompatibilities between GNU C and most
- existing (non-ANSI) versions of C. The @samp{-traditional} option
- eliminates many of these incompatibilities, @emph{but not all}, by
- telling GNU C to behave like the other C compilers.
-
- @itemize @bullet
- @cindex string constants
- @cindex read-only strings
- @cindex shared strings
- @item
- GNU CC normally makes string constants read-only. If several
- identical-looking string constants are used, GNU CC stores only one
- copy of the string.
-
- @cindex @code{mktemp}, and constant strings
- One consequence is that you cannot call @code{mktemp} with a string
- constant argument. The function @code{mktemp} always alters the
- string its argument points to.
-
- @cindex @code{sscanf}, and constant strings
- @cindex @code{fscanf}, and constant strings
- @cindex @code{scanf}, and constant strings
- Another consequence is that @code{sscanf} does not work on some systems
- when passed a string constant as its format control string or input.
- This is because @code{sscanf} incorrectly tries to write into the string
- constant. Likewise @code{fscanf} and @code{scanf}.
-
- The best solution to these problems is to change the program to use
- @code{char}-array variables with initialization strings for these
- purposes instead of string constants. But if this is not possible,
- you can use the @samp{-fwritable-strings} flag, which directs GNU CC
- to handle string constants the same way most C compilers do.
- @samp{-traditional} also has this effect, among others.
-
- @item
- @code{-2147483648} is positive.
-
- This is because 2147483648 cannot fit in the type @code{int}, so
- (following the ANSI C rules) its data type is @code{unsigned long int}.
- Negating this value yields 2147483648 again.
-
- @item
- GNU CC does not substitute macro arguments when they appear inside of
- string constants. For example, the following macro in GNU CC
-
- @example
- #define foo(a) "a"
- @end example
-
- @noindent
- will produce output @code{"a"} regardless of what the argument @var{a} is.
-
- The @samp{-traditional} option directs GNU CC to handle such cases
- (among others) in the old-fashioned (non-ANSI) fashion.
-
- @cindex @code{setjmp} incompatibilities
- @cindex @code{longjmp} incompatibilities
- @item
- When you use @code{setjmp} and @code{longjmp}, the only automatic
- variables guaranteed to remain valid are those declared
- @code{volatile}. This is a consequence of automatic register
- allocation. Consider this function:
-
- @example
- jmp_buf j;
-
- foo ()
- @{
- int a, b;
-
- a = fun1 ();
- if (setjmp (j))
- return a;
-
- a = fun2 ();
- /* @r{@code{longjmp (j)} may occur in @code{fun3}.} */
- return a + fun3 ();
- @}
- @end example
-
- Here @code{a} may or may not be restored to its first value when the
- @code{longjmp} occurs. If @code{a} is allocated in a register, then
- its first value is restored; otherwise, it keeps the last value stored
- in it.
-
- If you use the @samp{-W} option with the @samp{-O} option, you will
- get a warning when GNU CC thinks such a problem might be possible.
-
- The @samp{-traditional} option directs GNU C to put variables in
- the stack by default, rather than in registers, in functions that
- call @code{setjmp}. This results in the behavior found in
- traditional C compilers.
-
- @item
- Programs that use preprocessor directives in the middle of macro
- arguments do not work with GNU CC. For example, a program like this
- will not work:
-
- @example
- foobar (
- #define luser
- hack)
- @end example
-
- ANSI C does not permit such a construct. It would make sense to support
- it when @samp{-traditional} is used, but it is too much work to
- implement.
-
- @cindex external declaration scope
- @cindex scope of external declarations
- @cindex declaration scope
- @item
- Declarations of external variables and functions within a block apply
- only to the block containing the declaration. In other words, they
- have the same scope as any other declaration in the same place.
-
- In some other C compilers, a @code{extern} declaration affects all the
- rest of the file even if it happens within a block.
-
- The @samp{-traditional} option directs GNU C to treat all @code{extern}
- declarations as global, like traditional compilers.
-
- @item
- In traditional C, you can combine @code{long}, etc., with a typedef name,
- as shown here:
-
- @example
- typedef int foo;
- typedef long foo bar;
- @end example
-
- In ANSI C, this is not allowed: @code{long} and other type modifiers
- require an explicit @code{int}. Because this criterion is expressed
- by Bison grammar rules rather than C code, the @samp{-traditional}
- flag cannot alter it.
-
- @cindex typedef names as function parameters
- @item
- PCC allows typedef names to be used as function parameters. The
- difficulty described immediately above applies here too.
-
- @cindex whitespace
- @item
- PCC allows whitespace in the middle of compound assignment operators
- such as @samp{+=}. GNU CC, following the ANSI standard, does not
- allow this. The difficulty described immediately above applies here
- too.
-
- @cindex apostrophes
- @cindex '
- @item
- GNU CC will flag unterminated character constants inside of preprocessor
- conditionals that fail. Some programs have English comments enclosed in
- conditionals that are guaranteed to fail; if these comments contain
- apostrophes, GNU CC will probably report an error. For example,
- this code would produce an error:
-
- @example
- #if 0
- You can't expect this to work.
- #endif
- @end example
-
- The best solution to such a problem is to put the text into an actual
- C comment delimited by @samp{/*@dots{}*/}. However,
- @samp{-traditional} suppresses these error messages.
-
- @cindex @code{float} as function value type
- @item
- When compiling functions that return @code{float}, PCC converts it to
- a double. GNU CC actually returns a @code{float}. If you are concerned
- with PCC compatibility, you should declare your functions to return
- @code{double}; you might as well say what you mean.
-
- @cindex structures
- @cindex unions
- @item
- When compiling functions that return structures or unions, GNU CC
- output code normally uses a method different from that used on most
- versions of Unix. As a result, code compiled with GNU CC cannot call
- a structure-returning function compiled with PCC, and vice versa.
-
- The method used by GNU CC is as follows: a structure or union which is
- 1, 2, 4 or 8 bytes long is returned like a scalar. A structure or union
- with any other size is stored into an address supplied by the caller
- (usually in a special, fixed register, but on some machines it is passed
- on the stack). The machine-description macros @code{STRUCT_VALUE} and
- @code{STRUCT_INCOMING_VALUE} tell GNU CC where to pass this address.
-
- By contrast, PCC on most target machines returns structures and unions
- of any size by copying the data into an area of static storage, and then
- returning the address of that storage as if it were a pointer value.
- The caller must copy the data from that memory area to the place where
- the value is wanted. GNU CC does not use this method because it is
- slower and nonreentrant.
-
- On some newer machines, PCC uses a reentrant convention for all
- structure and union returning. GNU CC on most of these machines uses a
- compatible convention when returning structures and unions in memory,
- but still returns small structures and unions in registers.
-
- You can tell GNU CC to use a compatible convention for all structure and
- union returning with the option @samp{-fpcc-struct-return}.
- @end itemize
-
- @node Disappointments
- @section Disappointments and Misunderstandings
-
- These problems are perhaps regrettable, but we don't know any practical
- way around them.
-
- @itemize @bullet
- @item
- Certain local variables aren't recognized by debuggers when you compile
- with optimization.
-
- This occurs because sometimes GNU CC optimizes the variable out of
- existence. There is no way to tell the debugger how to compute the
- value such a variable ``would have had'', and it is not clear that would
- be desirable anyway. So GNU CC simply does not mention the eliminated
- variable when it writes debugging information.
-
- You have to expect a certain amount of disagreement between the
- executable and your source code, when you use optimization.
-
- @cindex conflicting types
- @cindex scope of declaration
- @item
- Users often think it is a bug when GNU CC reports an error for code
- like this:
-
- @example
- int foo (struct mumble *);
-
- struct mumble @{ @dots{} @};
-
- int foo (struct mumble *x)
- @{ @dots{} @}
- @end example
-
- This code really is erroneous, because the scope of @code{struct
- mumble} the prototype is limited to the argument list containing it.
- It does not refer to the @code{struct mumble} defined with file scope
- immediately below---they are two unrelated types with similar names in
- different scopes.
-
- But in the definition of @code{foo}, the file-scope type is used
- because that is available to be inherited. Thus, the definition and
- the prototype do not match, and you get an error.
-
- This behavior may seem silly, but it's what the ANSI standard specifies.
- It is easy enough for you to make your code work by moving the
- definition of @code{struct mumble} above the prototype. It's not worth
- being incompatible with ANSI C just to avoid an error for the example
- shown above.
- @end itemize
-
- @node Non-bugs,, Disappointments, Trouble
- @section Certain Changes We Don't Want to Make
-
- This section lists changes that people frequently request, but which
- we do not make because we think GNU CC is better without them.
-
- @itemize @bullet
- @item
- Checking the number and type of arguments to a function which has an
- old-fashioned definition and no prototype.
-
- Such a feature would work only occasionally---only for calls that appear
- in the same file as the called function, following the definition. The
- only way to check all calls reliably is to add a prototype for the
- function. But adding a prototype eliminates the motivation for this
- feature. So the feature is not worthwhile.
-
- @item
- Warning about using an expression whose type is signed as a shift count.
-
- Shift count operands are probably signed more often than unsigned.
- Warning about this would cause far more annoyance than good.
-
- @item
- Warning about assigning a signed value to an unsigned variable.
-
- Such assignments must be very common; warning about them would cause
- more annoyance than good.
-
- @item
- Warning when a non-void function value is ignored.
-
- Coming as I do from a Lisp background, I balk at the idea that there is
- something dangerous about discarding a value. There are functions that
- return values which some callers may find useful; it makes no sense to
- clutter the program with a cast to @code{void} whenever the value isn't
- useful.
-
- @item
- Assuming (for optimization) that the address of an external symbol is
- never zero.
-
- This assumption is false on certain systems when @samp{#pragma weak} is
- used.
-
- @item
- Making @samp{-fshort-enums} the default.
-
- This would cause storage layout to be incompatible with most other C
- compilers. And it doesn't seem very important, given that you can get
- the same result in other ways. The case where it matters most is when
- the enumeration-valued object is inside a structure, and in that case
- you can specify a field width explicitly.
-
- @item
- Making bitfields unsigned by default on particular machines where ``the
- ABI standard'' says to do so.
-
- The ANSI C standard leaves it up to the implementation whether a bitfield
- declared plain @code{int} is signed or not. This in effect creates two
- alternative dialects of C.
-
- The GNU C compiler supports both dialects; you can specify the dialect
- you want with the option @samp{-fsigned-bitfields} or
- @samp{-funsigned-bitfields}. However, this leaves open the question
- of which dialect to use by default.
-
- Currently, the preferred dialect makes plain bitfields signed, because
- this is simplest. Since @code{int} is the same as @code{signed int} in
- every other context, it is cleanest for them to be the same in bitfields
- as well.
-
- Some computer manufacturers have published Application Binary Interface
- standards which specify that plain bitfields should be unsigned. It is
- a mistake, however, to say anything about this issue in an ABI. This is
- because the handling of plain bitfields distinguishes two dialects of C.
- Both dialects are meaningful on every type of machine. Whether a
- particular object file was compiled using signed bitfields or unsigned
- is of no concern to other object files, even if they access the same
- bitfields in the same data structures.
-
- A given program is written in one or the other of these two dialects.
- The program stands a chance to work on most any machine if it is
- compiled with the proper dialect. It is unlikely to work at all if
- compiled with the wrong dialect.
-
- Many users appreciate the GNU C compiler because it provides an
- environment that is uniform across machines. These users would be
- inconvenienced if the compiler treated plain bitfields differently on
- certain machines.
-
- Occasionally users write programs intended only for a particular machine
- type. On these occasions, the users would benefit if the GNU C compiler
- were to support by default the same dialect as the other compilers on
- that machine. But such applications are rare. And users writing a
- program to run on more than one type of machine cannot possibly benefit
- from this kind of compatibility.
-
- This is why GNU CC does and will treat plain bitfields in the same
- fashion on all types of machines (by default).
-
- There are some arguments for making bitfields unsigned by default on all
- machines. If, for example, this becomes a universal de facto standard,
- it would make sense for GNU CC to go along with it. This is something
- to be considered in the future.
-
- (Of course, users strongly concerned about portability should indicate
- explicitly in each bitfield whether it is signed or not. In this way,
- they write programs which have the same meaning in both C dialects.)
-
- @item
- Undefining @code{__STDC__} when @samp{-ansi} is not used.
-
- Currently, GNU CC defines @code{__STDC__} as long as you don't use
- @samp{-traditional}. This provides good results in practice.
-
- Programmers normally use conditionals on @code{__STDC__} to ask whether
- it is safe to use certain features of ANSI C, such as function
- prototypes or ANSI token concatenation. Since plain @samp{gcc} supports
- all the features of ANSI C, the correct answer to these questions is
- ``yes''.
-
- Some users try to use @code{__STDC__} to check for the availability of
- certain library facilities. This is actually incorrect usage in an ANSI
- C program, because the ANSI C standard says that a conforming
- freestanding implementation should define @code{__STDC__} even though it
- does not have the library facilities. @samp{gcc -ansi -pedantic} is a
- conforming freestanding implementation, and it is therefore required to
- define @code{__STDC__}, even though it does not come with an ANSI C
- library.
-
- Sometimes people say that defining @code{__STDC__} in a compiler that
- does not completely conform to the ANSI C standard somehow violates the
- standard. This is illogical. The standard is a standard for compilers
- that claim to support ANSI C, such as @samp{gcc -ansi}---not for other
- compilers such as plain @samp{gcc}. Whatever the ANSI C standard says
- is relevant to the design of plain @samp{gcc} without @samp{-ansi} only
- for pragmatic reasons, not as a requirement.
-
- @item
- Undefining @code{__STDC__} in C++.
-
- Programs written to compile with C++-to-C translators get the
- value of @code{__STDC__} that goes with the C compiler that is
- subsequently used. These programs must test @code{__STDC__}
- to determine what kind of C preprocessor that compiler uses:
- whether they should concatenate tokens in the ANSI C fashion
- or in the traditional fashion.
-
- These programs work properly with GNU C++ if @code{__STDC__} is defined.
- They would not work otherwise.
-
- In addition, many header files are written to provide prototypes in ANSI
- C but not in traditional C. Many of these header files can work without
- change in C++ provided @code{__STDC__} is defined. If @code{__STDC__}
- is not defined, they will all fail, and will all need to be changed to
- test explicitly for C++ as well.
-
- @item
- Deleting ``empty'' loops.
-
- GNU CC does not delete ``empty'' loops because the most likely reason
- you would put one in a program is to have a delay. Deleting them will
- not make real programs run any faster, so it would be pointless.
-
- It would be different if optimization of a nonempty loop could produce
- an empty one. But this generally can't happen.
- @end itemize
-
- @node Bugs
- @chapter Reporting Bugs
- @cindex bugs
- @cindex reporting bugs
-
- Your bug reports play an essential role in making GNU CC reliable.
-
- When you encounter a problem, the first thing to do is to see if it is
- already known. @xref{Trouble}. If it isn't known, then you should
- report the problem.
-
- Reporting a bug may help you by bringing a solution to your problem, or
- it may not. (If it does not, look in the service directory; see
- @ref{Service}.) In any case, the principal function of a bug report is
- to help the entire community by making the next version of GNU CC work
- better. Bug reports are your contribution to the maintenance of GNU CC.
-
- In order for a bug report to serve its purpose, you must include the
- information that makes for fixing the bug.
-
- @menu
- * Criteria: Bug Criteria. Have you really found a bug?
- * Where: Bug Lists. Where to send your bug report.
- * Reporting: Bug Reporting. How to report a bug effectively.
- * Patches: Sending Patches. How to send a patch for GNU CC.
- * Known: Trouble. Known problems.
- * Help: Service. Where to ask for help.
- @end menu
-
- @node Bug Criteria
- @section Have You Found a Bug?
- @cindex bug criteria
-
- If you are not sure whether you have found a bug, here are some guidelines:
-
- @itemize @bullet
- @cindex fatal signal
- @cindex core dump
- @item
- If the compiler gets a fatal signal, for any input whatever, that is a
- compiler bug. Reliable compilers never crash.
-
- @cindex invalid assembly code
- @cindex assembly code, invalid
- @item
- If the compiler produces invalid assembly code, for any input whatever
- (except an @code{asm} statement), that is a compiler bug, unless the
- compiler reports errors (not just warnings) which would ordinarily
- prevent the assembler from being run.
-
- @cindex undefined behavior
- @cindex undefined function value
- @cindex increment operators
- @item
- If the compiler produces valid assembly code that does not correctly
- execute the input source code, that is a compiler bug.
-
- However, you must double-check to make sure, because you may have run
- into an incompatibility between GNU C and traditional C
- (@pxref{Incompatibilities}). These incompatibilities might be considered
- bugs, but they are inescapable consequences of valuable features.
-
- Or you may have a program whose behavior is undefined, which happened
- by chance to give the desired results with another C or C++ compiler.
-
- For example, in many nonoptimizing compilers, you can write @samp{x;}
- at the end of a function instead of @samp{return x;}, with the same
- results. But the value of the function is undefined if @code{return}
- is omitted; it is not a bug when GNU CC produces different results.
-
- Problems often result from expressions with two increment operators,
- as in @code{f (*p++, *p++)}. Your previous compiler might have
- interpreted that expression the way you intended; GNU CC might
- interpret it another way. Neither compiler is wrong. The bug is
- in your code.
-
- After you have localized the error to a single source line, it should
- be easy to check for these things. If your program is correct and
- well defined, you have found a compiler bug.
-
- @item
- If the compiler produces an error message for valid input, that is a
- compiler bug.
-
- @cindex invalid input
- @item
- If the compiler does not produce an error message for invalid input,
- that is a compiler bug. However, you should note that your idea of
- ``invalid input'' might be my idea of ``an extension'' or ``support
- for traditional practice''.
-
- @item
- If you are an experienced user of C or C++ compilers, your suggestions
- for improvement of GNU CC or GNU C++ are welcome in any case.
- @end itemize
-
- @node Bug Lists
- @section Where to Report Bugs
- @cindex bug report mailing lists
-
- Send bug reports for GNU C to one of these addresses:
-
- @example
- bug-gcc@@prep.ai.mit.edu
- @{ucbvax|mit-eddie|uunet@}!prep.ai.mit.edu!bug-gcc
- @end example
-
- Send bug reports for GNU C++ to one of these addresses:
-
- @example
- bug-g++@@prep.ai.mit.edu
- @{ucbvax|mit-eddie|uunet@}!prep.ai.mit.edu!bug-g++
- @end example
-
- @strong{Do not send bug reports to @samp{help-gcc}, or to the newsgroup
- @samp{gnu.gcc.help}.} Most users of GNU CC do not want to receive bug
- reports. Those that do, have asked to be on @samp{bug-gcc} and/or
- @samp{bug-g++}.
-
- The mailing lists @samp{bug-gcc} and @samp{bug-g++} both have newsgroups
- which serve as repeaters: @samp{gnu.gcc.bug} and @samp{gnu.g++.bug}.
- Each mailing list and its newsgroup carry exactly the same messages.
-
- Often people think of posting bug reports to the newsgroup instead of
- mailing them. This appears to work, but it has one problem which can be
- crucial: a newsgroup posting does not contain a mail path back to the
- sender. Thus, if maintaners need more information, they may be unable
- to reach you. For this reason, you should always send bug reports by
- mail to the proper mailing list.
-
- As a last resort, send bug reports on paper to:
-
- @example
- GNU Compiler Bugs
- Free Software Foundation
- 675 Mass Ave
- Cambridge, MA 02139
- @end example
-
- @node Bug Reporting
- @section How to Report Bugs
- @cindex compiler bugs, reporting
-
- The fundamental principle of reporting bugs usefully is this:
- @strong{report all the facts}. If you are not sure whether to state a
- fact or leave it out, state it!
-
- Often people omit facts because they think they know what causes the
- problem and they conclude that some details don't matter. Thus, you might
- assume that the name of the variable you use in an example does not matter.
- Well, probably it doesn't, but one cannot be sure. Perhaps the bug is a
- stray memory reference which happens to fetch from the location where that
- name is stored in memory; perhaps, if the name were different, the contents
- of that location would fool the compiler into doing the right thing despite
- the bug. Play it safe and give a specific, complete example. That is the
- easiest thing for you to do, and the most helpful.
-
- Keep in mind that the purpose of a bug report is to enable someone to
- fix the bug if it is not known. It isn't very important what happens if
- the bug is already known. Therefore, always write your bug reports on
- the assumption that the bug is not known.
-
- Sometimes people give a few sketchy facts and ask, ``Does this ring a
- bell?'' This cannot help us fix a bug, so it is basically useless. We
- respond by asking for enough details to enable us to investigate.
- You might as well expedite matters by sending them to begin with.
-
- Try to make your bug report self-contained. If we have to ask you for
- more information, it is best if you include all the previous information
- in your response, as well as the information that was missing.
-
- To enable someone to investigate the bug, you should include all these
- things:
-
- @itemize @bullet
- @item
- The version of GNU CC. You can get this by running it with the
- @samp{-v} option.
-
- Without this, we won't know whether there is any point in looking for
- the bug in the current version of GNU CC.
-
- @item
- A complete input file that will reproduce the bug. If the bug is in the
- C preprocessor, send a source file and any header files that it
- requires. If the bug is in the compiler proper (@file{cc1}), run your
- source file through the C preprocessor by doing @samp{gcc -E
- @var{sourcefile} > @var{outfile}}, then include the contents of
- @var{outfile} in the bug report. (When you do this, use the same
- @samp{-I}, @samp{-D} or @samp{-U} options that you used in actual
- compilation.)
-
- A single statement is not enough of an example. In order to compile
- it, it must be embedded in a function definition; and the bug might
- depend on the details of how this is done.
-
- Without a real example one can compile, all anyone can do about your bug
- report is wish you luck. It would be futile to try to guess how to
- provoke the bug. For example, bugs in register allocation and reloading
- frequently depend on every little detail of the function they happen in.
-
- @item
- The command arguments you gave GNU CC or GNU C++ to compile that example
- and observe the bug. For example, did you use @samp{-O}? To guarantee
- you won't omit something important, list all the options.
-
- If we were to try to guess the arguments, we would probably guess wrong
- and then we would not encounter the bug.
-
- @item
- The type of machine you are using, and the operating system name and
- version number.
-
- @item
- The operands you gave to the @code{configure} command when you installed
- the compiler.
-
- @item
- A complete list of any modifications you have made to the compiler
- source. (We don't promise to investigate the bug unless it happens in
- an unmodified compiler. But if you've made modifications and don't tell
- us, then you are sending us on a wild goose chase.)
-
- Be precise about these changes---show a context diff for them.
-
- Adding files of your own (such as a machine description for a machine we
- don't support) is a modification of the compiler source.
-
- @item
- Details of any other deviations from the standard procedure for installing
- GNU CC.
-
- @item
- A description of what behavior you observe that you believe is
- incorrect. For example, ``The compiler gets a fatal signal,'' or,
- ``The assembler instruction at line 208 in the output is incorrect.''
-
- Of course, if the bug is that the compiler gets a fatal signal, then one
- can't miss it. But if the bug is incorrect output, the maintainer might
- not notice unless it is glaringly wrong. None of us has time to study
- all the assembler code from a 50-line C program just on the chance that
- one instruction might be wrong. We need @code{you} to do this part!
-
- Even if the problem you experience is a fatal signal, you should still
- say so explicitly. Suppose something strange is going on, such as, your
- copy of the compiler is out of synch, or you have encountered a bug in
- the C library on your system. (This has happened!) Your copy might
- crash and the copy here would not. If you @i{said} to expect a crash,
- then when the compiler here fails to crash, we would know that the bug
- was not happening. If you don't say to expect a crash, then we would
- not know whether the bug was happening. We would not be able to draw
- any conclusion from our observations.
-
- Often the observed symptom is incorrect output when your program is run.
- Sad to say, this is not enough information unless the program is short
- and simple. None of us has time to study a large program to figure out
- how it would work if compiled correctly, much less which line of it was
- compiled wrong. So you will have to do that. Tell us which source line
- it is, and what incorrect result happens when that line is executed. A
- person who understands the program can find this as easily as finding a
- bug in the program itself.
-
- @item
- If you send examples of assembler code output from GNU CC or GNU C++,
- please use @samp{-g} when you make them. The debugging information
- includes source line numbers which are essential for correlating the
- output with the input.
-
- @item
- If you wish to suggest changes to the GNU CC source, send them as
- context diffs. If you even discuss something in the GNU CC source,
- refer to it by context, not by line number.
-
- The line numbers in the development sources don't match those in your
- sources. Your line numbers would convey no useful information to the
- maintainers.
-
- @item
- Additional information from a debugger might enable someone to find a
- problem on a machine which he does not have available. However, you
- need to think when you collect this information if you want it to have
- any chance of being useful.
-
- @cindex backtrace for bug reports
- For example, many people send just a backtrace, but that is never
- useful by itself. A simple backtrace with arguments conveys little
- about GNU CC because the compiler is largely data-driven; the same
- functions are called over and over for different RTL insns, doing
- different things depending on the details of the insn.
-
- Most of the arguments listed in the backtrace are useless because they
- are pointers to RTL list structure. The numeric values of the
- pointers, which the debugger prints in the backtrace, have no
- significance whatever; all that matters is the contents of the objects
- they point to (and most of the contents are other such pointers).
-
- In addition, most compiler passes consist of one or more loops that
- scan the RTL insn sequence. The most vital piece of information about
- such a loop---which insn it has reached---is usually in a local variable,
- not in an argument.
-
- @findex debug_rtx
- What you need to provide in addition to a backtrace are the values of
- the local variables for several stack frames up. When a local
- variable or an argument is an RTX, first print its value and then use
- the GDB command @code{pr} to print the RTL expression that it points
- to. (If GDB doesn't run on your machine, use your debugger to call
- the function @code{debug_rtx} with the RTX as an argument.) In
- general, whenever a variable is a pointer, its value is no use
- without the data it points to.
-
- In addition, include a debugging dump from just before the pass
- in which the crash happens. Most bugs involve a series of insns,
- not just one.
- @end itemize
-
- Here are some things that are not necessary:
-
- @itemize @bullet
- @item
- A description of the envelope of the bug.
-
- Often people who encounter a bug spend a lot of time investigating
- which changes to the input file will make the bug go away and which
- changes will not affect it.
-
- This is often time consuming and not very useful, because the way we
- will find the bug is by running a single example under the debugger with
- breakpoints, not by pure deduction from a series of examples. You might
- as well save your time for something else.
-
- Of course, if you can find a simpler example to report @emph{instead} of
- the original one, that is a convenience. Errors in the output will be
- easier to spot, running under the debugger will take less time, etc.
- Most GNU CC bugs involve just one function, so the most straightforward
- way to simplify an example is to delete all the function definitions
- except the one where the bug occurs. Those earlier in the file may be
- replaced by external declarations if the crucial function depends on
- them. (Exception: inline functions may affect compilation of functions
- defined later in the file.)
-
- However, simplification is not vital; if you don't want to do this,
- report the bug anyway and send the entire test case you used.
-
- @item
- A patch for the bug.
-
- A patch for the bug is useful if it is a good one. But don't omit the
- necessary information, such as the test case, on the assumption that a
- patch is all we need. We might see problems with your patch and decide
- to fix the problem another way, or we might not understand it at all.
-
- Sometimes with a program as complicated as GNU CC it is very hard to
- construct an example that will make the program follow a certain path
- through the code. If you don't send the example, we won't be able to
- construct one, so we won't be able to verify that the bug is fixed.
-
- And if we can't understand what bug you are trying to fix, or why your
- patch should be an improvement, we won't install it. A test case will
- help us to understand.
-
- @xref{Sending Patches}, for guidelines on how to make it easy for us to
- understand and install your patches.
-
- @item
- A guess about what the bug is or what it depends on.
-
- Such guesses are usually wrong. Even I can't guess right about such
- things without first using the debugger to find the facts.
- @end itemize
-
- @node Sending Patches,, Bug Reporting, Bugs
- @section Sending Patches for GNU CC
-
- If you would like to write bug fixes or improvements for the GNU C
- compiler, that is very helpful. When you send your changes, please
- follow these guidelines to avoid causing extra work for us in studying
- the patches.
-
- If you don't follow these guidelines, your information might still be
- useful, but using it will take extra work. Maintaining GNU C is a lot
- of work in the best of circumstances, and we can't keep up unless you do
- your best to help.
-
- @itemize @bullet
- @item
- Send an explanation with your changes of what problem they fix or what
- improvement they bring about. For a bug fix, just include a copy of the
- bug report, and explain why the change fixes the bug.
-
- (Referring to a bug report is not as good as including it, because then
- we will have to look it up, and we have probably already deleted it if
- we've already fixed the bug.)
-
- @item
- Always include a proper bug report for the problem you think you have
- fixed. We need to convince ourselves that the change is right before
- installing it. Even if it is right, we might have trouble judging it if
- we don't have a way to reproduce the problem.
-
- @item
- Include all the comments that are appropriate to help people reading the
- source in the future understand why this change was needed.
-
- @item
- Don't mix together changes made for different reasons.
- Send them @emph{individually}.
-
- If you make two changes for separate reasons, then we might not want to
- install them both. We might want to install just one. If you send them
- all jumbled together in a single set of diffs, we have to do extra work
- to disentangle them---to figure out which parts of the change serve
- which purpose. If we don't have time for this, we might have to ignore
- your changes entirely.
-
- If you send each change as soon as you have written it, with its own
- explanation, then the two changes never get tangled up, and we can
- consider each one properly without any extra work to disentangle them.
-
- Ideally, each change you send should be impossible to subdivide into
- parts that we might want to consider separately, because each of its
- parts gets its motivation from the other parts.
-
- @item
- Send each change as soon as that change is finished. Sometimes people
- think they are helping us by accumulating many changes to send them all
- together. As explained above, this is absolutely the worst thing you
- could do.
-
- Since you should send each change separately, you might as well send it
- right away. That gives us the option of installing it immediately if it
- is important.
-
- @item
- Use @samp{diff -c} to make your diffs. Diffs without context are hard
- for us to install reliably. More than that, they make it hard for us to
- study the diffs to decide whether we want to install them. Unidiff
- format is better than contextless diffs, but not as easy to read as
- @samp{-c} format.
-
- If you have GNU diff, use @samp{diff -cp}, which shows the name of the
- function that each change occurs in.
-
- @item
- Write the change log entries for your changes. We get lots of changes,
- and we don't have time to do all the change log writing ourselves.
-
- Read the @file{ChangeLog} file to see what sorts of information to put
- in, and to learn the style that we use. The purpose of the change log
- is to show people where to find what was changed. So you need to be
- specific about what functions you changed; in large functions, it's
- often helpful to indicate where within the function the change was.
-
- On the other hand, once you have shown people where to find the change,
- you need not explain its purpose. Thus, if you add a new function, all
- you need to say about it is that it is new. If you feel that the
- purpose needs explaining, it probably does---but the explanation will be
- much more useful if you put it in comments in the code.
-
- If you would like your name to appear in the header line for who made
- the change, send us the header line.
-
- @item
- When you write the fix, keep in mind that I can't install a change that
- would break other systems.
-
- People often suggest fixing a problem by changing machine-independent
- files such as @file{toplev.c} to do something special that a particular
- system needs. Sometimes it is totally obvious that such changes would
- break GNU CC for almost all users. We can't possibly make a change like
- that. At best it might tell us how to write another patch that would
- solve the problem acceptably.
-
- Sometimes people send fixes that @emph{might} be an improvement in
- general---but it is hard to be sure of this. It's hard to install
- such changes because we have to study them very carefully. Of course,
- a good explanation of the reasoning by which you concluded the change
- was correct can help convince us.
-
- The safest changes are changes to the configuration files for a
- particular machine. These are safe because they can't create new bugs
- on other machines.
-
- Please help us keep up with the workload by designing the patch in a
- form that is good to install.
- @end itemize
-
- @node Service
- @chapter How To Get Help with GNU CC
-
- If you need help installing, using or changing GNU CC, there are two
- ways to find it:
-
- @itemize @bullet
- @item
- Send a message to a suitable network mailing list. First try
- @code{bug-gcc@@prep.ai.mit.edu}, and if that brings no response, try
- @code{help-gcc@@prep.ai.mit.edu}.
-
- @item
- Look in the service directory for someone who might help you for a fee.
- The service directory is found in the file named @file{SERVICE} in the
- GNU CC distribution.
- @end itemize
-
- @node VMS
- @chapter Using GNU CC on VMS
-
- @menu
- * Include Files and VMS:: Where the preprocessor looks for the include files.
- * Global Declarations:: How to do globaldef, globalref and globalvalue with
- GNU CC.
- * VMS Misc:: Misc information.
- @end menu
-
- @node Include Files and VMS
- @section Include Files and VMS
-
- @cindex include files and VMS
- @cindex VMS and include files
- @cindex header files and VMS
- Due to the differences between the filesystems of Unix and VMS, GNU CC
- attempts to translate file names in @samp{#include} into names that VMS
- will understand. The basic strategy is to prepend a prefix to the
- specification of the include file, convert the whole filename to a VMS
- filename, and then try to open the file. GNU CC tries various prefixes
- one by one until one of them succeeds:
-
- @enumerate
- @item
- The first prefix is the @samp{GNU_CC_INCLUDE:} logical name: this is
- where GNU C header files are traditionally stored. If you wish to store
- header files in non-standard locations, then you can assign the logical
- @samp{GNU_CC_INCLUDE} to be a search list, where each element of the
- list is suitable for use with a rooted logical.
-
- @item
- The next prefix tried is @samp{SYS$SYSROOT:[SYSLIB.]}. This is where
- VAX-C header files are traditionally stored.
-
- @item
- If the include file specification by itself is a valid VMS filename, the
- preprocessor then uses this name with no prefix in an attempt to open
- the include file.
-
- @item
- If the file specification is not a valid VMS filename (i.e. does not
- contain a device or a directory specifier, and contains a @samp{/}
- character), the preprocessor tries to convert it from Unix syntax to
- VMS syntax.
-
- Conversion works like this: the first directory name becomes a device,
- and the rest of the directories are converted into VMS-format directory
- names. For example, @file{X11/foobar.h} is translated to
- @file{X11:[000000]foobar.h} or @file{X11:foobar.h}, whichever one can be
- opened. This strategy allows you to assign a logical name to point to
- the actual location of the header files.
-
- @item
- If none of these strategies succeeds, the @samp{#include} fails.
- @end enumerate
-
- Include directives of the form:
-
- @example
- #include foobar
- @end example
-
- @noindent
- are a common source of incompatibility between VAX-C and GNU CC. VAX-C
- treats this much like a standard @code{#include <foobar.h>} directive.
- That is incompatible with the ANSI C behavior implemented by GNU CC: to
- expand the name @code{foobar} as a macro. Macro expansion should
- eventually yield one of the two standard formats for @code{#include}:
-
- @example
- #include "@var{file}"
- #include <@var{file}>
- @end example
-
- If you have this problem, the best solution is to modify the source to
- convert the @code{#include} directives to one of the two standard forms.
- That will work with either compiler. If you want a quick and dirty fix,
- define the file names as macros with the proper expansion, like this:
-
- @example
- #define stdio <stdio.h>
- @end example
-
- @noindent
- This will work, as long as the name doesn't conflict with anything else
- in the program.
-
- Another source of incompatibility is that VAX-C assumes that:
-
- @example
- #include "foobar"
- @end example
-
- @noindent
- is actually asking for the file @file{foobar.h}. GNU CC does not
- make this assumption, and instead takes what you ask for literally;
- it tries to read the file @file{foobar}. The best way to avoid this
- problem is to always specify the desired file extension in your include
- directives.
-
- GNU CC for VMS is distributed with a set of include files that is
- sufficient to compile most general purpose programs. Even though the
- GNU CC distribution does not contain header files to define constants
- and structures for some VMS system-specific functions, there is no
- reason why you cannot use GNU CC with any of these functions. You first
- may have to generate or create header files, either by using the public
- domain utility @code{UNSDL} (which can be found on a DECUS tape), or by
- extracting the relevant modules from one of the system macro libraries,
- and using an editor to construct a C header file.
-
- @node Global Declarations
- @section Global Declarations and VMS
-
- @findex GLOBALREF
- @findex GLOBALDEF
- @findex GLOBALVALUEDEF
- @findex GLOBALVALUEREF
- GNU CC does not provide the @code{globalref}, @code{globaldef} and
- @code{globalvalue} keywords of VAX-C. You can get the same effect with
- an obscure feature of GAS, the GNU assembler. (This requires GAS
- version 1.39 or later.) The following macros allow you to use this
- feature in a fairly natural way:
-
- @smallexample
- #ifdef __GNUC__
- #define GLOBALREF(TYPE,NAME) \
- TYPE NAME \
- asm ("_$$PsectAttributes_GLOBALSYMBOL$$" #NAME)
- #define GLOBALDEF(TYPE,NAME,VALUE) \
- TYPE NAME \
- asm ("_$$PsectAttributes_GLOBALSYMBOL$$" #NAME) \
- = VALUE
- #define GLOBALVALUEREF(TYPE,NAME) \
- const TYPE NAME[1] \
- asm ("_$$PsectAttributes_GLOBALVALUE$$" #NAME)
- #define GLOBALVALUEDEF(TYPE,NAME,VALUE) \
- const TYPE NAME[1] \
- asm ("_$$PsectAttributes_GLOBALVALUE$$" #NAME) \
- = @{VALUE@}
- #else
- #define GLOBALREF(TYPE,NAME) \
- globalref TYPE NAME
- #define GLOBALDEF(TYPE,NAME,VALUE) \
- globaldef TYPE NAME = VALUE
- #define GLOBALVALUEDEF(TYPE,NAME,VALUE) \
- globalvalue TYPE NAME = VALUE
- #define GLOBALVALUEREF(TYPE,NAME) \
- globalvalue TYPE NAME
- #endif
- @end smallexample
-
- @noindent
- (The @code{_$$PsectAttributes_GLOBALSYMBOL} prefix at the start of the
- name is removed by the assembler, after it has modified the attributes
- of the symbol). These macros are provided in the VMS binaries
- distribution in a header file @file{GNU_HACKS.H}. An example of the
- usage is:
-
- @example
- GLOBALREF (int, ijk);
- GLOBALDEF (int, jkl, 0);
- @end example
-
- The macros @code{GLOBALREF} and @code{GLOBALDEF} cannot be used
- straightforwardly for arrays, since there is no way to insert the array
- dimension into the declaration at the right place. However, you can
- declare an array with these macros if you first define a typedef for the
- array type, like this:
-
- @example
- typedef int intvector[10];
- GLOBALREF (intvector, foo);
- @end example
-
- Array and structure initializers will also break the macros; you can
- define the initializer to be a macro of its own, or you can expand the
- @code{GLOBALDEF} macro by hand. You may find a case where you wish to
- use the @code{GLOBALDEF} macro with a large array, but you are not
- interested in explicitly initializing each element of the array. In
- such cases you can use an initializer like: @code{@{0,@}}, which will
- initialize the entire array to @code{0}.
-
- A shortcoming of this implementation is that a variable declared with
- @code{GLOBALVALUEREF} or @code{GLOBALVALUEDEF} is always an array. For
- example, the declaration:
-
- @example
- GLOBALVALUEREF(int, ijk);
- @end example
-
- @noindent
- declares the variable @code{ijk} as an array of type @code{int [1]}.
- This is done because a globalvalue is actually a constant; its ``value''
- is what the linker would normally consider an address. That is not how
- an integer value works in C, but it is how an array works. So treating
- the symbol as an array name gives consistent results---with the
- exception that the value seems to have the wrong type. @strong{Don't
- try to access an element of the array.} It doesn't have any elements.
- The array ``address'' may not be the address of actual storage.
-
- The fact that the symbol is an array may lead to warnings where the
- variable is used. Insert type casts to avoid the warnings. Here is an
- example; it takes advantage of the ANSI C feature allowing macros that
- expand to use the same name as the macro itself.
-
- @example
- GLOBALVALUEREF (int, ss$_normal);
- GLOBALVALUEDEF (int, xyzzy,123);
- #ifdef __GNUC__
- #define ss$_normal ((int) ss$_normal)
- #define xyzzy ((int) xyzzy)
- #endif
- @end example
-
- Don't use @code{globaldef} or @code{globalref} with a variable whose
- type is an enumeration type; this is not implemented. Instead, make the
- variable an integer, and use a @code{globalvaluedef} for each of the
- enumeration values. An example of this would be:
-
- @example
- #ifdef __GNUC__
- GLOBALDEF (int, color, 0);
- GLOBALVALUEDEF (int, RED, 0);
- GLOBALVALUEDEF (int, BLUE, 1);
- GLOBALVALUEDEF (int, GREEN, 3);
- #else
- enum globaldef color @{RED, BLUE, GREEN = 3@};
- #endif
- @end example
-
- @node VMS Misc
- @section Other VMS Issues
-
- @cindex exit status and VMS
- @cindex return value of @code{main}
- @cindex @code{main} and the exit status
- GNU CC automatically arranges for @code{main} to return 1 by default if
- you fail to specify an explicit return value. This will be interpreted
- by VMS as a status code indicating a normal successful completion.
- Version 1 of GNU CC did not provide this default.
-
- GNU CC on VMS works only with the GNU assembler, GAS. You need version
- 1.37 or later of GAS in order to produce value debugging information for
- the VMS debugger. Use the ordinary VMS linker with the object files
- produced by GAS.
-
- @cindex shared VMS run time system
- @cindex @file{VAXCRTL}
- Under previous versions of GNU CC, the generated code would occasionally
- give strange results when linked to the sharable @file{VAXCRTL} library.
- Now this should work.
-
- A caveat for use of @code{const} global variables: the @code{const}
- modifier must be specified in every external declaration of the variable
- in all of the source files that use that variable. Otherwise the linker
- will issue warnings about conflicting attributes for the variable. Your
- program will still work despite the warnings, but the variable will be
- placed in writable storage.
-
- @cindex name augmentation
- @cindex case sensitivity and VMS
- @cindex VMS and case sensitivity
- The VMS linker does not distinguish between upper and lower case letters
- in function and variable names. However, usual practice in C is to
- distinguish case. Normally GNU CC (by means of the assembler GAS)
- implements usual C behavior by augmenting each name that is not all
- lower-case. A name is augmented by truncating it to at most 23
- characters and then adding more characters at the end which encode the
- case pattern the rest.
-
- Name augmentation yields bad results for programs that use precompiled
- libraries (such as Xlib) which were generated by another compiler. You
- can use the compiler option @samp{/NOCASE_HACK} to inhibit augmentation;
- it makes external C functions and variables case-independent as is usual
- on VMS. Alternatively, you could write all references to the functions
- and variables in such libraries using lower case; this will work on VMS,
- but is not portable to other systems.
-
- Function and variable names are handled somewhat differently with GNU
- C++. The GNU C++ compiler performs @dfn{name mangling} on function
- names, which means that it adds information to the function name to
- describe the data types of the arguments that the function takes. One
- result of this is that the name of a function can become very long.
- Since the VMS linker only recognizes the first 31 characters in a name,
- special action is taken to ensure that each function and variable has a
- unique name that can be represented in 31 characters.
-
- If the name (plus a name augmentation, if required) is less than 32
- characters in length, then no special action is performed. If the name
- is longer than 31 characters, the assembler (GAS) will generate a
- hash string based upon the function name, truncate the function name to
- 23 characters, and append the hash string to the truncated name. If the
- @samp{/VERBOSE} compiler option is used, the assembler will print both
- the full and truncated names of each symbol that is truncated.
-
- The @samp{/NOCASE_HACK} compiler option should not be used when you are
- compiling programs that use libg++. libg++ has several instances of
- objects (i.e. @code{Filebuf} and @code{filebuf}) which become
- indistinguishable in a case-insensitive environment. This leads to
- cases where you need to inhibit augmentation selectively (if you were
- using libg++ and Xlib in the same program, for example). There is no
- special feature for doing this, but you can get the result by defining a
- macro for each mixed case symbol for which you wish to inhibit
- augmentation. The macro should expand into the lower case equivalent of
- itself. For example:
-
- @example
- #define StuDlyCapS studlycaps
- @end example
-
- These macro definitions can be placed in a header file to minimize the
- number of changes to your source code.
-
- @ifset INTERNALS
- @node Portability
- @chapter GNU CC and Portability
- @cindex portability
- @cindex GNU CC and portability
-
- The main goal of GNU CC was to make a good, fast compiler for machines in
- the class that the GNU system aims to run on: 32-bit machines that address
- 8-bit bytes and have several general registers. Elegance, theoretical
- power and simplicity are only secondary.
-
- GNU CC gets most of the information about the target machine from a machine
- description which gives an algebraic formula for each of the machine's
- instructions. This is a very clean way to describe the target. But when
- the compiler needs information that is difficult to express in this
- fashion, I have not hesitated to define an ad-hoc parameter to the machine
- description. The purpose of portability is to reduce the total work needed
- on the compiler; it was not of interest for its own sake.
-
- @cindex endianness
- @cindex autoincrement addressing, availability
- @findex abort
- GNU CC does not contain machine dependent code, but it does contain code
- that depends on machine parameters such as endianness (whether the most
- significant byte has the highest or lowest address of the bytes in a word)
- and the availability of autoincrement addressing. In the RTL-generation
- pass, it is often necessary to have multiple strategies for generating code
- for a particular kind of syntax tree, strategies that are usable for different
- combinations of parameters. Often I have not tried to address all possible
- cases, but only the common ones or only the ones that I have encountered.
- As a result, a new target may require additional strategies. You will know
- if this happens because the compiler will call @code{abort}. Fortunately,
- the new strategies can be added in a machine-independent fashion, and will
- affect only the target machines that need them.
- @end ifset
-
- @ifset INTERNALS
- @node Interface
- @chapter Interfacing to GNU CC Output
- @cindex interfacing to GNU CC output
- @cindex run-time conventions
- @cindex function call conventions
- @cindex conventions, run-time
-
- GNU CC is normally configured to use the same function calling convention
- normally in use on the target system. This is done with the
- machine-description macros described (@pxref{Target Macros}).
-
- @cindex unions, returning
- @cindex structures, returning
- @cindex returning structures and unions
- However, returning of structure and union values is done differently on
- some target machines. As a result, functions compiled with PCC
- returning such types cannot be called from code compiled with GNU CC,
- and vice versa. This does not cause trouble often because few Unix
- library routines return structures or unions.
-
- GNU CC code returns structures and unions that are 1, 2, 4 or 8 bytes
- long in the same registers used for @code{int} or @code{double} return
- values. (GNU CC typically allocates variables of such types in
- registers also.) Structures and unions of other sizes are returned by
- storing them into an address passed by the caller (usually in a
- register). The machine-description macros @code{STRUCT_VALUE} and
- @code{STRUCT_INCOMING_VALUE} tell GNU CC where to pass this address.
-
- By contrast, PCC on most target machines returns structures and unions
- of any size by copying the data into an area of static storage, and then
- returning the address of that storage as if it were a pointer value.
- The caller must copy the data from that memory area to the place where
- the value is wanted. This is slower than the method used by GNU CC, and
- fails to be reentrant.
-
- On some target machines, such as RISC machines and the 80386, the
- standard system convention is to pass to the subroutine the address of
- where to return the value. On these machines, GNU CC has been
- configured to be compatible with the standard compiler, when this method
- is used. It may not be compatible for structures of 1, 2, 4 or 8 bytes.
-
- @cindex argument passing
- @cindex passing arguments
- GNU CC uses the system's standard convention for passing arguments. On
- some machines, the first few arguments are passed in registers; in
- others, all are passed on the stack. It would be possible to use
- registers for argument passing on any machine, and this would probably
- result in a significant speedup. But the result would be complete
- incompatibility with code that follows the standard convention. So this
- change is practical only if you are switching to GNU CC as the sole C
- compiler for the system. We may implement register argument passing on
- certain machines once we have a complete GNU system so that we can
- compile the libraries with GNU CC.
-
- On some machines (particularly the Sparc), certain types of arguments
- are passed ``by invisible reference''. This means that the value is
- stored in memory, and the address of the memory location is passed to
- the subroutine.
-
- @cindex @code{longjmp} and automatic variables
- If you use @code{longjmp}, beware of automatic variables. ANSI C says that
- automatic variables that are not declared @code{volatile} have undefined
- values after a @code{longjmp}. And this is all GNU CC promises to do,
- because it is very difficult to restore register variables correctly, and
- one of GNU CC's features is that it can put variables in registers without
- your asking it to.
-
- If you want a variable to be unaltered by @code{longjmp}, and you don't
- want to write @code{volatile} because old C compilers don't accept it,
- just take the address of the variable. If a variable's address is ever
- taken, even if just to compute it and ignore it, then the variable cannot
- go in a register:
-
- @example
- @{
- int careful;
- &careful;
- @dots{}
- @}
- @end example
-
- @cindex arithmetic libraries
- @cindex math libraries
- Code compiled with GNU CC may call certain library routines. Most of
- them handle arithmetic for which there are no instructions. This
- includes multiply and divide on some machines, and floating point
- operations on any machine for which floating point support is disabled
- with @samp{-msoft-float}. Some standard parts of the C library, such as
- @code{bcopy} or @code{memcpy}, are also called automatically. The usual
- function call interface is used for calling the library routines.
-
- These library routines should be defined in the library @file{libgcc.a},
- which GNU CC automatically searches whenever it links a program. On
- machines that have multiply and divide instructions, if hardware
- floating point is in use, normally @file{libgcc.a} is not needed, but it
- is searched just in case.
-
- Each arithmetic function is defined in @file{libgcc1.c} to use the
- corresponding C arithmetic operator. As long as the file is compiled
- with another C compiler, which supports all the C arithmetic operators,
- this file will work portably. However, @file{libgcc1.c} does not work if
- compiled with GNU CC, because each arithmetic function would compile
- into a call to itself!
- @end ifset
-
- @ifset INTERNALS
- @node Passes
- @chapter Passes and Files of the Compiler
- @cindex passes and files of the compiler
- @cindex files and passes of the compiler
- @cindex compiler passes and files
-
- @cindex top level of compiler
- The overall control structure of the compiler is in @file{toplev.c}. This
- file is responsible for initialization, decoding arguments, opening and
- closing files, and sequencing the passes.
-
- @cindex parsing pass
- The parsing pass is invoked only once, to parse the entire input. The RTL
- intermediate code for a function is generated as the function is parsed, a
- statement at a time. Each statement is read in as a syntax tree and then
- converted to RTL; then the storage for the tree for the statement is
- reclaimed. Storage for types (and the expressions for their sizes),
- declarations, and a representation of the binding contours and how they nest,
- remain until the function is finished being compiled; these are all needed
- to output the debugging information.
-
- @findex rest_of_compilation
- @findex rest_of_decl_compilation
- Each time the parsing pass reads a complete function definition or
- top-level declaration, it calls the function
- @code{rest_of_compilation} or @code{rest_of_decl_compilation} in
- @file{toplev.c}, which are responsible for all further processing
- necessary, ending with output of the assembler language. All other
- compiler passes run, in sequence, within @code{rest_of_compilation}.
- When that function returns from compiling a function definition, the
- storage used for that function definition's compilation is entirely
- freed, unless it is an inline function (@pxref{Inline}).
-
- Here is a list of all the passes of the compiler and their source files.
- Also included is a description of where debugging dumps can be requested
- with @samp{-d} options.
-
- @itemize @bullet
- @item
- Parsing. This pass reads the entire text of a function definition,
- constructing partial syntax trees. This and RTL generation are no longer
- truly separate passes (formerly they were), but it is easier to think
- of them as separate.
-
- The tree representation does not entirely follow C syntax, because it is
- intended to support other languages as well.
-
- Language-specific data type analysis is also done in this pass, and every
- tree node that represents an expression has a data type attached.
- Variables are represented as declaration nodes.
-
- @cindex constant folding
- @cindex arithmetic simplifications
- @cindex simplifications, arithmetic
- Constant folding and some arithmetic simplifications are also done
- during this pass.
-
- The language-independent source files for parsing are
- @file{stor-layout.c}, @file{fold-const.c}, and @file{tree.c}.
- There are also header files @file{tree.h} and @file{tree.def}
- which define the format of the tree representation.@refill
-
- The source files for parsing C are @file{c-parse.y}, @file{c-decl.c},
- @file{c-typeck.c}, @file{c-convert.c}, @file{c-lang.c}, and
- @file{c-aux-info.c} along with header files @file{c-lex.h}, and
- @file{c-tree.h}.
-
- The source files for parsing C++ are @file{cp-parse.y},
- @file{cp-class.c}, @file{cp-cvt.c},@*
- @file{cp-decl.c}, @file{cp-decl.c}, @file{cp-decl2.c},
- @file{cp-dem.c}, @file{cp-except.c},@*
- @file{cp-expr.c}, @file{cp-init.c}, @file{cp-lex.c},
- @file{cp-method.c}, @file{cp-ptree.c},@*
- @file{cp-search.c}, @file{cp-tree.c}, @file{cp-type2.c}, and
- @file{cp-typeck.c}, along with header files @file{cp-tree.def},
- @file{cp-tree.h}, and @file{cp-decl.h}.
-
- The special source files for parsing Objective C are
- @file{objc-parse.y}, @file{objc-actions.c}, @file{objc-tree.def}, and
- @file{objc-actions.h}. Certain C-specific files are used for this as
- well.
-
- The file @file{c-common.c} is also used for all of the above languages.
-
- @cindex RTL generation
- @item
- RTL generation. This is the conversion of syntax tree into RTL code.
- It is actually done statement-by-statement during parsing, but for
- most purposes it can be thought of as a separate pass.
-
- @cindex target-parameter-dependent code
- This is where the bulk of target-parameter-dependent code is found,
- since often it is necessary for strategies to apply only when certain
- standard kinds of instructions are available. The purpose of named
- instruction patterns is to provide this information to the RTL
- generation pass.
-
- @cindex tail recursion optimization
- Optimization is done in this pass for @code{if}-conditions that are
- comparisons, boolean operations or conditional expressions. Tail
- recursion is detected at this time also. Decisions are made about how
- best to arrange loops and how to output @code{switch} statements.
-
- The source files for RTL generation include @file{stmt.c},
- @file{function.c}, @file{expr.c}, @file{calls.c}, @file{explow.c},
- @file{expmed.c}, @file{optabs.c} and @file{emit-rtl.c}. Also, the file
- @file{insn-emit.c}, generated from the machine description by the
- program @code{genemit}, is used in this pass. The header file
- @file{expr.h} is used for communication within this pass.@refill
-
- @findex genflags
- @findex gencodes
- The header files @file{insn-flags.h} and @file{insn-codes.h},
- generated from the machine description by the programs @code{genflags}
- and @code{gencodes}, tell this pass which standard names are available
- for use and which patterns correspond to them.@refill
-
- Aside from debugging information output, none of the following passes
- refers to the tree structure representation of the function (only
- part of which is saved).
-
- @cindex inline, automatic
- The decision of whether the function can and should be expanded inline
- in its subsequent callers is made at the end of rtl generation. The
- function must meet certain criteria, currently related to the size of
- the function and the types and number of parameters it has. Note that
- this function may contain loops, recursive calls to itself
- (tail-recursive functions can be inlined!), gotos, in short, all
- constructs supported by GNU CC. The file @file{integrate.c} contains
- the code to save a function's rtl for later inlining and to inline that
- rtl when the function is called. The header file @file{integrate.h}
- is also used for this purpose.
-
- The option @samp{-dr} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.rtl} to
- the input file name.
-
- @cindex jump optimization
- @cindex unreachable code
- @cindex dead code
- @item
- Jump optimization. This pass simplifies jumps to the following
- instruction, jumps across jumps, and jumps to jumps. It deletes
- unreferenced labels and unreachable code, except that unreachable code
- that contains a loop is not recognized as unreachable in this pass.
- (Such loops are deleted later in the basic block analysis.) It also
- converts some code originally written with jumps into sequences of
- instructions that directly set values from the results of comparisons,
- if the machine has such instructions.
-
- Jump optimization is performed two or three times. The first time is
- immediately following RTL generation. The second time is after CSE,
- but only if CSE says repeated jump optimization is needed. The
- last time is right before the final pass. That time, cross-jumping
- and deletion of no-op move instructions are done together with the
- optimizations described above.
-
- The source file of this pass is @file{jump.c}.
-
- The option @samp{-dj} causes a debugging dump of the RTL code after
- this pass is run for the first time. This dump file's name is made by
- appending @samp{.jump} to the input file name.
-
- @cindex register use analysis
- @item
- Register scan. This pass finds the first and last use of each
- register, as a guide for common subexpression elimination. Its source
- is in @file{regclass.c}.
-
- @cindex jump threading
- @item
- Jump threading. This pass detects a condition jump that branches to an
- identical or inverse test. Such jumps can be @samp{threaded} through
- the second conditional test. The source code for this pass is in
- @file{jump.c}. This optimization is only performed if
- @samp{-fthread-jumps} is enabled.
-
- @cindex common subexpression elimination
- @cindex constant propagation
- @item
- Common subexpression elimination. This pass also does constant
- propagation. Its source file is @file{cse.c}. If constant
- propagation causes conditional jumps to become unconditional or to
- become no-ops, jump optimization is run again when CSE is finished.
-
- The option @samp{-ds} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.cse} to
- the input file name.
-
- @cindex loop optimization
- @cindex code motion
- @cindex strength-reduction
- @item
- Loop optimization. This pass moves constant expressions out of loops,
- and optionally does strength-reduction and loop unrolling as well.
- Its source files are @file{loop.c} and @file{unroll.c}, plus the header
- @file{loop.h} used for communication between them. Loop unrolling uses
- some functions in @file{integrate.c} and the header @file{integrate.h}.
-
- The option @samp{-dL} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.loop} to
- the input file name.
-
- @item
- If @samp{-frerun-cse-after-loop} was enabled, a second common
- subexpression elimination pass is performed after the loop optimization
- pass. Jump threading is also done again at this time if it was specified.
-
- The option @samp{-dt} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.cse2} to
- the input file name.
-
- @cindex register allocation, stupid
- @cindex stupid register allocation
- @item
- Stupid register allocation is performed at this point in a
- nonoptimizing compilation. It does a little data flow analysis as
- well. When stupid register allocation is in use, the next pass
- executed is the reloading pass; the others in between are skipped.
- The source file is @file{stupid.c}.
-
- @cindex data flow analysis
- @cindex analysis, data flow
- @cindex basic blocks
- @item
- Data flow analysis (@file{flow.c}). This pass divides the program
- into basic blocks (and in the process deletes unreachable loops); then
- it computes which pseudo-registers are live at each point in the
- program, and makes the first instruction that uses a value point at
- the instruction that computed the value.
-
- @cindex autoincrement/decrement analysis
- This pass also deletes computations whose results are never used, and
- combines memory references with add or subtract instructions to make
- autoincrement or autodecrement addressing.
-
- The option @samp{-df} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.flow} to
- the input file name. If stupid register allocation is in use, this
- dump file reflects the full results of such allocation.
-
- @cindex instruction combination
- @item
- Instruction combination (@file{combine.c}). This pass attempts to
- combine groups of two or three instructions that are related by data
- flow into single instructions. It combines the RTL expressions for
- the instructions by substitution, simplifies the result using algebra,
- and then attempts to match the result against the machine description.
-
- The option @samp{-dc} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.combine}
- to the input file name.
-
- @cindex instruction scheduling
- @cindex scheduling, instruction
- @item
- Instruction scheduling (@file{sched.c}). This pass looks for
- instructions whose output will not be available by the time that it is
- used in subsequent instructions. (Memory loads and floating point
- instructions often have this behavior on RISC machines). It re-orders
- instructions within a basic block to try to separate the definition and
- use of items that otherwise would cause pipeline stalls.
-
- Instruction scheduling is performed twice. The first time is immediately
- after instruction combination and the second is immediately after reload.
-
- The option @samp{-dS} causes a debugging dump of the RTL code after this
- pass is run for the first time. The dump file's name is made by
- appending @samp{.sched} to the input file name.
-
- @cindex register class preference pass
- @item
- Register class preferencing. The RTL code is scanned to find out
- which register class is best for each pseudo register. The source
- file is @file{regclass.c}.
-
- @cindex register allocation
- @cindex local register allocation
- @item
- Local register allocation (@file{local-alloc.c}). This pass allocates
- hard registers to pseudo registers that are used only within one basic
- block. Because the basic block is linear, it can use fast and
- powerful techniques to do a very good job.
-
- The option @samp{-dl} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.lreg} to
- the input file name.
-
- @cindex global register allocation
- @item
- Global register allocation (@file{global-alloc.c}). This pass
- allocates hard registers for the remaining pseudo registers (those
- whose life spans are not contained in one basic block).
-
- @cindex reloading
- @item
- Reloading. This pass renumbers pseudo registers with the hardware
- registers numbers they were allocated. Pseudo registers that did not
- get hard registers are replaced with stack slots. Then it finds
- instructions that are invalid because a value has failed to end up in
- a register, or has ended up in a register of the wrong kind. It fixes
- up these instructions by reloading the problematical values
- temporarily into registers. Additional instructions are generated to
- do the copying.
-
- The reload pass also optionally eliminates the frame pointer and inserts
- instructions to save and restore call-clobbered registers around calls.
-
- Source files are @file{reload.c} and @file{reload1.c}, plus the header
- @file{reload.h} used for communication between them.
-
- The option @samp{-dg} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.greg} to
- the input file name.
-
- @cindex instruction scheduling
- @cindex scheduling, instruction
- @item
- Instruction scheduling is repeated here to try to avoid pipeline stalls
- due to memory loads generated for spilled pseudo registers.
-
- The option @samp{-dR} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.sched2}
- to the input file name.
-
- @cindex cross-jumping
- @cindex no-op move instructions
- @item
- Jump optimization is repeated, this time including cross-jumping
- and deletion of no-op move instructions.
-
- The option @samp{-dJ} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.jump2}
- to the input file name.
-
- @cindex delayed branch scheduling
- @cindex scheduling, delayed branch
- @item
- Delayed branch scheduling. This optional pass attempts to find
- instructions that can go into the delay slots of other instructions,
- usually jumps and calls. The source file name is @file{reorg.c}.
-
- The option @samp{-dd} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.dbr}
- to the input file name.
-
- @cindex register-to-stack conversion
- @item
- Conversion from usage of some hard registers to usage of a register
- stack may be done at this point. Currently, this is supported only
- for the floating-point registers of the Intel 80387 coprocessor. The
- source file name is @file{reg-stack.c}.
-
- The options @samp{-dk} causes a debugging dump of the RTL code after
- this pass. This dump file's name is made by appending @samp{.stack}
- to the input file name.
-
- @cindex final pass
- @cindex peephole optimization
- @item
- Final. This pass outputs the assembler code for the function. It is
- also responsible for identifying spurious test and compare
- instructions. Machine-specific peephole optimizations are performed
- at the same time. The function entry and exit sequences are generated
- directly as assembler code in this pass; they never exist as RTL.
-
- The source files are @file{final.c} plus @file{insn-output.c}; the
- latter is generated automatically from the machine description by the
- tool @file{genoutput}. The header file @file{conditions.h} is used
- for communication between these files.
-
- @cindex debugging information generation
- @item
- Debugging information output. This is run after final because it must
- output the stack slot offsets for pseudo registers that did not get
- hard registers. Source files are @file{dbxout.c} for DBX symbol table
- format, @file{sdbout.c} for SDB symbol table format, and
- @file{dwarfout.c} for DWARF symbol table format.
- @end itemize
-
- Some additional files are used by all or many passes:
-
- @itemize @bullet
- @item
- Every pass uses @file{machmode.def} and @file{machmode.h} which define
- the machine modes.
-
- @item
- Several passes use @file{real.h}, which defines the default
- representation of floating point constants and how to operate on them.
-
- @item
- All the passes that work with RTL use the header files @file{rtl.h}
- and @file{rtl.def}, and subroutines in file @file{rtl.c}. The tools
- @code{gen*} also use these files to read and work with the machine
- description RTL.
-
- @findex genconfig
- @item
- Several passes refer to the header file @file{insn-config.h} which
- contains a few parameters (C macro definitions) generated
- automatically from the machine description RTL by the tool
- @code{genconfig}.
-
- @cindex instruction recognizer
- @item
- Several passes use the instruction recognizer, which consists of
- @file{recog.c} and @file{recog.h}, plus the files @file{insn-recog.c}
- and @file{insn-extract.c} that are generated automatically from the
- machine description by the tools @file{genrecog} and
- @file{genextract}.@refill
-
- @item
- Several passes use the header files @file{regs.h} which defines the
- information recorded about pseudo register usage, and @file{basic-block.h}
- which defines the information recorded about basic blocks.
-
- @item
- @file{hard-reg-set.h} defines the type @code{HARD_REG_SET}, a bit-vector
- with a bit for each hard register, and some macros to manipulate it.
- This type is just @code{int} if the machine has few enough hard registers;
- otherwise it is an array of @code{int} and some of the macros expand
- into loops.
-
- @item
- Several passes use instruction attributes. A definition of the
- attributes defined for a particular machine is in file
- @file{insn-attr.h}, which is generated from the machine description by
- the program @file{genattr}. The file @file{insn-attrtab.c} contains
- subroutines to obtain the attribute values for insns. It is generated
- from the machine description by the program @file{genattrtab}.@refill
- @end itemize
- @end ifset
-
- @include rtl.texi
- @include md.texi
- @include tm.texi
-
- @ifset INTERNALS
- @node Config
- @chapter The Configuration File
- @cindex configuration file
- @cindex @file{xm-@var{machine}.h}
-
- The configuration file @file{xm-@var{machine}.h} contains macro
- definitions that describe the machine and system on which the compiler
- is running, unlike the definitions in @file{@var{machine}.h}, which
- describe the machine for which the compiler is producing output. Most
- of the values in @file{xm-@var{machine}.h} are actually the same on all
- machines that GNU CC runs on, so large parts of all configuration files
- are identical. But there are some macros that vary:
-
- @table @code
- @findex USG
- @item USG
- Define this macro if the host system is System V.
-
- @findex VMS
- @item VMS
- Define this macro if the host system is VMS.
-
- @findex FAILURE_EXIT_CODE
- @item FAILURE_EXIT_CODE
- A C expression for the status code to be returned when the compiler
- exits after serious errors.
-
- @findex SUCCESS_EXIT_CODE
- @item SUCCESS_EXIT_CODE
- A C expression for the status code to be returned when the compiler
- exits without serious errors.
-
- @findex HOST_WORDS_BIG_ENDIAN
- @item HOST_WORDS_BIG_ENDIAN
- Defined if the host machine stores words of multi-word values in
- big-endian order. (GNU CC does not depend on the host byte ordering
- within a word.)
-
- @findex HOST_FLOAT_FORMAT
- @item HOST_FLOAT_FORMAT
- A numeric code distinguishing the floating point format for the host
- machine. See @code{TARGET_FLOAT_FORMAT} in @ref{Storage Layout} for the
- alternatives and default.
-
- @findex HOST_BITS_PER_CHAR
- @item HOST_BITS_PER_CHAR
- A C expression for the number of bits in @code{char} on the host
- machine.
-
- @findex HOST_BITS_PER_SHORT
- @item HOST_BITS_PER_SHORT
- A C expression for the number of bits in @code{short} on the host
- machine.
-
- @findex HOST_BITS_PER_INT
- @item HOST_BITS_PER_INT
- A C expression for the number of bits in @code{int} on the host
- machine.
-
- @findex HOST_BITS_PER_LONG
- @item HOST_BITS_PER_LONG
- A C expression for the number of bits in @code{long} on the host
- machine.
-
- @findex ONLY_INT_FIELDS
- @item ONLY_INT_FIELDS
- Define this macro to indicate that the host compiler only supports
- @code{int} bit fields, rather than other integral types, including
- @code{enum}, as do most C compilers.
-
- @findex EXECUTABLE_SUFFIX
- @item EXECUTABLE_SUFFIX
- Define this macro if the host system uses a naming convention for
- executable files that involves a common suffix (such as, in some
- systems, @samp{.exe}) that must be mentioned explicitly when you run
- the program.
-
- @findex OBSTACK_CHUNK_SIZE
- @item OBSTACK_CHUNK_SIZE
- A C expression for the size of ordinary obstack chunks.
- If you don't define this, a usually-reasonable default is used.
-
- @findex OBSTACK_CHUNK_ALLOC
- @item OBSTACK_CHUNK_ALLOC
- The function used to allocate obstack chunks.
- If you don't define this, @code{xmalloc} is used.
-
- @findex OBSTACK_CHUNK_FREE
- @item OBSTACK_CHUNK_FREE
- The function used to free obstack chunks.
- If you don't define this, @code{free} is used.
-
- @findex USE_C_ALLOCA
- @item USE_C_ALLOCA
- Define this macro to indicate that the compiler is running with the
- @code{alloca} implemented in C. This version of @code{alloca} can be
- found in the file @file{alloca.c}; to use it, you must also alter the
- @file{Makefile} variable @code{ALLOCA}. (This is done automatically
- for the systems on which we know it is needed.)
-
- If you do define this macro, you should probably do it as follows:
-
- @example
- #ifndef __GNUC__
- #define USE_C_ALLOCA
- #else
- #define alloca __builtin_alloca
- #endif
- @end example
-
- @noindent
- so that when the compiler is compiled with GNU CC it uses the more
- efficient built-in @code{alloca} function.
-
- @item FUNCTION_CONVERSION_BUG
- @findex FUNCTION_CONVERSION_BUG
- Define this macro to indicate that the host compiler does not properly
- handle converting a function value to a pointer-to-function when it is
- used in an expression.
-
- @findex HAVE_VPRINTF
- @findex vprintf
- @item HAVE_VPRINTF
- Define this if the library function @code{vprintf} is available on your
- system.
-
- @findex MULTIBYTE_CHARS
- @item MULTIBYTE_CHARS
- Define this macro to enable support for multibyte characters in the
- input to GNU CC. This requires that the host system support the ANSI C
- library functions for converting multibyte characters to wide
- characters.
-
- @findex HAVE_PUTENV
- @findex putenv
- @item HAVE_PUTENV
- Define this if the library function @code{putenv} is available on your
- system.
-
- @findex NO_SYS_SIGLIST
- @item NO_SYS_SIGLIST
- Define this if your system @emph{does not} provide the variable
- @code{sys_siglist}.
-
- @vindex sys_siglist
- Some systems do provide this variable, but with a different name such
- as @code{_sys_siglist}. On these systems, you can define
- @code{sys_siglist} as a macro which expands into the name actually
- provided.
-
- @findex NO_STAB_H
- @item NO_STAB_H
- Define this if your system does not have the include file
- @file{stab.h}. If @samp{USG} is defined, @samp{NO_STAB_H} is
- assumed.
- @end table
-
- @findex bzero
- @findex bcmp
- In addition, configuration files for system V define @code{bcopy},
- @code{bzero} and @code{bcmp} as aliases. Some files define @code{alloca}
- as a macro when compiled with GNU CC, in order to take advantage of the
- benefit of GNU CC's built-in @code{alloca}.
-
-
- @node Index
- @unnumbered Index
- @end ifset
-
- @ifclear INTERNALS
- @node Index
- @unnumbered Index
- @end ifclear
-
- @printindex cp
- @contents
- @bye
-