home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-11-24 | 143.0 KB | 3,233 lines |
-
- From: scs@adam.mit.edu (Steve Summit)
- To: Matthew.Hunt@p2.f3.n2616.z1.fidonet.org
- Date: Thu, 20 May 93 09:03:42 -0400
-
- Here is the latest version of the comp.lang.c FAQ list. You
- might also want to watch for newer versions in comp.lang.c, near
- the first of each month (see also the last question).
-
- Newsgroups: comp.lang.c,comp.answers,news.answers
- From: scs@adam.mit.edu (Steve Summit)
- Subject: comp.lang.c Answers to Frequently Asked Questions (FAQ List)
- Message-ID: <1s387bINNrot@senator-bedfellow.MIT.EDU>
- Followup-To: poster
- Supersedes: <1pdstkINN3aj@senator-bedfellow.MIT.EDU>
- Reply-To: scs@adam.mit.edu
- X-Archive-Name: C-faq/faq
- Date: 3 May 1993 13:54:51 GMT
- X-Last-Modified: May 2, 1993
- Expires: 3 Jun 1993 00:00:00 GMT
- Lines: 3210
-
- [Last modified May 2, 1993 by scs.]
-
- Certain topics come up again and again on this newsgroup. They are good
- questions, and the answers may not be immediately obvious, but each time
- they recur, much net bandwidth and reader time is wasted on repetitive
- responses, and on tedious corrections to the incorrect answers which are
- inevitably posted.
-
- This article, which is posted monthly, attempts to answer these common
- questions definitively and succinctly, so that net discussion can move
- on to more constructive topics without continual regression to first
- principles.
-
- No mere newsgroup article can substitute for thoughtful perusal of a
- full-length tutorial or language reference manual. Anyone interested
- enough in C to be following this newsgroup should also be interested
- enough to read and study one or more such manuals, preferably several
- times. Some vendors' compiler manuals are unfortunately inadequate; a
- few even perpetuate some of the myths which this article attempts to
- refute. Several noteworthy books on C are listed in this article's
- bibliography. Many of the questions and answers are cross-referenced to
- these books, for further study by the interested and dedicated reader
- (but beware of ANSI vs. ISO C Standard section numbers; see question
- 5.1).
-
- If you have a question about C which is not answered in this article,
- first try to answer it by checking a few of the referenced books, or by
- asking knowledgeable colleagues, before posing your question to the net
- at large. There are many people on the net who are happy to answer
- questions, but the volume of repetitive answers posted to one question,
- as well as the growing number of questions as the net attracts more
- readers, can become oppressive. If you have questions or comments
- prompted by this article, please reply by mail rather than following up
- -- this article is meant to decrease net traffic, not increase it.
-
- Besides listing frequently-asked questions, this article also summarizes
- frequently-posted answers. Even if you know all the answers, it's worth
- skimming through this list once in a while, so that when you see one of
- its questions unwittingly posted, you won't have to waste time
- answering.
-
- This article is always being improved. Your input is welcomed. Send
- your comments to scs@adam.mit.edu, scs%adam.mit.edu@mit.edu, and/or
- mit-eddie!adam.mit.edu!scs; this article's From: line may be unusable.
-
- The questions answered here are divided into several categories:
-
- 1. Null Pointers
- 2. Arrays and Pointers
- 3. Memory Allocation
- 4. Expressions
- 5. ANSI C
- 6. C Preprocessor
- 7. Variable-Length Argument Lists
- 8. Boolean Expressions and Variables
- 9. Structs, Enums, and Unions
- 10. Declarations
- 11. Stdio
- 12. Library Subroutines
- 13. Lint
- 14. Style
- 15. Floating Point
- 16. System Dependencies
- 17. Miscellaneous (Fortran to C converters, YACC grammars, etc.)
-
- Herewith, some frequently-asked questions and their answers:
-
-
- Section 1. Null Pointers
-
- 1.1: What is this infamous null pointer, anyway?
-
- A: The language definition states that for each pointer type, there
- is a special value -- the "null pointer" -- which is
- distinguishable from all other pointer values and which is not
- the address of any object. That is, the address-of operator &
- will never yield a null pointer, nor will a successful call to
- malloc. (malloc returns a null pointer when it fails, and this
- is a typical use of null pointers: as a "special" pointer value
- with some other meaning, usually "not allocated" or "not
- pointing anywhere yet.")
-
- A null pointer is conceptually different from an uninitialized
- pointer. A null pointer is known not to point to any object; an
- uninitialized pointer might point anywhere. See also questions
- 3.1, 3.11, and 17.1.
-
- As mentioned in the definition above, there is a null pointer
- for each pointer type, and the internal values of null pointers
- for different types may be different. Although programmers need
- not know the internal values, the compiler must always be
- informed which type of null pointer is required, so it can make
- the distinction if necessary (see below).
-
- References: K&R I Sec. 5.4 pp. 97-8; K&R II Sec. 5.4 p. 102; H&S
- Sec. 5.3 p. 91; ANSI Sec. 3.2.2.3 p. 38.
-
- 1.2: How do I "get" a null pointer in my programs?
-
- A: According to the language definition, a constant 0 in a pointer
- context is converted into a null pointer at compile time. That
- is, in an initialization, assignment, or comparison when one
- side is a variable or expression of pointer type, the compiler
- can tell that a constant 0 on the other side requests a null
- pointer, and generate the correctly-typed null pointer value.
- Therefore, the following fragments are perfectly legal:
-
- char *p = 0;
- if(p != 0)
-
- However, an argument being passed to a function is not
- necessarily recognizable as a pointer context, and the compiler
- may not be able to tell that an unadorned 0 "means" a null
- pointer. For instance, the Unix system call "execl" takes a
- variable-length, null-pointer-terminated list of character
- pointer arguments. To generate a null pointer in a function
- call context, an explicit cast is typically required, to force
- the 0 to be in a pointer context:
-
- execl("/bin/sh", "sh", "-c", "ls", (char *)0);
-
- If the (char *) cast were omitted, the compiler would not know
- to pass a null pointer, and would pass an integer 0 instead.
- (Note that many Unix manuals get this example wrong.)
-
- When function prototypes are in scope, argument passing becomes
- an "assignment context," and most casts may safely be omitted,
- since the prototype tells the compiler that a pointer is
- required, and of which type, enabling it to correctly convert
- unadorned 0's. Function prototypes cannot provide the types for
- variable arguments in variable-length argument lists, however,
- so explicit casts are still required for those arguments. It is
- safest always to cast null pointer function arguments, to guard
- against varargs functions or those without prototypes, to allow
- interim use of non-ANSI compilers, and to demonstrate that you
- know what you are doing. (Incidentally, it's also a simpler
- rule to remember.)
-
- Summary:
-
- Unadorned 0 okay: Explicit cast required:
-
- initialization function call,
- no prototype in scope
- assignment
- variable argument in
- comparison varargs function call
-
- function call,
- prototype in scope,
- fixed argument
-
- References: K&R I Sec. A7.7 p. 190, Sec. A7.14 p. 192; K&R II
- Sec. A7.10 p. 207, Sec. A7.17 p. 209; H&S Sec. 4.6.3 p. 72; ANSI
- Sec. 3.2.2.3 .
-
- 1.3: What is NULL and how is it #defined?
-
- A: As a matter of style, many people prefer not to have unadorned
- 0's scattered throughout their programs. For this reason, the
- preprocessor macro NULL is #defined (by <stdio.h> or
- <stddef.h>), with value 0 (or (void *)0, about which more
- later). A programmer who wishes to make explicit the
- distinction between 0 the integer and 0 the null pointer can
- then use NULL whenever a null pointer is required. This is a
- stylistic convention only; the preprocessor turns NULL back to 0
- which is then recognized by the compiler (in pointer contexts)
- as before. In particular, a cast may still be necessary before
- NULL (as before 0) in a function call argument. (The table
- under question 1.2 above applies for NULL as well as 0.)
-
- NULL should _only_ be used for pointers; see question 1.8.
-
- References: K&R I Sec. 5.4 pp. 97-8; K&R II Sec. 5.4 p. 102; H&S
- Sec. 13.1 p. 283; ANSI Sec. 4.1.5 p. 99, Sec. 3.2.2.3 p. 38,
- Rationale Sec. 4.1.5 p. 74.
-
- 1.4: How should NULL be #defined on a machine which uses a nonzero
- bit pattern as the internal representation of a null pointer?
-
- A: Programmers should never need to know the internal
- representation(s) of null pointers, because they are normally
- taken care of by the compiler. If a machine uses a nonzero bit
- pattern for null pointers, it is the compiler's responsibility
- to generate it when the programmer requests, by writing "0" or
- "NULL," a null pointer. Therefore, #defining NULL as 0 on a
- machine for which internal null pointers are nonzero is as valid
- as on any other, because the compiler must (and can) still
- generate the machine's correct null pointers in response to
- unadorned 0's seen in pointer contexts.
-
- 1.5: If NULL were defined as follows:
-
- #define NULL (char *)0
-
- wouldn't that make function calls which pass an uncast NULL
- work?
-
- A: Not in general. The problem is that there are machines which
- use different internal representations for pointers to different
- types of data. The suggested #definition would make uncast NULL
- arguments to functions expecting pointers to characters to work
- correctly, but pointer arguments to other types would still be
- problematical, and legal constructions such as
-
- FILE *fp = NULL;
-
- could fail.
-
- Nevertheless, ANSI C allows the alternate
-
- #define NULL ((void *)0)
-
- definition for NULL. Besides helping incorrect programs to work
- (but only on machines with homogeneous pointers, thus
- questionably valid assistance) this definition may catch
- programs which use NULL incorrectly (e.g. when the ASCII NUL
- character was really intended; see question 1.8).
-
- References: ANSI Rationale Sec. 4.1.5 p. 74.
-
- 1.6: I use the preprocessor macro
-
- #define Nullptr(type) (type *)0
-
- to help me build null pointers of the correct type.
-
- A: This trick, though popular in some circles, does not buy much.
- It is not needed in assignments and comparisons; see question
- 1.2. It does not even save keystrokes. Its use suggests to the
- reader that the author is shaky on the subject of null pointers,
- and requires the reader to check the #definition of the macro,
- its invocations, and _all_ other pointer usages much more
- carefully. See also question 8.1.
-
- 1.7: Is the abbreviated pointer comparison "if(p)" to test for non-
- null pointers valid? What if the internal representation for
- null pointers is nonzero?
-
- A: When C requires the boolean value of an expression (in the if,
- while, for, and do statements, and with the &&, ||, !, and ?:
- operators), a false value is produced when the expression
- compares equal to zero, and a true value otherwise. That is,
- whenever one writes
-
- if(expr)
-
- where "expr" is any expression at all, the compiler essentially
- acts as if it had been written as
-
- if(expr != 0)
-
- Substituting the trivial pointer expression "p" for "expr," we
- have
-
- if(p) is equivalent to if(p != 0)
-
- and this is a comparison context, so the compiler can tell that
- the (implicit) 0 is a null pointer, and use the correct value.
- There is no trickery involved here; compilers do work this way,
- and generate identical code for both statements. The internal
- representation of a pointer does _not_ matter.
-
- The boolean negation operator, !, can be described as follows:
-
- !expr is essentially equivalent to expr?0:1
-
- It is left as an exercise for the reader to show that
-
- if(!p) is equivalent to if(p == 0)
-
- "Abbreviations" such as if(p), though perfectly legal, are
- considered by some to be bad style.
-
- See also question 8.2.
-
- References: K&R II Sec. A7.4.7 p. 204; H&S Sec. 5.3 p. 91; ANSI
- Secs. 3.3.3.3, 3.3.9, 3.3.13, 3.3.14, 3.3.15, 3.6.4.1, and
- 3.6.5 .
-
- 1.8: If "NULL" and "0" are equivalent, which should I use?
-
- A: Many programmers believe that "NULL" should be used in all
- pointer contexts, as a reminder that the value is to be thought
- of as a pointer. Others feel that the confusion surrounding
- "NULL" and "0" is only compounded by hiding "0" behind a
- #definition, and prefer to use unadorned "0" instead. There is
- no one right answer. C programmers must understand that "NULL"
- and "0" are interchangeable and that an uncast "0" is perfectly
- acceptable in initialization, assignment, and comparison
- contexts. Any usage of "NULL" (as opposed to "0") should be
- considered a gentle reminder that a pointer is involved;
- programmers should not depend on it (either for their own
- understanding or the compiler's) for distinguishing pointer 0's
- from integer 0's.
-
- NULL should _not_ be used when another kind of 0 is required,
- even though it might work, because doing so sends the wrong
- stylistic message. (ANSI allows the #definition of NULL to be
- (void *)0, which will not work in non-pointer contexts.) In
- particular, do not use NULL when the ASCII null character (NUL)
- is desired. Provide your own definition
-
- #define NUL '\0'
-
- if you must.
-
- References: K&R II Sec. 5.4 p. 102.
-
- 1.9: But wouldn't it be better to use NULL (rather than 0) in case
- the value of NULL changes, perhaps on a machine with nonzero
- null pointers?
-
- A: No. Although symbolic constants are often used in place of
- numbers because the numbers might change, this is _not_ the
- reason that NULL is used in place of 0. Once again, the
- language guarantees that source-code 0's (in pointer contexts)
- generate null pointers. NULL is used only as a stylistic
- convention.
-
- 1.10: I'm confused. NULL is guaranteed to be 0, but the null pointer
- is not?
-
- A: When the term "null" or "NULL" is casually used, one of several
- things may be meant:
-
- 1. The conceptual null pointer, the abstract language
- concept defined in question 1.1. It is implemented
- with...
-
- 2. The internal (or run-time) representation of a null
- pointer, which may or may not be all-bits-0 and which
- may be different for different pointer types. The
- actual values should be of concern only to compiler
- writers. Authors of C programs never see them, since
- they use...
-
- 3. The source code syntax for null pointers, which is the
- single character "0". It is often hidden behind...
-
- 4. The NULL macro, which is #defined to be "0" or
- "(void *)0". Finally, as red herrings, we have...
-
- 5. The ASCII null character (NUL), which does have all bits
- zero, but has no relation to the null pointer except in
- name; and...
-
- 6. The "null string," which is another name for an empty
- string (""). The term "null string" can be confusing in
- C (and should perhaps be avoided), because it involves a
- null ('\0') character, but not a null pointer, which
- brings us full circle...
-
- This article always uses the phrase "null pointer" (in lower
- case) for sense 1, the character "0" for sense 3, and the
- capitalized word "NULL" for sense 4.
-
- 1.11: Why is there so much confusion surrounding null pointers? Why
- do these questions come up so often?
-
- A: C programmers traditionally like to know more than they need to
- about the underlying machine implementation. The fact that null
- pointers are represented both in source code, and internally to
- most machines, as zero invites unwarranted assumptions. The use
- of a preprocessor macro (NULL) suggests that the value might
- change later, or on some weird machine. The construct
- "if(p == 0)" is easily misread as calling for conversion of p to
- an integral type, rather than 0 to a pointer type, before the
- comparison. Finally, the distinction between the several uses
- of the term "null" (listed above) is often overlooked.
-
- One good way to wade out of the confusion is to imagine that C
- had a keyword (perhaps "nil", like Pascal) with which null
- pointers were requested. The compiler could either turn "nil"
- into the correct type of null pointer, when it could determine
- the type from the source code, or complain when it could not.
- Now, in fact, in C the keyword for a null pointer is not "nil"
- but "0", which works almost as well, except that an uncast "0"
- in a non-pointer context generates an integer zero instead of an
- error message, and if that uncast 0 was supposed to be a null
- pointer, the code may not work.
-
- 1.12: I'm still confused. I just can't understand all this null
- pointer stuff.
-
- A: Follow these two simple rules:
-
- 1. When you want to refer to a null pointer in source code,
- use "0" or "NULL".
-
- 2. If the usage of "0" or "NULL" is an argument in a
- function call, cast it to the pointer type expected by
- the function being called.
-
- The rest of the discussion has to do with other people's
- misunderstandings, or with the internal representation of null
- pointers (which you shouldn't need to know), or with ANSI C
- refinements. Understand questions 1.1, 1.2, and 1.3, and
- consider 1.8 and 1.11, and you'll do fine.
-
- 1.13: Given all the confusion surrounding null pointers, wouldn't it
- be easier simply to require them to be represented internally by
- zeroes?
-
- A: If for no other reason, doing so would be ill-advised because it
- would unnecessarily constrain implementations which would
- otherwise naturally represent null pointers by special, nonzero
- bit patterns, particularly when those values would trigger
- automatic hardware traps for invalid accesses.
-
- Besides, what would this requirement really accomplish? Proper
- understanding of null pointers does not require knowledge of the
- internal representation, whether zero or nonzero. Assuming that
- null pointers are internally zero does not make any code easier
- to write (except for a certain ill-advised usage of calloc; see
- question 3.11). Known-zero internal pointers would not obviate
- casts in function calls, because the _size_ of the pointer might
- still be different from that of an int. (If "nil" were used to
- request null pointers rather than "0," as mentioned in question
- 1.11, the urge to assume an internal zero representation would
- not even arise.)
-
- 1.14: Seriously, have any actual machines really used nonzero null
- pointers, or different representations for pointers to different
- types?
-
- A: The Prime 50 series used segment 07777, offset 0 for the null
- pointer, at least for PL/I. Later models used segment 0, offset
- 0 for null pointers in C, necessitating new instructions such as
- TCNP (Test C Null Pointer), evidently as a sop to all the extant
- poorly-written C code which made incorrect assumptions. Older,
- word-addressed Prime machines were also notorious for requiring
- larger byte pointers (char *'s) than word pointers (int *'s).
-
- The Eclipse MV series from Data General has three
- architecturally supported pointer formats (word, byte, and bit
- pointers), two of which are used by C compilers: byte pointers
- for char * and void *, and word pointers for everything else.
-
- Some Honeywell-Bull mainframes use the bit pattern 06000 for
- (internal) null pointers.
-
- The CDC Cyber 180 Series has 48-bit pointers consisting of a
- ring, segment, and offset. Most users (in ring 11) have null
- pointers of 0xB00000000000.
-
- The Symbolics Lisp Machine, a tagged architecture, does not even
- have conventional numeric pointers; it uses the pair <NIL, 0>
- (basically a nonexistent <object, offset> handle) as a C null
- pointer.
-
- Depending on the "memory model" in use, 80*86 processors (PC's)
- may use 16 bit data pointers and 32 bit function pointers, or
- vice versa.
-
- 1.15: What does a run-time "null pointer assignment" error mean? How
- do I track it down?
-
- A: This message, which occurs only under MS-DOS (see, therefore,
- section 16) means that you've written, via a null pointer, to
- location zero.
-
- A debugger will usually let you set a data breakpoint on
- location 0. Alternately, you could write a bit of code to copy
- 20 or so bytes from location 0 into another buffer, and
- periodically check that it hasn't changed.
-
-
- Section 2. Arrays and Pointers
-
- 2.1: I had the definition char a[6] in one source file, and in
- another I declared extern char *a. Why didn't it work?
-
- A: The declaration extern char *a simply does not match the actual
- definition. The type "pointer-to-type-T" is not the same as
- "array-of-type-T." Use extern char a[].
-
- References: CT&P Sec. 3.3 pp. 33-4, Sec. 4.5 pp. 64-5.
-
- 2.2: But I heard that char a[] was identical to char *a.
-
- A: Not at all. (What you heard has to do with formal parameters to
- functions; see question 2.4.) Arrays are not pointers. The
- array declaration "char a[6];" requests that space for six
- characters be set aside, to be known by the name "a." That is,
- there is a location named "a" at which six characters can sit.
- The pointer declaration "char *p;" on the other hand, requests a
- place which holds a pointer. The pointer is to be known by the
- name "p," and can point to any char (or contiguous array of
- chars) anywhere.
-
- As usual, a picture is worth a thousand words. The statements
-
- char a[] = "hello";
- char *p = "world";
-
- would result in data structures which could be represented like
- this:
-
- +---+---+---+---+---+---+
- a: | h | e | l | l | o |\0 |
- +---+---+---+---+---+---+
-
- +-----+ +---+---+---+---+---+---+
- p: | *======> | w | o | r | l | d |\0 |
- +-----+ +---+---+---+---+---+---+
-
- It is important to realize that a reference like x[3] generates
- different code depending on whether x is an array or a pointer.
- Given the declarations above, when the compiler sees the
- expression a[3], it emits code to start at the location "a,"
- move three past it, and fetch the character there. When it sees
- the expression p[3], it emits code to start at the location "p,"
- fetch the pointer value there, add three to the pointer, and
- finally fetch the character pointed to. In the example above,
- both a[3] and p[3] happen to be the character 'l', but the
- compiler gets there differently. (See also question 17.14.)
-
- 2.3: So what is meant by the "equivalence of pointers and arrays" in
- C?
-
- A: Much of the confusion surrounding pointers in C can be traced to
- a misunderstanding of this statement. Saying that arrays and
- pointers are "equivalent" does not by any means imply that they
- are interchangeable.
-
- "Equivalence" refers to the following key definition:
-
- An lvalue [see question 2.5] of type array-of-T
- which appears in an expression decays (with
- three exceptions) into a pointer to its first
- element; the type of the resultant pointer is
- pointer-to-T.
-
- (The exceptions are when the array is the operand of a sizeof or
- & operator, or is a literal string initializer for a character
- array.)
-
- As a consequence of this definition, there is no apparent
- difference in the behavior of the "array subscripting" operator
- [] as it applies to arrays and pointers. In an expression of
- the form a[i], the array reference "a" decays into a pointer,
- following the rule above, and is then subscripted just as would
- be a pointer variable in the expression p[i] (although the
- eventual memory accesses will be different, as explained in
- question 2.2). In either case, the expression x[i] (where x is
- an array or a pointer) is, by definition, exactly equivalent to
- *((x)+(i)).
-
- References: K&R I Sec. 5.3 pp. 93-6; K&R II Sec. 5.3 p. 99; H&S
- Sec. 5.4.1 p. 93; ANSI Sec. 3.2.2.1, Sec. 3.3.2.1, Sec. 3.3.6 .
-
- 2.4: Then why are array and pointer declarations interchangeable as
- function formal parameters?
-
- A: Since arrays decay immediately into pointers, an array is never
- actually passed to a function. As a convenience, any parameter
- declarations which "look like" arrays, e.g.
-
- f(a)
- char a[];
-
- are treated by the compiler as if they were pointers, since that
- is what the function will receive if an array is passed:
-
- f(a)
- char *a;
-
- This conversion holds only within function formal parameter
- declarations, nowhere else. If this conversion bothers you,
- avoid it; many people have concluded that the confusion it
- causes outweighs the small advantage of having the declaration
- "look like" the call and/or the uses within the function.
-
- References: K&R I Sec. 5.3 p. 95, Sec. A10.1 p. 205; K&R II
- Sec. 5.3 p. 100, Sec. A8.6.3 p. 218, Sec. A10.1 p. 226; H&S
- Sec. 5.4.3 p. 96; ANSI Sec. 3.5.4.3, Sec. 3.7.1, CT&P Sec. 3.3
- pp. 33-4.
-
- 2.5: How can an array be an lvalue, if you can't assign to it?
-
- A: The ANSI C Standard defines a "modifiable lvalue," which an
- array is not.
-
- References: ANSI Sec. 3.2.2.1 p. 37.
-
- 2.6: Why doesn't sizeof properly report the size of an array which is
- a parameter to a function?
-
- A: The sizeof operator reports the size of the pointer parameter
- which the function actually receives (see question 2.4).
-
- 2.7: Someone explained to me that arrays were really just constant
- pointers.
-
- A: This is a bit of an oversimplification. An array name is
- "constant" in that it cannot be assigned to, but an array is
- _not_ a pointer, as the discussion and pictures in question 2.2
- should make clear.
-
- 2.8: Practically speaking, what is the difference between arrays and
- pointers?
-
- A: Arrays automatically allocate space, but can't be relocated or
- resized. Pointers must be explicitly assigned to point to
- allocated space (perhaps using malloc), but can be reassigned
- (i.e. pointed at different objects) at will, and have many other
- uses besides serving as the base of blocks of memory.
-
- Due to the "equivalence of arrays and pointers" (see question
- 2.3), arrays and pointers often seem interchangeable, and in
- particular a pointer to a block of memory assigned by malloc is
- frequently treated (and can be referenced using [] exactly) as
- if it were a true array (see also question 2.13).
-
- 2.9: I came across some "joke" code containing the "expression"
- 5["abcdef"] . How can this be legal C?
-
- A: Yes, Virginia, array subscripting is commutative in C. This
- curious fact follows from the pointer definition of array
- subscripting, namely that a[e] is exactly equivalent to
- *((a)+(e)), for _any_ expression e and primary expression a, as
- long as one of them is a pointer expression and one is integral.
- This unsuspected commutativity is often mentioned in C texts as
- if it were something to be proud of, but it finds no useful
- application outside of the Obfuscated C Contest (see question
- 17.9).
-
- References: ANSI Rationale Sec. 3.3.2.1 p. 41.
-
- 2.10: My compiler complained when I passed a two-dimensional array to
- a routine expecting a pointer to a pointer.
-
- A: The rule by which arrays decay into pointers is not applied
- recursively. An array of arrays (i.e. a two-dimensional array
- in C) decays into a pointer to an array, not a pointer to a
- pointer. Pointers to arrays can be confusing, and must be
- treated carefully. (The confusion is heightened by the
- existence of incorrect compilers, including some versions of pcc
- and pcc-derived lint's, which improperly accept assignments of
- multi-dimensional arrays to multi-level pointers.) If you are
- passing a two-dimensional array to a function:
-
- int array[NROWS][NCOLUMNS];
- f(array);
-
- the function's declaration should match:
-
- f(int a[][NCOLUMNS]) {...}
- or
- f(int (*ap)[NCOLUMNS]) {...} /* ap is a pointer to an array
- */
-
- In the first declaration, the compiler performs the usual
- implicit parameter rewriting of "array of array" to "pointer to
- array;" in the second form the pointer declaration is explicit.
- Since the called function does not allocate space for the array,
- it does not need to know the overall size, so the number of
- "rows," NROWS, can be omitted. The "shape" of the array is
- still important, so the "column" dimension NCOLUMNS (and, for 3-
- or more dimensional arrays, the intervening ones) must be
- included.
-
- If a function is already declared as accepting a pointer to a
- pointer, it is probably incorrect to pass a two-dimensional
- array directly to it.
-
- References: K&R I Sec. 5.10 p. 110; K&R II Sec. 5.9 p. 113.
-
- 2.11: How do I write functions which accept 2-dimensional arrays when
- the "width" is not known at compile time?
-
- A: It's not easy. One way is to pass in pointer to the [0][0]
- element, along with the two dimensions, and simulate array
- subscripting "by hand:"
-
- f2(aryp, m, n)
- int *aryp;
- int m, n;
- { ... ary[i][j] is really aryp[i * n + j] ... }
-
- This function could be called with the array from question 2.10
- as
-
- f2(&array[0][0], NROWS, NCOLUMNS);
-
- See also question 2.14.
-
- 2.12: How do I declare a pointer to an array?
-
- A: Usually, you don't want to. When people speak casually of a
- pointer to an array, they usually mean a pointer to its first
- element.
-
- Instead of a pointer to an array, consider using a pointer to
- one of the array's elements instead. Arrays of type T decay
- into pointers to type T (see question 2.3), which is convenient;
- subscripting or incrementing the resultant pointer accesses the
- individual members of the array. True pointers to arrays, when
- subscripted or incremented, step over entire arrays, and are
- generally only useful when operating on arrays of arrays, if at
- all. (See question 2.10 above.)
-
- If you really need to declare a pointer to an entire array, use
- something like "int (*ap)[N];" where N is the size of the array.
- (See also question 10.4.) If the size of the array is unknown,
- N can be omitted, but the resulting type, "pointer to array of
- unknown size," is useless.
-
- 2.13: How can I dynamically allocate a multidimensional array?
-
- A: It is usually best to allocate an array of pointers, and then
- initialize each pointer to a dynamically-allocated "row." Here
- is a two-dimensional example:
-
- int **array1 = (int **)malloc(nrows * sizeof(int *));
- for(i = 0; i < nrows; i++)
- array1[i] = (int *)malloc(ncolumns * sizeof(int));
-
- (In "real" code, of course, malloc would be declared correctly,
- and each return value checked.)
-
- You can keep the array's contents contiguous, while making later
- reallocation of individual rows difficult, with a bit of
- explicit pointer arithmetic:
-
- int **array2 = (int **)malloc(nrows * sizeof(int *));
- array2[0] = (int *)malloc(nrows * ncolumns * sizeof(int));
- for(i = 1; i < nrows; i++)
- array2[i] = array2[0] + i * ncolumns;
-
- In either case, the elements of the dynamic array can be
- accessed with normal-looking array subscripts: array[i][j].
-
- If the double indirection implied by the above schemes is for
- some reason unacceptable, you can simulate a two-dimensional
- array with a single, dynamically-allocated one-dimensional
- array:
-
- int *array3 = (int *)malloc(nrows * ncolumns * sizeof(int));
-
- However, you must now perform subscript calculations manually,
- accessing the i,jth element with array3[i * ncolumns + j]. (A
- macro can hide the explicit calculation, but invoking it then
- requires parentheses and commas which don't look exactly like
- multidimensional array subscripts.)
-
- Finally, you can use pointers-to-arrays:
-
- int (*array4)[NCOLUMNS] =
- (int (*)[NCOLUMNS])malloc(nrows * sizeof(*array4));
-
- , but the syntax gets horrific and all but one dimension must be
- known at compile time.
-
- With all of these techniques, you may of course need to remember
- to free the arrays (which may take several steps; see question
- 3.8) when they are no longer needed, and you cannot necessarily
- intermix the dynamically-allocated arrays with conventional,
- statically-allocated ones (see question 2.14 below, and also
- question 2.10).
-
- 2.14: How can I use statically- and dynamically-allocated
- multidimensional arrays interchangeably when passing them to
- functions?
-
- A: There is no single perfect method. Given the array and f() as
- declared in question 2.10, f2() as declared in question 2.11,
- array1, array2, array3, and array4 as declared in 2.13, and a
- function f3() declared as:
-
- f3(pp, m, n)
- int **pp;
- int m, n;
-
- ; the following calls would be legal, and work as expected:
-
- f(array, NROWS, NCOLUMNS);
- f(array4, nrows, NCOLUMNS);
- f2(&array[0][0], NROWS, NCOLUMNS);
- f2(*array2, nrows, ncolumns);
- f2(array3, nrows, ncolumns);
- f2(*array4, nrows, NCOLUMNS);
- f3(array1, nrows, ncolumns);
- f3(array2, nrows, ncolumns);
-
- The following two calls would probably work, but involve
- questionable casts, and work only if the dynamic ncolumns
- matches the static NCOLUMNS:
-
- f((int (*)[NCOLUMNS])(*array2), nrows, ncolumns);
- f((int (*)[NCOLUMNS])array3, nrows, ncolumns);
-
- If you can understand why all of the above calls work and are
- written as they are, and if you understand why the combinations
- that are not listed would not work, then you have a _very_ good
- understanding of arrays and pointers (and several other areas)
- in C.
-
- 2.15: Here's a neat trick: if I write
-
- int realarray[10];
- int *array = &realarray[-1];
-
- I can treat "array" as if it were a 1-based array.
-
- A: Although this technique is attractive (and is used in the book
- Numerical Recipes in C), it does not conform to the C standards.
- Pointer arithmetic is defined only as long as the pointer points
- within the same allocated block of memory, or to the imaginary
- "terminating" element one past it; otherwise, the behavior is
- undefined, _even if the pointer is not dereferenced_. The code
- above could fail if, while subtracting the offset, an illegal
- address were generated (perhaps because the address tried to
- "wrap around" past the beginning of some memory segment).
-
- References: ANSI Sec. 3.3.6 p. 48, Rationale Sec. 3.2.2.3 p. 38;
- K&R II Sec. 5.3 p. 100, Sec. 5.4 pp. 102-3, Sec. A7.7 pp. 205-6.
-
- 2.16: I passed a pointer to a function which initialized it:
-
- main()
- {
- int *ip;
- f(ip);
- return 0;
- }
-
- void f(ip)
- int *ip;
- {
- static int dummy;
- ip = &dummy;
- *ip = 5;
- }
-
- , but the pointer in the caller was unchanged.
-
- A: Did the function try to initialize the pointer itself, or just
- what it pointed to? Remember that arguments in C are passed by
- value. The called function altered only the passed copy of the
- pointer. You'll want to pass the address of the pointer (the
- function will end up accepting a pointer-to-a-pointer).
-
- 2.17: I have a char * pointer that happens to point to some ints, and
- I want to step it over them. Why doesn't
-
- ((int *)p)++;
-
- work?
-
- A: In C, a cast operator does not mean "pretend these bits have a
- different type, and treat them accordingly;" it is a conversion
- operator, and by definition it yields an rvalue, which cannot be
- assigned to, or incremented with ++. (It is an anomaly in pcc-
- derived compilers, and an extension in gcc, that expressions
- such as the above are ever accepted.) Say what you mean: use
-
- p = (char *)((int *)p + 1);
-
- , or simply
-
- p += sizeof(int);
-
- References: ANSI Sec. 3.3.4, Rationale Sec. 3.3.2.4 p. 43.
-
-
- Section 3. Memory Allocation
-
- 3.1: Why doesn't this fragment work?
-
- char *answer;
- printf("Type something:\n");
- gets(answer);
- printf("You typed \"%s\"\n", answer);
-
- A: The pointer variable "answer," which is handed to the gets
- function as the location into which the response should be
- stored, has not been set to point to any valid storage. That
- is, we cannot say where the pointer "answer" points. (Since
- local variables are not initialized, and typically contain
- garbage, it is not even guaranteed that "answer" starts out as a
- null pointer. See question 17.1.)
-
- The simplest way to correct the question-asking program is to
- use a local array, instead of a pointer, and let the compiler
- worry about allocation:
-
- #include <string.h>
-
- char answer[100], *p;
- printf("Type something:\n");
- fgets(answer, 100, stdin);
- if((p = strchr(answer, '\n')) != NULL)
- *p = '\0';
- printf("You typed \"%s\"\n", answer);
-
- Note that this example also uses fgets instead of gets (always a
- good idea), so that the size of the array can be specified, so
- that fgets will not overwrite the end of the array if the user
- types an overly-long line. (Unfortunately for this example,
- fgets does not automatically delete the trailing \n, as gets
- would.) It would also be possible to use malloc to allocate the
- answer buffer, and/or to parameterize its size
- (#define ANSWERSIZE 100).
-
- 3.2: I can't get strcat to work. I tried
-
- char *s1 = "Hello, ";
- char *s2 = "world!";
- char *s3 = strcat(s1, s2);
-
- but I got strange results.
-
- A: Again, the problem is that space for the concatenated result is
- not properly allocated. C does not provide an automatically-
- managed string type. C compilers only allocate memory for
- objects explicitly mentioned in the source code (in the case of
- "strings," this includes character arrays and string literals).
- The programmer must arrange (explicitly) for sufficient space
- for the results of run-time operations such as string
- concatenation, typically by declaring arrays, or by calling
- malloc.
-
- strcat performs no allocation; the second string is appended to
- the first one, in place. Therefore, one fix would be to declare
- the first string as an array with sufficient space:
-
- char s1[20] = "Hello, ";
-
- Since strcat returns the value of its first argument (s1, in
- this case), the s3 variable is superfluous.
-
- References: CT&P Sec. 3.2 p. 32.
-
- 3.3: But the man page for strcat says that it takes two char *'s as
- arguments. How am I supposed to know to allocate things?
-
- A: In general, when using pointers you _always_ have to consider
- memory allocation, at least to make sure that the compiler is
- doing it for you. If a library routine's documentation does not
- explicitly mention allocation, it is usually the caller's
- problem.
-
- The Synopsis section at the top of a Unix-style man page can be
- misleading. The code fragments presented there are closer to
- the function definition used by the call's implementor than the
- invocation used by the caller. In particular, many routines
- which accept pointers (e.g. to structs or strings), are usually
- called with the address of some object (a struct, or an array --
- see questions 2.3 and 2.4.) Another common example is stat().
-
- 3.4: I have a function that is supposed to return a string, but when
- it returns to its caller, the returned string is garbage.
-
- A: Make sure that the memory to which the function returns a
- pointer is correctly allocated. The returned pointer should be
- to a statically-allocated buffer, or to a buffer passed in by
- the caller, but _not_ to a local array. See also question 17.3.
-
- 3.5: You can't use dynamically-allocated memory after you free it,
- can you?
-
- A: No. Some early man pages for malloc stated that the contents of
- freed memory was "left undisturbed;" this ill-advised guarantee
- was never universal and is not required by ANSI.
-
- Few programmers would use the contents of freed memory
- deliberately, but it is easy to do so accidentally. Consider
- the following (correct) code for freeing a singly-linked list:
-
- struct list *listp, *nextp;
- for(listp = base; listp != NULL; listp = nextp) {
- nextp = listp->next;
- free((char *)listp);
- }
-
- and notice what would happen if the more-obvious loop iteration
- expression listp = listp->next were used, without the temporary
- nextp pointer.
-
- References: ANSI Rationale Sec. 4.10.3.2 p. 102; CT&P Sec. 7.10
- p. 95.
-
- 3.6: How does free() know how many bytes to free?
-
- A: The malloc/free package remembers the size of each block it
- allocates and returns, so it is not necessary to remind it of
- the size when freeing.
-
- 3.7: So can I query the malloc package to find out how big an
- allocated block is?
-
- A: Not portably.
-
- 3.8: I'm allocating structures which contain pointers to other
- dynamically-allocated objects. When I free a structure, do I
- have to free each subsidiary pointer first?
-
- A: Yes. In general, you must arrange that each pointer returned
- from malloc be individually passed to free, exactly once (if it
- is freed at all).
-
- 3.9: I have a program which mallocs but then frees a lot of memory,
- but memory usage (as reported by ps) doesn't seem to go back
- down.
-
- A: Most implementations of malloc/free do not return freed memory
- to the operating system (if there is one), but merely make it
- available for future malloc calls.
-
- 3.10: Is it legal to pass a null pointer as the first argument to
- realloc()? Why would you want to?
-
- A: ANSI C sanctions this usage (and the related realloc(..., 0),
- which frees), but several earlier implementations do not support
- it, so it is not widely portable. Passing an initially-null
- pointer to realloc can make it easier to write a self-starting
- incremental allocation algorithm.
-
- References: ANSI Sec. 4.10.3.4 .
-
- 3.11: What is the difference between calloc and malloc? Is it safe to
- use calloc's zero-fill guarantee for pointer and floating-point
- values? Does free work on memory allocated with calloc, or do
- you need a cfree?
-
- A: calloc(m, n) is essentially equivalent to
-
- p = malloc(m * n);
- memset(p, 0, m * n);
-
- The zero fill is all-bits-zero, and does not therefore guarantee
- useful zero values for pointers (see section 1 of this list) or
- floating-point values. free can (and should) be used to free
- the memory allocated by calloc.
-
- References: ANSI Secs. 4.10.3 to 4.10.3.2 .
-
- 3.12: What is alloca and why is its use discouraged?
-
- A: alloca allocates memory which is automatically freed when the
- function which called alloca returns. That is, memory allocated
- with alloca is local to a particular function's "stack frame" or
- context.
-
- alloca cannot be written portably, and is difficult to implement
- on machines without a stack. Its use is problematical (and the
- obvious implementation on a stack-based machine fails) when its
- return value is passed directly to another function, as in
- fgets(alloca(100), 100, stdin).
-
- For these reasons, alloca cannot be used in programs which must
- be widely portable, no matter how useful it might be.
-
- References: ANSI Rationale Sec. 4.10.3 p. 102.
-
-
- Section 4. Expressions
-
- 4.1: Why doesn't this code:
-
- a[i] = i++;
-
- work?
-
- A: The subexpression i++ causes a side effect -- it modifies i's
- value -- which leads to undefined behavior if i is also
- referenced elsewhere in the same expression.
-
- References: ANSI Sec. 3.3 p. 39.
-
- 4.2: Under my compiler, the code
-
- int i = 7;
- printf("%d\n", i++ * i++);
-
- prints 49. Regardless of the order of evaluation, shouldn't it
- print 56?
-
- A: Although the postincrement and postdecrement operators ++ and --
- perform the operations after yielding the former value, the
- implication of "after" is often misunderstood. It is _not_
- guaranteed that the operation is performed immediately after
- giving up the previous value and before any other part of the
- expression is evaluated. It is merely guaranteed that the
- update will be performed sometime before the expression is
- considered "finished" (before the next "sequence point," in ANSI
- C's terminology). In the example, the compiler chose to
- multiply the previous value by itself and to perform both
- increments afterwards.
-
- The behavior of code which contains multiple, ambiguous side
- effects has always been undefined. Don't even try to find out
- how your compiler implements such things (contrary to the ill-
- advised exercises in many C textbooks); as K&R wisely point out,
- "if you don't know _how_ they are done on various machines, that
- innocence may help to protect you."
-
- References: K&R I Sec. 2.12 p. 50; K&R II Sec. 2.12 p. 54; ANSI
- Sec. 3.3 p. 39; CT&P Sec. 3.7 p. 47; PCS Sec. 9.5 pp. 120-1.
- (Ignore H&S Sec. 7.12 pp. 190-1, which is obsolete.)
-
- 4.3: But what about the &&, ||, and comma operators?
- I see code like "if((c = getchar()) == EOF || c == '\n')" ...
-
- A: There is a special exception for those operators, (as well as
- ?: ); each of them does imply a sequence point (i.e. left-to-
- right evaluation is guaranteed). Any book on C should make this
- clear.
-
- References: K&R I Sec. 2.6 p. 38, Secs. A7.11-12 pp. 190-1;
- K&R II Sec. 2.6 p. 41, Secs. A7.14-15 pp. 207-8; ANSI
- Secs. 3.3.13 p. 52, 3.3.14 p. 52, 3.3.15 p. 53, 3.3.17 p. 55,
- CT&P Sec. 3.7 pp. 46-7.
-
- 4.4: If I'm not using the value of the expression, should I use i++
- or ++i to increment a variable?
-
- A: Since the two forms differ only in the value yielded, they are
- entirely equivalent when only their side effect is needed. Some
- people will tell you that in the old days one form was preferred
- over the other because it utilized a PDP-11 autoincrement
- addressing mode, but those people are confused.
-
- 4.5: Why doesn't the code
-
- int a = 1000, b = 1000;
- long int c = a * b;
-
- work?
-
- A: Under C's integral promotion rules, the multiplication is
- carried out using int arithmetic, and the result may overflow
- and/or be truncated before being assigned to the long int left-
- hand-side. Use an explicit cast to force long arithmetic:
-
- long int c = (long int)a * b;
-
-
- Section 5. ANSI C
-
- 5.1: What is the "ANSI C Standard?"
-
- A: In 1983, the American National Standards Institute commissioned
- a committee, X3J11, to standardize the C language. After a
- long, arduous process, including several widespread public
- reviews, the committee's work was finally ratified as an
- American National Standard, X3.159-1989, on December 14, 1989,
- and published in the spring of 1990. For the most part, ANSI C
- standardizes existing practice, with a few additions from C++
- (most notably function prototypes) and support for multinational
- character sets (including the much-lambasted trigraph
- sequences). The ANSI C standard also formalizes the C run-time
- library support routines.
-
- The published Standard includes a "Rationale," which explains
- many of its decisions, and discusses a number of subtle points,
- including several of those covered here. (The Rationale is "not
- part of ANSI Standard X3.159-1989, but is included for
- information only.")
-
- The Standard has been adopted as an international standard,
- ISO/IEC 9899:1990, although the sections are numbered
- differently (briefly, ANSI sections 2 through 4 correspond
- roughly to ISO sections 5 through 7), and the Rationale is
- currently not included.
-
- 5.2: How can I get a copy of the Standard?
-
- A: ANSI X3.159 has been officially superseded by ISO 9899.
- Copies are available from
-
- American National Standards Institute
- 11 W. 42nd St., 13th floor
- New York, NY 10036 USA
- (+1) 212 642 4900
-
- or
-
- Global Engineering Documents
- 2805 McGaw Avenue
- Irvine, CA 92714 USA
- (+1) 714 261 1455
- (800) 854 7179 (U.S. & Canada)
-
- The cost is $130.00 from ANSI or $162.50 from Global.
- Copies of the original X3.159 (including the Rationale) are
- still available at $205.00 from ANSI or $200.50 from Global.
- (Editorial comment: yes, these prices are outrageous.)
-
- Note that ANSI derives revenues to support its operations from
- the sale of printed standards, so electronic copies are _not_
- available.
-
- The Rationale, by itself, has been printed by Silicon Press,
- ISBN 0-929306-07-4.
-
- 5.3: Does anyone have a tool for converting old-style C programs to
- ANSI C, or vice versa, or for automatically generating
- prototypes?
-
- A: First, are you sure you really need to convert lots of old code
- to ANSI C? The old-style function syntax is still acceptable.
-
- Two programs, protoize and unprotoize, convert back and forth
- between prototyped and "old style" function definitions and
- declarations. (These programs do _not_ handle full-blown
- translation between "Classic" C and ANSI C.) These programs
- exist as patches to the FSF GNU C compiler, gcc. Look for the
- file protoize-1.39.0.5.Z in pub/gnu at prep.ai.mit.edu
- (18.71.0.38), or at several other FSF archive sites.
-
- The unproto program (/pub/unix/unproto4.shar.Z on
- ftp.win.tue.nl) is a filter which sits between the preprocessor
- and the next compiler pass, converting most of ANSI C to
- traditional C on-the-fly.
-
- The GNU GhostScript package comes with a little program called
- ansi2knr.
-
- Several prototype generators exist, many as modifications to
- lint. Version 3 of CPROTO was posted to comp.sources.misc in
- March, 1992. (See also question 17.8.)
-
- 5.4: I'm trying to use the ANSI "stringizing" preprocessing operator
- # to insert the value of a symbolic constant into a message, but
- it keeps stringizing the macro's name rather than its value.
-
- A: You must use something like the following two-step procedure to
- force the macro to be expanded as well as stringized:
-
- #define str(x) #x
- #define xstr(x) str(x)
- #define OP plus
- char *opname = xstr(OP);
-
- This sets opname to "plus" rather than "OP".
-
- An equivalent circumlocution is necessary with the token-pasting
- operator ## when the values (rather than the names) of two
- macros are to be concatenated.
-
- References: ANSI Sec. 3.8.3.2, Sec. 3.8.3.5 example p. 93.
-
- 5.5: What's the difference between "char const *p" and
- "char * const p"?
-
- A: "char const *p" is a pointer to a constant character (you can't
- change the character); "char * const p" is a constant pointer to
- a (variable) character (i.e. you can't change the pointer).
- (Read these "inside out" to understand them. See question
- 10.4.)
-
- References: ANSI Sec. 3.5.4.1 .
-
- 5.6: Why can't I pass a char ** to a function which expects a
- const char **?
-
- A: You can use a pointer-to-T (for any type T) where a pointer-to-
- const-T is expected, but the rule (an explicit exception) which
- permits slight mismatches in qualified pointer types is not
- applied recursively, but only at the top level.
-
- You must use explicit casts (e.g. (const char **) in this case)
- when assigning (or passing) pointers which have qualifier
- mismatches at other than the first level of indirection.
-
- References: ANSI Sec. 3.1.2.6 p. 26, Sec. 3.3.16.1 p. 54,
- Sec. 3.5.3 p. 65.
-
- 5.7: My ANSI compiler complains about a mismatch when it sees
-
- extern int func(float);
-
- int func(x)
- float x;
- {...
-
- A: You have mixed the new-style prototype declaration
- "extern int func(float);" with the old-style definition
- "int func(x) float x;". Old C (and ANSI C, in the absence of
- prototypes, and in variable-length argument lists) "widens"
- certain arguments when they are passed to functions. Type float
- is promoted to double, and characters and short integers are
- promoted to integers. (The values are automatically converted
- back to the corresponding narrower types within the body of the
- called function, if they are declared that way there.)
-
- The problem can be fixed either by using new-style syntax
- consistently in the definition:
-
- int func(float x) { ... }
-
- or by changing the new-style prototype declaration to match the
- old-style definition:
-
- extern int func(double);
-
- (In this case, it would be clearest to change the old-style
- definition to use double as well, as long as the address of that
- parameter is not taken.)
-
- It may also be safer to avoid "narrow" (char, short int, and
- float) function arguments and return types.
-
- References: ANSI Sec. 3.3.2.2 .
-
- 5.8: Why does the declaration
-
- extern f(struct x {int s;} *p);
-
- give me an obscure warning message about "struct x introduced in
- prototype scope"?
-
- A: In a quirk of C's normal block scoping rules, a struct declared
- only within a prototype cannot be compatible with other structs
- declared in the same source file, nor can the struct tag be used
- later as you'd expect (it goes out of scope at the end of the
- prototype).
-
- To resolve the problem, precede the prototype with the vacuous-
- looking declaration
-
- struct x;
-
- , which will reserve a place at file scope for struct x's
- definition, which will be completed by the struct declaration
- within the prototype.
-
- References: ANSI Sec. 3.1.2.1 p. 21, Sec. 3.1.2.6 p. 26,
- Sec. 3.5.2.3 p. 63.
-
- 5.9: I'm getting strange syntax errors inside code which I've
- #ifdeffed out.
-
- A: Under ANSI C, the text inside a "turned off" #if, #ifdef, or
- #ifndef must still consist of "valid preprocessing tokens."
- This means that there must be no unterminated comments or quotes
- (note particularly that an apostrophe within a contracted word
- could look like the beginning of a character constant), and no
- newlines inside quotes. Therefore, natural-language comments
- and pseudocode should always be written between the "official"
- comment delimiters /* and */. (But see also question 17.10, and
- 6.7.)
-
- References: ANSI Sec. 2.1.1.2 p. 6, Sec. 3.1 p. 19 line 37.
-
- 5.10: Can I declare main as void, to shut off these annoying "main
- returns no value" messages? (I'm calling exit(), so main
- doesn't return.)
-
- A: No. main must be declared as returning an int, and as taking
- either zero or two arguments (of the appropriate type). If
- you're calling exit() but still getting warnings, you'll have to
- insert a redundant return statement (or use some kind of
- "notreached" directive, if available).
-
- References: ANSI Sec. 2.1.2.2.1 pp. 7-8.
-
- 5.11: Is exit(status) truly equivalent to returning status from main?
-
- A: Yes, except under a few older, nonconforming systems.
-
- References: ANSI Sec. 2.1.2.2.3 p. 8.
-
- 5.12: Why does the ANSI Standard not guarantee more than six monocase
- characters of external identifier significance?
-
- A: The problem is older linkers which are neither under the control
- of the ANSI standard nor the C compiler developers on the
- systems which have them. The limitation is only that
- identifiers be _significant_ in the first six characters, not
- that they be restricted to six characters in length. This
- limitation is annoying, but certainly not unbearable, and is
- marked in the Standard as "obsolescent," i.e. a future revision
- will likely relax it.
-
- This concession to current, restrictive linkers really had to be
- made, no matter how vehemently some people oppose it. (The
- Rationale notes that its retention was "most painful.") If you
- disagree, or have thought of a trick by which a compiler
- burdened with a restrictive linker could present the C
- programmer with the appearance of more significance in external
- identifiers, read the excellently-worded section 3.1.2 in the
- X3.159 Rationale (see question 5.1), which discusses several
- such schemes and explains why they could not be mandated.
-
- References: ANSI Sec. 3.1.2 p. 21, Sec. 3.9.1 p. 96, Rationale
- Sec. 3.1.2 pp. 19-21.
-
- 5.13: What is the difference between memcpy and memmove?
-
- A: memmove offers guaranteed behavior if the source and destination
- arguments overlap. memcpy makes no such guarantee, and may
- therefore be more efficiently implementable. When in doubt,
- it's safer to use memmove.
-
- References: ANSI Secs. 4.11.2.1, 4.11.2.2, Rationale
- Sec. 4.11.2 .
-
- 5.14: My compiler is rejecting the simplest possible test programs,
- with all kinds of syntax errors.
-
- A: Perhaps it is a pre-ANSI compiler, unable to accept function
- prototypes and the like.
-
- 5.15: Why won't the Frobozz Magic C Compiler, which claims to be ANSI
- compliant, accept this code? I know that the code is ANSI,
- because gcc accepts it.
-
- A: Most compilers support a few non-Standard extensions, gcc more
- so than most. Are you sure that the code being rejected doesn't
- rely on such an extension? It is usually a bad idea to perform
- experiments with a particular compiler to determine properties
- of a language; the applicable standard may permit variations, or
- the compiler may be wrong.
-
- 5.16: Is char a[3] = "abc"; legal? What does it mean?
-
- A: Yes, it is legal; it declares an array of size three,
- initialized with the three characters 'a', 'b', and 'c', without
- the usual terminating '\0' character.
-
- References: ANSI Sec. 3.5.7 pp. 72-3.
-
- 5.17: What are #pragmas and what are they good for?
-
- A: The #pragma directive provides a single, well-defined "escape
- hatch" which can be used for all sorts of implementation-
- specific controls and extensions: source listing control,
- structure packing, warning suppression (like the old lint
- /* NOTREACHED */ comments), etc.
-
- References: ANSI Sec. 3.8.6 .
-
-
- Section 6. C Preprocessor
-
- 6.1: How can I write a generic macro to swap two values?
-
- A: There is no good answer to this question. If the values are
- integers, a well-known trick using exclusive-OR could perhaps be
- used, but it will not work for floating-point values or
- pointers, or if the two values are the same variable, (and the
- "obvious" supercompressed implementation for integral types
- a^=b^=a^=b is in fact illegal due to multiple side-effects,
- and...). If the macro is intended to be used on values of
- arbitrary type (the usual goal), it cannot use a temporary,
- since it does not know what type of temporary it needs, and
- standard C does not provide a typeof operator.
-
- The best all-around solution is probably to forget about using a
- macro, unless you're willing to pass in the type as a third
- argument.
-
- 6.2: I have some old code that tries to construct identifiers with a
- macro like
-
- #define Paste(a, b) a/**/b
-
- but it doesn't work any more.
-
- A: That comments disappeared entirely and could therefore be used
- for token pasting was an undocumented feature of some early
- preprocessor implementations, notably Reiser's. ANSI affirms
- (as did K&R) that comments are replaced with white space.
- However, since the need for pasting tokens was demonstrated and
- real, ANSI introduced a well-defined token-pasting operator, ##,
- which can be used like this:
-
- #define Paste(a, b) a##b
-
- (See also question 5.4.)
-
- References: ANSI Sec. 3.8.3.3 p. 91, Rationale pp. 66-7.
-
- 6.3: What's the best way to write a multi-statement cpp macro?
-
- A: The usual goal is to write a macro that can be invoked as if it
- were a single function-call statement. This means that the
- "caller" will be supplying the final semicolon, so the macro
- body should not. The macro body cannot be a simple brace-
- delineated compound statement, because syntax errors would
- result if it were invoked (apparently as a single statement, but
- with a resultant extra semicolon) as the if branch of an if/else
- statement with an explicit else clause.
-
- The traditional solution is to use
-
- #define Func() do { \
- /* declarations */ \
- stmt1; \
- stmt2; \
- /* ... */ \
- } while(0) /* (no trailing ; ) */
-
- When the "caller" appends a semicolon, this expansion becomes a
- single statement regardless of context. (An optimizing compiler
- will remove any "dead" tests or branches on the constant
- condition 0, although lint may complain.)
-
- If all of the statements in the intended macro are simple
- expressions, with no declarations or loops, another technique is
- to write a single, parenthesized expression using one or more
- comma operators. (See the example under question 6.10 below.
- This technique also allows a value to be "returned.")
-
- References: CT&P Sec. 6.3 pp. 82-3.
-
- 6.4: Is it acceptable for one header file to #include another?
-
- A: There has been considerable debate surrounding this question.
- Many people believe that "nested #include files" are to be
- avoided: the prestigious Indian Hill Style Guide (see question
- 14.3) disparages them; they can make it harder to find relevant
- definitions; they can lead to multiple-declaration errors if a
- file is #included twice; and they make manual Makefile
- maintenance very difficult. On the other hand, they make it
- possible to use header files in a modular way (a header file
- #includes what it needs itself, rather than requiring each
- #includer to do so, a requirement that can lead to intractable
- headaches); a tool like grep (or a tags file) makes it easy to
- find definitions no matter where they are; a popular trick:
-
- #ifndef HEADER_FILE_NAME
- #define HEADER_FILE_NAME
- ...header file contents...
- #endif
-
- makes a header file "idempotent" so that it can safely be
- #included multiple times; and automated Makefile maintenance
- tools (which are a virtual necessity in large projects anyway)
- handle dependency generation in the face of nested #include
- files easily. See also section 14.
-
- 6.5: Does the sizeof operator work in preprocessor #if directives?
-
- A: No. Preprocessing happens during an earlier pass of
- compilation, before type names have been parsed. Consider using
- the predefined constants in ANSI's <limits.h>, if applicable, or
- a "configure" script, instead. (Better yet, try to write code
- which is inherently insensitive to type sizes.)
-
- References: ANSI Sec. 2.1.1.2 pp. 6-7, Sec. 3.8.1 p. 87
- footnote 83.
-
- 6.6: How can I use a preprocessor #if expression to tell if a machine
- is big-endian or little-endian?
-
- A: You probably can't. (Preprocessor arithmetic uses only long
- ints, and there is no concept of addressing.) Are you sure you
- need to know the machine's endianness explicitly? Usually it's
- better to write code which doesn't care.
-
- 6.7: I've got this tricky processing I want to do at compile time and
- I can't figure out a way to get cpp to do it.
-
- A: cpp is not intended as a general-purpose preprocessor. Rather
- than forcing it to do something inappropriate, consider writing
- your own little special-purpose preprocessing tool, instead.
- You can easily get a utility like make(1) to run it for you
- automatically.
-
- If you are trying to preprocess something other than C, consider
- using a general-purpose preprocessor (such as m4).
-
- 6.8: I have some code which contains far too many #ifdef's for my
- taste. How can I preprocess the code to leave only one
- conditional compilation set, without running it through cpp and
- expanding all of the #include's and #define's as well?
-
- A: There is a program floating around called unifdef which does
- exactly this. (See question 17.8.)
-
- 6.9: How can I list all of the pre#defined identifiers?
-
- A: There's no standard way, although it is a frequent need. The
- most expedient way is probably to extract printable strings from
- the compiler or preprocessor executable with something like the
- Unix strings(1) utility.
-
- 6.10: How can I write a cpp macro which takes a variable number of
- arguments?
-
- A: One popular trick is to define the macro with a single argument,
- and call it with a double set of parentheses, which appear to
- the preprocessor to indicate a single argument:
-
- #define DEBUG(args) (printf("DEBUG: "), printf args)
-
- if(n != 0) DEBUG(("n is %d\n", n));
-
- The obvious disadvantage is that the caller must always remember
- to use the extra parentheses. Another solution is to use
- different macros (DEBUG1, DEBUG2, etc.) depending on the number
- of arguments. (It is often better to use a bona-fide function,
- which can take a variable number of arguments in a well-defined
- way. See questions 7.1 and 7.2 below.)
-
-
- Section 7. Variable-Length Argument Lists
-
- 7.1: How can I write a function that takes a variable number of
- arguments?
-
- A: Use the <stdarg.h> header (or, if you must, the older
- <varargs.h>).
-
- Here is a function which concatenates an arbitrary number of
- strings into malloc'ed memory:
-
- #include <stdlib.h> /* for malloc, NULL, size_t */
- #include <stdarg.h> /* for va_ stuff */
- #include <string.h> /* for strcat et al */
-
- char *vstrcat(char *first, ...)
- {
- size_t len = 0;
- char *retbuf;
- va_list argp;
- char *p;
-
- if(first == NULL)
- return NULL;
-
- len = strlen(first);
-
- va_start(argp, first);
-
- while((p = va_arg(argp, char *)) != NULL)
- len += strlen(p);
-
- va_end(argp);
-
- retbuf = malloc(len + 1); /* +1 for trailing \0 */
-
- if(retbuf == NULL)
- return NULL; /* error */
-
- (void)strcpy(retbuf, first);
-
- va_start(argp, first);
-
- while((p = va_arg(argp, char *)) != NULL)
- (void)strcat(retbuf, p);
-
- va_end(argp);
-
- return retbuf;
- }
-
- Usage is something like
-
- char *str = vstrcat("Hello, ", "world!", (char *)NULL);
-
- Note the cast on the last argument. (Also note that the caller
- must free the returned, malloc'ed storage.)
-
- Under a pre-ANSI compiler, rewrite the function definition
- without a prototype ("char *vstrcat(first) char *first; {"),
- include <stdio.h> rather than <stdlib.h>, add "extern
- char *malloc();", and use int instead of size_t. You may also
- have to delete the (void) casts, and use the older varargs
- package instead of stdarg. See the next question for hints.
-
- Remember that in variable-length argument lists, function
- prototypes do not supply parameter type information; therefore,
- default argument promotions apply (see question 5.7), and null
- pointer arguments must be typed explicitly (see question 1.2).
-
- References: K&R II Sec. 7.3 p. 155, Sec. B7 p. 254; H&S
- Sec. 13.4 pp. 286-9; ANSI Secs. 4.8 through 4.8.1.3 .
-
- 7.2: How can I write a function that takes a format string and a
- variable number of arguments, like printf, and passes them to
- printf to do most of the work?
-
- A: Use vprintf, vfprintf, or vsprintf.
-
- Here is an "error" routine which prints an error message,
- preceded by the string "error: " and terminated with a newline:
-
- #include <stdio.h>
- #include <stdarg.h>
-
- void
- error(char *fmt, ...)
- {
- va_list argp;
- fprintf(stderr, "error: ");
- va_start(argp, fmt);
- vfprintf(stderr, fmt, argp);
- va_end(argp);
- fprintf(stderr, "\n");
- }
-
- To use the older <varargs.h> package, instead of <stdarg.h>,
- change the function header to:
-
- void error(va_alist)
- va_dcl
- {
- char *fmt;
-
- change the va_start line to
-
- va_start(argp);
-
- and add the line
-
- fmt = va_arg(argp, char *);
-
- between the calls to va_start and vfprintf. (Note that there is
- no semicolon after va_dcl.)
-
- References: K&R II Sec. 8.3 p. 174, Sec. B1.2 p. 245; H&S
- Sec. 17.12 p. 337; ANSI Secs. 4.9.6.7, 4.9.6.8, 4.9.6.9 .
-
- 7.3: How can I discover how many arguments a function was actually
- called with?
-
- A: This information is not available to a portable program. Some
- systems provide a nonstandard nargs() function, but its use is
- questionable, since it typically returns the number of words
- passed, not the number of arguments. (Floating point values and
- structures are usually passed as several words.)
-
- Any function which takes a variable number of arguments must be
- able to determine from the arguments themselves how many of them
- there are. printf-like functions do this by looking for
- formatting specifiers (%d and the like) in the format string
- (which is why these functions fail badly if the format string
- does not match the argument list). Another common technique
- (useful when the arguments are all of the same type) is to use a
- sentinel value (often 0, -1, or an appropriately-cast null
- pointer) at the end of the list (see the execl and vstrcat
- examples under questions 1.2 and 7.1 above).
-
- 7.4: I can't get the va_arg macro to pull in an argument of type
- pointer-to-function.
-
- A: The type-rewriting games which the va_arg macro typically plays
- are stymied by overly-complicated types such as pointer-to-
- function. If you use a typedef for the function pointer type,
- however, all will be well.
-
- References: ANSI Sec. 4.8.1.2 p. 124.
-
- 7.5: How can I write a function which takes a variable number of
- arguments and passes them to some other function (which takes a
- variable number of arguments)?
-
- A: In general, you cannot. You must provide a version of that
- other function which accepts a va_list pointer, as does vfprintf
- in the example above. If the arguments must be passed directly
- as actual arguments (not indirectly through a va_list pointer)
- to another function which is itself variadic (for which you do
- not have the option of creating an alternate, va_list-accepting
- version) no portable solution is possible. (The problem can be
- solved by resorting to machine-specific assembly language.)
-
- 7.6: How can I call a function with an argument list built up at run
- time?
-
- A: There is no guaranteed or portable way to do this. If you're
- curious, ask this list's editor, who has a few wacky ideas you
- could try... (See also question 16.10.)
-
-
- Section 8. Boolean Expressions and Variables
-
- 8.1: What is the right type to use for boolean values in C? Why
- isn't it a standard type? Should #defines or enums be used for
- the true and false values?
-
- A: C does not provide a standard boolean type, because picking one
- involves a space/time tradeoff which is best decided by the
- programmer. (Using an int for a boolean may be faster, while
- using char may save data space.)
-
- The choice between #defines and enums is arbitrary and not
- terribly interesting (see also question 9.1). Use any of
-
- #define TRUE 1 #define YES 1
- #define FALSE 0 #define NO 0
-
- enum bool {false, true}; enum bool {no, yes};
-
- or use raw 1 and 0, as long as you are consistent within one
- program or project. (An enum may be preferable if your debugger
- expands enum values when examining variables.)
-
- Some people prefer variants like
-
- #define TRUE (1==1)
- #define FALSE (!TRUE)
-
- or define "helper" macros such as
-
- #define Istrue(e) ((e) != 0)
-
- These don't buy anything (see question 8.2 below; see also
- question 1.6).
-
- 8.2: Isn't #defining TRUE to be 1 dangerous, since any nonzero value
- is considered "true" in C? What if a built-in boolean or
- relational operator "returns" something other than 1?
-
- A: It is true (sic) that any nonzero value is considered true in C,
- but this applies only "on input", i.e. where a boolean value is
- expected. When a boolean value is generated by a built-in
- operator, it is guaranteed to be 1 or 0. Therefore, the test
-
- if((a == b) == TRUE)
-
- will work as expected (as long as TRUE is 1), but it is
- obviously silly. In general, explicit tests against TRUE and
- FALSE are undesirable, because some library functions (notably
- isupper, isalpha, etc.) return, on success, a nonzero value
- which is _not_ necessarily 1. (Besides, if you believe that
- "if((a == b) == TRUE)" is an improvement over "if(a == b)", why
- stop there? Why not use "if(((a == b) == TRUE) == TRUE)"?) A
- good rule of thumb is to use TRUE and FALSE (or the like) only
- for assignment to a Boolean variable, or as the return value
- from a Boolean function, never in a comparison.
-
- The preprocessor macros TRUE and FALSE (and, of course, NULL)
- are used for code readability, not because the underlying values
- might ever change. (See also question 1.7.)
-
- References: K&R I Sec. 2.7 p. 41; K&R II Sec. 2.6 p. 42,
- Sec. A7.4.7 p. 204, Sec. A7.9 p. 206; ANSI Secs. 3.3.3.3, 3.3.8,
- 3.3.9, 3.3.13, 3.3.14, 3.3.15, 3.6.4.1, 3.6.5; Achilles and the
- Tortoise.
-
-
- Section 9. Structs, Enums, and Unions
-
- 9.1: What is the difference between an enum and a series of
- preprocessor #defines?
-
- A: At the present time, there is little difference. Although many
- people might have wished otherwise, the ANSI standard says that
- enumerations may be freely intermixed with integral types,
- without errors. (If such intermixing were disallowed without
- explicit casts, judicious use of enums could catch certain
- programming errors.)
-
- Some advantages of enums are that the numeric values are
- automatically assigned, that a debugger may be able to display
- the symbolic values when enum variables are examined, and that
- they obey block scope. (A compiler may also generate nonfatal
- warnings when enums and ints are indiscriminately mixed, since
- doing so can still be considered bad style even though it is not
- strictly illegal). A disadvantage is that the programmer has
- little control over the size (or over those nonfatal warnings).
-
- References: K&R II Sec. 2.3 p. 39, Sec. A4.2 p. 196; H&S
- Sec. 5.5 p. 100; ANSI Secs. 3.1.2.5, 3.5.2, 3.5.2.2 .
-
- 9.2: I heard that structures could be assigned to variables and
- passed to and from functions, but K&R I says not.
-
- A: What K&R I said was that the restrictions on struct operations
- would be lifted in a forthcoming version of the compiler, and in
- fact struct assignment and passing were fully functional in
- Ritchie's compiler even as K&R I was being published. Although
- a few early C compilers lacked struct assignment, all modern
- compilers support it, and it is part of the ANSI C standard, so
- there should be no reluctance to use it.
-
- References: K&R I Sec. 6.2 p. 121; K&R II Sec. 6.2 p. 129; H&S
- Sec. 5.6.2 p. 103; ANSI Secs. 3.1.2.5, 3.2.2.1, 3.3.16 .
-
- 9.3: How does struct passing and returning work?
-
- A: When structures are passed as arguments to functions, the entire
- struct is typically pushed on the stack, using as many words as
- are required. (Programmers often choose to use pointers to
- structures instead, precisely to avoid this overhead.)
-
- Structures are often returned from functions in a location
- pointed to by an extra, compiler-supplied "hidden" argument to
- the function. Some older compilers used a special, static
- location for structure returns, although this made struct-valued
- functions nonreentrant, which ANSI C disallows.
-
- References: ANSI Sec. 2.2.3 p. 13.
-
- 9.4: The following program works correctly, but it dumps core after
- it finishes. Why?
-
- struct list
- {
- char *item;
- struct list *next;
- }
-
- /* Here is the main program. */
-
- main(argc, argv)
- ...
-
- A: A missing semicolon causes the compiler to believe that main
- returns a structure. (The connection is hard to see because of
- the intervening comment.) Since struct-valued functions are
- usually implemented by adding a hidden return pointer, the
- generated code for main() tries to accept three arguments,
- although only two are passed (in this case, by the C start-up
- code). See also question 17.15.
-
- References: CT&P Sec. 2.3 pp. 21-2.
-
- 9.5: Why can't you compare structs?
-
- A: There is no reasonable way for a compiler to implement struct
- comparison which is consistent with C's low-level flavor. A
- byte-by-byte comparison could be invalidated by random bits
- present in unused "holes" in the structure (such padding is used
- to keep the alignment of later fields correct). A field-by-
- field comparison would require unacceptable amounts of
- repetitive, in-line code for large structures.
-
- If you want to compare two structures, you must write your own
- function to do so. C++ would let you arrange for the ==
- operator to map to your function.
-
- References: K&R II Sec. 6.2 p. 129; H&S Sec. 5.6.2 p. 103; ANSI
- Rationale Sec. 3.3.9 p. 47.
-
- 9.6: I came across some code that declared a structure like this:
-
- struct name
- {
- int namelen;
- char name[1];
- };
-
- and then did some tricky allocation to make the name array act
- like it had several elements. Is this legal and/or portable?
-
- A: This technique is popular, although Dennis Ritchie has called it
- "unwarranted chumminess with the compiler." The ANSI C standard
- allows it only implicitly. It seems to be portable to all known
- implementations. (Compilers which check array bounds carefully
- might issue warnings.)
-
- References: ANSI Rationale Sec. 3.5.4.2 pp. 54-5.
-
- 9.7: How can I determine the byte offset of a field within a
- structure?
-
- A: ANSI C defines the offsetof macro, which should be used if
- available; see <stddef.h>. If you don't have it, a suggested
- implementation is
-
- #define offsetof(type, mem) ((size_t) \
- ((char *)&((type *) 0)->mem - (char *)((type *) 0)))
-
- This implementation is not 100% portable; some compilers may
- legitimately refuse to accept it.
-
- See the next question for a usage hint.
-
- References: ANSI Sec. 4.1.5, Rationale Sec. 3.5.4.2 p. 55.
-
- 9.8: How can I access structure fields by name at run time?
-
- A: Build a table of names and offsets, using the offsetof() macro.
- The offset of field b in struct a is
-
- offsetb = offsetof(struct a, b)
-
- If structp is a pointer to an instance of this structure, and b
- is an int field with offset as computed above, b's value can be
- set indirectly with
-
- *(int *)((char *)structp + offsetb) = value;
-
- 9.9: Why does sizeof report a larger size than I expect for a
- structure type, as if there was padding at the end?
-
- A: Structures may have this padding (as well as internal padding;
- see also question 9.5), so that alignment properties will be
- preserved when an array of contiguous structures is allocated.
-
- 9.10: My compiler is leaving holes in structures, which is wasting
- space and preventing "binary" I/O to external data files. Can I
- turn off the padding, or otherwise control the alignment of
- structs?
-
- A: Your compiler may provide an extension to give you this control
- (perhaps a #pragma), but there is no standard method. See also
- question 17.2.
-
- 9.11: Can I initialize unions?
-
- A: ANSI Standard C allows an initializer for the first member of a
- union. There is no standard way of initializing the other
- members (nor, under a pre-ANSI compiler, is there generally any
- way of initializing any of them).
-
-
- Section 10. Declarations
-
- 10.1: How do you decide which integer type to use?
-
- A: If you might need large values (above 32767 or below -32767),
- use long. Otherwise, if space is very important (there are
- large arrays or many structures), use short. Otherwise, use
- int. If well-defined overflow characteristics are important
- and/or negative values are not, use the corresponding unsigned
- types. (But beware of mixing signed and unsigned in
- expressions.) Similar arguments apply when deciding between
- float and double.
-
- Although char or unsigned char can be used as a "tiny" int type,
- doing so is often more trouble than it's worth, due to
- unpredictable sign extension and increased code size.
-
- These rules obviously don't apply if the address of a variable
- is taken and must have a particular type.
-
- If for some reason you need to declare something with an _exact_
- size (usually the only good reason for doing so is when
- attempting to conform to some externally-imposed storage layout,
- but see question 17.2), be sure to encapsulate the choice behind
- an appropriate typedef.
-
- 10.2: What should the 64-bit type on new, 64-bit machines be?
-
- A: Some vendors of C products for 64-bit machines support 64-bit
- long ints. Others fear that too much existing code depends on
- sizeof(int) == sizeof(long) == 32 bits, and introduce a new 64-
- bit long long int type instead.
-
- Programmers interested in writing portable code should therefore
- insulate their 64-bit type needs behind appropriate typedefs.
- Vendors who feel compelled to introduce a new long long int type
- should advertise it as being "at least 64 bits" (which is truly
- new; a type traditional C doesn't have), and not "exactly 64
- bits."
-
- 10.3: I can't seem to define a linked list successfully. I tried
-
- typedef struct
- {
- char *item;
- NODEPTR next;
- } *NODEPTR;
-
- but the compiler gave me error messages. Can't a struct in C
- contain a pointer to itself?
-
- A: Structs in C can certainly contain pointers to themselves; the
- discussion and example in section 6.5 of K&R make this clear.
- The problem with this example is that the NODEPTR typedef is not
- complete at the point where the "next" field is declared. To
- fix it, first give the structure a tag ("struct node"). Then,
- declare the "next" field as "struct node *next;", and/or move
- the typedef declaration wholly before or wholly after the struct
- declaration. One corrected version would be
-
- struct node
- {
- char *item;
- struct node *next;
- };
-
- typedef struct node *NODEPTR;
-
- , and there are at least three other equivalently correct ways
- of arranging it.
-
- A similar problem, with a similar solution, can arise when
- attempting to declare a pair of typedef'ed mutually recursive
- structures.
-
- References: K&R I Sec. 6.5 p. 101; K&R II Sec. 6.5 p. 139; H&S
- Sec. 5.6.1 p. 102; ANSI Sec. 3.5.2.3 .
-
- 10.4: How do I declare an array of pointers to functions returning
- pointers to functions returning pointers to characters?
-
- A: This question can be answered in at least three ways (all
- declare the hypothetical array with 5 elements):
-
- 1. char *(*(*a[5])())();
-
- 2. Build the declaration up in stages, using typedefs:
-
- typedef char *pc; /* pointer to char */
- typedef pc fpc(); /* function returning pointer to char */
- typedef fpc *pfpc; /* pointer to above */
- typedef pfpc fpfpc(); /* function returning... */
- typedef fpfpc *pfpfpc; /* pointer to... */
- pfpfpc a[5]; /* array of... */
-
- 3. Use the cdecl program, which turns English into C and vice
- versa:
-
- cdecl> declare a as array 5 of pointer to function returning
- pointer to function returning pointer to char
- char *(*(*a[5])())()
-
- cdecl can also explain complicated declarations, help with
- casts, and indicate which set of parentheses the arguments
- go in (for complicated function definitions, like the
- above). Versions of cdecl are in volume 14 of
- comp.sources.unix (see question 17.8) and K&R II.
-
- Any good book on C should explain how to read these complicated
- C declarations "inside out" to understand them ("declaration
- mimics use").
-
- References: K&R II Sec. 5.12 p. 122; H&S Sec. 5.10.1 p. 116.
-
- 10.5: I'm building a state machine with a bunch of functions, one for
- each state. I want to implement state transitions by having
- each function return a pointer to the next state function. I
- find a limitation in C's declaration mechanism: there's no way
- to declare these functions as returning a pointer to a function
- returning a pointer to a function returning a pointer to a
- function...
-
- A: You can't do it directly. Either have the function return a
- generic function pointer type, and apply a cast before calling
- through it; or have it return a structure containing only a
- pointer to a function returning that structure.
-
- 10.6: What's the best way to declare and define global variables?
-
- A: First, though there can be many _declarations_ (and in many
- translation units) of a single "global" (strictly speaking,
- "external") variable (or function), there must be exactly one
- _definition_. (The definition is the declaration that actually
- allocates space, and provides an initialization value, if any.)
- It is best to place the definition in some central (to the
- program, or to the module) .c file, with an external declaration
- in a header (".h") file, which is #included wherever the
- declaration is needed. The .c file containing the definition
- should also #include the header file containing the external
- declaration, so that the compiler can check that the
- declarations match.
-
- This rule promotes a high degree of portability, and is
- consistent with the requirements of the ANSI C Standard. Note
- that Unix compilers and linkers typically use a "common model"
- which allows multiple (uninitialized) definitions. A few very
- odd systems may require an explicit initializer to distinguish a
- definition from an external declaration.
-
- It is possible to use preprocessor tricks to arrange that the
- declaration need only be typed once, in the header file, and
- "turned into" a definition, during exactly one #inclusion, via a
- special #define.
-
- References: K&R I Sec. 4.5 pp. 76-7; K&R II Sec. 4.4 pp. 80-1;
- ANSI Sec. 3.1.2.2 (esp. Rationale), Secs. 3.7, 3.7.2,
- Sec. F.5.11 .
-
- 10.7: I finally figured out the syntax for declaring pointers to
- functions, but now how do I initialize one?
-
- A: Use something like
-
- extern int func();
- int (*fp)() = func;
-
- When the name of a function appears in an expression but is not
- being called (i.e. is not followed by a "("), it "decays" into a
- pointer (i.e. it has its address implicitly taken), much as an
- array name does.
-
- An explicit extern declaration for the function is normally
- needed, since implicit external function declaration does not
- happen in this case (again, because the function name is not
- followed by a "(").
-
- 10.8: I've seen different methods used for calling through pointers to
- functions. What's the story?
-
- A: Originally, a pointer to a function had to be "turned into" a
- "real" function, with the * operator (and an extra pair of
- parentheses, to keep the precedence straight), before calling:
-
- int r, func(), (*fp)() = func;
- r = (*fp)();
-
- It can also be argued that functions are always called through
- pointers, but that "real" functions decay implicitly into
- pointers (in expressions, as they do in initializations) and so
- cause no trouble. This reasoning, made widespread through pcc
- and adopted in the ANSI standard, means that
-
- r = fp();
-
- is legal and works correctly, whether fp is a function or a
- pointer to one. (The usage has always been unambiguous; there
- is nothing you ever could have done with a function pointer
- followed by an argument list except call through it.) An
- explicit * is harmless, and still allowed (and recommended, if
- portability to older compilers is important).
-
- References: ANSI Sec. 3.3.2.2 p. 41, Rationale p. 41.
-
-
- Section 11. Stdio
-
- 11.1: Why doesn't this code:
-
- char c;
- while((c = getchar()) != EOF)...
-
- work?
-
- A: For one thing, the variable to hold getchar's return value must
- be an int. getchar can return all possible character values, as
- well as EOF. By passing getchar's return value through a char,
- either a normal character might be misinterpreted as EOF, or the
- EOF might be altered and so never seen.
-
- References: CT&P Sec. 5.1 p. 70.
-
- 11.2: Why doesn't the code scanf("%d", i); work?
-
- A: You must always pass addresses (in this case, &i) to scanf.
-
- 11.3: Why doesn't this code:
-
- double d;
- scanf("%f", &d);
-
- work?
-
- A: With scanf, use %lf for values of type double, and %f for float.
- (Note the discrepancy with printf, which uses %f for both double
- and float, due to C's default argument promotion rules.)
-
- 11.4: Why won't the code
-
- while(!feof(fp))
- fgets(buf, MAXLINE, fp);
-
- work?
-
- A: C's I/O is not like Pascal's. EOF is only indicated _after_ an
- input routine has tried to read, and has reached end-of-file.
- Usually, you should just check the return value of the input
- routine (fgets in this case); feof() is rarely needed.
-
- 11.5: Why does everyone say not to use gets()?
-
- A: It cannot be told the size of the buffer it's to read into, so
- it cannot be prevented from overflowing that buffer.
-
- 11.6: Why does errno contain ENOTTY after a call to printf?
-
- A: Many implementations of the stdio package adjust their behavior
- slightly if stdout is a terminal. To make the determination,
- these implementations perform an operation which fails (with
- ENOTTY) if stdout is not a terminal. Although the output
- operation goes on to complete successfully, errno still contains
- ENOTTY.
-
- References: CT&P Sec. 5.4 p. 73.
-
- 11.7: My program's prompts and intermediate output don't always show
- up on the screen, especially when I pipe the output through
- another program.
-
- A: It is best to use an explicit fflush(stdout) whenever output
- should definitely be visible. Several mechanisms attempt to
- perform the fflush for you, at the "right time," but they tend
- to apply only when stdout is a terminal. (See question 11.6.)
-
- 11.8: When I read from the keyboard with scanf, it seems to hang until
- I type one extra line of input.
-
- A: scanf was designed for free-format input, which is seldom what
- you want when reading from the keyboard. In particular, "\n" in
- a format string does _not_ mean to expect a newline, but rather
- to read and discard characters as long as each is a whitespace
- character.
-
- A related problem is that unexpected non-numeric input can cause
- scanf to "jam." Because of these problems, it is usually better
- to use fgets() to read a whole line, and then use sscanf or
- other string functions to pick apart the line buffer. If you do
- use sscanf, don't forget to check the return value to make sure
- that the expected number of items were found.
-
- 11.9: I'm trying to update a file in place, by using fopen mode "r+",
- then reading a certain string, and finally writing back a
- modified string, but it's not working.
-
- A: Be sure to call fseek before you write, both to seek back to the
- beginning of the string you're trying to overwrite, and because
- an fseek or fflush is always required between reading and
- writing in the read/write "+" modes.
-
- References: ANSI Sec. 4.9.5.3 p. 131.
-
- 11.10: How can I read one character at a time, without waiting for the
- RETURN key?
-
- A: See question 16.1.
-
- 11.11: How can I flush pending input so that a user's typeahead isn't
- read at the next prompt? Will fflush(stdin) work?
-
- A: fflush is defined only for output streams. Since its definition
- of "flush" is to complete the writing of buffered characters
- (not to discard them), discarding unread input would not be an
- analogous meaning for fflush on input streams. There is no
- standard way to discard unread characters from a stdio input
- buffer, nor would such a way be sufficient; unread characters
- can also accumulate in other, OS-level input buffers.
-
- 11.12: How can I redirect stdin or stdout to a file from within a
- program?
-
- A: Use freopen.
-
- 11.13: Once I've used freopen, how can I get the original stdout (or
- stdin) back?
-
- A: If you need to switch back and forth, the best all-around
- solution is not to use freopen in the first place. Try using
- your own explicit output (or input) stream variable, which you
- can reassign at will, while leaving the original stdout (or
- stdin) undisturbed.
-
- 11.14: How can I recover the file name given an open file descriptor?
-
- A: This problem is, in general, insoluble. Under Unix, for
- instance, a scan of the entire disk, (perhaps requiring special
- permissions) would theoretically be required, and would fail if
- the file descriptor was a pipe or referred to a deleted file
- (and could give a misleading answer for a file with multiple
- links). It is best to remember the names of files yourself when
- you open them (perhaps with a wrapper function around fopen).
-
-
- Section 12. Library Subroutines
-
- 12.1: Why does strncpy not always place a '\0' termination in the
- destination string?
-
- A: strncpy was first designed to handle a now-obsolete data
- structure, the fixed-length, not-necessarily-\0-terminated
- "string." strncpy is admittedly a bit cumbersome to use in
- other contexts, since you must often append a '\0' to the
- destination string by hand.
-
- 12.2: I'm trying to sort an array of strings with qsort, using strcmp
- as the comparison function, but it's not working.
-
- A: By "array of strings" you probably mean "array of pointers to
- char." The arguments to qsort's comparison function are
- pointers to the objects being sorted, in this case, pointers to
- pointers to char. (strcmp, of course, accepts simple pointers
- to char.)
-
- The comparison routine's arguments are expressed as "generic
- pointers," const void * or char *. They must be converted back
- to what they "really are" (char **) and dereferenced, yielding
- char *'s which can be usefully compared. Write a comparison
- function like this:
-
- int pstrcmp(p1, p2) /* compare strings through pointers */
- char *p1, *p2; /* const void * for ANSI C */
- {
- return strcmp(*(char **)p1, *(char **)p2);
- }
-
- 12.3: Now I'm trying to sort an array of structures with qsort. My
- comparison routine takes pointers to structures, but the
- compiler complains that the function is of the wrong type for
- qsort. How can I cast the function pointer to shut off the
- warning?
-
- A: The conversions must be in the comparison function, which must
- be declared as accepting "generic pointers" (const void * or
- char *) as discussed above.
-
- 12.4: How can I convert numbers to strings (the opposite of atoi)? Is
- there an itoa function?
-
- A: Just use sprintf. (You'll have to allocate space for the result
- somewhere anyway; see questions 3.1 and 3.2. Don't worry that
- sprintf may be overkill, potentially wasting run time or code
- space; it works well in practice.)
-
- References: K&R I Sec. 3.6 p. 60; K&R II Sec. 3.6 p. 64.
-
- 12.5: How can I get the current date or time of day in a C program?
-
- A: Just use the time, ctime, and/or localtime functions. (These
- routines have been around for years, and are in the ANSI
- standard.) Here is a simple example:
-
- #include <stdio.h>
- #include <time.h>
-
- main()
- {
- time_t now;
- time(&now);
- printf("It's %.24s.\n", ctime(&now));
- return 0;
- }
-
- References: ANSI Sec. 4.12 .
-
- 12.6: I know that the library routine localtime will convert a time_t
- into a broken-down struct tm, and that ctime will convert a
- time_t to a printable string. How can I perform the inverse
- operations of converting a struct tm or a string into a time_t?
-
- A: ANSI C specifies a library routine, mktime, which converts a
- struct tm to a time_t. Several public-domain versions of this
- routine are available in case your compiler does not support it
- yet.
-
- Converting a string to a time_t is harder, because of the wide
- variety of date and time formats which should be parsed.
- Public-domain routines have been written for performing this
- function (see, for example, the file partime.c, widely
- distributed with the RCS package), but they are less likely to
- become standardized.
-
- References: K&R II Sec. B10 p. 256; H&S Sec. 20.4 p. 361; ANSI
- Sec. 4.12.2.3 .
-
- 12.7: I need a random number generator.
-
- A: The standard C library has one: rand(). The implementation on
- your system may not be perfect, but writing a better one isn't
- necessarily easy, either.
-
- References: ANSI Sec. 4.10.2.1 p. 154, Knuth Vol. 2 Chap. 3
- pp. 1-177.
-
- 12.8: Each time I run my program, I get the same sequence of numbers
- back from rand().
-
- A: You can call srand() to seed the pseudo-random number generator
- with a more random initial value. Popular seed values are the
- time of day, or the elapsed time before the user presses a key
- (although keypress times are hard to determine portably; see
- question 16.9).
-
- References: ANSI Sec. 4.10.2.2 p. 154.
-
- 12.9: I need a random true/false value, so I'm taking rand() % 2, but
- it's just alternating 0, 1, 0, 1, 0...
-
- A: Poor pseudorandom number generators (such as the ones
- unfortunately supplied with some systems) are not very random in
- the low-order bits. Try using the higher-order bits.
-
- 12.10- I'm trying to port this old A: These routines are variously
- 12.14: program. Why do I get obsolete; you should
- "undefined external" errors instead:
- for:
-
- 12.10: index? A: use strchr.
- 12.11: rindex? A: use strrchr.
- 12.12: bcopy? A: use memmove, after
- interchanging the first and
- second arguments (see also
- question 5.13).
- 12.13: bcmp? A: use memcmp.
- 12.14: bzero? A: use memset, with a second
- argument of 0.
-
- 12.15: How can I execute a command with system() and read its output
- into a program?
-
- A: Unix and some other systems provide a popen() routine, which
- sets up a stdio stream on a pipe connected to the process
- running a command, so that the output can be read (or the input
- supplied).
-
- 12.16: How can I read a directory in a C program?
-
- A: See if you can use the opendir() and readdir() routines, which
- are available on most Unix systems. Implementations also exist
- for MS-DOS, VMS, and other systems. (MS-DOS also has FINDFIRST
- and FINDNEXT routines which do essentially the same thing.)
-
-
- Section 13. Lint
-
- 13.1: I just typed in this program, and it's acting strangely. Can
- you see anything wrong with it?
-
- A: Try running lint first(perhaps with the -a, -c, -h, -p and/or
- other options). Many C compilers are really only half-
- compilers, electing not to diagnose numerous source code
- difficulties which would not actively preclude code generation.
-
- 13.2: How can I shut off the "warning: possible pointer alignment
- problem" message lint gives me for each call to malloc?
-
- A: The problem is that traditional versions of lint do not know,
- and cannot be told, that malloc "returns a pointer to space
- suitably aligned for storage of any type of object." It is
- possible to provide a pseudoimplementation of malloc, using a
- #define inside of #ifdef lint, which effectively shuts this
- warning off, but a simpleminded #definition will also suppress
- meaningful messages about truly incorrect invocations. It may
- be easier simply to ignore the message, perhaps in an automated
- way with grep -v.
-
- 13.3: Where can I get an ANSI-compatible lint?
-
- A: A product called FlexeLint is available (in "shrouded source
- form," for compilation on 'most any system) from
-
- Gimpel Software
- 3207 Hogarth Lane
- Collegeville, PA 19426 USA
- (+1) 215 584 4261
-
- The System V release 4 lint is ANSI-compatible, and is available
- separately (bundled with other C tools) from Unix Support Labs
- (a subsidiary of AT&T), or from System V resellers.
-
-
- Section 14. Style
-
- 14.1: Here's a neat trick:
-
- if(!strcmp(s1, s2))
-
- Is this good style?
-
- A: It is not particularly good style, although it is a popular
- idiom. The test succeeds if the two strings are equal, but its
- form suggests that it tests for inequality.
-
- Another solution is to use a macro:
-
- #define Streq(s1, s2) (strcmp((s1), (s2)) == 0)
-
- Opinions on code style, like those on religion, can be debated
- endlessly. Though good style is a worthy goal, and can usually
- be recognized, it cannot be codified.
-
- 14.2: What's the best style for code layout in C?
-
- A: K&R, while providing the example most often copied, also supply
- a good excuse for avoiding it:
-
- The position of braces is less important,
- although people hold passionate beliefs.
- We have chosen one of several popular styles.
- Pick a style that suits you, then use it
- consistently.
-
- It is more important that the layout chosen be consistent (with
- itself, and with nearby or common code) than that it be
- "perfect." If your coding environment (i.e. local custom or
- company policy) does not suggest a style, and you don't feel
- like inventing your own, just copy K&R. (The tradeoffs between
- various indenting and brace placement options can be
- exhaustively and minutely examined, but don't warrant repetition
- here. See also the Indian Hill Style Guide.)
-
- The elusive quality of "good style" involves much more than mere
- code layout details; don't spend time on formatting to the
- exclusion of more substantive code quality issues.
-
- References: K&R Sec. 1.2 p. 10.
-
- 14.3: Where can I get the "Indian Hill Style Guide" and other coding
- standards?
-
- A: Various documents are available for anonymous ftp from:
-
- Site: File or directory:
-
- cs.washington.edu ~ftp/pub/cstyle.tar.Z
- (128.95.1.4) (the updated Indian Hill guide)
-
- cs.toronto.edu doc/programming
-
- giza.cis.ohio-state.edu pub/style-guide
-
-
- Section 15. Floating Point
-
- 15.1: My floating-point calculations are acting strangely and giving
- me different answers on different machines.
-
- A: First, make sure that you have #included <math.h>, and correctly
- declared other functions returning double.
-
- If the problem isn't that simple, recall that most digital
- computers use floating-point formats which provide a close but
- by no means exact simulation of real number arithmetic.
- Underflow, cumulative precision loss, and other anomalies are
- often troublesome.
-
- Don't assume that floating-point results will be exact, and
- especially don't assume that floating-point values can be
- compared for equality. (Don't throw haphazard "fuzz factors"
- in, either.)
-
- These problems are no worse for C than they are for any other
- computer language. Floating-point semantics are usually defined
- as "however the processor does them;" otherwise a compiler for a
- machine without the "right" model would have to do prohibitively
- expensive emulations.
-
- This article cannot begin to list the pitfalls associated with,
- and workarounds appropriate for, floating-point work. A good
- programming text should cover the basics.
-
- References: EoPS Sec. 6 pp. 115-8.
-
- 15.2: I'm trying to do some simple trig, and I am #including <math.h>,
- but I keep getting "undefined: _sin" compilation errors.
-
- A: Make sure you're linking against the correct math library. For
- instance, under Unix, you usually need to use the -lm option,
- and at the _end_ of the command line, when compiling/linking.
-
- 15.3: Why doesn't C have an exponentiation operator?
-
- A: You can #include <math.h> and use the pow() function, although
- explicit multiplication is often better for small positive
- integral exponents.
-
- References: ANSI Sec. 4.5.5.1 .
-
- 15.4: I'm having trouble with a Turbo C program which crashes and says
- something like "floating point formats not linked."
-
- A: Some compilers for small machines, including Turbo C (and
- Ritchie's original PDP-11 compiler), leave out floating point
- support if it looks like it will not be needed. In particular,
- the non-floating-point versions of printf and scanf save space
- by not including code to handle %e, %f, and %g. It happens that
- Turbo C's heuristics for determining whether the program uses
- floating point are insufficient, and the programmer must
- sometimes insert an extra, explicit call to a floating-point
- library routine to force loading of floating-point support.
-
-
- Section 16. System Dependencies
-
- 16.1: How can I read a single character from the keyboard without
- waiting for a newline?
-
- A: Contrary to popular belief and many people's wishes, this is not
- a C-related question. (Nor are closely-related questions
- concerning the echo of keyboard input.) The delivery of
- characters from a "keyboard" to a C program is a function of the
- operating system in use, and has not been standardized by the C
- language. Some versions of curses have a cbreak() function
- which does what you want. Under UNIX, use ioctl to play with
- the terminal driver modes (CBREAK or RAW under "classic"
- versions; ICANON, c_cc[VMIN] and c_cc[VTIME] under System V or
- Posix systems). Under MS-DOS, use getch(). Under VMS, try the
- Screen Management (SMG$) routines, or curses, or issue low-level
- $QIO's to ask for one character at a time. Under other
- operating systems, you're on your own. Beware that some
- operating systems make this sort of thing impossible, because
- character collection into input lines is done by peripheral
- processors not under direct control of the CPU running your
- program.
-
- Operating system specific questions are not appropriate for
- comp.lang.c . Many common questions are answered in
- frequently-asked questions postings in such groups as
- comp.unix.questions and comp.os.msdos.programmer . Note that
- the answers are often not unique even across different variants
- of a system; bear in mind when answering system-specific
- questions that the answer that applies to your system may not
- apply to everyone else's.
-
- References: PCS Sec. 10 pp. 128-9, Sec. 10.1 pp. 130-1.
-
- 16.2: How can I find out if there are characters available for reading
- (and if so, how many)? Alternatively, how can I do a read that
- will not block if there are no characters available?
-
- A: These, too, are entirely operating-system-specific. Some
- versions of curses have a nodelay() function. Depending on your
- system, you may also be able to use "nonblocking I/O", or a
- system call named "select", or the FIONREAD ioctl, or kbhit(),
- or rdchk(), or the O_NDELAY option to open() or fcntl().
-
- 16.3: How can I clear the screen? How can I print things in inverse
- video?
-
- A: Such things depend on the terminal type (or display) you're
- using. You will have to use a library such as termcap or
- curses, or some system-specific routines, to perform these
- functions.
-
- 16.4: How do I read the mouse?
-
- A: Consult your system documentation, or ask on an appropriate
- system-specific newsgroup (but check its FAQ list first). Mouse
- handling is completely different under the X window system, MS-
- DOS, Macintosh, and probably every other system.
-
- 16.5: How can my program discover the complete pathname to the
- executable file from which it was invoked?
-
- A: argv[0] may contain all or part of the pathname, or it may
- contain nothing. You may be able to duplicate the command
- language interpreter's search path logic to locate the
- executable if the name in argv[0] is present but incomplete.
- However, there is no guaranteed or portable solution.
-
- 16.6: How can a process change an environment variable in its caller?
-
- A: In general, it cannot. Different operating systems implement
- name/value functionality similar to the Unix environment in
- different ways. Whether the "environment" can be usefully
- altered by a running program, and if so, how, is system-
- dependent.
-
- Under Unix, a process can modify its own environment (some
- systems provide setenv() and/or putenv() functions to do this),
- and the modified environment is usually passed on to any child
- processes, but it is _not_ propagated back to the parent
- process.
-
- 16.7: How can I find out the size of a file, prior to reading it in?
-
- A: If the "size of a file" is the number of characters you'll be
- able to read from it in C, it is in general impossible to
- determine this number in advance. Under Unix, the stat call
- will give you an exact answer, and several other systems supply
- a Unix-like stat which will give an approximate answer. You can
- fseek to the end and then use ftell, but this usage is
- nonportable (it gives you an accurate answer only under Unix,
- and otherwise a quasi-accurate answer only for ANSI C "binary"
- files).
-
- Are you sure you have to determine the file's size in advance?
- Since the most accurate way of determining the size of a file as
- a C program will see it is to open the file and read it, perhaps
- you can rearrange the code to learn the size as it reads.
-
- 16.8: How can a file be shortened in-place without completely clearing
- or rewriting it?
-
- A: BSD systems provide ftruncate(), several others supply chsize(),
- and a few may provide a (possibly undocumented) fcntl option
- F_FREESP. Under MS-DOS, you can sometimes use write(fd, "", 0).
- However, there is no truly portable solution.
-
- 16.9: How can I implement a delay, or time a user's response, with
- sub-second resolution?
-
- A: Unfortunately, there is no portable way. V7 Unix, and derived
- systems, provided a fairly useful ftime() routine with
- resolution up to a millisecond, but it has disappeared from
- System V and Posix. Other routines you might look for on your
- system include clock() and gettimeofday(). The select() and
- poll() calls (if available) can be pressed into service to
- implement simple delays. On MS-DOS machines, it is possible to
- reprogram the system timer and timer interrupts.
-
- 16.10: How can I read in an object file and jump to routines in it?
-
- A: You want a dynamic linker and/or loader. It is possible to
- malloc some space and read in object files, but you have to know
- an awful lot about object file formats, relocation, etc. Under
- BSD Unix, you could use system() and ld -A to do the linking for
- you. Many (most?) versions of SunOS and System V have the -ldl
- library which allows object files to be dynamically loaded.
- There is also a GNU package called "dld". See also question
- 7.6.
-
-
- Section 17. Miscellaneous
-
- 17.1: What can I safely assume about the initial values of variables
- which are not explicitly initialized? If global variables start
- out as "zero," is that good enough for null pointers and
- floating-point zeroes?
-
- A: Variables with "static" duration (that is, those declared
- outside of functions, and those declared with the storage class
- static), are guaranteed initialized to zero, as if the
- programmer had typed "= 0". Therefore, such variables are
- initialized to the null pointer (of the correct type) if they
- are pointers, and to 0.0 if they are floating-point.
-
- Variables with "automatic" duration (i.e. local variables
- without the static storage class) start out containing garbage,
- unless they are explicitly initialized. Nothing useful can be
- predicted about the garbage.
-
- Dynamically-allocated memory obtained with malloc and realloc is
- also likely to contain garbage, and must be initialized by the
- calling program, as appropriate. Memory obtained with calloc
- contains all-bits-0, but this is not necessarily useful for
- pointer or floating-point values (see question 3.11, and section
- 1).
-
- 17.2: How can I write data files which can be read on other machines
- with different word size, byte order, or floating point formats?
-
- A: The best solution is to use text files (usually ASCII), written
- with fprintf and read with fscanf or the like. (Similar advice
- also applies to network protocols.) Be skeptical of arguments
- which imply that text files are too big, or that reading and
- writing them is too slow. Not only is their efficiency
- frequently acceptable in practice, but the advantages of being
- able to manipulate them with standard tools can be overwhelming.
- If you must use a binary format, you can improve portability,
- and perhaps take advantage of prewritten I/O libraries, by
- making use of standardized formats such as Sun's XDR (RFC 1014),
- OSI's ASN.1, CCITT's X.409, or ISO 8825 "Basic Encoding Rules."
- See also question 9.10.
-
- 17.3: How can I return several values from a function?
-
- A: Either pass pointers to locations which the function can fill
- in, or have the function return a structure containing the
- desired values, or (in a pinch) consider global variables. See
- also questions 2.16, 3.4, and 9.2.
-
- 17.4: If I have a char * variable pointing to the name of a function
- as a string, how can I call that function?
-
- A: The most straightforward thing to do is maintain a
- correspondence table of names and function pointers:
-
- int function1(), function2();
-
- struct {char *name; int (*funcptr)(); } symtab[] =
- {
- "function1", function1,
- "function2", function2,
- };
-
- Then, just search the table for the name, and call through the
- associated function pointer.
-
- 17.5: I seem to be missing the system header file <sgtty.h>. Can
- someone send me a copy?
-
- A: Standard headers exist in part so that definitions appropriate
- to your compiler, operating system, and processor can be
- supplied. You cannot just pick up a copy of someone else's
- header file and expect it to work, unless that person is using
- exactly the same environment. Ask your compiler vendor why the
- file was not provided (or to send a replacement copy).
-
- 17.6: How can I call FORTRAN (C++, BASIC, Pascal, Ada, LISP) functions
- from C? (And vice versa?)
-
- A: The answer is entirely dependent on the machine and the specific
- calling sequences of the various compilers in use, and may not
- be possible at all. Read your compiler documentation very
- carefully; sometimes there is a "mixed-language programming
- guide," although the techniques for passing arguments and
- ensuring correct run-time startup are often arcane. More
- information may be found in FORT.Z by Glenn Geers, available via
- anonymous ftp from suphys.physics.su.oz.au in the src directory.
-
- cfortran.h, a C header file, simplifies C/FORTRAN interfacing on
- many popular machines. It is available via anonymous ftp from
- zebra.desy.de (131.169.2.244).
-
- In C++, a "C" modifier in an external function declaration
- indicates that the function is to be called using C calling
- conventions.
-
- 17.7: Does anyone know of a program for converting Pascal or FORTRAN
- (or LISP, Ada, awk, "Old" C, ...) to C?
-
- A: Several public-domain programs are available:
-
- p2c written by Dave Gillespie, and posted to
- comp.sources.unix in March, 1990 (Volume 21); also
- available by anonymous ftp from csvax.cs.caltech.edu,
- file pub/p2c-1.20.tar.Z .
-
- ptoc another comp.sources.unix contribution, this one written
- in Pascal (comp.sources.unix, Volume 10, also patches in
- Volume 13?).
-
- f2c jointly developed by people from Bell Labs, Bellcore,
- and Carnegie Mellon. To find about f2c, send the mail
- message "send index from f2c" to netlib@research.att.com
- or research!netlib. (It is also available via anonymous
- ftp on research.att.com, in directory dist/f2c.)
-
- This FAQ list's maintainer also has available a list of other
- commercial translation products, and some for more obscure
- languages.
-
- See also question 5.3.
-
- 17.8: Where can I get copies of all these public-domain programs?
-
- A: If you have access to Usenet, see the regular postings in the
- comp.sources.unix and comp.sources.misc newsgroups, which
- describe, in some detail, the archiving policies and how to
- retrieve copies. The usual approach is to use anonymous ftp
- and/or uucp from a central, public-spirited site, such as uunet
- (ftp.uu.net, 192.48.96.9). However, this article cannot track
- or list all of the available archive sites and how to access
- them. The comp.archives newsgroup contains numerous
- announcements of anonymous ftp availability of various items.
- The "archie" mailserver can tell you which anonymous ftp sites
- have which packages; send the mail message "help" to
- archie@quiche.cs.mcgill.ca for information. Finally, the
- newsgroup comp.sources.wanted is generally a more appropriate
- place to post queries for source availability, but check _its_
- FAQ list, "How to find sources," before posting there.
-
- 17.9: When will the next International Obfuscated C Code Contest
- (IOCCC) be held? How can I get a copy of the current and
- previous winning entries?
-
- A: The contest typically runs from early March through mid-May. To
- obtain a current copy of the rules and guidelines, send e-mail
- with the Subject: line "send rules" to:
-
- {apple,pyramid,sun,uunet}!hoptoad!judges (not the addresses for
- or judges@toad.com submitting entries)
-
- Contest winners are first announced at the Summer Usenix
- Conference in mid-June, and posted to the net sometime in July-
- August. Winning entries from previous years (to 1984) are
- archived at uunet (see question 17.8) under the directory
- ~/pub/ioccc.
-
- As a last resort, previous winners may be obtained by sending
- e-mail to the above address, using the Subject: "send YEAR
- winners", where YEAR is a single four-digit year, a year range,
- or "all".
-
- 17.10: Why don't C comments nest? Are they legal inside quoted
- strings?
-
- A: Nested comments would cause more harm than good, mostly because
- of the possibility of accidentally leaving comments unclosed by
- including the characters "/*" within them. For this reason, it
- is usually better to "comment out" large sections of code, which
- might contain comments, with #ifdef or #if 0 (but see question
- 5.9).
-
- The character sequences /* and */ are not special within
- double-quoted strings, and do not therefore introduce comments,
- because a program (particularly one which is generating C code
- as output) might want to print them.
-
- References: ANSI Appendix E p. 198, Rationale Sec. 3.1.9 p. 33.
-
- 17.11: How can I implement sets and/or arrays of bits?
-
- A: Use arrays of char or int, with a few macros to access the right
- bit at the right index (try using 8 for CHAR_BIT if you don't
- have <limits.h>):
-
- #include <limits.h> /* for CHAR_BIT */
-
- #define BITMASK(bit) (1 << ((bit) % CHAR_BIT))
- #define BITSLOT(bit) ((bit) / CHAR_BIT)
- #define BITSET(ary, bit) ((ary)[BITSLOT(bit)] |= BITMASK(bit))
- #define BITTEST(ary, bit) ((ary)[BITSLOT(bit)] & BITMASK(bit))
-
- 17.12: What is the most efficient way to count the number of bits which
- are set in a value?
-
- A: This and many other similar bit-twiddling problems can often be
- sped up and streamlined using lookup tables (but see the next
- question).
-
- 17.13: How can I make this code more efficient?
-
- A: Efficiency, though a favorite comp.lang.c topic, is not
- important nearly as often as people tend to think it is. Most
- of the code in most programs is not time-critical. When code is
- not time-critical, it is far more important that it be written
- clearly and portably than that it be written maximally
- efficiently. (Remember that computers are very, very fast, and
- that even "inefficient" code can run without apparent delay.)
-
- It is notoriously difficult to predict what the "hot spots" in a
- program will be. When efficiency is a concern, it is important
- to use profiling software to determine which parts of the
- program deserve attention. Often, actual computation time is
- swamped by peripheral tasks such as I/O and memory allocation,
- which can be sped up by using buffering and cacheing techniques.
-
- For the small fraction of code that is time-critical, it is
- vital to pick a good algorithm; it is less important to
- "microoptimize" the coding details. Many of the "efficient
- coding tricks" which are frequently suggested (e.g. substituting
- shift operators for multiplication by powers of two) are
- performed automatically by even simpleminded compilers.
- Heavyhanded "optimization" attempts can make code so bulky that
- performance is degraded.
-
- For more discussion of efficiency tradeoffs, as well as good
- advice on how to increase efficiency when it is important, see
- chapter 7 of Kernighan and Plauger's The Elements of Programming
- Style, and Jon Bentley's Writing Efficient Programs.
-
- 17.14: Are pointers really faster than arrays? How much do function
- calls slow things down? Is ++i faster than i = i + 1?
-
- A: Precise answers to these and many similar questions depend of
- course on the processor and compiler in use. If you simply must
- know, you'll have to time test programs carefully. (Often the
- differences are so slight that hundreds of thousands of
- iterations are required even to see them. Check the compiler's
- assembly language output, if available, to see if two purported
- alternatives aren't compiled identically.)
-
- It is "usually" faster to march through large arrays with
- pointers rather than array subscripts, but for some processors
- the reverse is true.
-
- Function calls, though obviously incrementally slower than in-
- line code, contribute so much to modularity and code clarity
- that there is rarely good reason to avoid them.
-
- Before rearranging expressions such as i = i + 1, remember that
- you are dealing with a C compiler, not a keystroke-programmable
- calculator. Any decent compiler will generate identical code
- for ++i, i += 1, and i = i + 1. The reasons for using ++i or
- i += 1 over i = i + 1 have to do with style, not efficiency.
- (See also question 4.4.)
-
- 17.15: This program crashes before it even runs! (When single-stepping
- with a debugger, it dies before the first statement in main.)
-
- A: You probably have one or more very large (kilobyte or more)
- local arrays. Many systems have fixed-size stacks, and those
- which perform dynamic stack allocation automatically (e.g. Unix)
- can be confused when the stack tries to grow by a huge chunk all
- at once.
-
- It is often better to declare large arrays with static duration
- (unless of course you need a fresh set with each recursive
- call).
-
- (See also question 9.4.)
-
- 17.16: What do "Segmentation violation" and "Bus error" mean?
-
- A: These generally mean that your program tried to access memory it
- shouldn't have, invariably as a result of improper pointer use,
- often involving malloc (see question 17.17) or perhaps scanf
- (see question 11.2).
-
- 17.17: My program is crashing, apparently somewhere down inside malloc,
- but I can't see anything wrong with it.
-
- A: It is unfortunately very easy to corrupt malloc's internal data
- structures, and the resulting problems can be hard to track
- down. The most common source of problems is writing more to a
- malloc'ed region than it was allocated to hold; a particularly
- common bug is to malloc(strlen(s)) instead of strlen(s) + 1.
- Other problems involve freeing pointers not obtained from
- malloc, or trying to realloc a null pointer (see question 3.10).
-
- A number of debugging packages exist to help track down malloc
- problems; one popular one is Conor P. Cahill's "dbmalloc".
-
- 17.18: Does anyone have a C compiler test suite I can use?
-
- A: Plum Hall (1 Spruce Ave., Cardiff, NJ 08232, USA) sells one.
- The FSF's GNU C (gcc) distribution includes a c-torture-
- test.tar.Z which checks a number of common problems with
- compilers. Kahan's paranoia test, found in netlib on
- research.att.com, strenuously tests a C implementation's
- floating point capabilities.
-
- 17.19: Where can I get a YACC grammar for C?
-
- A: The definitive grammar is of course the one in the ANSI
- standard. Several copies are floating around; keep your eyes
- open. There is one (due to Jeff Lee) on uunet (see question
- 17.8) in usenet/net.sources/ansi.c.grammar.Z (including a
- companion lexer). Another one, by Jim Roskind, is in
- pub/*grammar* at ics.uci.edu . The FSF's GNU C compiler
- contains a grammar, as does the appendix to K&R II.
-
- References: ANSI Sec. A.2 .
-
- 17.20: How do you pronounce "char"?
-
- A: You can pronounce the C keyword "char" in at least three ways:
- like the English words "char," "care," or "car;" the choice is
- arbitrary.
-
- 17.21: What's a good book for learning C?
-
- A: Mitch Wright maintains an annotated bibliography of C and Unix
- books; it is available for anonymous ftp from ftp.rahul.net in
- directory pub/mitch/YABL.
-
- This FAQ list's editor maintains a collection of previous
- answers to this question, which is available upon request.
-
- 17.22: Where can I get extra copies of this list? What about back
- issues?
-
- A: For now, just pull it off the net; it is normally posted to
- comp.lang.c on the first of each month, with an Expiration: line
- which should keep it around all month. It can also be found in
- the newsgroups comp.answers and news.answers . Several sites
- archive news.answers postings and other FAQ lists, including
- this one: two sites are pit-manager.mit.edu (directory
- pub/usenet), and ftp.uu.net (directory usenet). The archie
- server should help you find others. See the meta-FAQ list in
- news.answers for more information; see also question 17.8.
-
- This list is an evolving document of questions which have been
- Frequent since before the Great Renaming, not just a collection
- of this month's interesting questions. Older copies are
- obsolete and don't contain much, except the occasional typo,
- that the current list doesn't.
-
-
- Bibliography
-
- ANSI American National Standard for Information Systems --
- Programming Language -- C, ANSI X3.159-1989 (see question 5.2).
-
- JLB Jon Louis Bentley, Writing Efficient Programs, Prentice-Hall,
- 1982, ISBN 0-13-970244-X.
-
- H&S Samuel P. Harbison and Guy L. Steele, C: A Reference Manual,
- Second Edition, Prentice-Hall, 1987, ISBN 0-13-109802-0.
- (A third edition has recently been released.)
-
- PCS Mark R. Horton, Portable C Software, Prentice Hall, 1990,
- ISBN 0-13-868050-7.
-
- EoPS Brian W. Kernighan and P.J. Plauger, The Elements of Programming
- Style, Second Edition, McGraw-Hill, 1978, ISBN 0-07-034207-5.
-
- K&R I Brian W. Kernighan and Dennis M. Ritchie, The C Programming
- Language, Prentice Hall, 1978, ISBN 0-13-110163-3.
-
- K&R II Brian W. Kernighan and Dennis M. Ritchie, The C Programming
- Language, Second Edition, Prentice Hall, 1988, ISBN 0-13-
- 110362-8, 0-13-110370-9.
-
- Knuth Donald E. Knuth, The Art of Computer Programming, (3 vols.),
- Addison Wesley, 1981.
-
- CT&P Andrew Koenig, C Traps and Pitfalls, Addison-Wesley, 1989,
- ISBN 0-201-17928-8.
-
- P.J. Plauger, The Standard C Library, Prentice Hall, 1992,
- ISBN 0-13-131509-9.
-
- Harry Rabinowitz and Chaim Schaap, Portable C, Prentice-Hall,
- 1990, ISBN 0-13-685967-4.
-
- There is a more extensive bibliography in the revised Indian Hill style
- guide (see question 14.3). See also question 17.21.
-
-
- Acknowledgements
-
- Thanks to Jamshid Afshar, Sudheer Apte, Randall Atkinson, Dan Bernstein,
- Vincent Broman, Stan Brown, Joe Buehler, Gordon Burditt, Burkhard Burow,
- D'Arcy J.M. Cain, Raymond Chen, Christopher Calabrese, Paul Carter,
- James Davies, Jutta Degener, Norm Diamond, Ray Dunn, Stephen M. Dunn,
- Bjorn Engsig, Alexander Forst, Jeff Francis, Dave Gillespie, Samuel
- Goldstein, Alasdair Grant, Ron Guilmette, Doug Gwyn, Tony Hansen, Joe
- Harrington, Guy Harris, Jos Horsmeier, Blair Houghton, Kirk Johnson,
- Peter Klausler, Andrew Koenig, Tom Koenig, John Lauro, Felix Lee, Don
- Libes, Christopher Lott, Tim McDaniel, John R. MacMillan, Evan Manning,
- Barry Margolin, Brad Mears, Mark Moraes, Darren Morby, Landon Curt Noll,
- David O'Brien, Richard A. O'Keefe, Hans Olsson, Francois Pinard, Pat
- Rankin, Erkki Ruohtula, Rich Salz, Chip Salzenberg, Paul Sand, Doug
- Schmidt, Patricia Shanahan, Peter da Silva, Joshua Simons, Henry
- Spencer, David Spuler, Erik Talvola, Clarke Thatcher, Wayne Throop,
- Chris Torek, Goran Uddeborg, Wietse Venema, Ed Vielmetti, Larry Virden,
- Chris Volpe, Freek Wiedijk, Dave Wolverton, Mitch Wright, Conway Yee,
- and Zhuo Zang, who have contributed, directly or indirectly, to this
- article. Special thanks to Karl Heuer, and particularly to Mark Brader,
- who (to borrow a line from Steve Johnson) have goaded me beyond my
- inclination, and occasionally beyond my endurance, in relentless pursuit
- of a better FAQ list.
-
- Steve Summit
- scs@adam.mit.edu
- scs%adam.mit.edu@mit.edu
- mit-eddie!adam.mit.edu!scs
-
- This article is Copyright 1988, 1990-1993 by Steve Summit.
- It may be freely redistributed so long as the author's name, and this
- notice, are retained.
- The C code in this article (vstrcat(), error(), etc.) is public domain
- and may be used without restriction.
-
-
-
-