home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!usenet.ins.cwru.edu!agate!dog.ee.lbl.gov!horse.ee.lbl.gov!torek
- From: torek@horse.ee.lbl.gov (Chris Torek)
- Newsgroups: comp.lang.c
- Subject: Re: Wanted: program to extract comments from C
- Date: 31 Dec 1992 23:18:36 GMT
- Organization: Lawrence Berkeley Laboratory, Berkeley CA
- Lines: 30
- Message-ID: <28184@dog.ee.lbl.gov>
- References: <1h580nINN11p@life.ai.mit.edu> <1992Dec26.095028.29823@netcom.com>
- NNTP-Posting-Host: 128.3.112.15
-
- In article <1992Dec26.095028.29823@netcom.com> uniteq@netcom.com
- (Uniteq Application Systems) writes:
- >... you could also do it with a lot fewer lines (and a much
- >more complicated engine, but we let lex worry about that stuff):
- >
- >%%
- >"//".* {printf("%s\n", yytext); /* C++ comments */ }
- >"/*"([^*]|\*+[^*/])*\*+\/ {printf("%s\n", yytext); /* std C comments */ }
- >.|\n { /* ignore everything else */ }
- >
- >I am uncertain about the correct semantics when mixing the two commenting
- >styles - the above code allows `//' to comment out a `/*' but not a `*/'.
- >The code Kenneth presented took the rather strange (IMO) stance of making
- >either type of comment able to override the other (`//' remains active
- >within a `/* ... */' pair).
-
- All of these points are valid, but there are two problems with this
- approach. The first is that lex (including its improved cousin flex)
- has a fairly small limit on the size of `yytext'. Comments may often
- exceed this limit. One can raise it in any lex program (simply
- redefine YY_BUF_SIZE), but any limit may prove too small (I have seen
- truly enormous comments in some code). The second is that regular
- expressions for recognizing /* ... */ comments are notoriously
- difficult. The one above appears correct, but anyone trying to
- reproduce this from memory may get it wrong. For these reasons it
- usually seems better to use start states (easier in flex, with its
- exclusive start states) to handle this.
- --
- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 510 486 5427)
- Berkeley, CA Domain: torek@ee.lbl.gov
-