home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-09-11 | 48.3 KB | 1,205 lines |
- Newsgroups: comp.sources.misc
- From: amc@wuecl.wustl.edu (Adam Costello)
- Subject: v39i083: par131 - paragraph reformatter, v1.31, Part01/03
- Message-ID: <csm-v39i083=par131.093602@sparky.Sterling.COM>
- X-Md4-Signature: c35bc72f0101e735399de84c0b4c893b
- Sender: kent@sparky.sterling.com (Kent Landfield)
- Organization: Sterling Software
- Date: Sat, 11 Sep 1993 14:36:27 GMT
- Approved: kent@sparky.sterling.com
-
- Submitted-by: amc@wuecl.wustl.edu (Adam Costello)
- Posting-number: Volume 39, Issue 83
- Archive-name: par131/part01
- Environment: ANSI-C
- Supersedes: par: Volume 39, Issue 40-42
-
- Par 1.31 is a package containing documentation and ANSI C source code
- for the filter "par".
-
- par is a paragraph reformatter, vaguely similar to fmt, but better.
-
- For example, the command "par 44qgc", given the input:
-
- John Q. Public writes:
- > Jane Doe writes:
- > > May I remind people that this newsgroup
- > > is for posting binaries only. Please keep
- > > all discussion in .d where it belongs.
- > Who appointed you net.god? alt groups are
- > UNmoderated.
- Could you two please take this to e-mail?
-
- Would produce the output:
-
- John Q. Public writes:
-
- > Jane Doe writes:
- >
- > > May I remind people that this
- > > newsgroup is for posting
- > > binaries only. Please keep
- > > all discussion in .d where it
- > > belongs.
- >
- > Who appointed you net.god? alt
- > groups are UNmoderated.
-
- Could you two please take this to
- e-mail?
-
- Be sure to read "par.doc".
-
- AMC
- amc@ecl.wustl.edu (Adam M. Costello)
-
- #! /bin/sh
- # This is a shell archive. Remove anything before this line, then unpack
- # it by saving it into a file and typing "sh file". To overwrite existing
- # files, type "sh file -c". You can also feed this as standard input via
- # unshar, or by typing "sh <file", e.g.. If this archive is complete, you
- # will see the following message at the end:
- # "End of shell archive."
- # Contents: Par131 Par131/par.doc Par131/protoMakefile
- # Wrapped by amc@wuecl on Fri Sep 10 18:46:17 1993
- PATH=/bin:/usr/bin:/usr/ucb ; export PATH
- if test ! -d 'Par131' ; then
- echo shar: Creating directory \"'Par131'\"
- mkdir 'Par131'
- fi
- if test -f 'Par131/par.doc' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'Par131/par.doc'\"
- else
- echo shar: Extracting \"'Par131/par.doc'\" \(43573 characters\)
- sed "s/^X//" >'Par131/par.doc' <<'END_OF_FILE'
- X *********************
- X * par.doc *
- X * for Par 1.31 *
- X * Copyright 1993 by *
- X * Adam M. Costello *
- X *********************
- X
- X
- X Par 1.31 is a package containing:
- X
- X + This doc file.
- X + A man page based on this doc file.
- X + The ANSI C source for the filter "par".
- X
- X
- XContents
- X
- X Contents
- X File List
- X Rights and Responsibilities
- X Release Notes
- X Compilation
- X Synopsis
- X Description
- X Terminology
- X Options
- X Environment
- X Details
- X Diagnostics
- X Examples
- X Limitations
- X Bugs
- X
- X
- XFile List
- X
- X The Par 1.31 package is always distributed with at least the following
- X files:
- X
- X buffer.c
- X buffer.h
- X charset.c
- X charset.h
- X failf.c
- X failf.h
- X par.1
- X par.c
- X par.doc
- X protoMakefile
- X reformat.c
- X reformat.h
- X
- X Each file is a text file which identifies itself on the second line, and
- X identifies the version of Par to which it belongs on the third line,
- X so you can always tell which file is which, even if the files have been
- X renamed.
- X
- X The file "par.1" is a man page for the filter par (not to be confused
- X with the package Par, which contains the source code for par). "par.1"
- X is based on this doc file, and conveys much (not all) of the same
- X information, but "par.doc" is the definitive documentation for both par
- X and Par.
- X
- X
- XRights and Responsibilities
- X
- X The files listed in the Files List section above are each Copyright 1993
- X by Adam M. Costello (henceforth "I").
- X
- X I grant everyone permission to use these files in any way, subject to
- X the following two restrictions:
- X
- X 1) No one may distribute modifications of any of the files unless I am
- X the one who modified them.
- X
- X 2) No one may distribute any one of the files unless it is accompanied
- X by all of the other files.
- X
- X I cannot disallow the distribution of patches, but I would prefer that
- X users send me suggestions for changes so that I can incorporate them
- X into future versions of Par. See the Bugs section for my addresses.
- X
- X Though I have tried to make sure that Par is free of bugs, I make no
- X guarantees about its soundness. Therefore, I am not responsible for any
- X damage resulting from the use of these files.
- X
- X
- XRelease Notes
- X
- X Each entry below describes changes since the previous version.
- X
- X Par 1.00 released 25 July 1993
- X The first release.
- X
- X Par 1.10 released 2 August 1993
- X Fixed the following bugs:
- X In reformat.c I used sprintf() but forgot to #include <stdio.h>.
- X I forgot to verify that <width> > <prefix> + <suffix>.
- X The first word of a paragraph was expanded to include initial
- X white characters, not just spaces, contrary to par.doc.
- X Some invalid options were not complained about.
- X NUL characters in the input were not handled.
- X A pointer foul-up in freelines() in par.c could cause a crash.
- X Added the following features:
- X The f, j, and t options.
- X The PARBODY environment variable.
- X Multiple options may be concatenated into a single argument.
- X Removed the m option:
- X Its function is better performed by the f and t options.
- X Normally I would avoid making incompatible changes, unless I
- X were doing a complete overhaul of the whole program, in which
- X case I'd make the version number 2.00 to alert users to possible
- X incompatibilities. However, in this particular instance I
- X allowed an incompatibility in a minor upgrade because version
- X 1.00 was distributed to only four people.
- X Changed the handling of white characters:
- X par now changes all of them (except newlines) to spaces as they
- X are read. This is another incompatible change, excused for the
- X same reason.
- X Made all error messages begin with "par error:".
- X
- X Par 1.20 released 10 August 1993
- X Since Par 1.10 was distributed to no one, I've made some more
- X incompatible changes in Par 1.20.
- X Added the following features:
- X The d option.
- X Paragraphs are now separated by vacant lines, not just blank
- X lines.
- X <hang> now affects not only <prefix> but also <suffix>.
- X
- X Par 1.30 released 18 August 1993
- X Since Par 1.20 was posted to comp.sources.misc, I have made only
- X backward-compatible changes in Par 1.30.
- X Fixed the following bugs:
- X One wrong word in par.c sometimes caused par to crash. Thanks
- X go to vogelke@c-17igp.wpafb.af.mil (Contr Karl Vogel) for
- X sending me an input file that caused a crash.
- X Too-long words were chopped up before the first word in a
- X paragraph was expanded to include initial spaces, allowing
- X impossibility #1 to occur. The order of the two operations
- X has been reversed. Thanks go to splat@deakin.oz.au (Andrew
- X Cashin) for reporting the error message.
- X Added the following features:
- X The g option (motivated by suggestions from several people).
- X The q option (inspired by a suggestion from splat@deakin.oz.au
- X (Andrew Cashin)).
- X The R option (my attempt to squash a bad idea from Par 1.00).
- X The PARQUOTE environment variable (comes with the q option).
- X The PARPROTECT environment variable (inspired by a suggestion
- X from dennisf@se01.elk.miles.com (Dennis Flaherty)).
- X Altered the terminology:
- X Several terms have been added, and the meaning of some terms
- X has been slightly modified. This is a change in the language
- X used to describe par's behavior, not a change in par's actual
- X behavior.
- X Added a clean target to protoMakefile (suggested by hlj@posix.com
- X (Hal Jespersen)).
- X
- X Par 1.31 released 7 September 1993
- X The version number is 1.31 rather than 1.40 because all added
- X features are really just enhancements of existing features.
- X Fixed the following bug:
- X In par.doc, in the example of a paragraph produced by a greedy
- X algorithm, the word "establish" appeared twice in a row.
- X Thanks go to niel@astro.rug.nl (Daniel Kussendrager) for
- X first pointing this out. (The example is now even better
- X because the paragraph looks even worse than before.)
- X Added the following features:
- X A usage message to accompany command line or environment
- X variable syntax errors (first suggested by
- X qarl@ecl.wustl.edu (Karl Stiefvater)).
- X The help and c options.
- X The B, P, and Q options, which render PARBODY, PARPROTECT, and
- X PARQUOTE no longer necessary. They are retained, though,
- X for compatibility and convenience.
- X The _b, _q, and _Q escape sequences for charset syntax.
- X Added the term "charset syntax".
- X Isolated the character set code in charset.c and charset.h.
- X
- X
- XCompilation
- X
- X To compile par, you need an ANSI C compiler. Copy protoMakefile to
- X Makefile and edit it, following the instructions in the comments. Then
- X use make (or the equivalent on your system) to compile par.
- X
- X If you have no make, compile each .c file into an object file and link
- X all the object files together by whatever method works on your system.
- X Then go look for a version of make that works on your system, since it
- X will come in handy in the future.
- X
- X If your compiler warns you about a pointer to a constant being converted
- X to a pointer to a non-constant in line 507 of reformat.c, ignore it.
- X Your compiler (like mine) is in error. What it thinks is a pointer to
- X a constant is actually a pointer to a pointer to a constant, which is
- X something quite different. The conversion is legal, and I don't think a
- X correct ANSI C compiler would complain.
- X
- X If your compiler generates any other warnings that you think are
- X legitimate, please tell me about them (see the Bugs section).
- X
- X Note that all variables in par are either constant or automatic (or
- X both), which means that par can be made reentrant (if your compiler
- X supports it). Given the right operating system, it should be possible
- X for several par processes to share the same code space and the same data
- X space (but not the same stack, of course) in memory.
- X
- X
- XSynopsis
- X par [help] [version] [B<op><set>] [P<op><set>] [Q<op><set>] [h[<hang>]]
- X [p<prefix>] [s<suffix>] [w<width>] [c[<cap>]] [d[<div>]] [f[<fit>]]
- X [g[<guess>]] [j[<just>]] [l[<last>]] [q[<quote>]] [R[<Report>]]
- X [t[<touch>]]
- X
- X Things enclosed in [square brackets] are optional. Things enclosed in
- X <angle brackets> are parameters.
- X
- X
- XDescription
- X
- X par is a filter which copies its input to its output, changing all white
- X characters (except newlines) to spaces, and reformatting each paragraph.
- X Paragraphs are separated by protected, blank, and vacant lines (see
- X the Terminology section for definitions), and optionally delimited by
- X indentation (see the d option in the Options section).
- X
- X Each output paragraph is generated from the corresponding input
- X paragraph as follows:
- X
- X 1) An optional prefix and/or suffix is removed from each input line.
- X 2) The remainder is divided into words (separated by spaces).
- X 3) The words are joined into lines to make an eye-pleasing paragraph.
- X 4) The prefixes and suffixes are reattached.
- X
- X If there are suffixes, spaces are inserted before them so that they all
- X end in the same column.
- X
- X
- XTerminology
- X
- X Miscellaneous terms:
- X
- X charset syntax
- X A way of representing a set of characters as a string. The set
- X includes exactly those characters which appear in the string,
- X except that the underscore (_) is an escape character. Whenever
- X it appears, it must begin one of the following escape sequences:
- X
- X __ = an underscore
- X _s = a space
- X _b = a backslash (\)
- X _q = a single quote (')
- X _Q = a double quote (")
- X _A = all upper case letters
- X _a = all lower case letters
- X _0 = all decimal digits
- X _xhh = the character represented by the two hexadecimal
- X digits hh (which may be upper or lower case)
- X
- X The NUL character must not appear in the string but it may be
- X included in the set with the _x00 sequence.
- X
- X error
- X A condition which causes par to abort. See the Diagnostics
- X section.
- X
- X IP Input paragraph.
- X
- X OP Output paragraph.
- X
- X parameter
- X A symbol which may take on unsigned integral values. There are
- X several parameters whose values affect the behavior of par.
- X Parameters can be assigned values using command line options.
- X
- X
- X Types of characters:
- X
- X alphanumeric character
- X An upper case letter, lower case letter, or decimal digit.
- X
- X body character
- X A member of the set of characters defined by the PARBODY
- X environment variable (see the Environment section).
- X
- X protective character
- X A member of the set of characters defined by the PARPROTECT
- X environment variable (see the Environment section).
- X
- X quote character
- X A member of the set of characters defined by the PARQUOTE
- X environment variable (see the Environment section).
- X
- X terminal character
- X A period, question mark, exclamation point, or colon.
- X
- X white character
- X A space, formfeed, newline, carriage return, tab, or vertical
- X tab.
- X
- X Functions:
- X
- X comprelen
- X The comprelen of a non-empty set of lines is the length of
- X the longest string of non-body characters appearing at the
- X beginning of every line in the set.
- X
- X comsuflen
- X Given a non-empty set <S> of lines, let <p> be the comprelen of
- X <S>. Let <T> be the set of lines which result from stripping
- X the first <p> characters from each line in <S>. Let <c> be the
- X longest string of non-body characters appearing at the end of
- X every line in <T>. Strip <c> of all initial spaces except the
- X last. The length of <c> is the comsuflen of <S>.
- X
- X quoteprefix
- X The quoteprefix of a line is the longest string of quote
- X characters appearing at the beginning of the line, after this
- X string has been stripped of any trailing spaces.
- X
- X Types of lines:
- X
- X blank line
- X An empty line, or a line whose first character is not protective
- X and which contains only spaces.
- X
- X protected line
- X An input line whose first character is protective.
- X
- X vacant line
- X Any line which can be shown to be vacant by a finite number of
- X applications of the following recursive rule: Suppose <S> is a
- X subsequence of a segment (see below) bounded above and below by
- X vacant lines or by the beginning/end of the segment. Let <p>
- X and <s> be the comprelen and comsuflen of <S>. Any member of
- X <S> which, if stripped of its first <p> characters and last <s>
- X characters, would be blank, is vacant.
- X
- X Groups of lines:
- X
- X segment
- X A contiguous sequence of input lines containing no protected or
- X blank lines, bounded above and below by protected lines, blank
- X lines, and/or the beginning/end of the input.
- X
- X block
- X A contiguous subsequence of a segment containing no vacant lines,
- X bounded above and below by vacant lines and/or the beginning/end
- X of the segment.
- X
- X Types of words:
- X
- X capitalized word
- X If <cap> is 0, a capitalized word is one which contains at
- X least one alphanumeric character, whose first alphanumeric
- X character is not a lower case letter. If <cap> is 1, every word
- X is considered a capitalized word.
- X
- X curious word
- X A word which contains a terminal character <c> such that there
- X are no alphanumeric characters in the word after <c>, but there
- X is at least one alphanumeric character in the word before <c>.
- X
- X
- XOptions
- X
- X Any command line argument may begin with one minus sign (-) which is
- X ignored. Generally, more than one option may appear in a single command
- X line argument, but there are exceptions: The help, version, B, P, and Q
- X options must have whole arguments all to themselves.
- X
- X help Causes all remaining arguments to be ignored. No input
- X is read. A usage message is printed on the output briefly
- X describing the options used by par.
- X
- X version Causes all remaining arguments to be ignored. No input is
- X read. "par 1.31" is printed on the output. Of course, this
- X will change in future releases of Par.
- X
- X B<op><set> <op> is a single character, either an equal sign (=), a
- X plus sign (+), or a minus sign (-). <set> is a string using
- X charset syntax. If <op> is an equal sign, the set of body
- X characters is set to the character set defined by <set>. If
- X <op> is a plus/minus sign, the characters in the set defined
- X by <set> are added/removed to/from the existing set of body
- X characters defined by the PARBODY environment variable and
- X any previous B options. It is okay to add characters that
- X are already in the set or to remove characters that are not
- X in the set.
- X
- X P<op><set> Just like the B option, except that it applies to the set of
- X protective characters.
- X
- X Q<op><set> Just like the B option, except that it applies to the set of
- X quote characters.
- X
- X All remaining options are used to set values of parameters. Values set
- X by command line options hold for all paragraphs. Unset parameters are
- X given default values. Any unset parameters whose default values depend
- X on the IP are recomputed separately for each paragraph.
- X
- X The approximate role of each parameter is described here. See the
- X Details section for the rest of the story.
- X
- X The first four parameters, <hang>, <prefix>, <suffix>, and <width>, may
- X be set to any unsigned decimal integer less than 10000.
- X
- X h[<hang>] Mainly affects the default values of <prefix> and <suffix>.
- X Defaults to 0. If the h option is given without a number,
- X the value 1 is inferred. (See also the p and s options.)
- X
- X p<prefix> The first <prefix> characters of each line of the OP
- X are copied from the first <prefix> characters of the
- X corresponding line of the IP. If there are more than
- X <hang> + 1 lines in the IP, the default value is the
- X comprelen of all the lines in the IP except the first <hang>
- X of them. If there are exactly <hang> + 1 lines in the IP
- X and <quote> is 1, the default value is the number of leading
- X quote characters in the last line. Otherwise the default
- X value is 0. (See also the h and q options.)
- X
- X s<suffix> The last <suffix> characters of each line of the OP
- X are copied from the last <suffix> characters of the
- X corresponding line of the IP. If there are more than
- X <hang> + 1 lines in the IP, the default value is the
- X comsuflen of all the lines of the IP except the first <hang>
- X of them. Otherwise the default value is 0. (See also the h
- X option.)
- X
- X w<width> No line in the OP will contain more than <width> characters,
- X not including the trailing newlines. Defaults to 72.
- X
- X The remaining nine parameters, <cap>, <div>, <fit>, <guess>, <just>,
- X <last>, <quote>, <Report>, and <touch>, may be set to either 0 or 1. If
- X the number is absent in the option, the value 1 is inferred.
- X
- X c[<cap>] If <cap> is 1, all words are considered capitalized. This
- X currently affects only the application of the g option.
- X
- X d[<div>] If <div> is 0, each block becomes an IP. If <div> is 1,
- X each block is subdivided into IPs as follows: Let <p> be
- X the comprelen of the block. Let a line's status be 1 if its
- X (<p> + 1)st character is a space, 0 otherwise. Every line
- X in the block whose status is the same as the status of the
- X first line will begin a new paragraph. Defaults to 0.
- X
- X f[<fit>] If <fit> is 1 and <just> is 0, par will try to make the
- X lines in the OP as nearly the same length as possible, even
- X if it means making the OP narrower. Defaults to 0. (See
- X also the j option.)
- X
- X g[<guess>] If <guess> is 1, then when par is choosing line breaks,
- X whenever it encounters a curious word followed by a
- X capitalized word, it takes one of two special actions.
- X If the two words are separated by a single space in the
- X input, they will be merged into one word with an embedded
- X non-breaking space. If the two words are separated by more
- X than one space, or by a line break, par will insure that
- X they are separated by two spaces, or by a line break, in the
- X output. Defaults to 0.
- X
- X j[<just>] If <just> is 1, par justifies the OP, inserting spaces
- X between words so that all lines in the OP have length
- X <width> (except the last, if <last> is 0). <fit> has no
- X effect if <just> is 1. Defaults to 0. (See also the w, l,
- X and f options.)
- X
- X l[<last>] If <last> is 1, par tries to make the last line of the OP
- X about the same length as the others. Defaults to 0.
- X
- X q[<quote>] If <quote> is 1, then before each segment is scanned for
- X vacant lines, par will insert some new lines as follows:
- X For each pair of adjacent lines in the segment, if the
- X quoteprefix of one is a prefix of (but not the same as) the
- X quoteprefix of the other, and each of the two lines contains
- X at least one non-quote character, then a line consisting of
- X the smaller quoteprefix will be inserted between the two
- X lines. <quote> also affects the default value of <prefix>.
- X Defaults to 0. (See also the p option.)
- X
- X R[<Report>] If <Report> is 1, it will be considered an error for an
- X input word to contain more than <L> = (<width> - <prefix> -
- X <suffix>) characters. Otherwise, such words will be chopped
- X after each <L>th character into shorter words. Defaults
- X to 0. It is recommended that this option be included in
- X PARINIT (see the Environment section).
- X
- X t[<touch>] Has no effect if <suffix> is 0 or <just> is 1. Otherwise,
- X if <touch> is 0, all lines in the OP have length <width>.
- X If <touch> is 1, the length of the lines is decreased until
- X the suffixes touch the body of the OP. Defaults to the
- X logical OR of <fit> and <last>. (See also the s, j, w, f,
- X and l options.)
- X
- X If the value of any parameter is set more than once, the last value is
- X used. When unset parameters are assigned default values, <hang> and
- X <quote> are assigned before <prefix>, and <fit> and <last> are assigned
- X before <touch> (because of the dependencies).
- X
- X It is an error if <width> <= <prefix> + <suffix>.
- X
- X
- XEnvironment
- X
- X PARBODY Determines the initial set of body characters (which are
- X used for determining comprelens and comsuflens), using
- X charset syntax. If PARBODY is not set, the set of body
- X characters is initially empty. A good value for PARBODY
- X might be "_A_a.", but it depends on the application.
- X
- X PARINIT If set, par will read command line arguments from PARINIT
- X before it reads them from the command line. Within
- X the value of PARINIT, arguments are separated by white
- X characters.
- X
- X PARPROTECT Determines the set of protective characters, using charset
- X syntax. If PARPROTECT is not set, the set of protective
- X characters is initially empty.
- X
- X PARQUOTE Determines the set of quote characters, using charset
- X syntax. If PARQUOTE is not set, the set of quote characters
- X initially contains only the greater-than sign (>) and the
- X space.
- X
- X If a NUL character appears in the value of an environment variable, it
- X and the rest of the string will not be seen by par.
- X
- X
- XDetails
- X
- X Lines are terminated by newline characters, but the newlines are not
- X considered to be included in the lines. If the last character of the
- X input is a non-newline, a newline will be inferred immediately after
- X it (but if the input is empty, no newline will be inferred; the number
- X of input lines will be 0). Thus, the input can always be viewed as a
- X sequence of lines.
- X
- X Protected lines are copied unchanged from the input to the output. All
- X other input lines, as they are read, have any NUL characters removed,
- X and every white character (except newlines) turned into a space.
- X
- X Blank lines in the input are transformed into empty lines in the output.
- X Vacant lines in the input are stripped of trailing spaces before being
- X output.
- X
- X The input is divided into segments, which are divided into blocks,
- X which are divided into IPs. The exact process depends on the values of
- X <quote> and <div> (see q and d in the Options section). The remainder
- X of this section describes the process which is applied independently to
- X each IP to construct the corresponding OP.
- X
- X After the values of the parameters are determined (see the Options
- X section), the first <prefix> characters and the last <suffix> characters
- X of each input line are removed and remembered. It is an error for any
- X line to contain fewer than <prefix> + <suffix> characters.
- X
- X The remaining text is treated as a sequence of characters, not lines.
- X The text is broken into words, which are separated by spaces. That is,
- X a word is a maximal sub-sequence of non-spaces. If <guess> is 1, some
- X words might be merged (see g in the Options section). The first word
- X includes any spaces that preceed it on the same line.
- X
- X Let <L> = <width> - <prefix> - <suffix>.
- X
- X If <Report> is 0, some words may get chopped up at this point (see R in
- X the Options section).
- X
- X The words are reassembled, preserving their order, into lines. If
- X <just> is 0, adjacent words within a line are separated by a single
- X space (or sometimes two if <guess> is 1), and line breaks are chosen so
- X that the paragraph satisfies the following properties:
- X
- X 1) No line contains more than <L> characters.
- X
- X 2) If <fit> is 1, the difference between the lengths of the
- X shortest and longest lines is as small as possible.
- X
- X 3) The shortest line is as long as possible, subject to properties
- X 1 and 2.
- X
- X 4) Let <target> be <L> if <fit> is 0, or the length of the longest
- X line if <fit> is 1. The sum of the squares of the differences
- X between <target> and the lengths of the lines is as small as
- X possible, subject to properties 1, 2, and 3.
- X
- X If <last> is 0, the last line does not count as a line for the
- X purposes of properties 2, 3, and 4 above.
- X
- X If all the words fit on a single line, the properties as worded
- X above don't make much sense. In that case, no line breaks are
- X inserted.
- X
- X If <just> is 1, adjacent words within a line are separated by one space
- X (or sometimes two if <guess> is 1) plus zero or more extra spaces. The
- X value of <fit> is disregarded, and line breaks are chosen so that the
- X paragraph satisfies the following properties:
- X
- X 1) Every line contains exactly <L> characters.
- X
- X 2) The largest inter-word gap is as small as possible, subject
- X to property 1. (An inter-word gap consists only of the extra
- X spaces, not the regular spaces.)
- X
- X 3) The sum of the squares of the lengths of the inter-word gaps is
- X as small as possible, subject to properties 1 and 2.
- X
- X If <last> is 0, the last line does not count as a line for the
- X purposes of property 1, and it does not require or contain any extra
- X spaces.
- X
- X Extra spaces are distributed as uniformly as possible among the
- X inter-word gaps in each line.
- X
- X In a justified paragraph, every line must contain at least two
- X words, but that's not always possible to accomplish. If the
- X paragraph cannot be justified, it is considered an error.
- X
- X If the number of lines in the resulting paragraph is less than <hang>,
- X empty lines are added at the end to bring the number of lines up to
- X <hang>.
- X
- X If <just> is 0 and <touch> is 1, <L> is changed to be the length of the
- X longest line.
- X
- X If <suffix> is not 0, each line is padded at the end with spaces to
- X bring its length up to <L>.
- X
- X To each line is prepended <prefix> characters. Let <n> be the number of
- X lines in the IP. The characters which are prepended to the <i>th line
- X are chosen as follows:
- X
- X 1) If <i> <= <n>, the characters are copied from the ones that were
- X removed from the beginning of the <n>th input line.
- X
- X 2) If <i> > <n> > <hang>, the characters are copied from the ones that
- X were removed from the beginning of the last input line.
- X
- X 3) If <i> > <n> and <n> <= <hang>, the characters are all spaces.
- X
- X Then to each line is appended <suffix> characters. The characters which
- X are appended to the <i>th line are chosen as follows:
- X
- X 1) If <i> <= <n>, the characters are copied from the ones that were
- X removed from the end of the nth input line.
- X
- X 2) If <i> > <n> > 0, the characters are copied from the ones that were
- X removed from the end of the last input line.
- X
- X 3) If <n> = 0, the characters are all spaces.
- X
- X Finally, the lines are printed to the output as the OP.
- X
- X
- XDiagnostics
- X
- X If there are no errors, par returns EXIT_SUCCESS (see <stdlib.h>).
- X
- X If there is an error, an error message will be printed to the output,
- X and par will return EXIT_FAILURE. If the error is local to a single
- X paragraph, the preceeding paragraphs will have been output before
- X the error was detected. Line numbers in error messages are local to
- X the IP in which the error occurred. All error messages begin with
- X "par error:" on a line by itself. Error messages concerning command
- X line or environment variable syntax are accompanied by the same usage
- X message that the help option produces.
- X
- X Of course, trying to print an error message would be futile if an error
- X resulted from an output function, so par doesn't bother doing any error
- X checking on output functions.
- X
- X
- XExamples
- X
- X The superiority of par's dynamic programming algorithm over a greedy
- X algorithm (such as the one used by fmt) can be seen in the following
- X example:
- X
- X Original paragraph:
- X
- X We the people of the United States,
- X in order to form a more perfect union,
- X establish justice,
- X insure domestic tranquility,
- X provide for the common defense,
- X promote the general welfare,
- X and secure the blessing of liberty
- X to ourselves and our posterity,
- X do ordain and establish the Constitution
- X of the United States of America.
- X
- X After a greedy algorithm with width = 39:
- X
- X We the people of the United
- X States, in order to form a more
- X perfect union, establish
- X justice, insure domestic
- X tranquility, provide for the
- X common defense, promote the
- X general welfare, and secure the
- X blessing of liberty to
- X ourselves and our posterity, do
- X ordain and establish the
- X Constitution of the United
- X States of America.
- X
- X After "par 39":
- X
- X We the people of the United
- X States, in order to form a
- X more perfect union, establish
- X justice, insure domestic
- X tranquility, provide for the
- X common defense, promote the
- X general welfare, and secure
- X the blessing of liberty to
- X ourselves and our posterity,
- X do ordain and establish the
- X Constitution of the United
- X States of America.
- X
- X The line breaks chosen by par are clearly more eye-pleasing.
- X
- X par is most useful in conjunction with the text-filtering features of an
- X editor, such as the ! commands of vi.
- X
- X The rest of this section is a series of before-and-after pictures
- X showing some typical uses of par.
- X
- X Before:
- X
- X /* We the people of the United States, */
- X /* in order to form a more perfect union, */
- X /* establish justice, */
- X /* insure domestic tranquility, */
- X /* provide for the common defense, */
- X /* promote the general welfare, */
- X /* and secure the blessing of liberty */
- X /* to ourselves and our posterity, */
- X /* do ordain and establish the Constitution */
- X /* of the United States of America. */
- X
- X After "par 59":
- X
- X /* We the people of the United States, in */
- X /* order to form a more perfect union, establish */
- X /* justice, insure domestic tranquility, provide */
- X /* for the common defense, promote the general */
- X /* welfare, and secure the blessing of liberty */
- X /* to ourselves and our posterity, do ordain */
- X /* and establish the Constitution of the United */
- X /* States of America. */
- X
- X Or after "par 59f":
- X
- X /* We the people of the United States, */
- X /* in order to form a more perfect union, */
- X /* establish justice, insure domestic */
- X /* tranquility, provide for the common */
- X /* defense, promote the general welfare, */
- X /* and secure the blessing of liberty to */
- X /* ourselves and our posterity, do ordain */
- X /* and establish the Constitution of the */
- X /* United States of America. */
- X
- X Or after "par 59l":
- X
- X /* We the people of the United States, in */
- X /* order to form a more perfect union, establish */
- X /* justice, insure domestic tranquility, */
- X /* provide for the common defense, promote */
- X /* the general welfare, and secure the */
- X /* blessing of liberty to ourselves and our */
- X /* posterity, do ordain and establish the */
- X /* Constitution of the United States of America. */
- X
- X Or after "par 59lf":
- X
- X /* We the people of the United States, */
- X /* in order to form a more perfect union, */
- X /* establish justice, insure domestic */
- X /* tranquility, provide for the common */
- X /* defense, promote the general welfare, */
- X /* and secure the blessing of liberty */
- X /* to ourselves and our posterity, do */
- X /* ordain and establish the Constitution */
- X /* of the United States of America. */
- X
- X Or after "par 59lft0":
- X
- X /* We the people of the United States, */
- X /* in order to form a more perfect union, */
- X /* establish justice, insure domestic */
- X /* tranquility, provide for the common */
- X /* defense, promote the general welfare, */
- X /* and secure the blessing of liberty */
- X /* to ourselves and our posterity, do */
- X /* ordain and establish the Constitution */
- X /* of the United States of America. */
- X
- X Or after "par 59j":
- X
- X /* We the people of the United States, in */
- X /* order to form a more perfect union, establish */
- X /* justice, insure domestic tranquility, provide */
- X /* for the common defense, promote the general */
- X /* welfare, and secure the blessing of liberty */
- X /* to ourselves and our posterity, do ordain and */
- X /* establish the Constitution of the United */
- X /* States of America. */
- X
- X Or after "par 59jl":
- X
- X /* We the people of the United States, */
- X /* in order to form a more perfect */
- X /* union, establish justice, insure domestic */
- X /* tranquility, provide for the common defense, */
- X /* promote the general welfare, and secure */
- X /* the blessing of liberty to ourselves and */
- X /* our posterity, do ordain and establish the */
- X /* Constitution of the United States of America. */
- X
- X Before:
- X
- X Preamble We the people of the United States,
- X to the US in order to form
- X Constitution a more perfect union,
- X establish justice,
- X insure domestic tranquility,
- X provide for the common defense,
- X promote the general welfare,
- X and secure the blessing of liberty
- X to ourselves and our posterity,
- X do ordain and establish
- X the Constitution
- X of the United States of America.
- X
- X After "par 52h3":
- X
- X Preamble We the people of the United
- X to the US States, in order to form a
- X Constitution more perfect union, establish
- X justice, insure domestic
- X tranquility, provide for the
- X common defense, promote the
- X general welfare, and secure
- X the blessing of liberty to
- X ourselves and our posterity,
- X do ordain and establish the
- X Constitution of the United
- X States of America.
- X
- X Before:
- X
- X 1 We the people of the United States,
- X 2 in order to form a more perfect union,
- X 3 establish justice,
- X 4 insure domestic tranquility,
- X 5 provide for the common defense,
- X 6 promote the general welfare,
- X 7 and secure the blessing of liberty
- X 8 to ourselves and our posterity,
- X 9 do ordain and establish the Constitution
- X 10 of the United States of America.
- X
- X After "par 59p12l":
- X
- X 1 We the people of the United States, in order to
- X 2 form a more perfect union, establish justice,
- X 3 insure domestic tranquility, provide for the
- X 4 common defense, promote the general welfare,
- X 5 and secure the blessing of liberty to ourselves
- X 6 and our posterity, do ordain and establish the
- X 7 Constitution of the United States of America.
- X
- X Before:
- X
- X > > We the people
- X > > of the United States,
- X > > in order to form a more perfect union,
- X > > establish justice,
- X > > ensure domestic tranquility,
- X > > provide for the common defense,
- X >
- X > Promote the general welfare,
- X > and secure the blessing of liberty
- X > to ourselves and our posterity,
- X > do ordain and establish
- X > the Constitution of the United States of America.
- X
- X After "par 52":
- X
- X > > We the people of the United States, in
- X > > order to form a more perfect union,
- X > > establish justice, ensure domestic
- X > > tranquility, provide for the common
- X > > defense,
- X >
- X > Promote the general welfare, and secure
- X > the blessing of liberty to ourselves and
- X > our posterity, do ordain and establish
- X > the Constitution of the United States of
- X > America.
- X
- X Before:
- X
- X > We the people
- X > of the United States,
- X > in order to form a more perfect union,
- X > establish justice,
- X > ensure domestic tranquility,
- X > provide for the common defense,
- X > Promote the general welfare,
- X > and secure the blessing of liberty
- X > to ourselves and our posterity,
- X > do ordain and establish
- X > the Constitution of the United States of America.
- X
- X After "par 52d":
- X
- X > We the people of the United States,
- X > in order to form a more perfect union,
- X > establish justice, ensure domestic
- X > tranquility, provide for the common
- X > defense,
- X > Promote the general welfare, and secure
- X > the blessing of liberty to ourselves and
- X > our posterity, do ordain and establish
- X > the Constitution of the United States of
- X > America.
- X
- X Before:
- X
- X Joe Public writes:
- X > Jane Doe writes:
- X > > I can't find the source for uncompress.
- X > Oh no, not again!!!
- X >
- X > Isn't there a FAQ for this?
- X That wasn't very helpful, Joe. Jane,
- X just make a link from uncompress to compress.
- X
- X After "par 40q":
- X
- X Joe Public writes:
- X
- X > Jane Doe writes:
- X >
- X > > I can't find the source for
- X > > uncompress.
- X >
- X > Oh no, not again!!!
- X >
- X > Isn't there a FAQ for this?
- X
- X That wasn't very helpful, Joe.
- X Jane, just make a link from
- X uncompress to compress.
- X
- X Before:
- X
- X I sure hope there's still room
- X in Dr. Jones' section of archaeology.
- X I've heard he's the bestest. [sic]
- X
- X After "par 50g":
- X
- X I sure hope there's still room in
- X Dr. Jones' section of archaeology. I've
- X heard he's the bestest. [sic]
- X
- X Or after "par 50gc":
- X
- X I sure hope there's still room in
- X Dr. Jones' section of archaeology. I've
- X heard he's the bestest. [sic]
- X
- X
- XLimitations
- X
- X The <guess> feature guesses wrong in cases like the following:
- X
- X I calc'd the approx.
- X Fermi level to 3 sig. digits.
- X
- X With <guess> = 1, par will incorrectly assume that "approx." ends a
- X sentence. If the input were:
- X
- X I calc'd the approx. Fermi
- X level to 3 sig. digits.
- X
- X then par would refuse to put a line break between "approx." and "Fermi"
- X in the output, mainly to avoid creating the first situation (in case the
- X paragraph were to be fed back through par again). This non-breaking
- X space policy does come in handy for cases like "Mr. Johnson" and
- X "Jan. 1", though.
- X
- X The <guess> feature only goes one way. par can preserve wide sentence
- X breaks in a paragraph, or remove them, but it can't insert them if they
- X aren't already in the input.
- X
- X If you use tabs, you probably won't like the way par handles (or doesn't
- X handle) them. It turns them into spaces. I didn't bother trying to
- X make sense of tabs because they don't make sense to begin with. Not
- X everyone's terminal has the same tab settings, so text files containing
- X tabs are sometimes mangled. In fact, almost every text file containing
- X tabs gets mangled when something is inserted at the beginning of each
- X line (when quoting e-mail or commenting out a section of a shell script,
- X for example), making them a pain to edit. In my opinion, the world
- X would be a nicer place if everyone stopped using tabs (so I'm doing my
- X part by not supporting them in par). (Thanks to ets1@cs.wustl.edu (Eric
- X T. Stuebe) for showing me the light about tabs.)
- X
- X There is currently no way for the length of the output prefix to differ
- X from the length of the input prefix. Ditto for the suffix. I may
- X consider adding this capability in a future release, but right now I'm
- X not sure how I'd want it to work.
- X
- X
- XBugs
- X
- X If I knew of any bugs, I wouldn't release the package. Of course, there
- X may be bugs that I haven't yet discovered.
- X
- X If you find any bugs (in the program or in the documentation), or if you
- X have any suggestions, please send e-mail to:
- X
- X amc@ecl.wustl.edu
- X
- X or send paper mail to:
- X
- X Adam M. Costello
- X Campus Box 1045
- X Washington University
- X One Brookings Dr.
- X St. Louis, MO 63130
- X USA
- X
- X Note that both addresses could change anytime after June 1994.
- X
- X When reporting a bug, please include the exact input and command line
- X options used, and the version number of par, so that I can reproduce it.
- END_OF_FILE
- if test 43573 -ne `wc -c <'Par131/par.doc'`; then
- echo shar: \"'Par131/par.doc'\" unpacked with wrong size!
- fi
- # end of 'Par131/par.doc'
- fi
- if test -f 'Par131/protoMakefile' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'Par131/protoMakefile'\"
- else
- echo shar: Extracting \"'Par131/protoMakefile'\" \(1592 characters\)
- sed "s/^X//" >'Par131/protoMakefile' <<'END_OF_FILE'
- X# *********************
- X# * protoMakefile *
- X# * for Par 1.31 *
- X# * Copyright 1993 by *
- X# * Adam M. Costello *
- X# *********************
- X
- X
- X# Define CC so that the command
- X# $(CC) foo.c
- X# compiles the ANSI C source file "foo.c" into the object file "foo.o".
- X# You may assume that foo.c uses no floating point math.
- X#
- X# If your operating system or your compiler's exit() function automatically
- X# frees all memory allocated by malloc() when a process terminates, then you
- X# can choose to trade away space efficiency for time efficiency by defining
- X# DONTFREE.
- X#
- X# Example (for Solaris 2.2):
- X# CC = cc -c -O -s -Xc -DDONTFREE
- X
- XCC =
- X
- X# Define LINK1 and LINK2 so that the command
- X# $(LINK1) foo1.o foo2.o foo3.o $(LINK2) foo
- X# links the object files "foo1.o", "foo2.o", "foo3.o"
- X# into the executable file "foo".
- X# You may assume that none of the .o files use floating point math.
- X#
- X# Example (for Solaris 2.2):
- X# LINK1 = cc -s
- X# LINK2 = -o
- X
- XLINK1 =
- XLINK2 =
- X
- X# Define RM so that the command
- X# $(RM) foo1 foo2 foo3
- X# removes the files "foo1", "foo2", and "foo3", and
- X# preferrably doesn't complain if they don't exist.
- X#
- X# Example (for Solaris 2.2):
- X# RM = rm -f
- X
- XRM =
- X
- X# You shouldn't need to modify anything below this line.
- X
- XOBJS = buffer.o charset.o errmsg.o par.o reformat.o
- X
- X.c.o:
- X $(CC) $<
- X
- Xpar: $(OBJS)
- X $(LINK1) $(OBJS) $(LINK2) par
- X
- Xbuffer.o: buffer.c buffer.h errmsg.h
- X
- Xcharset.o: charset.c charset.h errmsg.h buffer.h
- X
- Xerrmsg.o: errmsg.c errmsg.h
- X
- Xpar.o: par.c charset.h errmsg.h buffer.h reformat.h
- X
- Xreformat.o: reformat.c reformat.h errmsg.h buffer.h
- X
- Xclean:
- X $(RM) par $(OBJS)
- END_OF_FILE
- if test 1592 -ne `wc -c <'Par131/protoMakefile'`; then
- echo shar: \"'Par131/protoMakefile'\" unpacked with wrong size!
- fi
- # end of 'Par131/protoMakefile'
- fi
- echo shar: End of shell archive.
- exit 0
-
- exit 0 # Just in case...
-