home *** CD-ROM | disk | FTP | other *** search
File List | 1991-09-22 | 24.4 KB | 661 lines |
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- NAME
- sed - the stream editor
-
- SYNOPSIS
- sed [-n] [-g] [-e script] [-f sfilename] [filename ]
-
- DESCRIPTION
- sed reads each filename line by line, edits each line according to a script
- of commands as specified by the -e and -f arguments and then copies the
- edited line to the standard output.
-
- OPTIONS
- The -e option supplies a single edit command from the next argument; if
- there are several of these they are executed in the order in which they
- appear. If there is just one -e option and no -f's, the -e flag may be
- omitted. An -f option causes commands to be taken from the file sfilename;
- if there are several of these they are executed in the order in which
- they appear; -e and -f commands may be mixed. The script or sfilename can
- be adjacent to the -e or -f or can be the next argument on the command
- line. The -g option causes sed to act as though every substitute command
- in the following script has a g suffix. The -n option suppresses the
- default output.
-
- SCRIPTS
- A script consists of one or more sed commands of the following form:
- [address[,address]] function [arguments]
- Normally sed cyclically copies a line of input into a current text buffer,
- then applies in sequence all commands whose addresses select that line and
- then copies the buffer to standard output and clears the buffer. The -n
- option suppresses normal output so that only commands which do output (e.g.
- p) cause any writing to occur. Also, some commands (n, N) do their own
- line reads, and some others (o, d, D) cause all commands following in the
- script to be skipped (the D command also suppresses the clearing of the
- current text buffer that would normally occur before the next cycle).
- There is also a second buffer (called the 'hold space' that can be copied
- or appended to or from or swapped with the current text buffer.
-
- ADDRESSES
- An address is: a decimal number (which matches that numbered line where
- line numbers start at 1 and run cumulatively across files), or a '$' (which
- matches the last line of input), or a '/regular expression/' (which matches
- any line satisfying the expression. The following rules govern the address
- matching:
-
- * A command line with no addresses selects every input line.
-
- * A command line with one address selects every input line that matches
- that address.
-
- * A command line with two addresses selects the inclusive range from the
- first input line that matches the first address up to and including
- the next input that matches the second. (If the second address is a
- number less than or equal to the line number first selected, only one
- line is selected.) Once the second address is matched sed starts
-
-
-
- h**2 Documentation 21 September 1991 1
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- looking for the first one again; thus, any number of these ranges will
- be matched.
-
- * The second address may be in the form of '+number'. This means that
- the command will stay selected for number lines after the first
- address is satisfied.
-
- * \?regular expression? where ? is any character is identical to
- /regular expression/.
-
- * The negation operator '!' preceding a function makes that function
- apply to every line not selected by the address(es).
-
- FUNCTIONS
- In the following list of functions, the maximum number of addresses
- permitted for each function is indicated in parentheses. An argument
- denoted 'text' consists of one or more lines, with all but the last ending
- with '\' to hide the newline. A command with this type argument must be the
- last on any command line or -e argument. Otherwise multiple commands may
- appear on a line separated by ';' characters. A command may have a
- trailing comment indicated by a '#' character. Comment lines begin with a
- '#'. Backslashes in text are treated as described in 'escape sequences
- below; they may be used to protect initial whitespace against the
- stripping that is done on every line of the script. An argument denoted
- 'label', 'rfile' or 'wfile' (which specify labels or file names) is not
- processed for 'escape sequences'. Therefore a ';' or '#' terminates the
- label or file name. This simplifies entering DOS style paths. Each 'wfile'
- is created before processing begins. There can be at most 10 distinct
- 'wfile' arguments.
-
- (1) a text Append the 'text' on output before reading the next input
- line.
-
- (2) b [label] Branch to the ':' command with the given 'label'. If no
- 'label' is given, branch to the end of the script.
-
- (2) c text Change lines by deleting the current text buffer and at the
- end of the address range, place 'text' on the output.
- Start the next input cycle.
-
- (2) d Delete the current text buffer. Start the next input cycle.
-
- (2) D Delete the first line of the current text buffer (all
- characters up to the first newline). Start the next input
- cycle.
-
- (2) g Replace the contents of the current text buffer with the
- contents of the hold space.
-
- (2) G Append the contents of the hold space to the current text
- buffer.
-
- (2) h Copy the current text buffer into the hold space.
-
-
-
-
- h**2 Documentation 21 September 1991 2
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- (2) H Append a copy of the current text buffer to the hold space.
-
- (1) i text Insert the 'text' on the standard output.
-
- (2) l [w[file]] List current text buffer on standard output or to a file if
- the -w option follows. Non ASCII printable characters are
- expanded as shown in the 'escape sequence' section below.
-
- (2) n Copy the current text buffer to standard output. Read the
- next line of input into it. The current line number
- changes.
-
- (2) N Append the next line of input to the current text buffer,
- inserting an embedded newline between the two. The current
- line number changes.
-
- (2) p Copy the current text buffer to the standard output.
-
- (2) P Copy the first line of the current text buffer (all
- characters up to the first newline) to standard output.
-
- (1) q Quit. Perform any pending outputs (a or r commands) and
- terminate sed.
-
- (1) r rfile Read the contents of 'rfile'. Place them on the output
- before reading the next input line.
-
- (2) s /regular expression/replacement/flags
- Substitute the 'replacement' for instances of the 'regular
- expression' in the current text buffer. Any character may
- be used instead of '/'. In the 'regular expression' and in
- the 'replacement' text \1 - \9 are used to indicate the nth
- subexpression indicated by a '\(...\)' expression in the
- 'regular expression'. In the replacement text an & may be
- used to indicate the entire matched expression. If the
- replacement text consists only of the a single '%'
- character, then a copy of the replacement text for the
- previous s command is used as the replacement text for this
- command. 'Flags' are any of the following options, with
- the following provisos: if present w must be the last one;
- only the last of either p or P is used; and only the last
- 'n} is used.
-
- g - Global. Substitute for all nonoverlapping instances
- of the 'RE' rather than just the first one.
-
- p - Print the current text buffer if a replacement was
- made.
-
- P - Print the first line of the current text buffer if a
- replacement was made.
-
-
-
-
-
-
- h**2 Documentation 21 September 1991 3
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- w[wfile] - Append the current text buffer to the file
- argument as in a w command if a replacement is made.
- Standard output is used if no file argument is given.
-
- n - Where n can be 1 through 512. Perform only the nth
- replacement. If g is also set or the -g option is
- selected, this option means that the nth and all
- succeeding substitutions should be performed.
-
- (2) t [label] Branch to the ':' command with the given 'label' if any s
- commands made any substitutions since the most recent read
- of an input line or execution of a t or T. If no 'label'
- is given, branch to the end of the script.
-
- (2) T [label] Branch to the ':' command with the given 'label' if no s
- commands have succeeded since the last input line or t or T
- command. Branch to the end of the script if no 'label' is
- given.
-
- (2) w [wfile] Write the current text buffer to 'wfile'. If no 'wfile' is
- given standard output is used.
-
- (2) W [wfile] Write the first line of the current text buffer to 'wfile'.
- If no 'wfile' is given standard output is used.
-
- (2) x Exchange the contents of the current text buffer and hold
- space.
-
- (2) y /string1/string2/
- Translate. Replace each occurrence of a character in
- string1 with the corresponding character in string2. The
- '/' may be any character not in 'string1' or 'string2'.
- The lengths of the two strings must be equal.
-
- (2) ! function All-but. Apply the function (or group, if function is '{')
- only to lines not selected by the address(es).
-
- (0) : label This command defines a label for b T and t commands.
-
- (1) = Write a line containing the current line number to the
- standard output.
-
- (2) { Execute the following commands through a matching '}' only
- when the current line matches the address or address range
- given.
-
- (0) } The { command marks the end of a grouping started by a '{'.
-
- (0) An empty command is ignored.
-
-
-
-
-
-
-
-
- h**2 Documentation 21 September 1991 4
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- ESCAPE SEQUENCES
- The following escape sequences are used to represent unprintable characters
- in 'text', 'regular expressions' and 'replacement' text. It is ignored in
- 'labels' and 'file's. If the character following the '\ 'is not list below
- the '\' causes the character to be quoted during script input. The l
- command also uses this convention.
-
- \a - bell (ASCII 07)
- \b - backspace (ASCII 08)
- \e - escape (ASCII 27)
- \f - formfeed (ASCII 12)
- \n - newline (ASCII 10)
- \r - return (ASCII 13)
- \t - tab (ASCII 09)
- \v - verticaltab(ASCII 11)
- \xhh - the ASCII character corresponding to 2 hex digits hh.
- \\ - the backslash itself.
-
- REGULAR EXPRESSIONS ('REs')
- Regular expressions can be built up from the following "single-character"
- 'RE's:
-
- c Any ordinary character not listed below. An ordinary character
- matches itself.
-
- \ Backslash. When followed by a special character the 'RE' matches
- the "quoted character" as listed in 'Escape Sequences' above. A
- backslash followed by one of <,>,(,),{,} or 0...9 represents an
- 'operator' in a regular expression, as described below.
-
- . Dot. Matches any single character except the NEWLINE at the end
- of a line.
-
- ^ Carat. As the leftmost character in an 'RE' this constrains the
- pattern to be an anchored match. That is it must match anchored
- at the first character in the line. In any other position the ^
- is an ordinary character.
-
- $ The dollar sign as the rightmost character in an 'RE' matches the
- NEWLINE at the end of the line. At any other position the $ is an
- ordinary character.
-
- ^RE$ This requires the 'RE' to match the entire buffer.
-
- [c...] A nonempty string of characters enclosed by square brackets
- matches any single character in the string except the NEWLINE at
- the end of the string. If the first character of the string is a
- caret (^), then the 'RE' matches any character not in the string
- except the NEWLINE at the end of the string. A '-' sign may be
- used to express ranges of characters. For example the range '[0-
- 9]' is equivalent to the string '[0123456789]'. The '-' is
- treated as an ordinary character if it occurs in the string at a
- position that can not be part of a range. This construct is
- called 'set definition'.
-
-
-
- h**2 Documentation 21 September 1991 5
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- \{m\}
- \{m,\}
- \{m,n\\} When any of these constructs follow an ordinary character, a dot,
- a 'set definition' or the '\n' construct. This construct matches
- the previous construct for a range of occurrences. At least m
- occurrences will be matched and at most n. "\{m,\}" matches at
- least m occurrences and "\{m}" matches exactly m.
-
- * When this follows an ordinary character, a dot, a 'set
- definition' or the '\n' construct, this 'RE' matches 0 or more
- occurrences of that construct. This pattern is called a
- 'closure'.
-
- + This pattern is similar to the star above but matches one or more
- occurrences of the previous construct.
-
- \< The sequence \< in an 'RE' requires that the scan position in the
- line must be immediately following a character that can not be
- part of a "word" and immediately preceding a character that can
- be part of a "word". In this context a "word" is any sequence of
- upper and lowercase letters, a numeral [0-9] or the underscore
- character (_).
-
- \> The sequence \> in an 'RE' requires that the scan position in the
- line must be immediately following a character that can be part
- of a "word" and immediately preceding a character that can not be
- part of a "word".
-
- \(...\) An 'RE' enclosed between the character sequences \( and \)
- matches whatever the unadorned 'RE' matches, but saves the string
- matched by the enclosed RE in a numbered substring register.
- There can be up to nine such substrings in an 'RE', and the
- parenthesis operators can be nested.
-
- \n Match the contents on the nth substring register. When nested
- substrings are present, 'n' is determined by counting the
- occurrences of \( starting from the left.
-
- // The empty 'RE' (//) is equivalent to the last 'RE' encountered in
- the input processing.
-
- ERROR MESSAGES
- The following error messages may appear during the compilation phase of sed
- processing all cause sed to terminate:
-
- sed: bad expression 'hh' -- The escape sequence of "\x" did was not
- followed by two hex digits
-
- sed: bad value for match count on s command 'command' -- A maximum value of
- 512 is allowed for 'n' on an s command.
-
- sed: cannot create 'file' -- The listed output file could not be opened
-
-
-
-
-
- h**2 Documentation 21 September 1991 6
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- sed: cannot open command-file 'file' -- The 'file' on an -f argument could
- not be opened
-
- sed: command "command" has trailing garbage -- Command was not terminated
- properly
-
- sed: duplicate label 'label' -- The indicated 'label' appeared on more than
- one ':' command
-
- sed: error processing: 'argument' -- The 'argument' is incorrect either a
- file name was missing or the g or n options had trailing garbage
-
- sed: garbled address 'command' -- Improper 'regular expression' in an
- address, line number in an address or + used in first address
-
- sed: garbled command 'command' -- Error in the construction of the 'regular
- expression' or 'replacement' in an s command, an ill-formed y command
- or a 'null' character was found
-
- sed: no addresses allowed for 'command -- The end of group (}) and label
- command (:), can not have addresses
-
- sed: no argument for -e -- The -e option did not have a 'script'
-
- sed: no such command as 'command' -- The function in the 'command' was
- illegal
-
- sed: only one address allowed for 'command' -- The a i q r and = commands
- allow only one address
-
- sed: range error in set 'command -- A [...'x'-'y'...] construct was found
- where y<x
-
- sed: RE too long: 'command' -- Internal buffer overflow while processing a
- 'character set'
-
- sed: too many commands, last was 'command' -- A maximum of 200 commands are
- allowed
-
- sed: too many labels: 'command' -- A maximum of 50 labels are allowed
-
- sed: too many line numbers 'command' -- More than 256 different line
- numbers were used or more than 50 + addresses were used
-
- sed: too many w files 'command' -- A maximum of 10 output files is allowed
-
- sed: too many {'s 'command' -- A '{' command did not have a matching '}'
- command
-
- sed: too many }'s 'command' -- A '}' command appeared before an opening '{'
- command
-
- sed: too much text: 'command' -- The internal command text buffer
- overflowed processing the command
-
-
-
- h**2 Documentation 21 September 1991 7
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- sed: undefined label 'label' -- The listed 'label' was never defined on a :
- command
-
- sed: unknown flag 'option' -- The listed 'option' is not allowed on the
- invoking line for sed
-
- The following warning may be displayed during compilation:
-
- sed: Label not used 'label' -- The listed 'label' was defined but never
- referenced.
-
- During the actual editing the following fatal errors can occur:
-
- sed: append too long after line 'number' -- An A, G, H, or N command
- created a line in the buffer longer than 4000 characters
-
- sed: cannot open 'file' -- The r command could not open 'file'
-
- sed: infinite branch loop at line 'number' -- More than 50 branches were
- taken without the editing of the line completing
-
- sed: line too long at line 'number' -- While the s command was performing a
- substitution the line length exceeded 4000 characters
-
- sed: RE bad code 'code' -- An internal processing error has occurred while
- matching a 'regular expression'
-
- sed: too many appends after line 'number' -- An append command caused more
- than 20 reads and appends for the given line
-
- sed: too many reads after line 'number' -- A read command caused more than
- 20 reads and appends for the give line
-
- BUGS
- I tried to fix every problem I could find, but I believe the follow bugs
- still exist in this version:
-
- * The getline routine can overflow the buffer before checking for
- overflow
-
- * I still do not know exactly what the D command should do
-
- * Strange options on the s command are allowed
-
- * The handling of inrange for '{' commands
-
- * Error processing could be improved
-
- * All output files are overwritten even when there are errors
-
-
-
-
-
-
-
-
- h**2 Documentation 21 September 1991 8
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- COMPATIBILITY
- This version of sed is a modification of the Internet supplied GNU version.
- That version was reverse-engineered from BSD 4.1 UNIX sed. The following
- changes, modifications and improvements have been made:
-
- * There is no hidden length limit (40 in BSD sed) on 'wfile' names.
-
- * There is no limit (8 in BSD sed) on the length of 'labels'.
-
- * The exchange command now works for long pattern and hold spaces.
-
- * 'Escape sequences' are inhibited for both 'label's and 'filename's.
-
- * All commands not having a 'text' argument can be separated by ";" or
- can have trailing comments (#)
-
- * a, c and i commands don't insist on a leading backslash '\n' in the
- text.
-
- * r and w commands do not insist on whitespace before the filename.
-
- * The g, P, p and 'n' options on s commands may be given in any order.
-
- * Escape sequences are valid in all contexts except file names and
- labels.
-
- * The full range of characters are allowed all 256 values.
-
- * In an 'RE', '+' calls for 1...n repeats of the previous pattern.
-
- * The l command produces a different format than the UNIX sed.
-
- * The W command (write first line of pattern space to file).
-
- * The T command (branch on last substitution failed).
-
- * sed's error messages have been made more specific and informative and
- cause processing to halt.
-
- * + allowed in the second address.
-
- * The empty RE "//" is allowed as a first address if a previous RE has
- been compiled
-
- * The -e and -f command line options do not require their arguments to
- be separate options.
-
- * If no arguments are given sed prints usage data.
-
- * In all contexts a blank file name means 'stdout'.
-
- This version otherwise appears to be equivalent to the UNIX version on the
- Sun4 computer. That is I believe anything that sed did on that system,
- this version of sed will do the same on either a Sun4 or a PC under DOS. If
-
-
-
- h**2 Documentation 21 September 1991 9
-
-
-
-
-
- SED(1) USER DOCUMENTATION SED(1)
-
-
- anyone can really explain what sed is really supposed to do as explained in
- the UNIX documentation I would appreciate the information. The manual page
- refers to ed for further details which is ambiguous at best and the
- description I read for either the D command or the options for the s
- command were not understandable by me.
-
- I would appreciate any comments, suggestions and even bug reports. I some-
- times can be reached on INTERNET (the fiscal year is almost over), but you
- can always contact me by mail or phone
- Howard Helman (helman@elm.sdd.trw.com)
- Box 340
- Manhattan Beach, CA 90266
- 213.372.5387 or after 11/1/91 310.372.538
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- h**2 Documentation 21 September 1991 10
-
-
-