home *** CD-ROM | disk | FTP | other *** search
-
-
-
- pdftotext(1) pdftotext(1)
-
-
- NAME
- pdftotext - Portable Document Format (PDF) to text con-
- verter (version 0.90)
-
- SYNOPSIS
- pdftotext [options] [PDF-file [text-file]]
-
- DESCRIPTION
- Pdftotext converts Portable Document Format (PDF) files to
- plain text.
-
- Pdftotext reads the PDF file, PDF-file, and writes a text
- file, text-file. If text-file is not specified, pdftotext
- converts file.pdf to file.txt. If text-file is '-', the
- text is sent to stdout.
-
- OPTIONS
- -f number
- Specifies the first page to convert.
-
- -l number
- Specifies the last page to convert.
-
- -ascii7
- Convert the text to 7-bit ASCII; the default is to
- use the 8-bit ISO Latin-1 character set.
-
- -eucjp Convert Japanese text to EUC-JP. This is currently
- the only option for converting Japanese text -- the
- only effect is to switch to 7-bit ASCII for non-
- Japanese text, in order to fit into the EUC-JP
- encoding. (This option is only available if pdfto-
- text was compiled with Japanese support.)
-
- -raw Keep the text in content stream order. This is a
- hack which often "undoes" column formatting, etc.
- This option will likely be replaced with something
- more sophisticated when pdftotext is rewritten to
- use a smarter text placement algorithm.
-
- -q Don't print any messages or errors.
-
- -h Print usage information. (-help is equivalent.)
-
- BUGS
- Some PDF files contain fonts whose encodings have been
- mangled beyond recognition. There is no way (short of
- OCR) to extract text from these files.
-
- AUTHOR
- The pdftotext software and documentation are copyright
- 1996-1999 Derek B. Noonburg (derekn@foolabs.com).
-
-
-
-
-
- 02 Aug 1999 1
-
-
-
-
-
- pdftotext(1) pdftotext(1)
-
-
- SEE ALSO
- xpdf(1), pdftops(1), pdfinfo(1), pdftopbm(1), pdfimages(1)
- http://www.foolabs.com/xpdf/
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 02 Aug 1999 2
-
-
-