This documents contains answers to some of the frequently asked question on MHonArc. MHonArc is a Perl program for converting e-mail messages as specified in RFC 822 and RFC 1521 (MIME) to HTML. MHonArc has the ability to maintain an archive of converted messages, or it can be used as basic e-mail->HTML converter.
The FAQ is intended to compliment the documentation provided in the MHonArc distribution. Hence, the documentation is still the key source to answers to any question you may have.
Earl Hood, ehood@convex.com
MHonArc is a Perl program for converting e-mail messages as specified in RFC 822 and RFC 1521 (MIME) to HTML. MHonArc has the ability to maintain an archive of converted messages, or it can be used as basic e-mail->HTML converter.
1.2.1.
The latest information on MHonArc, and its availability, may be obtained at <URL:http://www.oac.uci.edu/indiv/ehood/mhonarc.html>.
Its FREE! MHonArc is distributed under the GNU General Public License. A copy of the license is included in the distribution. Please read it for more information.
The first place to try is the documentation that comes with MHonArc. The documentation is quite extensive, and may provide answers to most of your questions.
Second, you can read this FAQ.
Third, a mailing list, mhonarc@rosat.mpe-garching.mpg.de, is available to provide a discussion forum on the usage and development of MHonArc. Appropriate topics for the list include: usage questions, bug reports, behavioral enhancements, documentation bugs, and general help.
To subscribe to the mailing list, send mail to mhonarc-request@rosat.mpe-garching.mpg.de with the command,
subscribe
as the message body.
If you send mail mhonarc@rosat.mpe-garching.mpg.de, your message will be distributed to all subscribers on the list.
The mailing list is archived by Majordomo. You can also use the WWW to access the archive (with full text search using glimpse) at <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/mhonarc/>
MHonArc is known to run under Unix, MS-DOS/Windows, WinNT, and Win95.
MacPerl support is in the works. Please notify the author if you are interested in testing MHonArc under MacPerl.
MHonArc can run under Perl 4 or 5.
MHonArc can convert mail that is stored in UUCP mailbox format (ie. all messages are in a single file), or in the format used by the Rand Message Handler (MH) (messages are contained in separate files within a directory). MHonArc is known to work with the following MUAs: MH, mail, Mail, Elm, Eudora, WinVN, Windows Trumpet, and NUPop.
To support some MUA's, it may require redefining the MSGSEP resource.
If you are processing UUCP mailbox files, messages are separated by a line starting
with "From
" (ie. The word "From
" followed by a space). Some mail software will
prefix lines in message bodies with a `>
' to avoid MUA's from incorrectly treating
the line as a message separator. However, some mail software doesn't.
To avoid incorrect separator detection, many MUAs perform a more stricter
detection of separators beyond "From
". MHonArc, by default, will treat lines
starting with "From
" as a message separator, which can lead to incorrect message
termination if the From line has not been escaped with a `>
'.
To fix the problem, use the MSGSEP
resource to instruct MHonArc to use a stricter
test detecting a message separator. The following MSGSEP
resource setting is
known to work well:
<MSGSEP> ^From \S+\s+\S+\s+\S+\s+\d+\s+\d+:\d+:\d+\s+\d+ </MSGSEP>
MHonArc utilizes the References
and In-Reply-To
fields of mail messages for
generating threads. It is up to the mail user agents (MUAs) to define these fields.
The References
field is normally utilized by news software, while
In-Reply-To
is normally utilized be e-mail software.
If the mail you archive does not contain References
and In-Reply-To
fields,
MHonArc will not detect a thread, even though there are messages that are
follow-ups to existing messages.
Subject text is imperfect. Problems with subject based threading:
It is possible to use some heuristics combined with References
and
In-Reply-To
fields, but I have not had the time to work on it.
The answer varies depending on your MUA. You'll need to look at the documentation of your MUA to find the answer.
For MH users, the following in your replcomps
file will work:
%<{date}In-reply-to: Your message of "\ %<(nodate{date})%{date}%|%(pretty{date})%>."%<{message-id} %{message-id}%>\n%>\
Or, you can use the following if you prefer the References
field format:
%<{message-id}References: \ %<{references}%(void{references})%(trim)%(putstr) %> %(void{message-id})%(trim)%(putstr)\n%>\
Author welcomes feedback from users on how to configure other MUAs.
MIME stands for Multipurpose Internet Mail Extensions. MIME is defined by RFC 1521 and 1522. An HTML version of the RFCs are available at <URL:http://www.oac.uci.edu/indiv/ehood/MIME/MIME.html>.
In sum, MIME "redefines the format of message bodies to allow multi-part textual and non-textual message bodies to be represented and exchanged without loss of information." [RFC 1521]
Not yet.
This question can be anwsered by reading the "MIME" section of the MHonArc documentation. The solution may require registering a pre-existing filter for the given content-type, or hooking in a new filter.
Yes. In version 1.2, a resource file element was added called OTHERINDEXES
. With
this element, you are able to define an arbitrary number of indexes you desire. The
additional indexes may be in any format you desire and that is supported by
MHonArc. Refer to the documentation for the usage of OTHERINDEXES
.
No. Since the existance of author names is not guaranteed, or consistent, sorting messages by author would not be perfect.
No. The archive database stores all resource settings. The only time you need to respecify the resource file is if changes are required in the layout of the archive.
When utilizing the OTHERINDEXES
resource, the resource filenames listed
in the main resource file are stored in the database, but the resources for
each additional index are NOT. Hence, the resource files defining the
additional indexes must be accesible.
No. In order to achieve the same effect, you must add the original, unprocessed, message to the destination archive, then remove the appropriate HTML version of the message from the source archive.
Maybe. There is currently no utility to perform this task, but is possible to write one. The utility can scan each HTML messages and extract the necessary information required to restore the database. How well the database is reconstructed is heavily dependent on how the messages are formatted.
When I have more time, I may develop a recovering utility, but do not hold your breath.