April 1996
Adam Blum
Microsoft Corporation
Download Microsoft Word (.DOC) format of this document (zipped, 19.4K).
1. Introduction to ISAPI Filters
2. Building the CVTDOC ISAPI Filter
2.1. What CVTDOC Does
2.2. Building the Filter
2.2.1. Requirements
2.2.2. GetFilterVersion
2.2.3. HttpFilterProc
2.2.4. Building and Testing
2.3. Building the Conversion Programs
2.3.1. Word to HTML Conversion: DOC2HTM.EXE
2.3.2. Excel to HTML Conversion: XL2HTM.EXE
2.3.3. Text to HTML Conversion: TXT2HTML.BAT
2.3.4. Creating Your Own Conversions
3. Using CVTDOC
3.1. Installation
3.2. Conversion Programs
3.3. Usage
This article details the programming required to successfully build Internet server application programming interface (ISAPI) filters, a powerful technology for extending the functionality of ISAPI-compliant Web servers such as the Microsoft® Internet Information Server (IIS). After explaining the ISAPI filter specification (part of the ActiveX server framework) in general, it describes an example ISAPI filter, CVTDOC, in detail. CVTDOC is an ISAPI filter that allows a Web server to perform automatic file publishing by converting files on the fly from their application's native format to HTML.
The Web's growing popularity for information publishing and retrieval has made many a custom-developed application obsolete. End users and Webmasters can easily create Web content that approximates what was done with custom applications. As a developer, does this mean that if a Web-based approach is chosen for building a solution, you are out of the loop? No way! Microsoft IIS provides a host of capabilities in the ActiveX server framework for using your Visual C++® development magic to provide advanced capabilities for Web-based applications. In this article, I'll explain one of these technologies, ISAPI filters, that will allow you to add some particularly cool features to your Web site. To get you hooked, I'll give you a free sample. Use the CVTDOC filter to generate HTML files dynamically. A filtered Web server provides smooth HTML conversion every time. (Warning: ISAPI pr ogramming is definitely addictive.)
The ISAPI Filter specification (included in the ActiveX Development Kit) provides the capability of registering a DLL to intercept specific server events and perform appropriate actions. Unlike ISAPI itself, which is an improvement over the Common Gateway Interface (CGI) that Web servers have used for years, ISAPI filters are an entirely new capability in the world of Web servers. In effect, ISAPI filters let you extend the capabilities of your Web server. The ISAPI filter you build says to the Web server, "Hey, when something like this happens, let me handle it." Your filter can then handle the event entirely, process the event and leave it available for the Web server and other filters to handle, or decide on-the-fly that it's not an event it needs to process at all. For example, you can create ISAPI filters to:
ISAPI filter authors must create two main functions for export: GetFilterVersion() and HttpFilterProc(). GetFilterVersion() is called just once by the Web server: On server startup when loading all filters. GetFilterVersion() should:
BOOL WINAPI GetFilterVersion( PHTTP_FILTER_VERSION pVer );
The HTTP_FILTER_VERSION structure looks like this:
typedef struct _HTTP_FILTER_VERSION { DWORD dwServerFilterVersion; DWORD dwFilterVersion; CHAR lpszFilterDesc[SF_MAX_FILTER_DESC_LEN+1]; DWORD dwFlags; } HTTP_FILTER_VERSION, *PHTTP_FILTER_VERSION;
The GetFilterVersion() function should fill in the dwFilterVersion, lpszFilterDesc, and dwFlags structure members. Most importantly, dwFlags needs to have all events that it is interested in registered for by turning that flag bit on.
The events available to register are listed in Table 1.:
Event ID | Description |
SF_NOTIFY_READ_RAW_DATA | Intercept data going to the server. |
SF_NOTIFY_SEND_RAW_DATA | Intercept data going from the server back to the client. |
SF_NOTIFY_AUTHENTICATION | Call your filter when authentication event occurs. Used to implement custom password schemes. |
SF_NOTIFY_LOG | Call your filter when the server is about to log a resource access or other event. Lets you implement your own custom logging schemes. |
SF_NOTIFY_URL_MAP | Call your filter when the server is mapping a logical path to a physical path. In effect, this is called every time a resource on your server is accessed. |
SF_NOTIFY_PREPROC_HEADERS | Called before server preprocesses headers coming from Web client. |
SF_NOTIFY_END_OF_NET_SESSION | Call your filter when the user's session is about to end. |
SF_NOTIFY_SECURE_PORT | Include with other flags if you want filter called when running over secure port (such as http://...). |
SF_NOTIFY_NONSECURE_PORT | Include with other flags when running over a normal HTTP connection (almost always included in your filter flags). |
Table 1.
The SF_NOTIFY_READ_RAW_DATA and SF_NOTIFY_SEND_RAW_DATA flags allow the ISAPI filter dynamic-link library (DLL) to intercept data going from the client to the server (READ) or from the server back to the client (SEND), and store and manipulate the data for its own purposes. Intercepting the SF_NOTIFY_AUTHENTICATION event allows the filter to insert its own authentication scheme for use with the server. The SF_NOTIFY_LOG event allows the filter to supplement or replace the IIS logging mechanism with its own logging method. The SF_NOTIFY_URL_MAP is a good event to intercept if you want to change how the server responds to a request for a URL resource. For example, we will intercept the SF_NOTIFY_URL_MAP in the CVTDOC filter to create the file requested by the URL. The SF_NOTIFY_SECURE_PORT and SF_NOTIFY_NON_SECURE_PORT flags can be ORed with the events requested, to allow your filter to restrict its operation to situations where the HTTP server is running over a secure port or over a normal HTTP session.
The other externally available function, HttpFilterProc() is called by the HTTP server, for example, IIS, each time one of these events the filter is interested in occurs.
DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD NotificationType, LPVOID pvNotification );
The first argument is an HTTP_FILTER_CONTEXT structure that has information about the server session, function pointers available that can get more information about the server session, and can add headers or data to the response going back to the client. In the CVTDOC sample, this argument is not used; you won't always need to use this argument. The next argument indicates the event notification type. This determines what event triggered the call of your filter. It is almost always used because, as good form, you will want to make sure that you are not processing events that do not interest you. Also, a single filter may be registered for multiple events, and HttpFilterProc() may have conditional logic based upon the event that triggered its call. For example, a filter may be registered for the SF_NOTIFY_READ_RAW_DATA and SF_NOTIFY_SEND_RAW_DATA events, where it processes some of the data passing from the client to the server or from the server to the client. But the details of its actions will likely vary slightly depending on the direction, so it needs to know the triggering event. The third argument stores data associated with an event in a structure. Available structure types are listed in Table 2.
Structure Type | Description |
HTTP_FILTER_RAW_DATA | Points to the data passed back by a READ or SEND event |
HTTP_FILTER_PREPROC_HEADERS | Accesses the client headers before the server processes them. |
HTTP_FILTER_AUTHENT | Provides user and password information from the server about to authenticate the client. |
HTTP_FILTER_URL_MAP | Provides the physical path resulting from the server mapping a logical path. |
HTTP_FILTER_LOG | Provides a variety of information about the client and its request that can be logged by the filter or changed to affect IIS' native logging. |
Table 2.
Once you have the information on the event type and the data associated with the event, your filter can do its work. Once the work is complete, the filter should return a valid return code. If you are not concerned with the event, you should immediately return SF_STATUS_REQ_NEXT_NOTIFICATION. If you handle an event and do not want any other filter or the server to handle it, return SF_STATUS_REQ_HANDLED_NOTICATION. If you handled an event, but it's all right for other filters and the server to deal with the event as well, return SF_STATUS_REQ_NEXT_NOTIFICATION. SF_STATUS_REQ_ERROR can be returned to indicate an error in the filter (reserve this for fairly serious problems). SF_STATUS_REQ_READ_NEXT can be returned to request to see more of the data being passed back to the client or received by the server; expecting to be called again with more data in the HTTP_FILTER_RAW_DATA structure.
Return Code | Use |
SF_STATUS_REQ_NEXT_NOTIFICATION | The filter is not concerned with the event. |
SF_STATUS_REQ_HANDLED_NOTICATION | The filter handled the event and will restrict other filters and the server from handling the event. |
SF_STATUS_REQ_NEXT_NOTIFICATION | Event is handled; it is all right for other filters and the server to handle the event now. |
SF_STATUS_REQ_ERROR | An error occurred in the filter. Reserve this return for fairly serious problems. |
SF_STATUS_REQ_READ_NEXT | Request to see more of the data being passed back to the client or received by the server. Expects to be called again. |
Table 3.
Once your filter is built, you can install it on IIS by running REGEDT32.EXE and adding the DLL name to the key: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\W3Svc\Parameters\Filter DLLs. Ideally, this should be done by a SETUP program that accompanies your filter.
This is the essence of what's required to build and use an ISAPI filter. To give you a better sense of what's involved in building an ISAPI filter, I'll describe the construction of an ISAPI filter sample that ships with ActiveX server framework: CVTDOC. This should also give you a sense of the types of problems to which ISAPI filters can be applied.
First, I'll briefly describe the purpose of CVTDOC, then how I developed the ISAPI filter itself, and finally present how each of the supplied conversion programs was built.
CVTDOC is a simple ISAPI filter I wrote in response to a need from several clients for "automatic file publishing": Generating HTML on the fly for specific document types. CVTDOC uses the capability of an ISAPI filter to supplement server capabilities by registering itself as intercepting all URL map events, and then checking to see if the document type requested is one that it knows how to convert.
The following fragment from the CVTDOC documentation (CVTDOC.DOC when you download the sample) may explain this requirement better:
Web content creators and Webmasters often want to "publish" a document or data file on the Web. However, it can be very inconvenient to constantly run a conversion program to generate new HTML each time the document or data file is updated. Relying on the Webmaster to run the conversion program for data that is often updated is also prone to error. If you are positive that the user has the software to display the document in native form, no conversion is necessary, but this is dangerous to assume. It would be great to be able to leave the document in native form, and have the Web server (or a Web server add-in such as CVTDOC) convert the document to HTML on the fly as needed.
CVTDOC is an Internet Services API (ISAPI) filter that dynamically converts documents to HTML if required when the HTML file is accessed. If the HTML document is out of date (older than the source document) or missing, it is automatically generated from the ISAPI filter, based on "conversion programs" registered for the source document type in the Registry. I provide sample conversion programs for Word documents, Microsoft® Excel spreadsheets, and text files, but it's important to remember that this can be used for any document type. The primary purpose of CVTDOC is to demonstrate the powerful capabilities of ISAPI filters. Nevertheless, I think you will find it useful in its own right.
The following section describes in detail how the filter was constructed. It's relatively to easy to lose the forest through the trees here: A quick glance at Section 3 on installing and using the filter (both of which are really quite simple) may help avoid any disorientation as you plow through the minutiae of how this was built.
Now that we know what's required, we can proceed to develop the filter. The basic steps are:
The filter needs to be able to intercept URL requests ending with a reference in the format:
filename.extension.htm
...and convert the file filename.extension to filename.extension.htm if and only if the HTML file is missing or older than the source. For example, an HTML hyperlink reference such as:
<A HREF="specials.doc.htm">
...should result in CVTDOC conditionally converting the SPECIALS.DOC file to HTML. CVTDOC should first check whether the HTML for that document already exists. If not, or if the HTML file is older than the source data file, it is a candidate for automatic conversion to HTML. CVTDOC searches through a list of registered data file types and associated conversion programs stored in the Registry, looking for a conversion program for the given extension (such as .DOC, .XLS, and .TXT). If it finds a conversion program, that program is launched to generate the specified filename.extension.htm file (for example, SPECIALS.DOC.HTM).
Why does the HTML author need to use the strange syntax (SPECIALS.DOC.HTM), instead of just embedding the file reference (SPECIALS.DOC) and somehow configuring CVTDOC to know to convert all .DOC files to HTML? First of all, you may still want to embed references to a .DOC file and have it launch Word, or in general embed a reference to the native file format and have it launch a viewer for that format if present. Using the syntax presented, references to the native format are still possible. More fundamentally, the Web browser is always going to attempt to launch a helper application if the URL ends with an extension of the native file format and not .HTM or .HTML. The URL ending with .HTM makes the browser expect HTML back, which is what it gets.
What we need is a filter that intercepts every request for a file of a type for which our filter can perform a conversion. From looking at Table 1, it might seem that there is no explicit "file re quested" event, but in fact there is. As long as the request is for a file on our local site, a URL mapping event (which can be intercepted with the SF_NOTIFY_URL_MAP flag) takes place. That is, if the URL reference is SPECIALS.DOC.HTM or any other URL that resolves to a local file, such as http://ourstore.com/specials.doc.htm, a URL mapping event will take place on the server to convert the logical URL to a physical file path. The filter should intercept each URL mapping event, by setting the SF_NOTIFY_URL_MAP flag in the HTTP_FILTER_VERSION structure on the GetFilterVersion() call. The other flag set should be SF_NOTIFY_ORDER_HIGH, to get the notification as early as possible and make the necessary conversion, before other filters that may need to use the resulting data try to access it.
As preparation for writing the HttpFilterProc call, the pseudo-code for doing actual filter processing is thus:
IF URL request is filename.ext.htm
IF filename.ext EXISTS
IF filename.ext.htm MISSING or OLDER than filename.ext
LOOK FOR CONVERSION PROGRAM FOR ext
IF FOUND
CONVERT filename.ext TO filename.ext.htm
With this information about the desired functionality, we're now ready to write the ISAPI filter.
The first function we need to write is GetFilterVersion(). This performs the three steps outlined in the discussion of GetFilterVersion responsibilities for all ISAPI filters identified earlier:
pFilterVersion->dwFilterVersio n = HTTP_FILTER_REVISION; strcpy (pFilterVersion->lpszFilterDesc, "CVTDOC - Converts document or data into HTML if HTML not present or older");
This first step provides the ISAPI filter revision number back to the server, as well as a text description of CVTDOC.
pFilterVersion->dwFlags=(SF_NOTIFY_ORDER_HIGH | // be sure to intercept! SF_NOTIFY_SECURE_PORT | SF_NOTIFY_NONSECURE_PORT | SF_NOTIFY_URL_MAP // tell us about all URL requests );
This sets the notification priority high, tells the filter that we are interested in both in sessions over secure and nonsecure ports, and registers the filter for all URL map events.
The entire GetFilterVersion() code is:
BOOL WINAPI GetFilterVersion (PHTTP_FILTER_VERSION pFilterVersion) { pFilterVersion->dwFilterVersion = HTTP_FILTER_REVISION; strcpy (pFilterVersion->lpszFilterDesc, "CVTDOC - Converts document or data into HTML if HTML not present or older"); // now register for events we're interested in pFilterVersion->dwFlags=(SF_NOTIFY_ORDER_HIGH | // be sure to intercept! SF_NOTIFY_SECURE_PORT | SF_NOTIFY_NONSECURE_PORT | SF_NOTIFY_URL_MAP // tell us about all URL requests ); hEvtLog=RegisterEventSource(NULL,"CVTDOC");// open up event log return TRUE; }
Now we just need to write the HttpFilterProc() procedure and we're almost done. We've already developed the pseudo-code for what it needs to do. Here is the essence of the implemented HttpFilterProc:
// Make a copy of the supplied filename that was requested // so that we can determine what the source file is strcpy(szSrcFile,pURLMap->pszPhysicalPath); // Check to see if there's an extension and then save a pointer to it if (pszExt=strrchr(szSrcFile,'.')){ // check for extension // This is the request for a .htm or .html file if (!strnicmp(pszExt,".htm",3)){ // is it HTML? // Zap the extension on the copy of the file to get the source filename *pszExt='\0'; // check for access() returning zero, indicating presence of source file if (!access(szSrcFile,0)){//check for presence of file // This function checks to see if the source file is newer // than the requested file, or if the requested file is // just not present if (FileDateCompare(szSrcFile,pURLMap->pszPhysicalPath)>0) // This looks for a conversion program to run based on extension // then runs the conversion program if (CvtToHTML(szSrcFile,pURLMap->pszPhysicalPath)==TRUE) // This indicates that the filter handled the request for // the URL so no other filters process return SF_STATUS_REQ_HANDLED_NOTIFICATION; } // end check for presence of file } // End is it HTML? } // End check for extension // . // . // If we didn't attempt conversion, control is passed to next filter // by returning SF_STATUS_NEXT_NOTIFICATION return SF_STATUS_NEXT_NOTIFICATION;
First, we parse out the source file from the full HTML file (pURLMap->pszPhysicalPath), by copying the physical file path into szSrcFile and stripping off the .HTM extension if it's there (if it's not then this is not a candidate URL for automatic conversion). Then we check for existence of the source file (access() returning 0 indicates presence). If it's there, then we check to see if the source file is newer or if the HTML file is missing (with the FileDateCompare() function that we write elsewhere). If so, we attempt to convert the source file into HTML using the CvtToHTML() function. This function checks for available conversions in a Registry subkey called Conversions, created just for CVTDOC, that contains extensions (such as .DOC, .XLS, .TXT) and their associated conversion programs. The Conversions key is located under the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\W3SVC\Parameters key. Creating it and filling it with values (file extensions and corresponding conversion programs) is part of the installation process documented in Section 3. If the conversion is not attempted, then control is passed to the next filter or to the server itself by returning SF_STATUS_NEXT_NOTIFCATION. If the conversion fails or no conversion program is found, these failures are reported to the Windows NT event log. In this case, the likely message to the Web user is a "404 Not found" error showing on their Web browser, unless the HTML file is already present on the server.
This section provides some tips on building ISAPI filters that may illustrate to you how simple it really is, and thus encourage you. Make sure your source file contains the following headers:
#include <httpext.h> #include <httpfilt.h>
Make sure that your INCLUDE environment variable contains the ISAPI header directory (such as C:\INETSDK\INCLUDE), and your LIB environment variable points to the ISAPI libraries (such as C:\INETSDK\LIB\I386). A makefile is supplied with CVTDOC on which you can model your ISAPI filter makefile, but it's worth a look to see how simple it is. ISAPI programs in general, and ISAPI filters in particular are really very lightweight.
CC=cl -c CVARS=-DWIN32 -DNDEBUG LINK=link LINKOPT=/DLL LIBS=wininet.lib user32.lib OBJS=cvtdoc.obj LINKOUT=/OUT:cvtdoc.dll DEFS=cvtdoc.def .cpp.obj: $(CC) $(CFLAGS) $(CVARS) $*.cpp cvtdoc.dll: $(OBJS) $(DEFS) $(LINK) $(LINKOPT) /DEF:$(DEFS) $(LINKOUT) $(LIBS) $(OBJS)
To do initial testing on the created filter, we wanted to see that a conversion program actually got called. Running REGEDT32.EXE, we created the Conversions subkey below the W3SVC\Parameters key in the Registry and added a value of .TXT with data of TXT2HTML.BAT %s. We created a batch file TXT2HTML.BAT with one line:
COPY %1 %1.htm
This batch file also shows the primary requirement of any conversion program that will be registered with CVTDOC. It needs to take the source file as its argument and create a destination HTML file that has the same name as the source file, with an .HTM appended to it. This is a characteristic of all the conversion programs supplied with CVTDOC and should be the convention followed by your own conversion programs that you register with CVTDOC. We do supply a text-to-HTML converter with the delivered CVTDOC filter. This "real" text-to-HTML conversion program is another TXT2HTML.BAT file that invokes a Perl script called TXT2HTML.PL.
Now we need to install the filter as part of the running IIS Web server. Still running REGEDT32.EXE, add the full path for CVTDOC.DLL to the Filter DLLs parameter in HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\W3Svc\Parameters.
Now create an HTML file with contents of :
<A HREF="test.txt.htm">Quick test"</A>
Create a file called TEST.TXT with contents of:
<HTML><HEAD><TITLE>CVTDOC Test</TITLE></HEAD><BODY>Test data.</BODY></HTML>
If you place the HTML file onto your IIS Web server, load the HTML file into your Web browser, and click on the Quick Test link, it should result in the TXT2HTML.BAT file being invoked, and you will see the contents of TEST.TXT on your Web browser. This means that the server is calling our ISAPI filter successfully. Of course, you don't need to do any of this testing for CVTDOC; that is already complete. But this should give you some idea of the testing process for your own ISAPI filters.
Now that we have a working CVTDOC ISAPI filter, we need to supply some conversion programs for it. As you'll see in Section 3, CVTDOC ships with conversion programs for Word .DOC files, Excel .XLS files, and text files (with a real text-to-HTML converter written in Perl rather than the stub batch file shown above). This is an immediately useful set of conversions, and you could just use the supplied conversion programs. However, CVTDOC is primarily meant to be a tool with which to register any data file type and associated conversion program. If you are planning to create conversion programs for other file types, a discussion of how these types were created may be useful. Note that the code for these conversion programs is not included with the CVTDOC sample as shipped with the ActiveX Development Kit, which concentrates on the CVTDOC ISAPI filter code itself, not the code for conversion programs.
DOC2HTM.EXE is invoked with the name of the source Word document as its argument. It will create an HTML file named with the source filename and an appended .HTM. To install it for use by CVTDOC, create a value under the W3Svc\Parameters\Conversions key with name of .DOC and data of (for example) C:\WWWROOT\CGI-BIN\DOC2HTM.EXE %s.
This was an easy conversion program to create. Microsoft Word combined with Wor d Internet Assistant allows you to load a Word document and convert it to HTML by selecting the Save As HTML option from the File menu. Creating the conversion program was just a matter of automating this with Visual Basic and the WordBasic OLE Automation interface. Here is the entire code for the Word-to-HTML converter supplied with CVTDOC.
Private Sub Form_Load() Dim X As Object Set X = CreateObject("Word.Basic") X.FileOpen Name:=Command NewFile = Command + ".htm" fmt = X.ConverterLookup("HTML") X.FileSaveAs Name:=NewFile, Format:=fmt Set X = Nothing Unload Me End Sub
You must have Word 6.0 or later and Word Internet Assistant installed on the IIS server machine for this code to work. Once a new conversion program is built, registering it with CVTDOC is as simple as adding a new value to the Conversions subkey of the W3Svc\Parameters key, with data of the full path to the conversion program, followed by "%s".
XL2HTM.EXE is invoked with the name of the source Excel spreadsheet. It will create an HTML file named with the source filename and an appended .HTM. It will only take the data from a named range in your Excel spreadsheet entitled Export. If no Export range is available it will just export A1 through H20. To install it for use by CVTDOC, create a value under the W3Svc\Parameters\Conversions key with name of .XLS and data of (for example) C:\WWWROOT\CGI-BIN\XL2HTM.EXE %s.
Unfortunately, the Excel Internet Assistant cannot be invoked via OLE Automation as Word Internet Assistant is indirectly by saving as an HTML file (for example, you cannot save as HTML within Excel). So I had to write this conversion program from scratch. The entire program handling character formatting and alignment is quite long and not that interesting for the purpose at hand (to show you how to build your own conversions). Below is a grossly oversimplied (but functional) version of the code that just shows how to get the data from the Export range into an HTML table.
Private Sub Form_Load() Dim X As Object Set X = CreateObject("Excel.Sheet") Dim App as Object Set App = X.Application App.Workbooks.Open Command Dim CurSheet As Object Set CurSheet = App.ActiveWorkbook.Worksheets("Sheet1") Result =CurSheet.Range("Export").Select If (Result <> True) Then Result = CurSheet.Range("A1:H20").Select Dim OutputFile As String OutputFile = Command + ".htm" Open OutputFile For Output As #1 Header = "Data From " + Command Line = "<HTML><HEAD><TITLE>" & Header & "</TITLE></HEAD><BODY>" Print #1, Line Line = "<H1>" & Header & "</H1>" Print #1, Line Print #1, "<TABLE>" NoRows = App.Selection.Rows.Count NoCols = App.Selection.Columns.Count ' now loop through all rows and columns printing out contents For Row = 1 to NoRows Print #1, "<TR>" For Col = 1 to NoCols Print #1, "<TD>" Print #1, App.Selection.Cells(Row, Col).Text Next Col Next Row Print #1, "</TABLE></BODY></HTML>" Set X = Nothing Set App = Nothing Set CurSheet = Nothing Unload Me End Sub
TXT2HTML.BAT is invoked with the name of the source Word document as its argument. It will create an HTML file named with the source filename and an appended .HTM. To install it for use by CVTDOC, create a value under the W3Svc\Parameters\Conversions key with name of .TXT and data of (for example) C:\WWWROOT\CGI-BIN\TXT2HTML.BAT %s. You will need to have Windows NT Perl installed on your IIS machine, executable in the PATH. You can find NT Perl on the Windows NT 3.51 Resource Kit, and at http://www.perl.hip.com.
This batch file is a wrapper around TXT2HTML.PL, a Perl script for text-to-HTML conversion written by Seth Golub of the University of Washington. The script is entirely too large to present here. The batch file is as follows:
perl txt2html.pl < %1 > %1.htm
The Perl script moves through the text file placing headers around logical breakpoints, and generally attempting to convert the content to HTML. It won't be perfect, but the result is a bit more attractive than a plain text file displayed on a Web browser.
You can get a pretty good idea from the discussion above of how to create your own conversion program. It should take an argument of the source file. It should generate an HTML file named with the source filename and an appended .HTM. If the program in question exposes an OLE Automation interface, this usually makes writing a small Visual Basic® program to do the work very easy. You should be able to use the Form_Load() subroutines presented as a model to build another VB-based conversion program.
As mentioned, the primary purpose of CVTDOC is to demonstrate the capabilities of ISAPI filters. Hopefully, presenting how this filter was built has made it clear how to create your own filters. If you don't need the specific functionality offered by CVTDOC, you can stop here, fire up Developer Studio and start hacking your own ISAPI filters.
However, based on what you now know about the functionality available in CVTDOC, it may have value to you in and of itself. Assuming you now would like to use CVTDOC on your own Web site, here are the instructions to do so. CVTDOC is included in the ActiveX Development Kit in the directory \INETSDK\SAMPLES\ISAPI\CVTDOC. We recommend downloading and installing the ActiveX Development Kit to ensure you have what you need for the sample. Currently the ActiveX Development Kit can be downloaded from http://www.microsoft.com/intdev/sdk/.
Download the CVTDOC sample files (zipped, 30.3K). To work with these sample files, you'll need to have the ActiveX Development Kit, which you can download from the ActiveX Development Kit page on this site.
There are three conversion programs supplied with this sample:
Embed a reference in your referring HTML page to the document or data file name with an appended .HTM. For example:
<HTML> <HEAD><TITLE>Simple CVTDOC Example</TITLE></HEAD> <BODY> <H1>Welcome to the CyberStore</H1> For maximum savings, please check out our <A HREF="specials.doc.htm">daily specials!</A> </BODY> </HTML>
The SPECIALS.DOC file will be converted automatically by the CVTDOC ISAPI filter if either: The HTML file doesn't exist yet, or it's older than the updated SPECIALS.DOC file. This allows the Webmaster to keep the HTML content current with very little intervention.
The following files and URLs will be useful references in your ISAPI filter development efforts. The first reference is ActiveX Development Kit page on the Microsoft Web site. The remaining references are files and directories in the ActiveX Development Kit itself.