4 cows rating at TuCows!
4 stars rating at ZDNet! 4 stars rating at ZDNet!
Scout Report Selection

HTTrack
The Web Mirror Utility

* 1.20 soooon... *

Version en franτais

 

Overview1.20beta19 RELEASE is available! (February, 1999) - Final 1.20: March, 1999... (we hope.....) - What's new?

WinHTTrack

HTTrack is an easy-use offline browser utility. It allows you to transfer a World Wide Web site from the Internet to a local directory, building recursively all structures, getting html, images, and other files from the server to your computer. Links are rebuilt relatively so that you can freely browse to the local site (works with any browser). You can mirror several sites together so that you can jump from one to another. You can, also, update an existing mirror site, or continue an interrupted transfer. The robot is fully configurable, with an integrated help.
WinHTTrack is the Windows95/98 release of HTTrack. It is available on the ZIP archive of HTTrack

SUN SOLARIS AND IBM AIX VERSIONS ONLY:
On Sun Solaris and AIX, HTTrack is composed by two programs : the graphic shell, and the robot.
The Shell is an easy way to control the robot, through a graphic interface ; it is available at the HTTrack shell page.
Here you can only find the robot, which can be used as a command-line programm.

 


 

Features of HTTrack

 

Download version 1.20beta (10/26/98) - BETA RELEASE WITHOUT .hlp FILE-

Plateform

Click on the proper file to download

Current version

Windows95/98

httrack.zip

1.20BETA-19 (!!)

SUN Solaris (5.6)

httrack.tar.gz

1.20BETA-19'

IBM AIX (4.0)

httrack.tar.gz

1.20BETA-19'

Linux PC

httrack.tar.gz

1.20BETA-10

Alternative sites: Search with Ftp search the latest release available on ftps.

 


 

Usage of HTTrack

The documentation is now available for WinHTTrack and HTTrack, and includes the frequently asked questions (FAQ).
 
  

On SUN/Solaris and IBM/AIX, the simplest way is to use the graphic shell, but you can also use the robot in a console window.
Type httrack (without any parameters) to show the options list. Parameters and adresses must not be in special order.
There are special commands, as the Nx option. 

Example:  

httrack www.myweb.abc/mydir/index.hml
httrack www.myweb.abc/mydir/index.hml  www.otherweb.abc/~friend/cool/
httrack www.myweb.abc/mydir/index.html www.otherweb.abc/~friend/cool/ -N1 -P proxy.myweb.abc:1234
httrack www.myweb.abc/mydir/index.html +www.otherweb* +www.hisweb*.net* -*.com*

The first example will transfer the site starting from 'www.myweb.abc/mydir/index.hml' (and, of course, not all www.myweb.abc!)
The second example will mirror 'www.myweb.abc/mydir/index.hml' and 'www.otherweb.abc/~friend/cool/" together. (the third will generate html and images in two directory (option N1), and use a proxy (option P))
The 4th example shows how to use wildcards to accept/refuse all URLs of a certain type. Note that if you specify an URL without any wildcards (*) after '+' or '-' the URL will be accepted/refused, if it exists somewhere. 

Default options are fixed so that you can easily use the command line-mode robot.

 


 

Comments : SEND US AN EMAIL!

We hope you will enjoy and have fun with this utility as we had fun developping it. If you like it, feel free to encourage us by sending comments and remarks. Problems and bug report are also welcome, for the shell and for the robot.
 


 

Updatesand bugs fixed

To do & Known bugs: (fixed soon)
- Well... some help files to do... we're something like... late... still waiting for FINAL 1.20 (be patient!!)
- We have some problems with the Unix release (ftp protocol bogus) and still no Linux version (disk crash...)

BETA RELEASE BEFORE FINAL, PLEASE NOTIFY US ANY BUG OR PROBLEM
1.20

+ Fixed: Random crashes (div by 0/illegal instruction) with null size files
+ New: Limited ftp protocol (files only), e.g. +ftp://* now works
+ Fixed: Some connect problems with several servers or proxies
+ New: New option, save html error report by default
+ Shell: Browse and see log files at the end of a mirror
+ New: Proxy authentication (ex: guest:star@myproxy.com:8080)
+ Shell: Interface improved (especially during mirror)
+ Fixed: Ambiguous files are renamed (asp,cgi->html/gif..)
+ Shell: New test link mode option
+ New: Site authentication (ex: guest:star@www.myweb.com/index.html)
+ Fixed: Minor bugs fixed
+ Shell: See log files during a mirror
+ Fixed: Some problems using CGI (different names now)
+ Fixed: Go down/up/both options and filters
+ Fixed: "Store html first" did not work
+ New: -F option ("Browser ID") disguise HTTrack into a browser
+ New: New filter system
+ Shell: New "Save as default" options
+ Fixed: "Build options" did NOT work properly! (files overwritten or missing)
+ Fixed: User agent ID fixed
+ Shell: Skip options
+ Shell: Better interface control during mirrors
+ Shell: InstallShield and Help files
+ Fixed: Some external links were not filtered sometimes
+ Fixed: Mirror crash at the end

1.16b
+ Shell: Really *stupid* bug fixed causing WinHTTrack to be slooow
+ Fixed: Crash if the first page has no title fixed
+ Fixed: Bogus options like "Just scan" saved empty files
+ Fixed: Forbid all links (*) with manual accept did not work
+ Shell: Filters interface improved
1.16:
+ New : Java Classes and subclasses are now retrieved!
+ New: Better JavaScripts parsing
+ New: Option: Abandon slowest hosts if timeout/transfer too slow
+ Shell: Interface improved

1.15b
+ Fixed: Some bugs fixed
1.15:
+ Shell: Interface improved
+ New: Robot improved (some files through javascript are now detected!)
+ New: Improved jokers (for example, +www.*.com/*.zip)
+ New: 'config' file to configurate proxy, path.. only once

1.11
+ New: Wait for specific time (begin transfer at specific hour)
+ New: Time limit option (stops transfer after x seconds)
+ Shell: Interface improved for an easy use

1.10e
+ Fixed: Maps were not correctly managed (stupid bug)
1.10d:
+ Fixed: Bogus index.html fixed
1.10c
+ Shell: "Time out" field needed "transfer rate" field
1.10b
+ Fixed: Better memory management
1.10
+ New: "Transfer rate out" option added (abandon slowests sites)
+ New: "Deaf" hosts do not freeze HTTrack any more
+ Fixed: Again problems with code/codebase tags
+ New: Broken links detection improved

1.04
+ Fixed:Some links were not correctly read (pages with "codebase" tags)
+ Shell: Interface improved

1.03 (No changes for the command-line robot)
+ Shell: Big bug fixed! (VERY slow transfer rates..)

1.02
+ Fixed: Some java files were not correctly transfered
+ New: Speed has been improved
+ Fixed: Log file more accurate
+ Shell: Interface has been improved

1.01
+ Fixed: Structure check error in some cases

1.00 -- The 1.00, Yeah!
+ New: base and codebase are now scanned

0.998 beta-2
+ Fixed: Multiple name bug (files having the same name in the same directory) with -O option fixed

0.997 beta-2
+ Fixed: Filenames with '%' were not correctly named
+ Fixed: Bug detected in 0.996: several files are not written on disk!!

0.996 beta-2
+ New: -O option (path for mirror and log)
+ New: Unmodified file time/date are not changed during an update

0.99 beta-2
+ New: User-agent field
+ New: Shortcuts (--spider etc.)
+ New: Links not retrieved are now rebuilt absolutly
+ New: The 'g' option (juste get files in current directory) has been added
+ New: Primary links analyste has been improved
+ Fixed: "304" bug fixed

0.25 beta-2
+ Fixed: Freeze during several mirrors fixed!
+ New: More 'N' options (filenames type)

0.24 beta-2
+ Fixed: Restart/Update with cache did not work (really not..)
+ Fixed: Jokers now work properly (e.g. +www.abc.com* do works)
+ New: The 'n' option (get non-html files near a link) has been added!

0.23 beta-2
+ Fixed: The 'M' option (site size) did not work
+ Fixed: Files larger than 65Kb were not correctly written

older beta
+ Many, many bugs fixed
 


 

Credits 

Graphic shell developped by Yann Philippot
Robot developped by Xavier Roche
Project Leaded by Patrick Ducrot and Daniel CarrΘ

HTTrack has been developped using C and C++, with around 10,000 lines of code. We have spent many, many hours in testing and debugging, so that it can be as reliable as possible. We think we have made a good job ;-)

Project developped at the ENSI Caen - ISMRa
⌐1998 Xavier Roche & Yann Philippot, all rights reserved.

HTTrack

hts/wmu.gif (2416 bytes)