home *** CD-ROM | disk | FTP | other *** search
-
- <HTML>
- <HEAD>
- <TITLE>WWW::Search::WebCrawler - class for searching WebCrawler</TITLE>
- <LINK REL="stylesheet" HREF="../../../../Active.css" TYPE="text/css">
- <LINK REV="made" HREF="mailto:">
- </HEAD>
-
- <BODY>
- <TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 WIDTH=100%>
- <TR><TD CLASS=block VALIGN=MIDDLE WIDTH=100% BGCOLOR="#cccccc">
- <STRONG><P CLASS=block> WWW::Search::WebCrawler - class for searching WebCrawler</P></STRONG>
- </TD></TR>
- </TABLE>
-
- <A NAME="__index__"></A>
- <!-- INDEX BEGIN -->
-
- <UL>
-
- <LI><A HREF="#name">NAME</A></LI><LI><A HREF="#supportedplatforms">SUPPORTED PLATFORMS</A></LI>
-
- <LI><A HREF="#synopsis">SYNOPSIS</A></LI>
- <LI><A HREF="#description">DESCRIPTION</A></LI>
- <LI><A HREF="#see also">SEE ALSO</A></LI>
- <LI><A HREF="#how does it work">HOW DOES IT WORK?</A></LI>
- <LI><A HREF="#bugs">BUGS</A></LI>
- <LI><A HREF="#testing">TESTING</A></LI>
- <LI><A HREF="#author">AUTHOR</A></LI>
- <LI><A HREF="#legalese">LEGALESE</A></LI>
- <LI><A HREF="#version history">VERSION HISTORY</A></LI>
- <UL>
-
- <LI><A HREF="#2.02, 19991005">2.02, 1999-10-05</A></LI>
- <LI><A HREF="#2.01, 19990713">2.01, 1999-07-13</A></LI>
- <LI><A HREF="#1.13, 19990329">1.13, 1999-03-29</A></LI>
- <LI><A HREF="#1.11, 19981009">1.11, 1998-10-09</A></LI>
- <LI><A HREF="#1.9">1.9</A></LI>
- <LI><A HREF="#1.7">1.7</A></LI>
- <LI><A HREF="#1.5">1.5</A></LI>
- <LI><A HREF="#1.3">1.3</A></LI>
- </UL>
-
- </UL>
- <!-- INDEX END -->
-
- <HR>
- <P>
- <H1><A NAME="name">NAME</A></H1>
- <P>WWW::Search::WebCrawler - class for searching WebCrawler</P>
- <P>
- <HR>
- <H1><A NAME="supportedplatforms">SUPPORTED PLATFORMS</A></H1>
- <UL>
- <LI>Linux</LI>
- <LI>Solaris</LI>
- <LI>Windows</LI>
- </UL>
- <HR>
- <H1><A NAME="synopsis">SYNOPSIS</A></H1>
- <PRE>
- use WWW::Search;
- my $oSearch = new WWW::Search('WebCrawler');
- my $sQuery = WWW::Search::escape_query("+sushi restaurant +Columbus Ohio");
- $oSearch->native_query($sQuery);
- while (my $oResult = $oSearch->next_result())
- print $oResult->url, "\n";</PRE>
- <P>
- <HR>
- <H1><A NAME="description">DESCRIPTION</A></H1>
- <P>This class is a WebCrawler specialization of WWW::Search.
- It handles making and interpreting WebCrawler searches
- <EM><A HREF="http://www.WebCrawler.com">http://www.WebCrawler.com</A></EM>.</P>
- <P>This class exports no public interface; all interaction should
- be done through <A HREF="../../../../site/lib/WWW/Search.html">the WWW::Search manpage</A> objects.</P>
- <P>
- <HR>
- <H1><A NAME="see also">SEE ALSO</A></H1>
- <P>To make new back-ends, see <A HREF="../../../../site/lib/WWW/Search.html">the WWW::Search manpage</A>.</P>
- <P>
- <HR>
- <H1><A NAME="how does it work">HOW DOES IT WORK?</A></H1>
- <P><CODE>native_setup_search</CODE> is called (from <CODE>WWW::Search::setup_search</CODE>)
- before we do anything. It initializes our private variables (which
- all begin with underscore) and sets up a URL to the first results
- page in <CODE>{_next_url}</CODE>.</P>
- <P><CODE>native_retrieve_some</CODE> is called (from <CODE>WWW::Search::retrieve_some</CODE>)
- whenever more hits are needed. It calls <CODE>WWW::Search::http_request</CODE>
- to fetch the page specified by <CODE>{_next_url}</CODE>.
- It then parses this page, appending any search hits it finds to
- <CODE>{cache}</CODE>. If it finds a ``next'' button in the text,
- it sets <CODE>{_next_url}</CODE> to point to the page for the next
- set of results, otherwise it sets it to undef to indicate we''re done.</P>
- <P>
- <HR>
- <H1><A NAME="bugs">BUGS</A></H1>
- <P>Please tell the author if you find any!</P>
- <P>
- <HR>
- <H1><A NAME="testing">TESTING</A></H1>
- <P>This module adheres to the <CODE>WWW::Search</CODE> test suite mechanism.
- See $TEST_CASES below.</P>
- <P>
- <HR>
- <H1><A NAME="author">AUTHOR</A></H1>
- <P>As of 1998-03-16, <CODE>WWW::Search::WebCrawler</CODE> is maintained by Martin Thurn
- (<A HREF="mailto:MartinThurn@iname.com">MartinThurn@iname.com</A>)</P>
- <P><CODE>WWW::Search::WebCrawler</CODE> was originally written by Martin Thurn
- based on <CODE>WWW::Search::HotBot</CODE>.</P>
- <P>
- <HR>
- <H1><A NAME="legalese">LEGALESE</A></H1>
- <P>THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
- WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
- MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.</P>
- <P>
- <HR>
- <H1><A NAME="version history">VERSION HISTORY</A></H1>
- <P>If it's not listed here, then it wasn't a meaningful or released version.</P>
- <P>
- <H2><A NAME="2.02, 19991005">2.02, 1999-10-05</A></H2>
- <P>now uses <CODE>hash_to_cgi_string()</CODE></P>
- <P>
- <H2><A NAME="2.01, 19990713">2.01, 1999-07-13</A></H2>
- <P>
- <H2><A NAME="1.13, 19990329">1.13, 1999-03-29</A></H2>
- <P>Remove extraneous HTML from description (thanks to Jim Smyser <A HREF="mailto:jsmyser@bigfoot.com)">jsmyser@bigfoot.com)</A></P>
- <P>
- <H2><A NAME="1.11, 19981009">1.11, 1998-10-09</A></H2>
- <P>Now uses split_lines function</P>
- <P>
- <H2><A NAME="1.9">1.9</A></H2>
- <P>1998-08-20: New format of www.webcrawler.com output.</P>
- <P>
- <H2><A NAME="1.7">1.7</A></H2>
- <P>\n changed to \012 for MacPerl compatibility</P>
- <P>
- <H2><A NAME="1.5">1.5</A></H2>
- <P>1998-05-29: New format of www.webcrawler.com output.</P>
- <P>
- <H2><A NAME="1.3">1.3</A></H2>
- <P>First publicly-released version.</P>
- <TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 WIDTH=100%>
- <TR><TD CLASS=block VALIGN=MIDDLE WIDTH=100% BGCOLOR="#cccccc">
- <STRONG><P CLASS=block> WWW::Search::WebCrawler - class for searching WebCrawler</P></STRONG>
- </TD></TR>
- </TABLE>
-
- </BODY>
-
- </HTML>
-