WHAT IS THIS ?

This is Circa::Search, a module who provide functions to 
perform search on Circa, a www search engine running with 
Mysql. Circa is for your Web site, or for a list of sites. 
It indexes like Altavista does. It can read, add and 
parse all url's found in a page. It add url and word 
to MySQL for use it at search.

HOW DO I INSTALL IT ?

To install this module, cd to the directory that contains
this README file and type the following:

   perl Makefile.PL
   make
   make test
   make install

If this doesn't work for you, try:

   cp Search.pm /usr/local/lib/perl5/Circa

If you have trouble installing Search.pm because you have 
insufficient access privileges to add to the perl library 
directory, you can still use Search.pm.  See the docs in 
directory demo for details.

FEATURES ?

+ Full text indexing 

+ Different weights for title, keywords, description and 
rest of page HTML read can be given in configuration 

+ Boolean query language support : or (efault) and ("+") 
not ("-"). Ex perl + faq -cgi : Documents with faq, 
eventually perl and not cgi. 

+ Support protocol HTTP,FTP 

+ Make index in MySQL 

+ Read HTML and full text plain 

+ Can do indexation of filesystem without talk to Web Server 

+ Can browse site by directory / rubrique. 

+ Several kinds of indexing : full, incremental, only on 

a particular server. Documents not updated are not 
reindexed. All requests for a file are made first with 
a head http request, for information such as validate, 
last update, size, etc. 

+ Size of documents read can be restricted (Ex: don't get 
all documents > 5 MB). For use with low-bandwidth 
connections, or computers which do not have much memory. 

+ HTML template can be easily customized for your needs. 

+ Search for different criteria: news, last modified date, 
language, URL / site. 

+ Admin functions available by browser interface or 
command-line. 

+ Full support of standard robots exclusion (robots.txt). 

+ Identification with CircaIndexer/0.1, mail 
alian@alianwebserver.com. 

+ Delay requests to the same server for one minute. 
"It's not a bug, it's a feature!" Basic rule for 
HTTP serveur load. Index the different links found 
in a CGI (all after name_of_file?) 

+ Support proxy HTTP 

WHERE IS THE DOCUMENTATION ?

You'll find very verbose documentation in the file Search.pm 
in POD format. For read it, you can use pod2html.

When you install Circa::Search, the MakeMaker program will 
automatically install the manual pages for you (on Unix 
systems, type "man Circa::Search").

WHERE ARE THE EXAMPLES ?

A collection of examples demonstrating various features and
techniques are in the directory "demo".

First install and use Circa::Indexer.

Have fun, and let me know how it turns out!

Alain BARBET
alian@alianwebserver.com
