Sophie

Sophie

distrib > Mandriva > 8.2 > i586 > media > contrib > by-pkgid > 710ed0d687dee071f36b1a99483146be > files > 13

pavuk-0.9pl28-2mdk.i586.rpm

This is public beta release of Pavuk.
Pavuk in Slovak language means spider.

What this program does :
-recursive HTTP , HTTP over SSL , FTP, FTP over SSL and Gopher document
 retrieving
-supports HTTP/1.1 with persistent connections
-supports HTTP POST requests
-can automaticaly fill forms from HTML documents and make POST or GET
 requestes based on user input and form content
-synchronizing retrieved local copies of document with remote
-partial content retrieving on servers which suppots it (FTP and HTTP/1.1)
-automaticaly follows moved documents
-supports robots exclusion standard via "robots.txt"  and <META NAME=robots ..>
-supports HTTP, FTP, Gopher, SSL proxy servers
-supports HTTP authentification (user, Basic, Digest, NTLM)
-shows document tree
-have interface to "at" command for scheduling
-have optional GTK+ user interface
-may be built with or without X Window user interface
-can handle setup files (scenarios)
-supports NLS via GNU gettext (so messages are translatable to many native 
 languages)
-can be used for fetching documents to proxy/cache server (-mode dontstore)
-supports HTTP cookies
-supports SOCKS(4/5) proxy
-you can run as many as instances of pavuk in same tree without any loose
 of data, because pavuk locks documents while procesing
-you can limit transfer rate over network (speed throttling)
-have powerfull mechanism for mapping URLs to local filenames (-fnrules)
-can check URL wheter it is modified and can send list of modified URLs to any 
 script
-can load files from Netscape or MSIE browser cache directory
-can filter advertisement banners
-can automaticaly turn on/off output of messages to terminal when running
 in foreground or background
-can run users postprocessing scripts for each downloaded document
-can run user scripts for decision wheter links from current HTML document
 should be downloaded
-optionaly can generate statistical reports from download, usable for
 WEB site link checking
-supports different FTP directory listing formats (SYSV, BSD, EPFL, NOVEL,
 VMS, DOS/WINDOWS)
-multithreading support
-supports multiple HTTP proxies with round robin scheduling
-have simple support for javascript using URL patterns
-supports persistent HTTP/1.0 proxy connections
-have simple JavaScript bindings to allow much more flexible conditions
 for excluding URLs from transfer
....?

License :
Everything is licensed under GPL. See COPYING and COPYING.LIB.

Ported to:

-Linux (x86) (gcc)
	Single threaded: supported, works well
	Multi threaded: supported, works well with glibc2&LinuxThreads,
			glibc1&LinuxThreads and glibc1&pcthreads
	GUI: Gtk+

-Digital Unix 3.2 (alpha) (cc and gcc), 4.0 (alpha) (cc and egcs)
	Single threaded: supported, works well
	Multi threaded: supported, works well
	GUI: Gtk+

-Ultrix 4.4 (mips) (cc,egcs) (need to use bash instead of default sh to 
                              run configure script)
	Single threaded: supported, works well
	Multi threaded: not supported, I don't know about any POSIX threads
			implementation for Ultrix
	GUI: Gtk+

-NetBSD (sparc , mips) (gcc)
	latest versions not tested

-NetBSD-1.4.2 (x86) (gcc)
	Single threaded: supported, works well
	Multi threaded: not supported, I don't know about any POSIX threads
			implementation for this version of NetBSD
	GUI: Gtk+ (not tested)

-OpenBSD-2.7 (x86) (gcc)
	Single threaded: supported, works well
	Multi threaded: supported, works well (not much tested)
	GUI: Gtk+ (not tested)

-FreeBSD-4.0 (x86) (gcc)
	Single threaded: supported, works well
	Multi threaded: works, sometimes I get kernel error
			"microuptime() went backwards"
	GUI: Gtk+ (need to pass option --with-gtk-config=gtk12-config to
	     configure script)

-WIN32 (x86) (gcc + cygwin POSIX emulation layer)
	Single threaded: supported, works well
	Multi threaded: doesn't work, because POSIX threads implementation is
			not finished yet
	GUI: win32 Gtk+ for cygwin

-Solaris 7 (x86) (gcc)
	Single threaded: supported, works well
	Multi threaded: supported, works well with POSIX threads
	GUI: Gtk+

-QNX RtP (x86) (gcc)
	Single threaded: supported, works well
	Multi threaded: supported, works well with libc POSIX threads
	GUI: Gtk+ in XPhoton

-BeOS 5 PE (x86) (gcc)
	Single threaded: supported, works except document locking
	Multi threaded: not supported, missing POSIX threads
	GUI: not supported

-also several other UNIX-es, Mac OS X server, OS/2 reported to work

Author :
Ondrejicka Stefan (ondrej@idata.sk)

Home page:
http://www.idata.sk/~ondrej/pavuk/

If you find any bugs please send me report and or fix. 
If you have any suggestions, ideas or quetions please write me.