This are Frequently Asked Questions on using pavuk. Here is only few entries because I am bit lazy to update this document. I will very appreciate if you find some problem and solution for it, if you write FAQ entry and send it to my address. It will help me to reduce my load on responding to help requests and I will have more time to code. --- Q: I am using firewall to access internet. Can pavuk go through it ? A: Yes. You can use proxys for HTTP, HTTPS, FTP and Gopher. For HTTP proxy use -http_proxy host:port . For HTTPS proxy use -ssl_proxy host:port . Pavuk require HTTP proxy with enabled CONNECT request. For Gopher proxy use -gopher_proxy host:port . It can use optionaly HTTP gateway for accessing gopher servers (use -gopher_httpgw option) or can use HTTP proxy with enabled CONNECT request. For FTP proxy use -ftp_proxy host:port . Pavuk can use three different methods for going through firewall. You can use HTTP gateway for FTP (option -ftp_httpgw). You can use native FTP proxy. And third option is to use HTTP proxy with enabled CONNECT request (option -ftp_dirtyproxy) If you firewall supports SOCKS 4 or SOCKS 5 proxy, you can compile pavuk to support it. You need only development libraries for this protocols. --- Q: I am using FWTK for my firewall, but I can't download any files trougth FTP proxy. A: FTP proxy included in FWTK doesn't support pasive data transfers. Use option -ftp_active to use active mode of FTP data connections. --- Q: I have different scenarios, which I want to execute automaticaly. Is it possible to serialize scenarios execution with pavuk? A: You can use shell or any scripting language to write short script to do this. Here is example how to use it with sh or bash : for scn in *.scn; do pavuk -scndir . -scenario $scn;done --- Q: I see any files begining with .in_. What does it mean ? A: This files uses pavuk as temporary files while file is downloaded. When transfer of file fails, this file contains the transfered part, wich is used for next reget (if possible). This files are used for locking of documents too. --- Q: I want to allways start pavuk with GUI interface. Is there any chance how to set it up in ~/.pavukrc file ? A: No. There isn't any chance to set it in ~/.pavukrc, but you can use aliasing mechanizm of your shell. For example: csh: alias xpavuk 'pavuk -X' bash: alias xpavuk='pavuk -X' --- Q: Is there any chance how to force pavuk not to build whole directory hierarchy for local document tree ? A: There are two different ways now to do this : 1) You can use option -base_level to cut some levels from hierarchy. For example: you are downloading http://www.site.tld/manual/automake/automake_toc.html and you want to store it only in automake directory. use -base_level 3. 2) You can also use option -fnrules to do this job. For example you can put all downloaded files into single directory: -fnrules 'F' '*' '/directory/%n' --- Q: When I am running pavuk with GUI, is there any chance how to close or restart Xserver, without breaking pavuk? A: Yes it is posible with lot of limitations. At first pavuk must be executed as background job (run pavuk with pavuk -X &; or pavuk - X -bg; or stop pavuk with CTRL-Z from shell, and then put in to backgraund with bg shell command). Then you can use "Go Bg" button, which will discard all pavuk windows from screen as soon as it will be safe (transfer of current document must finish) and then will close the connection to XServer. --- Q: When I am downloading documents with some special characters as ?&*, stored document tree is not browsable. I want to convert this characters to some others. A: This is posible with -tr_chr_chr option. for example use -tr_chr_chr '?&*' _ and all of ?&* charcters becomes _ character. If you want to make this default behavior, set in your ~/.pavukrc file : TrChrToChr: "?&*" "_" --- Q: Does pavuk preserve symbolic links with FTP servers ? A: Yes it does. But you have to use option -ftplist to enable this feature. --- Q: How can I download complex site to single directory without subdirectories? A: Use following options : a) (works with 0.9pl19 and higher) -store_info -fnrules F '*' '/directory/%n' b) (works with 0.9pl20 and higher) -store_info -base_level 1000 -cdir /home/my/directory option -store_info is optional with version 0.9pl20 and higher, but is required if you want make synchroniztion in future (see manual for description) --- Q: In sync mode I am using option -remove_old, but pavuk don't want to remove documents which have just disapeared from remote server. Is this bug ? A: No this is not bug. Pavuk needs to know which directory contains your mirror, to be able to find files which belongs to it. So you have to use with option -remove_old also option -subdir to specify that directory. For example: If you are mirroring http://www.idata.sk/~ondrej/pavuk/ to directory /home/my/mirror , use command : pavuk -mode sync http://www.idata.sk/~ondrej/pavuk/ -dont_leave_dir \ -remove_old -cdir /home/my/mirror/ \ -subdir /home/my/mirror/http/www.idata.sk_80/~ondrej/pavuk/ and removing of old documents will work well for you. --- (by Georg Rehm) Q: Pavuk tells me "stat: no such file or directory" but all the files seem to be in the local document tree, just where they belong. What's going on? A: This happens when you're deleting temporary files with an external program or script via the -post_cmd switch and then try to rewrite links that are embedded in new incoming documents. By issuing the abovementioned error message, pavuk tells you that something's wrong (i.e., the file that is being referenced in an incoming document is no longer in the local document tree) but it's not crucial as pavuk rewrites the link to the remote destination nevertheless.