is an aggregation of sub commands. The name of a sub command is specified by the first argument. Other arguments are parsed according to each sub command. The argument
specifies the crawler root directory which contains configuration file and so on.
is specified, crawling is restarted from the seed documents.
is specified, collected documents are revisited.
is specified, collected documents are revisited and then crawling is continued.</dd>
Fetch a document.
specifies the URL of a document.
specifies the host name and the port number of the proxy server.
specifies timeout in seconds.
specifies the preferred language. By default, it is English.
All sub commands return 0 if the operation is success, else return 1. A running crawler finishes with closing the database when it catches the signal 1 (SIGHUP), 2 (SIGINT), 3 (SIGQUIT), or 15 (SIGTERM).
When crawling finishes, there is a directory
in the crawler root directory. It is an index available by
and so on.