Thursday, 28th August 2008
GNU Wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.
Wget is non-interactive, meaning that it can work in the background, while the user is not logged on. This allows you to start a retrieval and disconnect from the system, letting Wget finish the work. By contrast, most of the Web browsers require constant user's presence, which can be a great hindrance when transferring a lot of data.
Wget can follow links in HTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as "recursive downloading". While doing that, Wget respects the Robot Exclusion Standard (/robots.txt). Wget can be instructed to convert the links in downloaded HTML files to the local files for offline viewing.
Wget has been designed for robustness over slow or unstable network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. If the server supports regetting, it will instruct the server to continue the download from where it left off.
$ wget http://www.compsoc.nuigalway.ie/learning/downloads/putty.exe
# will download putty.exe to your current directory, and show you it's progress,
time remaining and transfer speed in KiloBytes/sec.
$ nohup wget http://www.compsoc.nuigalway.ie/learning/downloads/putty.exe
&
# Will do the same as above, but in the background and will continue even if
you log off the terminal in which it was executed.
In the last example, the nohup command blocks all output from the program to
your shell. The '&' means to background the program,
so it doesn't take over your shell. In this example it will output into into
a file called nohup.out in the directory in which it was executed, should you
wish to review it.
This applies to most non-interactive programs.
$ wget -c http://frink.nuigalway.ie/learning/downlaods/putty.exe
# will resume downloading the file putty.exe should it be stopped mid-way for
some reason
# but only if that is executed in the directory where the partial download is
kept.
$ wget -rnp http://frink.nuigalway.ie/learning/downloads/
# will recursively download everything readable in /learning/downloads/ and below
it
# WARNING! This may lead to you filling your quota unknownst to you! Use with
caution!
Happy Leeching :)