Here's a simple command which makes a mirrored clone of a site (-m = mirror and -k = convert links):
wget -mk --wait=9 --limit-rate=200K http://www.example.com/
Here's a more complex example wget command. Explanation to follow. see http://www.gnu.org/software/wget/manual/wget.html
wget \ --recursive \ --no-clobber \ --page-requisites \ --html-extension \ --convert-links \ --restrict-file-names=windows \ --domains example.com \ --no-parent \ --wait=9 \ --limit-rate=200K \ --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36" \ --reject=mov,pdf \ --directory-prefix=./LOCAL-DIR \ www.example.com/tutorials/html/
to restart a download that only partially finished, use wget -c
--recursive: download the entire Web site.
--no-clobber: don't overwrite any existing files (used in case the download is interrupted and
resumed).
--page-requisites: get all the elements that compose the page (images, CSS and so on).
--html-extension: save files with the .html extension.
--convert-links: convert links so that they work locally, off-line.
--restrict-file-names=windows: modify filenames so that they will work in Windows as well.
--domains website.org: don't follow links outside website.org.
--no-parent: don't ascend to the parent directory when retrieving recursively.
The next two options throttle downloads so that you don't get blacklisted on the site:
--wait: waits for specified number of seconds between download attempts
--limit-rate: limits the amount of the servers bandwidth you are using
--user-agent: download as if using a browser
--reject: reject certain file types