jump to navigation

The Ultimate Wget Download Guide With 15 Examples November 10, 2010

Posted by Tournas Dimitrios in Linux.
trackback

Your browser does a good job of fetching web documents and displaying them, but there are times when you need an extra strength download manager to get those tougher HTTP jobs done. A versatile, old school Unix program called Wget is a highly hackable, handy little tool that can take care of all your downloading needs. Whether you want to mirror an entire web site, automatically download music or movies from a set of favorite weblogs, or transfer huge files painlessly on a slow or intermittent network connection, Wget’s for you.Wget, the “non-interactive network retriever,” is called at the command line. The format of a Wget command is:

wget [option]... [URL]...

The URL is the address of the file(s) you want Wget to download. The magic in this little tool is the long menu of options available that make some really neat downloading tasks possible. Here are 15 examples of what you can do with Wget and a few dashes and letters in the [option] part of the command.

Download Single File with wget :
wget http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
The previous example downloads a single file from internet and stores in the current directory.While downloading it will show a progress bar with the following information: %age of download completion , total amount of bytes downloaded so far , current download speed and remaining time to download
Download and Store With a Different File name Using wget -O :By default wget will pick the filename from the last word after last forward slash, which may not be appropriate always.Following example will download and store the file :
wget http://www.vim.org/scripts/download_script.php?src_id=7701 It  will download and store the file with name: download_script.php?src_id=7701 .To correct this issue, we can specify the output file name using the -O option as:
wget -O taglist.zip http://www.vim.org/scripts/download_script.php?src_id=7701
Specify Download Speed / Download Rate Using wget –limit-rate : While executing the wget, by default it will try to occupy full possible bandwidth. This might not be acceptable when you are downloading huge files on production servers. So, to avoid that we can limit the download speed using the –limit-rate as shown below.In the following example, the download speed is limited to 200k
wget –limit-rate=200k http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
Continue the Incomplete Download Using wget -c : Restart a download which got stopped in the middle using wget -c option as shown below.
wget -c http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2

This is very helpful when you have initiated a very big file download which got interrupted in the middle. Instead of starting the whole download again, you can start the download from where it got interrupted using option -c Note: If a download is stopped in middle, when you restart the download again without the option -c, wget will append .1 to the filename automatically as a file with the previous name already exist. If a file with .1 already exist, it will download the file with .2 at the end.
Download in the Background Using wget -b : For a huge download, put the download in background using wget option -b as shown below.It will initiate the download and gives back the shell prompt to you. You can always check the status of the download using tail -f as   :  tail -f wget-log .

wget -b http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
Mask User Agent and Display wget like Browser Using wget –user-agent :Some websites can disallow you to download its page by identifying that the user agent is not a browser. So you can mask the user agent by using –user-agent options and show wget like a browser as shown below.
wget –user-agent=”Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3″   URL-TO-DOWNLOAD
Test Download URL Using wget –spider : When you are going to do scheduled download, you should check whether download will happen fine or not at scheduled time. To do so, copy the line exactly from the schedule, and then add –spider option to check.
wget –spider DOWNLOAD-URL

You can use the spider option under following scenarios:

  • Check before scheduling a download.
  • Monitoring whether a website is available or not at certain intervals.
  • Check a list of pages from your bookmark, and find out which pages are still exists.

Increase Total Number of Retry Attempts Using wget –tries : f the internet connection has problem, and if the download file is large there is a chance of failures in the download. By default wget retries 20 times to make the download successful.If needed, you can increase retry attempts using –tries option as shown below.
wget –tries=75 DOWNLOAD-URL
Download Multiple Files / URLs Using Wget -i : First, store all the download files or URLs in a text file as:
$ cat > download-file-list.txt
URL1
URL2
URL3
URL4

Next, give the download-file-list.txt as argument to wget using -i option as shown below.
wget -i download-file-list.txt
Download a Full Website Using wget –mirror :Following is the command line which you want to execute when you want to download a full website and made available for local viewing.
wget –mirror -p –convert-links -P ./LOCAL-DIR WEBSITE-URL

  • –mirror : turn on options suitable for mirroring.
  • -p : download all files that are necessary to properly display a given HTML page.
  • –convert-links : after the download, convert the links in document for local viewing.
  • -P ./LOCAL-DIR : save all the files and directories to the specified directory.

Reject Certain File Types while Downloading Using wget –reject :You have found a website which is useful, but don’t want to download the images you can specify the following.
wget –reject=gif WEBSITE-TO-BE-DOWNLOADED
Log messages to a log file instead of stderr Using wget -o : When you wanted the log to be redirected to a log file instead of the terminal.
wget -o download.log DOWNLOAD-URL
Quit Downloading When it Exceeds Certain Size Using wget -Q : When you want to stop download when it crosses 5 MB you can use the following wget command line.
wget -Q5m -i FILE-WHICH-HAS-URLS

Note: This quota will not get effect when you do a download a single URL. That is irrespective of the quota size everything will get downloaded when you specify a single file. This quota is applicable only for recursive downloads.
Download Only Certain File Types Using wget -r -A : You can use this under following situations:

  • Download all images from a website
  • Download all videos from a website
  • Download all PDF files from a website

wget -r -A.pdf http://url-to-webpage-with-pdfs/
FTP Download With wget :You can use wget to perform FTP download as shown below. Anonymous FTP download using Wget  wget ftp-url FTP download using wget with username and password authentication.
wget –ftp-user=USERNAME –ftp-password=PASSWORD DOWNLOAD-URL

Comments»

1. giwrgos - May 17, 2013

καλησπερα!!πολυ χρησιμο το αρθρο…πως μπορω ομως με το wget να κατεβασω ενα flash/streaming video ,οπως αυτο του youtube,που εχω τη διευθυνση;;δοκιμασα :wget -r -http://www.youtube.com/watch?v=ydvmIOhmmqw…κατεβαζει μονο το αρχειο .html,και οχι το βιντεο που παιζει….θα μπορουσατε να μου στειλετε μια εντολη,που να μπορω να κατεβαζω ενα συγκεκριμενο streaming video;;ευχαριστω για το χρονο σας…

tournasdimitrios1 - May 18, 2013

Hi George ,
As a reminder , the only Language spoken/written on this blog is English . As my audience is 95% from locations spread all over the globe , I had to adopt a commonly understandable Language . Please accept this rule as an entitlement for all visitors of this Blog .

I’ll translate your question :

How to use wget to download a flash / streaming video, such as that of youtube , which I have the URL ? For instance : wget-r-http :/ / http://www.youtube.com/watch?v=yyoonhmmqw … only downloads the FILE. html, and not the video plays …

You can’t use your browser’s URL with wget to download directly video’s from youtube , because the adress is just the webpage of the video , not the video file . A simple method is by using an online website (http://en.savefrom.net/) .
There are also Python/PERL based scripts that do very well their job (https://github.com/rg3/youtube-dl –Python–) . Alternatively use Browser plugins (download helper) to reveal the actual URL of the video .
Chrome’s web-development console (CTRL + SHIFT + I –> console-tab ) could also be used to discover the URI of downloaded media , this option is a bit trickier though .

2. giwrgos - May 19, 2013

first of all, I would like to apologize for the fact that I didn’t use English,as i hadn’t read the ”rules for visitors”.mea culpa… it would be great if wget could directly download flash videos,without the use of a browser plugin (i think only firefox has a really reliable one like flashgot)….it is by far the best download manager,although it doesn’t achieve maximum download speeds…I ‘ll try the sugested solutions!thank you for your qiuck reply and keep up the good work!!

tournasdimitrios1 - May 19, 2013

@George
You are welcome 🙂


Leave a comment