Difference between revisions of "Wget"

From ArchWiki
Jump to: navigation, search
m (Move to sub category.)
(Installing: wget is not part of base)
 
(15 intermediate revisions by 10 users not shown)
Line 1: Line 1:
 +
[[Category:Internet applications]]
 +
[[ar:Wget]]
 
[[es:Wget]]
 
[[es:Wget]]
[[Category:Internet Applications]]
+
[[ja:Wget]]
GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc. [[http://www.gnu.org/software/wget/ source]]
+
[http://www.gnu.org/software/wget/ GNU Wget] is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
  
 
==Installing==
 
==Installing==
wget is normally installed as part of the base setup. If not present, install the {{Pkg|wget}} package using [[pacman]].
+
Install the {{Pkg|wget}} package using [[pacman]]. The git version is present in the AUR by the name {{AUR|wget-git}}.
  
 
==Configuring==
 
==Configuring==
Line 34: Line 36:
 
===pacman integration===
 
===pacman integration===
 
To have [[pacman]] automatically use Wget and a proxy with authentication, place the Wget command into {{ic|/etc/pacman.conf}}, in the {{Ic|[options]}} section:
 
To have [[pacman]] automatically use Wget and a proxy with authentication, place the Wget command into {{ic|/etc/pacman.conf}}, in the {{Ic|[options]}} section:
  XferCommand = /usr/bin/wget --proxy-user "domain\user" --proxy-password="password" --passive-ftp -c -O %o %u
+
  XferCommand = /usr/bin/wget --proxy-user "domain\user" --proxy-password="password" --passive-ftp -q --show-progress -c -O %o %u
 
{{Warning|be aware that storing passwords in plain text is not safe. Make sure that only root can read this file with {{Ic|chmod 600 /etc/pacman.conf}}.}}
 
{{Warning|be aware that storing passwords in plain text is not safe. Make sure that only root can read this file with {{Ic|chmod 600 /etc/pacman.conf}}.}}
 +
 +
==Usage==
 +
This section explains some of the use case scenarios for Wget.
 +
 +
===Basic usage===
 +
 +
One of the most basic and common use cases for Wget is to download a file from the internet. For example, to download [https://upload.wikimedia.org/wikipedia/commons/f/fb/Blue_Wildebeest%2C_Ngorongoro.jpg a picture of a gnu from Wikipedia], you can type:
 +
 +
<nowiki>$ wget https://upload.wikimedia.org/wikipedia/commons/f/fb/Blue_Wildebeest%2C_Ngorongoro.jpg</nowiki>
 +
 +
When you already know the URL of a file to download, this can be much faster than the usual routine downloading it on your browser and moving it to the correct directory manually. Needless to say, just from the simplest usage, you can probably see a few ways of utilising this for some automated downloading if that's what you want.
 +
 +
===Archive a complete website===
 +
Wget can archive a complete website whilst preserving the correct link destinations by changing absolute links to relative links.
 +
 +
<nowiki>$ wget -np -r -k 'http://your-url-here'</nowiki>

Latest revision as of 19:52, 27 February 2016

GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.

Installing

Install the wget package using pacman. The git version is present in the AUR by the name wget-gitAUR.

Configuring

Configuration is performed in /etc/wgetrc. Not only is the default configuration file well documented; altering it is seldom necessary. See the man page for more intricate options.

FTP automation

Normally, SSH is used to securely transfer files among a network. However, FTP is lighter on resources compared to scp and rsyncing over SSH. FTP is not as secure, but when transfering large amounts of data inside a firewall protected environment on CPU-bound systems, using FTP can prove beneficial.

wget ftp://root:somepassword@10.13.X.Y//ifs/home/test/big/"*.tar"

3,562,035,200 74.4M/s   in 47s

In this case, Wget transfered a 3.3 G file at 74.4MB/second rate.

In short, this procedure is:

  • scriptable
  • faster than ssh
  • easily used by languages than can substitute string variables
  • globbing capable

Proxy

Wget uses the standard proxy environment variables. See: Proxy settings

To use the proxy authentication feature:

$ wget --proxy-user "DOMAIN\USER" --proxy-password "PASSWORD" URL

Proxies that use HTML authentication forms are not covered.

pacman integration

To have pacman automatically use Wget and a proxy with authentication, place the Wget command into /etc/pacman.conf, in the [options] section:

XferCommand = /usr/bin/wget --proxy-user "domain\user" --proxy-password="password" --passive-ftp -q --show-progress -c -O %o %u
Warning: be aware that storing passwords in plain text is not safe. Make sure that only root can read this file with chmod 600 /etc/pacman.conf.

Usage

This section explains some of the use case scenarios for Wget.

Basic usage

One of the most basic and common use cases for Wget is to download a file from the internet. For example, to download a picture of a gnu from Wikipedia, you can type:

$ wget https://upload.wikimedia.org/wikipedia/commons/f/fb/Blue_Wildebeest%2C_Ngorongoro.jpg

When you already know the URL of a file to download, this can be much faster than the usual routine downloading it on your browser and moving it to the correct directory manually. Needless to say, just from the simplest usage, you can probably see a few ways of utilising this for some automated downloading if that's what you want.

Archive a complete website

Wget can archive a complete website whilst preserving the correct link destinations by changing absolute links to relative links.

$ wget -np -r -k 'http://your-url-here'