USCAN
Section: User Commands (1)
Updated: Debian Utilities
Index
Return to Main Contents
NAME
uscan - scan/watch upstream sources for new releases of software
SYNOPSIS
uscan [
options] [
path-to-debian-source-packages ...]
DESCRIPTION
uscan scans the given directories (or the current directory if
none are specified) and all of their subdirectories for packages
containing a control file
debian/watch. Parameters are then
read from those control files and upstream ftp or http sites are
inspected for newly available updates (as compared with the upstream
version number retrieved from the
debian/changelog file in the
same directory). The newest updates are retrieved (as determined by
their version numbers) and if specified in the
watch file, a program
may then be executed on the newly downloaded source.
The traditional debian/watch files can still be used, but the
current format offers both simpler and more flexible services. We do
not describe the old format here; for their documentation, see the
source code for uscan.
FORMAT of debian/watch files
The following demonstrates the type of entries which can appear in a
debian/watch file. Obviously, not all of these would appear in
one such file; usually, one would have one line for the current
package.
# format version number, currently 3; this line is compulsory!
version=3
# Line continuations are performed with \
# This is the format for an FTP site:
# Full-site-with-pattern [Version [Action]]
ftp://ftp.tex.ac.uk/tex-archive/web/c_cpp/cweb/cweb-(.+)\.tar\.gz \
debian uupdate
# This is the format for an FTP site with regex special characters in
# the filename part
ftp://ftp.worldforge.org/pub/worldforge/libs/Atlas-C++/transitional/Atlas-C\+\+-(.+)\.tar\.gz
# This is the format for an FTP site with directory pattern matching
ftp://ftp.nessus.org/pub/nessus/nessus-([\d\.]+)/src/nessus-core-([\d\.]+)\.tar\.gz
# This can be used if you want to override the PASV setting
# for a specific site
# opts=pasv ftp://.../...
# This is one format for an HTTP site, which is the same
# as the FTP format. uscan starts by downloading the homepage,
# obtained by removing the last component of the URL; in this case,
# http://www.cpan.org/modules/by-module/Text/
http://www.cpan.org/modules/by-module/Text/Text-CSV_XS-(.+)\.tar\.gz
# This is a variant HTTP format which allows direct specification of
# the homepage:
# Homepage Pattern [Version [Action]]
http://www.dataway.ch/~lukasl/amph/amph.html \
files/amphetamine-([\d\.]*).tar.bz2
# This one shows that recursive directory scanning works, in either of
# two forms, as long as the website can handle requests of the form
# http://site/inter/mediate/dir/
http://tmrc.mit.edu/mirror/twisted/Twisted/(\d\.\d)/ \
Twisted-([\d\.]*)\.tar\.bz2
http://tmrc.mit.edu/mirror/twisted/Twisted/(\d\.\d)/Twisted-([\d\.]*)\.tar\.bz2
# For maximum flexibility with upstream tarball formats, use this:
http://example.com/example-(\d[\d.]*)\.(?:zip|tgz|tbz2|txz|tar\.(?:gz|bz2|xz))
# qa.debian.org runs a redirector which allows a simpler form of URL
# for SourceForge based projects. The format below will automatically
# be rewritten to use the redirector.
http://sf.net/audacity/audacity-src-(.+)\.tar\.gz
# For GitHub projects you can use the tags or releases page. Since the archive
# URLs use only the version as the name, it is recommended to use a
# filenamemangle to adjust the name of the downloaded file:
opts="filenamemangle=s/(?:.*/)?v?(\d[\d\.]*)\.tar\.gz/<project>-$1.tar.gz/" \
https://github.com/<user>/<project>/tags (?:.*/)?v?(\d[\d\.]*)\.tar\.gz
# For Google Code projects you should use the downloads page like this:
https://code.google.com/p/<project>/downloads/list?can=1 \
.*/<project>-(\d[\d.]*)\.tar\.gz
# This is the format for a site which has funny version numbers;
# the parenthesised groups will be joined with dots to make a
# sanitised version number
http://www.site.com/pub/foobar/foobar_v(\d+)_(\d+)\.tar\.gz
# This is another way of handling site with funny version numbers,
# this time using mangling. (Note that multiple groups will be
# concatenated before mangling is performed, and that mangling will
# only be performed on the basename version number, not any path
# version numbers.)
opts="uversionmangle=s/^/0.0./" \
ftp://ftp.ibiblio.org/pub/Linux/ALPHA/wine/development/Wine-(.+)\.tar\.gz
# Similarly, the upstream part of the Debian version number can be
# mangled:
opts=dversionmangle=s/\+dfsg\d*$// \
http://some.site.org/some/path/foobar-(.+)\.tar\.gz
# The filename is found by taking the last component of the URL and
# removing everything after any '?'. If this would not make a usable
# filename, use filenamemangle. For example,
# <A href="http://foo.bar.org/download/?path=&download=foo-0.1.1.tar.gz">
# could be handled as:
# opts=filenamemangle=s/.*=(.*)/$1/ \
# http://foo.bar.org/download/\?path=&download=foo-(.+)\.tar\.gz
#
# <A href="http://foo.bar.org/download/?path=&download_version=0.1.1">
# could be handled as:
# opts=filenamemangle=s/.*=(.*)/foo-$1\.tar\.gz/ \
# http://foo.bar.org/download/\?path=&download_version=(.+)
# The option downloadurlmangle can be used to mangle the URL of the file
# to download. This can only be used with http:// URLs. This may be
# necessary if the link given on the web page needs to be transformed in
# some way into one which will work automatically, for example:
# opts=downloadurlmangle=s/prdownload/download/ \
# http://developer.berlios.de/project/showfiles.php?group_id=2051 \
# http://prdownload.berlios.de/softdevice/vdr-softdevice-(.+).tgz
Comment lines may be introduced with a `#' character. Continuation
lines may be indicated by terminating a line with a backslash
character.
The first (non-comment) line of the file must begin `version=3'. This
allows for future extensions without having to change the name of the
file.
There are two possibilities for the syntax of an HTTP watch file line,
and only one for an FTP line. We begin with the common (and simpler)
format. We describe the optional opts=... first field below, and
ignore it in what follows.
The first field gives the full pattern of URLs being searched for. In
the case of an FTP site, the directory listing for the requested
directory will be requested and this will be scanned for files
matching the basename (everything after the trailing `/'). In the
case of an HTTP site, the URL obtained by stripping everything after
the trailing slash will be downloaded and searched for hrefs (links of
the form <a href=...>) to either the full URL pattern given, or to the
absolute part (everything without the http://host.name/ part), or to
the basename (just the part after the final `/'). Everything up to
the final slash is taken as a verbatim URL, as long as there are no
parentheses (`(' and ')') in this part of the URL: if it does, the
directory name will be matched in the same way as the final component
of the URL as described below. (Note that regex metacharacters such
as `+' are regarded literally unless they are in a path component
containing parentheses; see the Atlas-C++ example above. Also, the
parentheses must match within each path component.)
The pattern (after the final slash) is a Perl regexp (see
perlre(1) for details of these). You need to make the pattern
so tight that it matches only the upstream software you are interested
in and nothing else. Also, the pattern will be anchored at the
beginning and at the end, so it must match the full filename. (Note
that for HTTP URLs, the href may include the absolute path or full
site and path and still be accepted.) The pattern must contain at
least one Perl group as explained in the next paragraph.
Having got a list of `files' matching the pattern, their version
numbers are extracted by treating the part matching the Perl regexp
groups, demarcated by `(...)', joining them with `.' as a separator,
and using the result as the version number of the file. The version
number will then be mangled if required by the uversionmangle option
described below. Finally, the file versions are then compared to find
the one with the greatest version number, as determined by dpkg
--compare-versions. Note that if you need Perl groups which are
not to be used in the version number, either use `(?:...)' or use the
uversionmangle option to clean up the mess!
The current (upstream) version can be specified as the second
parameter in the watch file line. If this is debian or absent,
then the current Debian version (as determined by
debian/changelog) is used to determine the current upstream
version. The current upstream version may also be specified by the
command-line option --upstream-version, which specifies the
upstream version number of the currently installed package (i.e., the
Debian version number without epoch and Debian revision). The
upstream version number will then be mangled using the dversionmangle
option if one is specified, as described below. If the newest version
available is newer than the current version, then it is downloaded
into the parent directory, unless the --report or
--report-status option has been used. Once the file has been
downloaded, then a symlink to the file is made from
<package>_<version>.orig.tar.{gz|bz2|lzma|xz} as described by the help
for the --symlink option.
Finally, if a third parameter (an action) is given in the watch file
line, this is taken as the name of a command, and the command
command --upstream-version version filename
is executed, using either the original file or the symlink name. A
common such command would be
uupdate(1). (Note that the calling
syntax was slightly different when using
watch file without a
`
version=...' line; there the command executed was `
command filename
version'.) If the command is
uupdate, then the
--no-symlink option is given to
uupdate as a first
option, since any requested symlinking will already be done by
uscan.
The alternative version of the watch file syntax for HTTP URLs is as
follows. The first field is a homepage which should be downloaded and
then searched for hrefs matching the pattern given in the second
field. (Again, this pattern will be anchored at the beginning and the
end, so it must match the whole href. If you want to match just the
basename of the href, you can use a pattern like
".*/name-(.+)\.tar\.gz" if you know that there is a full URL, or
better still: "(?:.*/)?name-(.+)\.tar\.gz" if there may or may not
be. Note the use of (?:...) to avoid making a backreference.) If any
of the hrefs in the homepage which match the (anchored) pattern are
relative URLs, they will be taken as being relative to the base URL of
the homepage (i.e., with everything after the trailing slash removed),
or relative to the base URL specified in the homepage itself with a
<base href="..."> tag. The third and fourth fields are the version
number and action fields as before.
PER-SITE OPTIONS
A
watch file line may be prefixed with `
opts=options', where
options is a comma-separated list of options. The whole
options string may be enclosed in double quotes, which is
necessary if
options contains any spaces. The recognised
options are as follows:
- active and passive (or pasv)
-
If used on an FTP line, these override the choice of whether to use
PASV mode or not, and force the use of the specified mode for this
site.
- uversionmangle=rules
-
This is used to mangle the upstream version number as matched by the
ftp://... or http:// rules as follows. First, the rules string
is split into multiple rules at every `;'. Then the upstream version
number is mangled by applying rule to the version, in a similar
way to executing the Perl command:
$version =~ rule;
for each rule. Thus, suitable rules might be `s/^/0./' to prepend
`0.' to the version number and `s/_/./g' to change underscores into
periods. Note that the rule string may not contain commas;
this should not be a problem.
rule may only use the 's', 'tr' and 'y' operations. When the 's'
operation is used, only the 'g', 'i' and 'x' flags are available and
rule may not contain any expressions which have the potential to
execute code (i.e. the (?{}) and (??{}) constructs are not supported).
If the 's' operation is used, the replacement can contain
backreferences to expressions within parenthesis in the matching regexp,
like `s/-alpha(\d*)/.a$1/'. These backreferences must use the
`$1' syntax, as the `\1' syntax is not supported.
- dversionmangle=rules
-
This is used to mangle the Debian version number of the currently
installed package in the same way as the uversionmangle option.
Thus, a suitable rule might be `s/+dfsg\d*$//' to remove a
`+dfsg1' suffix from the Debian version number, or to handle `.pre6'
type version numbers. Again, the rules string may not contain
commas; this should not be a problem.
- versionmangle=rules
-
This is a syntactic shorthand for
uversionmangle=rules,dversionmangle=rules, applying the
same rules to both the upstream and Debian version numbers.
- filenamemangle=rules
-
This is used to mangle the filename with which the downloaded file
will be saved, and is parsed in the same way as the
uversionmangle option. Examples of its use are given in the
examples section above.
- downloadurlmangle=rules
-
This is used to mangle the URL to be used for the download. The URL
is first computed based on the homepage downloaded and the pattern
matched, then the version number is determined from this URL.
Finally, any rules given by this option are applied before the actual
download attempt is made. An example of its use is given in the
examples section above.
- pgpsigurlmangle=rules
-
If present, the supplied rules will be applied to the downloaded URL
(after any downloadurlmangle rules, if present) to craft a new URL
that will be used to fetch the detached OpenPGP signature file for the
upstream tarball. Some common rules might be `s/$/.asc/' or
`s/$/.pgp/' or `s/$/.gpg/'. This signature must be made
by a key found in the keyring debian/upstream/signing-key.pgp or
the armored keyring debian/upstream/signing-key.asc. If it is not
valid, or not made by one of the listed keys, uscan will report an
error.
- repacksuffix=suffix
-
If the upstream sources are modified because debian/copyright contains
the Files-Excluded field, suffix will be appended to the upstream
version of the repacked tar archive. Common suffixes might be +dfsg1 to
indicate the removal of non-DFSG code or +ds1 to indicate the removal of
embedded (DFSG) code copies.
Directory name checking
Similarly to several other scripts in the
devscripts package,
uscan explores the requested directory trees looking for
debian/changelog and
debian/watch files. As a safeguard
against stray files causing potential problems, and in order to
promote efficiency, it will examine the name of the parent directory
once it finds the
debian/changelog file, and check that the
directory name corresponds to the package name. It will only attempt
to download newer versions of the package and then perform any
requested action if the directory name matches the package name.
Precisely how it does this is controlled by two configuration file
variables
DEVSCRIPTS_CHECK_DIRNAME_LEVEL and
DEVSCRIPTS_CHECK_DIRNAME_REGEX, and their corresponding command-line
options
--check-dirname-level and
--check-dirname-regex.
DEVSCRIPTS_CHECK_DIRNAME_LEVEL can take the following values:
- 0
-
Never check the directory name.
- 1
-
Only check the directory name if we have had to change directory in
our search for debian/changelog, that is, the directory
containing debian/changelog is not the directory from which
uscan was invoked. This is the default behaviour.
- 2
-
Always check the directory name.
The directory name is checked by testing whether the current directory
name (as determined by pwd(1)) matches the regex given by the
configuration file option DEVSCRIPTS_CHECK_DIRNAME_REGEX or by the
command line option --check-dirname-regex regex. Here
regex is a Perl regex (see perlre(3perl)), which will be
anchored at the beginning and the end. If regex contains a '/',
then it must match the full directory path. If not, then it must
match the full directory name. If regex contains the string
'PACKAGE', this will be replaced by the source package name, as
determined from the changelog. The default value for the regex is:
'PACKAGE(-.+)?', thus matching directory names such as PACKAGE and
PACKAGE-version.
EXAMPLE
This script will perform a fully automatic upstream update.
#!/bin/sh -e
# called with '--upstream-version' <version> <file>
uupdate "$@"
package=`dpkg-parsechangelog | sed -n 's/^Source: //p'`
cd ../$package-$2
debuild
Note that we don't call dupload or dput automatically, as
the maintainer should perform sanity checks on the software before
uploading it to Debian.
OPTIONS
- --report, --no-download
-
Only report about available newer versions but do not download anything.
- --report-status
-
Report on the status of all packages, even those which are up-to-date,
but do not download anything.
- --download
-
Report and download. (This is the default behaviour.)
- --destdir
-
Path of directory to which to download. If the specified path is not
absolute, it will be relative to one of the current directory or, if directory
scanning is enabled, the package's source directory.
- --force-download
-
Download upstream even if up to date (will not overwrite local files, however)
- --pasv
-
Force PASV mode for FTP connections.
- --no-pasv
-
Do not use PASV mode for FTP connections.
- --timeout N
-
Set timeout to N seconds (default 20 seconds).
- --no-symlink
-
Do not call mk-origtargz.
The following options are passed to mk-origtargz:
-
- --symlink
-
Make orig.tar.gz (with the appropriate extension) symlinks to the
downloaded files.
(This is the default behaviour.)
- --copy
-
Instead of symlinking as described above, copy the downloaded files.
- --rename
-
Instead of symlinking as described above, rename the downloaded files.
- --repack
-
After having downloaded an lzma tar, xz tar, bzip tar or zip archive,
repack it to a gzip tar archive, if required.
The unzip package must be installed in order to repack .zip archives, the
xz-utils package must be installed to repack lzma or xz tar archives.
- --compression [ gzip | bzip2 | lzma | xz ]
-
In the case where the upstream sources are repacked (either because
--repack option is given or debian/copyright contains the
field Files-Excluded), it is possible to control the compression
method via the parameter (defaults to gzip).
- --copyright-file copyright-file
-
Exclude files mentioned in Files-Excluded in the given copyright file.
This is useful when running uscan not within a source package directory.
- --dehs
-
Use an XML format for output, as required by the DEHS system.
- --no-dehs
-
Use the traditional uscan output format. (This is the default behaviour.)
- --package package
-
Specify the name of the package to check for rather than examining
debian/changelog; this requires the --upstream-version
(unless a version is specified in the watch file)
and --watchfile options as well. Furthermore, no directory
scanning will be done and nothing will be downloaded. This option is
probably most useful in conjunction with the DEHS system (and
--dehs).
- --upstream-version upstream-version
-
Specify the current upstream version rather than examine the watch file
or changelog to determine it. This is ignored if a directory scan is
being performed and more than one watch file is found.
- --watchfile watchfile
-
Specify the watchfile rather than perform a directory scan to
determine it. If this option is used without --package, then
uscan must be called from within the Debian package source tree
(so that debian/changelog can be found simply by stepping up
through the tree).
- --download-version version
-
Specify the version which the upstream release must match in order to be
considered, rather than using the release with the highest version.
- --download-current-version
-
Download the currently packaged version
- --verbose
-
Give verbose output.
- --no-verbose
-
Don't give verbose output. (This is the default behaviour.)
- --no-exclusion
-
Do not automatically exclude files mentioned in
debian/copyright field Files-Excluded
- --debug
-
Dump the downloaded web pages to stdout for debugging your watch file.
- --check-dirname-level N
-
See the above section Directory name checking for an explanation of
this option.
- --check-dirname-regex regex
-
See the above section Directory name checking for an explanation of
this option.
- --user-agent, --useragent
-
Override the default user agent header.
- --no-conf, --noconf
-
Do not read any configuration files. This can only be used as the
first option given on the command-line.
- --help
-
Give brief usage information.
- --version
-
Display version information.
CONFIGURATION VARIABLES
The two configuration files
/etc/devscripts.conf and
~/.devscripts are sourced by a shell in that order to set
configuration variables. These may be overridden by command line
options. Environment variable settings are ignored for this purpose.
If the first command line option given is
--noconf, then these
files will not be read. The currently recognised variables are:
- USCAN_DOWNLOAD
-
If this is set to no, then newer upstream files will not be
downloaded; this is equivalent to the --report or
--no-download options.
- USCAN_PASV
-
If this is set to yes or no, this will force FTP
connections to use PASV mode or not to, respectively. If this is set
to default, then Net::FTP(3) makes the choice (primarily based on
the FTP_PASSIVE environment variable).
- USCAN_TIMEOUT
-
If set to a number N, then set the timeout to N seconds.
This is equivalent to the --timeout option.
- USCAN_SYMLINK
-
If this is set to no, then a pkg_version.orig.tar.{gz|bz2|lzma|xz}
symlink will not be made (equivalent to the --no-symlink
option). If it is set to yes or symlink, then the
symlinks will be made. If it is set to rename, then the files
are renamed (equivalent to the --rename option).
- USCAN_DEHS_OUTPUT
-
If this is set to yes, then DEHS-style output will be used.
This is equivalent to the --dehs option.
- USCAN_VERBOSE
-
If this is set to yes, then verbose output will be given. This
is equivalent to the --verbose option.
- USCAN_USER_AGENT
-
If set, the specified user agent string will be used in place of the
default. This is equivalent to the --user-agent option.
- USCAN_DESTDIR
-
If set, the downloaded files will be placed in this directory. This is
equivalent to the --destdir option.
- USCAN_REPACK
-
If this is set to yes, then after having downloaded a bzip tar,
lzma tar, xz tar, or zip archive, uscan will repack it to a gzip tar.
This is equivalent to the --repack option.
- USCAN_EXCLUSION
-
If this is set to no, files mentioned in the field Files-Excluded
of debian/copyright will be ignored and no exclusion of files will be
tried. This is equivalent to the --no-exclusion option.
EXIT STATUS
The exit status gives some indication of whether a newer version was
found or not; one is advised to read the output to determine exactly
what happened and whether there were any warnings to be noted.
- 0
-
Either --help or --version was used, or for some
watch file which was examined, a newer upstream version was located.
- 1
-
No newer upstream versions were located for any of the watch files
examined.
HISTORY AND UPGRADING
This section briefly describes the backwards-incompatible
watch file
features which have been added in each
watch file version, and the
first version of the
devscripts package which understood them.
- Pre-version 2
-
The watch file syntax was significantly different in those days. Don't
use it. If you are upgrading from a pre-version 2 watch file, you are
advised to read this manpage and to start from scratch.
- Version 2
-
devscripts version 2.6.90: The first incarnation of the current style
of watch files.
- Version 3
-
devscripts version 2.8.12: Introduced the following: correct handling
of regex special characters in the path part, directory/path pattern
matching, version number in several parts, version number mangling.
Later versions have also introduced URL mangling.
If you are upgrading from version 2, the key incompatibility is if you
have multiple groups in the pattern part; whereas only the first one
would be used in version 2, they will all be used in version 3. To
avoid this behaviour, change the non-version-number groups to be
(?:...) instead of a plain (...) group.
SEE ALSO
dpkg(1),
mk-origtargz(1),
perlre(1),
uupdate(1),
devscripts.conf(5)
AUTHOR
The original version of
uscan was written by Christoph Lameter
<
clameter@debian.org>. Significant improvements, changes and bugfixes
were made by Julian Gilbey <
jdg@debian.org>. HTTP support was added
by Piotr Roszatycki <
dexter@debian.org>. The program was rewritten
in Perl by Julian Gilbey.
Index
- NAME
-
- SYNOPSIS
-
- DESCRIPTION
-
- FORMAT of debian/watch files
-
- PER-SITE OPTIONS
-
- Directory name checking
-
- EXAMPLE
-
- OPTIONS
-
- CONFIGURATION VARIABLES
-
- EXIT STATUS
-
- HISTORY AND UPGRADING
-
- SEE ALSO
-
- AUTHOR
-