aboutsummaryrefslogtreecommitdiffstats
path: root/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/bitbake-user-manual/bitbake-user-manual-fetching.xml')
-rw-r--r--doc/bitbake-user-manual/bitbake-user-manual-fetching.xml622
1 files changed, 622 insertions, 0 deletions
diff --git a/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml b/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml
new file mode 100644
index 000000000..5aa53defc
--- /dev/null
+++ b/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml
@@ -0,0 +1,622 @@
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<chapter>
+<title>File Download Support</title>
+
+ <para>
+ BitBake's fetch module is a standalone piece of library code
+ that deals with the intricacies of downloading source code
+ and files from remote systems.
+ Fetching source code is one of the corner stones of building software.
+ As such, this module forms an important part of BitBake.
+ </para>
+
+ <para>
+ The current fetch module is called "fetch2" and refers to the
+ fact that it is the second major version of the API.
+ The original version is obsolete and removed from the codebase.
+ Thus, in all cases, "fetch" refers to "fetch2" in this
+ manual.
+ </para>
+
+ <section id='the-download-fetch'>
+ <title>The Download (Fetch)</title>
+
+ <para>
+ BitBake takes several steps when fetching source code or files.
+ The fetcher codebase deals with two distinct processes in order:
+ obtaining the files from somewhere (cached or otherwise)
+ and then unpacking those files into a specific location and
+ perhaps in a specific way.
+ Getting and unpacking the files is often optionally followed
+ by patching.
+ Patching, however, is not covered by this module.
+ </para>
+
+ <para>
+ The code to execute the first part of this process, a fetch,
+ looks something like the following:
+ <literallayout class='monospaced'>
+ src_uri = (d.getVar('SRC_URI', True) or "").split()
+ fetcher = bb.fetch2.Fetch(src_uri, d)
+ fetcher.download()
+ </literallayout>
+ This code sets up an instance of the fetch class.
+ The instance uses a space-separated list of URLs from the
+ <link linkend='var-SRC_URI'><filename>SRC_URI</filename></link>
+ variable and then calls the <filename>download</filename>
+ method to download the files.
+ </para>
+
+ <para>
+ The instantiation of the fetch class is usually followed by:
+ <literallayout class='monospaced'>
+ rootdir = l.getVar('WORKDIR', True)
+ fetcher.unpack(rootdir)
+ </literallayout>
+ This code unpacks the downloaded files to the
+ specified by <filename>WORKDIR</filename>.
+ <note>
+ For convenience, the naming in these examples matches
+ the variables used by OpenEmbedded.
+ </note>
+ The <filename>SRC_URI</filename> and <filename>WORKDIR</filename>
+ variables are not coded into the fetcher.
+ They variables can (and are) called with different variable names.
+ In OpenEmbedded for example, the shared state (sstate) code uses
+ the fetch module to fetch the sstate files.
+ </para>
+
+ <para>
+ When the <filename>download()</filename> method is called,
+ BitBake tries to fulfill the URLs by looking for source files
+ in a specific search order:
+ <itemizedlist>
+ <listitem><para><emphasis>Pre-mirror Sites:</emphasis>
+ BitBake first uses pre-mirrors to try and find source files.
+ These locations are defined using the
+ <link linkend='var-PREMIRRORS'><filename>PREMIRRORS</filename></link>
+ variable.
+ </para></listitem>
+ <listitem><para><emphasis>Source URI:</emphasis>
+ If pre-mirrors fail, BitBake uses the original URL (e.g from
+ <filename>SRC_URI</filename>).
+ </para></listitem>
+ <listitem><para><emphasis>Mirror Sites:</emphasis>
+ If fetch failures occur, BitBake next uses mirror location as
+ defined by the
+ <link linkend='var-MIRRORS'><filename>MIRRORS</filename></link>
+ variable.
+ </para></listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ For each URL passed to the fetcher, the fetcher
+ calls the submodule that handles that particular URL type.
+ This behavior can be the source of some confusion when you
+ are providing URLs for the <filename>SRC_URI</filename>
+ variable.
+ Consider the following two URLs:
+ <literallayout class='monospaced'>
+ http://git.yoctoproject.org/git/poky;protocol=git
+ git://git.yoctoproject.org/git/poky;protocol=http
+ </literallayout>
+ In the former case, the URL is passed to the
+ <filename>wget</filename> fetcher, which does not
+ understand "git".
+ Therefore, the latter case is the correct form since the
+ Git fetcher does know how to use HTTP as a transport.
+ </para>
+
+ <para>
+ Here are some examples that show commonly used mirror
+ definitions:
+ <literallayout class='monospaced'>
+ PREMIRRORS ?= "\
+ bzr://.*/.* http://somemirror.org/sources/ \n \
+ cvs://.*/.* http://somemirror.org/sources/ \n \
+ git://.*/.* http://somemirror.org/sources/ \n \
+ hg://.*/.* http://somemirror.org/sources/ \n \
+ osc://.*/.* http://somemirror.org/sources/ \n \
+ p4://.*/.* http://somemirror.org/sources/ \n \
+ svn://.*/.* http://somemirror.org/sources/ \n"
+
+ MIRRORS =+ "\
+ ftp://.*/.* http://somemirror.org/sources/ \n \
+ http://.*/.* http://somemirror.org/sources/ \n \
+ https://.*/.* http://somemirror.org/sources/ \n"
+ </literallayout>
+ It is useful to note that BitBake supports
+ cross-URLs.
+ It is possible to mirror a Git repository on an HTTP
+ server as a tarball.
+ This is what the <filename>git://</filename> mapping in
+ the previous example does.
+ </para>
+
+ <para>
+ Since network accesses are slow, Bitbake maintains a
+ cache of files downloaded from the network.
+ Any source files that are not local (i.e.
+ downloaded from the Internet) are placed into the download
+ directory, which is specified by the
+ <link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
+ variable.
+ </para>
+
+ <para>
+ File integrity is of key importance for reproducing builds.
+ For non-local archive downloads, the fetcher code can verify
+ sha256 and md5 checksums to ensure the archives have been
+ downloaded correctly.
+ You can specify these checksums by using the
+ <filename>SRC_URI</filename> variable with the appropriate
+ varflags as follows:
+ <literallayout class='monospaced'>
+ SRC_URI[md5sum] = "value"
+ SRC_URI[sha256sum] = "value"
+ </literallayout>
+ You can also specify the checksums as parameters on the
+ <filename>SRC_URI</filename> as shown below:
+ <literallayout class='monospaced'>
+ SRC_URI = "http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07fdb994d"
+ </literallayout>
+ If multiple URIs exist, you can specify the checksums either
+ directly as in the previous example, or you can name the URLs.
+ The following syntax shows how you name the URIs:
+ <literallayout class='monospaced'>
+ SRC_URI = "http://example.com/foobar.tar.bz2;name=foo"
+ SRC_URI[foo.md5sum] = 4a8e0f237e961fd7785d19d07fdb994d
+ </literallayout>
+ After a file has been downloaded and has had its checksum checked,
+ a ".done" stamp is placed in <filename>DL_DIR</filename>.
+ BitBake uses this stamp during subsequent builds to avoid
+ downloading or comparing a checksum for the file again.
+ <note>
+ It is assumed that local storage is safe from data corruption.
+ If this were not the case, there would be bigger issues to worry about.
+ </note>
+ </para>
+
+ <para>
+ If
+ <link linkend='var-BB_STRICT_CHECKSUM'><filename>BB_STRICT_CHECKSUM</filename></link>
+ is set, any download without a checksum triggers an
+ error message.
+ The
+ <link linkend='var-BB_NO_NETWORK'><filename>BB_NO_NETWORK</filename></link>
+ variable can be used to make any attempted network access a fatal
+ error, which is useful for checking that mirrors are complete
+ as well as other things.
+ </para>
+ </section>
+
+ <section id='bb-the-unpack'>
+ <title>The Unpack</title>
+
+ <para>
+ The unpack process usually immediately follows the download.
+ For all URLs except Git URLs, BitBake uses the common
+ <filename>unpack</filename> method.
+ </para>
+
+ <para>
+ A number of parameters exist that you can specify within the
+ URL to govern the behavior of the unpack stage:
+ <itemizedlist>
+ <listitem><para><emphasis>unpack:</emphasis>
+ Controls whether the URL components are unpacked.
+ If set to "1", which is the default, the components
+ are unpacked.
+ If set to "0", the unpack stage leaves the file alone.
+ This parameter is useful when you want an archive to be
+ copied in and not be unpacked.
+ </para></listitem>
+ <listitem><para><emphasis>dos:</emphasis>
+ Applies to <filename>.zip</filename> and
+ <filename>.jar</filename> files and specifies whether to
+ use DOS line ending conversion on text files.
+ </para></listitem>
+ <listitem><para><emphasis>basepath:</emphasis>
+ Instructs the unpack stage to strip the specified
+ directories from the source path when unpacking.
+ </para></listitem>
+ <listitem><para><emphasis>subdir:</emphasis>
+ Unpacks the specific URL to the specified subdirectory
+ within the root directory.
+ </para></listitem>
+ </itemizedlist>
+ The unpack call automatically decompresses and extracts files
+ with ".Z", ".z", ".gz", ".xz", ".zip", ".jar", ".ipk", ".rpm".
+ ".srpm", ".deb" and ".bz2" extensions as well as various combinations
+ of tarball extensions.
+ </para>
+
+ <para>
+ As mentioned, the Git fetcher has its own unpack method that
+ is optimized to work with Git trees.
+ Basically, this method works by cloning the tree into the final
+ directory.
+ The process is completed using references so that there is
+ only one central copy of the Git metadata needed.
+ </para>
+ </section>
+
+ <section id='bb-fetchers'>
+ <title>Fetchers</title>
+
+ <para>
+ As mentioned earlier, the URL prefix determines which
+ fetcher submodule BitBake uses.
+ Each submodule can support different URL parameters,
+ which are described in the following sections.
+ </para>
+
+ <section id='local-file-fetcher'>
+ <title>Local file fetcher (<filename>file://</filename>)</title>
+
+ <para>
+ This submodule handles URLs that begin with
+ <filename>file://</filename>.
+ The filename you specify with in the URL can
+ either be an absolute or relative path to a file.
+ If the filename is relative, the contents of the
+ <link linkend='var-FILESPATH'><filename>FILESPATH</filename></link>
+ variable is used in the same way
+ <filename>PATH</filename> is used to find executables.
+ Failing that,
+ <link linkend='var-FILESDIR'><filename>FILESDIR</filename></link>
+ is used to find the appropriate relative file.
+ <note>
+ <filename>FILESDIR</filename> is deprecated and can
+ be replaced with <filename>FILESPATH</filename>.
+ Because <filename>FILESDIR</filename> is likely to be
+ removed, you should not use this variable in any new code.
+ </note>
+ If the file cannot be found, it is assumed that it is available in
+ <link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
+ by the time the <filename>download()</filename> method is called.
+ </para>
+
+ <para>
+ If you specify a directory, the entire directory is
+ unpacked.
+ </para>
+
+ <para>
+ Here are some example URLs:
+ <literallayout class='monospaced'>
+ SRC_URI = "file://relativefile.patch"
+ SRC_URI = "file://relativefile.patch;this=ignored"
+ SRC_URI = "file:///Users/ich/very_important_software"
+ </literallayout>
+ </para>
+ </section>
+
+ <section id='cvs-fetcher'>
+ <title>CVS fetcher (<filename>(cvs://</filename>)</title>
+
+ <para>
+ This submodule handles checking out files from the
+ CVS version control system.
+ You can configure it using a number of different variables:
+ <itemizedlist>
+ <listitem><para><emphasis><filename>FETCHCMD_cvs</filename>:</emphasis>
+ The name of the executable to use when running
+ the <filename>cvs</filename> command.
+ This name is usually "cvs".
+ </para></listitem>
+ <listitem><para><emphasis><filename>SRCDATE</filename>:</emphasis>
+ The date to use when fetching the CVS source code.
+ A special value of "now" causes the checkout to
+ be updated on every build.
+ </para></listitem>
+ <listitem><para><emphasis><filename>CVSDIR</filename>:</emphasis>
+ Specifies where a temporary checkout is saved.
+ The location is often <filename>DL_DIR/cvs</filename>.
+ </para></listitem>
+ <listitem><para><emphasis><filename>CVS_PROXY_HOST</filename>:</emphasis>
+ The name to use as a "proxy=" parameter to the
+ <filename>cvs</filename> command.
+ </para></listitem>
+ <listitem><para><emphasis><filename>CVS_PROXY_PORT</filename>:</emphasis>
+ The port number to use as a "proxyport=" parameter to
+ the <filename>cvs</filename> command.
+ </para></listitem>
+ </itemizedlist>
+ As well as the standard username and password URL syntax,
+ you can also configure the fetcher with various URL parameters:
+ </para>
+
+ <para>
+ The supported parameters are as follows:
+ <itemizedlist>
+ <listitem><para><emphasis>"method":</emphasis>
+ The protocol over which to communicate with the cvs server.
+ By default, this protocol is "pserver".
+ If "method" is set to "ext", BitBake examines the
+ "rsh" parameter and sets <filename>CVS_RSH</filename>.
+ You can use "dir" for local directories.
+ </para></listitem>
+ <listitem><para><emphasis>"module":</emphasis>
+ Specifies the module to check out.
+ You must supply this parameter.
+ </para></listitem>
+ <listitem><para><emphasis>"tag":</emphasis>
+ Describes which CVS TAG should be used for
+ the checkout.
+ By default, the TAG is empty.
+ </para></listitem>
+ <listitem><para><emphasis>"date":</emphasis>
+ Specifies a date.
+ If no "date" is specified, the
+ <link linkend='var-SRCDATE'><filename>SRCDATE</filename></link>
+ of the configuration is used to checkout a specific date.
+ The special value of "now" causes the checkout to be
+ updated on every build.
+ </para></listitem>
+ <listitem><para><emphasis>"localdir":</emphasis>
+ Used to rename the module.
+ Effectively, you are renaming the output directory
+ to which the module is unpacked.
+ You are forcing the module into a special
+ directory relative to <filename>CVSDIR</filename>.
+ </para></listitem>
+ <listitem><para><emphasis>"rsh"</emphasis>
+ Used in conjunction with the "method" parameter.
+ </para></listitem>
+ <listitem><para><emphasis>"scmdata":</emphasis>
+ Causes the CVS metadata to be maintained in the tarball
+ the fetcher creates when set to "keep".
+ The tarball is expanded into the work directory.
+ By default, the CVS metadata is removed.
+ </para></listitem>
+ <listitem><para><emphasis>"fullpath":</emphasis>
+ Controls whether the resulting checkout is at the
+ module level, which is the default, or is at deeper
+ paths.
+ </para></listitem>
+ <listitem><para><emphasis>"norecurse":</emphasis>
+ Causes the fetcher to only checkout the specified
+ directory with no recurse into any subdirectories.
+ </para></listitem>
+ <listitem><para><emphasis>"port":</emphasis>
+ The port to which the CVS server connects.
+ </para></listitem>
+ </itemizedlist>
+ Some example URLs are as follows:
+ <literallayout class='monospaced'>
+ SRC_URI = "cvs://CVSROOT;module=mymodule;tag=some-version;method=ext"
+ SRC_URI = "cvs://CVSROOT;module=mymodule;date=20060126;localdir=usethat"
+ </literallayout>
+ </para>
+ </section>
+
+ <section id='http-ftp-fetcher'>
+ <title>HTTP/FTP wget fetcher (<filename>http://</filename>, <filename>ftp://</filename>, <filename>https://</filename>)</title>
+
+ <para>
+ This fetcher obtains files from web and FTP servers.
+ Internally, the fetcher uses the wget utility.
+ </para>
+
+ <para>
+ The executable and parameters used are specified by the
+ <filename>FETCHCMD_wget</filename> variable, which defaults
+ to a sensible values.
+ The fetcher supports a parameter "downloadfilename" that
+ allows the name of the downloaded file to be specified.
+ Specifying the name of the downloaded file is useful
+ for avoiding collisions in
+ <link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
+ when dealing with multiple files that have the same name.
+ </para>
+
+ <para>
+ Some example URLs are as follows:
+ <literallayout class='monospaced'>
+ SRC_URI = "http://oe.handhelds.org/not_there.aac"
+ SRC_URI = "ftp://oe.handhelds.org/not_there_as_well.aac"
+ SRC_URI = "ftp://you@oe.handheld.sorg/home/you/secret.plan"
+ </literallayout>
+ </para>
+ </section>
+
+ <section id='svn-fetcher'>
+ <title>Subversion (SVN) Fetcher (<filename>svn://</filename>)</title>
+
+ <para>
+ This fetcher submodule fetches code from the
+ Subversion source control system.
+ The executable used is specified by
+ <filename>FETCHCMD_svn</filename>, which defaults
+ to "svn".
+ The fetcher's temporary working directory is set
+ by <filename>SVNDIR</filename>, which is usually
+ <filename>DL_DIR/svn</filename>.
+ </para>
+
+ <para>
+ The supported parameters are as follows:
+ <itemizedlist>
+ <listitem><para><emphasis>"module":</emphasis>
+ The name of the svn module to checkout.
+ You must provide this parameter.
+ You can think of this parameter as the top-level
+ directory of the repository data you want.
+ </para></listitem>
+ <listitem><para><emphasis>"protocol":</emphasis>
+ The protocol to use, which defaults to "svn".
+ Other options are "svn+ssh" and "rsh".
+ For "rsh", the "rsh" parameter is also used.
+ </para></listitem>
+ <listitem><para><emphasis>"rev":</emphasis>
+ The revision of the source code to checkout.
+ </para></listitem>
+ <listitem><para><emphasis>"date":</emphasis>
+ The date of the source code to checkout.
+ Specific revisions are generally much safer to checkout
+ rather than by date as they do not involve timezones
+ (e.g. they are much more deterministic).
+ </para></listitem>
+ <listitem><para><emphasis>"scmdata":</emphasis>
+ Causes the “.svn” directories to be available during
+ compile-time when set to "keep".
+ By default, these directories are removed.
+ </para></listitem>
+ </itemizedlist>
+ Following are two examples using svn:
+ <literallayout class='monospaced'>
+ SRC_URI = "svn://svn.oe.handhelds.org/svn;module=vip;proto=http;rev=667"
+ SRC_URI = "svn://svn.oe.handhelds.org/svn/;module=opie;proto=svn+ssh;date=20060126"
+ </literallayout>
+ </para>
+ </section>
+
+ <section id='git-fetcher'>
+ <title>GIT Fetcher (<filename>git://</filename>)</title>
+
+ <para>
+ This fetcher submodule fetches code from the Git
+ source control system.
+ The fetcher works by creating a bare clone of the
+ remote into <filename>GITDIR</filename>, which is
+ usually <filename>DL_DIR/git</filename>.
+ This bare clone is then cloned into the work directory during the
+ unpack stage when a specific tree is checked out.
+ This is done using alternates and by reference to
+ minimize the amount of duplicate data on the disk and
+ make the unpack process fast.
+ The executable used can be set with
+ <filename>FETCHCMD_git</filename>.
+ </para>
+
+ <para>
+ This fetcher supports the following parameters:
+ <itemizedlist>
+ <listitem><para><emphasis>"protocol":</emphasis>
+ The protocol used to fetch the files.
+ The default is "git" when a hostname is set.
+ If a hostname is not set, the Git protocol is "file".
+ You can also use "http", "https", "ssh" and "rsync".
+ </para></listitem>
+ <listitem><para><emphasis>"nocheckout":</emphasis>
+ Tells the fetcher to not checkout source code when
+ unpacking when set to "1".
+ Set this option for the URL where there is a custom
+ routine to checkout code.
+ The default is "0".
+ </para></listitem>
+ <listitem><para><emphasis>"rebaseable":</emphasis>
+ Indicates that the upstream Git repository can be rebased.
+ You should set this parameter to "1" if
+ revisions can become detached from branches.
+ In this case, the source mirror tarball is done per
+ revision, which has a loss of efficiency.
+ Rebasing the upstream Git repository could cause the
+ current revision to disappear from the upstream repository.
+ This option reminds the fetcher to preserve the local cache
+ carefully for future use.
+ The default value for this parameter is "0".
+ </para></listitem>
+ <listitem><para><emphasis>"nobranch":</emphasis>
+ Tells the fetcher to not check the SHA validation
+ for the branch when set to "1".
+ The default is "0".
+ Set this option for the recipe that refers to
+ the commit that is valid for a tag instead of
+ the branch.
+ </para></listitem>
+ <listitem><para><emphasis>"bareclone":</emphasis>
+ Tells the fetcher to clone a bare clone into the
+ destination directory without checking out a working tree.
+ Only the raw Git metadata is provided.
+ This parameter implies the "nocheckout" parameter as well.
+ </para></listitem>
+ <listitem><para><emphasis>"branch":</emphasis>
+ The branch(es) of the Git tree to clone.
+ If unset, this is assumed to be "master".
+ The number of branch parameters much match the number of
+ name parameters.
+ </para></listitem>
+ <listitem><para><emphasis>"rev":</emphasis>
+ The revision to use for the checkout.
+ The default is "master".
+ </para></listitem>
+ <listitem><para><emphasis>"tag":</emphasis>
+ Specifies a tag to use for the checkout.
+ To correctly resolve tags, BitBake must access the
+ network.
+ For that reason, tags are often not used.
+ As far as Git is concerned, the "tag" parameter behaves
+ effectively the same as the "revision" parameter.
+ </para></listitem>
+ <listitem><para><emphasis>"subpath":</emphasis>
+ Limits the checkout to a specific subpath of the tree.
+ By default, the whole tree is checked out.
+ </para></listitem>
+ <listitem><para><emphasis>"destsuffix":</emphasis>
+ The name of the path in which to place the checkout.
+ By default, the path is <filename>git/</filename>.
+ </para></listitem>
+ </itemizedlist>
+ Here are some example URLs:
+ <literallayout class='monospaced'>
+ SRC_URI = "git://git.oe.handhelds.org/git/vip.git;tag=version-1"
+ SRC_URI = "git://git.oe.handhelds.org/git/vip.git;protocol=http"
+ </literallayout>
+ </para>
+ </section>
+
+ <section id='other-fetchers'>
+ <title>Other Fetchers</title>
+
+ <para>
+ Fetch submodules also exist for the following:
+ <itemizedlist>
+ <listitem><para>
+ Bazaar (<filename>bzr://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Perforce (<filename>p4://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Git Submodules (<filename>gitsm://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Trees using Git Annex (<filename>gitannex://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Secure FTP (<filename>sftp://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Secure Shell (<filename>ssh://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Repo (<filename>repo://</filename>)
+ </para></listitem>
+ <listitem><para>
+ OSC (<filename>osc://</filename>)
+ </para></listitem>
+ <listitem><para>
+ Mercurial (<filename>hg://</filename>)
+ </para></listitem>
+ </itemizedlist>
+ No documentation currently exists for these lesser used
+ fetcher submodules.
+ However, you might find the code helpful and readable.
+ </para>
+ </section>
+ </section>
+
+ <section id='auto-revisions'>
+ <title>Auto Revisions</title>
+
+ <para>
+ We need to document <filename>AUTOREV</filename> and
+ <filename>SRCREV_FORMAT</filename> here.
+ </para>
+ </section>
+</chapter>