aboutsummaryrefslogtreecommitdiffstats
path: root/documentation/ref-manual/eclipse/html/poky-ref-manual/checksums.html
blob: 5dccce93b94720c978f6f6140fbe6b733243b2b5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>3.2.2.�Checksums (Signatures)</title>
<link rel="stylesheet" type="text/css" href="../book.css">
<meta name="generator" content="DocBook XSL Stylesheets V1.76.1">
<link rel="home" href="index.html" title="The Yocto Project Reference Manual">
<link rel="up" href="shared-state-cache.html" title="3.2.�Shared State Cache">
<link rel="prev" href="overall-architecture.html" title="3.2.1.�Overall Architecture">
<link rel="next" href="shared-state.html" title="3.2.3.�Shared State">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="section" title="3.2.2.�Checksums (Signatures)">
<div class="titlepage"><div><div><h3 class="title">
<a name="checksums"></a>3.2.2.�Checksums (Signatures)</h3></div></div></div>
<p>
            The shared state code uses a checksum, which is a unique signature of a task's 
            inputs, to determine if a task needs to be run again. 
            Because it is a change in a task's inputs that triggers a rerun, the process
            needs to detect all the inputs to a given task. 
            For shell tasks, this turns out to be fairly easy because
            the build process generates a "run" shell script for each task and 
            it is possible to create a checksum that gives you a good idea of when 
            the task's data changes.
        </p>
<p>
            To complicate the problem, there are things that should not be included in 
            the checksum. 
            First, there is the actual specific build path of a given task - 
            the <code class="filename">WORKDIR</code>. 
            It does not matter if the working directory changes because it should not 
            affect the output for target packages.
            Also, the build process has the objective of making native/cross packages relocatable. 
            The checksum therefore needs to exclude <code class="filename">WORKDIR</code>.
            The simplistic approach for excluding the working directory is to set 
            <code class="filename">WORKDIR</code> to some fixed value and create the checksum
            for the "run" script. 
        </p>
<p>
            Another problem results from the "run" scripts containing functions that 
            might or might not get called.  
            The incremental build solution contains code that figures out dependencies 
            between shell functions.
            This code is used to prune the "run" scripts down to the minimum set, 
            thereby alleviating this problem and making the "run" scripts much more 
            readable as a bonus.
        </p>
<p>
            So far we have solutions for shell scripts.
            What about python tasks?
            The same approach applies even though these tasks are more difficult.
            The process needs to figure out what variables a python function accesses 
            and what functions it calls.
            Again, the incremental build solution contains code that first figures out 
            the variable and function dependencies, and then creates a checksum for the data 
            used as the input to the task.
        </p>
<p>
            Like the <code class="filename">WORKDIR</code> case, situations exist where dependencies 
            should be ignored.
            For these cases, you can instruct the build process to ignore a dependency
            by using a line like the following:
            </p>
<pre class="literallayout">
     PACKAGE_ARCHS[vardepsexclude] = "MACHINE"
            </pre>
<p>
            This example ensures that the <code class="filename">PACKAGE_ARCHS</code> variable does not 
            depend on the value of <code class="filename">MACHINE</code>, even if it does reference it.
        </p>
<p>
            Equally, there are cases where we need to add dependencies BitBake is not able to find.
            You can accomplish this by using a line like the following:
            </p>
<pre class="literallayout">
      PACKAGE_ARCHS[vardeps] = "MACHINE"
            </pre>
<p>
            This example explicitly adds the <code class="filename">MACHINE</code> variable as a 
            dependency for <code class="filename">PACKAGE_ARCHS</code>.
        </p>
<p> 
            Consider a case with inline python, for example, where BitBake is not
            able to figure out dependencies. 
            When running in debug mode (i.e. using <code class="filename">-DDD</code>), BitBake 
            produces output when it discovers something for which it cannot figure out
            dependencies. 
            The Yocto Project team has currently not managed to cover those dependencies 
            in detail and is aware of the need to fix this situation.
        </p>
<p>
            Thus far, this section has limited discussion to the direct inputs into a task.
            Information based on direct inputs is referred to as the "basehash" in the
            code. 
            However, there is still the question of a task's indirect inputs - the
            things that were already built and present in the Build Directory. 
            The checksum (or signature) for a particular task needs to add the hashes 
            of all the tasks on which the particular task depends. 
            Choosing which dependencies to add is a policy decision. 
            However, the effect is to generate a master checksum that combines the basehash 
            and the hashes of the task's dependencies.
        </p>
<p>
            At the code level, there are a variety of ways both the basehash and the
            dependent task hashes can be influenced. 
            Within the BitBake configuration file, we can give BitBake some extra information 
            to help it construct the basehash.
            The following statements effectively result in a list of global variable
            dependency excludes - variables never included in any checksum:
            </p>
<pre class="literallayout">
  BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH"
  BB_HASHBASE_WHITELIST += "DL_DIR SSTATE_DIR THISDIR FILESEXTRAPATHS"
  BB_HASHBASE_WHITELIST += "FILE_DIRNAME HOME LOGNAME SHELL TERM USER"
  BB_HASHBASE_WHITELIST += "FILESPATH USERNAME STAGING_DIR_HOST STAGING_DIR_TARGET"
            </pre>
<p>
            The previous example actually excludes 
            <a class="link" href="ref-variables-glos.html#var-WORKDIR" title="WORKDIR"><code class="filename">WORKDIR</code></a>
            since it is actually constructed as a path within 
            <a class="link" href="ref-variables-glos.html#var-TMPDIR" title="TMPDIR"><code class="filename">TMPDIR</code></a>, which is on 
            the whitelist. 
        </p>
<p>
            The rules for deciding which hashes of dependent tasks to include through
            dependency chains are more complex and are generally accomplished with a 
            python function. 
            The code in <code class="filename">meta/lib/oe/sstatesig.py</code> shows two examples
            of this and also illustrates how you can insert your own policy into the system 
            if so desired.
            This file defines the two basic signature generators <code class="filename">OE-Core</code>
            uses:  "OEBasic" and "OEBasicHash". 
            By default, there is a dummy "noop" signature handler enabled in BitBake. 
            This means that behavior is unchanged from previous versions. 
            <code class="filename">OE-Core</code> uses the "OEBasic" signature handler by default
            through this setting in the <code class="filename">bitbake.conf</code> file:
            </p>
<pre class="literallayout">
  BB_SIGNATURE_HANDLER ?= "OEBasic"
            </pre>
<p>
            The "OEBasicHash" <code class="filename">BB_SIGNATURE_HANDLER</code> is the same as the 
            "OEBasic" version but adds the task hash to the stamp files. 
            This results in any metadata change that changes the task hash, automatically 
            causing the task to be run again. 
            This removes the need to bump <a class="link" href="ref-variables-glos.html#var-PR" title="PR"><code class="filename">PR</code></a>
            values and changes to metadata automatically ripple across the build. 
            Currently, this behavior is not the default behavior for <code class="filename">OE-Core</code>
            but is the default in <code class="filename">poky</code>.
        </p>
<p>
            It is also worth noting that the end result of these signature generators is to
            make some dependency and hash information available to the build. 
            This information includes:
            </p>
<pre class="literallayout">
  BB_BASEHASH_task-&lt;taskname&gt; - the base hashes for each task in the recipe
  BB_BASEHASH_&lt;filename:taskname&gt; - the base hashes for each dependent task
  BBHASHDEPS_&lt;filename:taskname&gt; - The task dependencies for each task
  BB_TASKHASH - the hash of the currently running task
            </pre>
<p>
        </p>
</div></body>
</html>