summaryrefslogtreecommitdiffstats
path: root/lib/bb/runqueue.py
AgeCommit message (Collapse)Author
2014-08-19runqueue.py: Fix typoes/grammar in comments.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-08-19runqueue.py: Correct several misspellings of "notifing".Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-08-02runqueue: Add sceneQueueComplete eventRichard Purdie
Its useful to have an event emitted when all of the sceneQueue tasks have completed since the metadata can hook this for processing. Therefore add such an event. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-07-21command/runqueue: Fix shutdown logicRichard Purdie
If you hit Ctrl+C at the right point, the system processes the request but merrily continues building. It turns out finish_runqueue() is called but this doesn't stop the later generation and execution of the runqueue. This patch adjusts some of the conditionals to ensure the build really does stop. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-04-21runqueue: Do not write out stamp files in dry_run modeRichard Purdie
In dry run mode, stamps for noexec tasks are being written out which is incorrect. Avoid this. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-04-21runqueue: Fix task weighting algorithmRichard Purdie
When looking at a list of tasks, do_patch and do_unpack were being given equal priority when one clearly depends on another. The reason for this was the default task weights of 0 being to tasks. This is therefore changed to 1 to allow correct weighting of dependencies which means the scheduler has better information available to it about tasks. Weight endpoints differently (10) for clearer debugging of priorities. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-04-21runqueue: Fix handling of zero priority taskRichard Purdie
The zero priority task should be run first but was being confused with the None value the priority field defaulted to. Check for None explicitly to avoid this error. In the real world this doesn't change much but it confused the debug output from the schedulers. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-04-01runqueue: Address issues with incomplete sstate setsRichard Purdie
The first part of the sstate code checks en-mass whether given checksums are available. The next part of the code then either triggers those setscene tasks either running them or skipping them if they've been covered by others. The problems was that this second part would always skip a task if it was unavailable in the first part, even if it would have otherwise been covered by other tasks. This mean the mere presence of an artefact (or lack of presence) could cause a different build failure. The issue reproduces if you run a build and populate an sstate feed, then run a second build off that feed, then run a third build off the sstate feed of the second build (which is reduced in size). The fix is rather than immediately skipping tasks if the checksum is unavailable, create a list of missing tasks, then, if that task cannot be covered by others we can skip it later. The deferral makes the behaviour the same even when the cache is "incomplete". [YOCTO #6081] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-04-01runqueue: Fix sstate task dependency problemsRichard Purdie
If a setscene task has [depends], its possible they may still get executed out of order. The issue is that the dependencies are set to set() for all tasks involved. This patch adds back in explict dependencies within these chains to avoid the setscene task failures. [YOCTO #6069] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-26runqueue/siggen: Pass in commandline options to dump_sigs()Richard Purdie
This allows the commandline options to be processed in the dump signature code. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-26bitbake: Force -S option to take a parameterRichard Purdie
There is no easy way to make this change. We really need parameters for the -S (dump signatures) handling code. Such a parameter can then be used within the codebase to handle the signatures in different ways. For now, "none" is the recommended default and "printdiff" will execute the new (and more expensive) comparison algorithms. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-25runqueue: Fix sceneQueueEvent to use the correct hashesRichard Purdie
The runqueue should be using the "realtask" ID to lookup the task hash, not the "task" ID. This patch resolves corruption issues where incorrect task hashes were displayed within toaster. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-19runqueue: Remove use of waitpid on worker processesRichard Purdie
Use of waitpid on the worker processes is a bad idea since it conflicts directly with subprocess internals. Instead use the poll() method and returncode to determine if the process has exitted, if it has, we can shut down the system. This should resolve the hangs once and for all, famous last words. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-19runqueue: Revert child signal handler for nowRichard Purdie
We're running into processes using 100% cpu. It appears theses are locked in a subprocess.poll() type loop where the process has exited but the code is looping as its not handling the ECHILD error. http://bugs.python.org/issue14396 http://bugs.python.org/issue15756 This is likely due to one or both of the above bugs. The question is what actually grabbed the child exit code as it wasn't this code. Its likely there is therefore some other code racing and taking that code, it may be some kind of race like: http://hg.python.org/cpython/rev/767420808a62/ where the fix effectively catches the childs codes in a different part of the system. We could try and get everyone onto python 2.7.4 where the above bugs are fixed however for now its safer to admit defeat and go back to polling explictly for our worker exit codes. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-19runqueue: Don't catch all child return codesRichard Purdie
Catching all child exit status values is a bad idea. Setting an http sstate mirror is a great way to view that spectacularly break things. The previous change did have good code changes so don't revert those parts. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-18runqueue: Really fix sigchld handlingRichard Purdie
There are several problems. Firstly, a return value of "None" can mean there is a C signal handler installed so we need to better handle that case. signal.SIG_DFL is 0 which equates to false so we also need to handle that by testing explicitly for None. Finally, the signal handler *must* call waitpid on all child processes else it will just get called repeatedly, leading to the hanging behaviour we've been seeing. The solution is to only error for the worker children, we warn about any other stray children which we'll have to figure out the sources of in due course. Hopefully this patch gets things working again properly though. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-18runqueue: Ensure handler does not recurseRichard Purdie
Failures on the autobuilder look like this handler is recursing. That shouldn't be possible but it doesn't hurt to code as such. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-18runqueue: More carefully handle the sigchld handlerRichard Purdie
We've noticed hanging processes which appear to be looping around waitpid. Its possible multiple calls to teardown are causing problem or in theory multiple registrations (although the code should not allow that). Regardless, put better guards around signal handler registration. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-17runqueue: Don't error if we never setup workersRichard Purdie
If we didn't setup any workers (such as bitbake -S), this would error since we're trying to set a signal handler to None. This patch avoids that problem. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-12runqueue: Improve sigchld handlerRichard Purdie
The sigchld handler was reaping any processes and this was leading to confusion with any other process handling code that could be active. This patch: a) Ensures we only read any process results for the worker processes we want to monitor b) Ensures we pass the event to any other sigchld handler if it isn't an event we're interested in so the functions are properly chained. Together this should resolve some of the reports of unknown processes people have been reporting. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09runqueue: Use SIGCHLD instead of polling waitpidRichard Purdie
Instead of a significant number of calls to waitpid, register a SIGCHLD handler instead. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09providers/runqueue/taskdata: Optimise logger.debug callsRichard Purdie
A run of "bitbake bash -c unpack" when the task has already been completed resulted in about 9000 calls to logger.debug(). With this patch which comments out some noisy/less usefull logging and moves other logging calls outside loops, this number is reduced to 1000 calls. This results in cleaner logs and gives a small but measurable 0.15s speedup. The log size dropped from 900kb to 160kb. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09runqueue.py: Gracefully handle a missing worker processRichard Purdie
If the worker has already gone missing (e.g. SIGTERM), we should gracefully handle the write failures at exit time rather than throwing ugly tracebacks. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09runqueue.py: Handle worker disappearing gracefullyRichard Purdie
If the worker (or fakeworker) process disappears for some reason, the system doesn't currently even notice. To fix this, we call waitpid periodically, looking for exit events of our children. If these occur, we can gracefully shutdown the server. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-06runqueue: Fix typoRichard Purdie
slef.self is clearly meant to be self, fix typo. Otavio spotted and reported, thanks. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-02-24runqueue: Catch ValueError from pickle.loadsMartin Jansa
* exception like this keeps spinning quite quickly generating GBs of logs better to kill it asap and show invalid pickle Signed-off-by: Martin Jansa <Martin.Jansa@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-02-15runqueue: Fix silly variable overlapRichard Purdie
A previous commit of mine used the target variable for two different uses resulting in a lot more sstate being installed than is needed. Fix the variable to use two different names and unbreak the setscene behaviour. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-02-13runqueue: Ensure we do run 'masked' setscene tasks if specified as targetsRichard Purdie
If you specify multiple targets on bitbake's commandline and some of them are setscene tasks which are "masked" by other tasks they may not get run. For example <image>:do_rootfs <kernel>:do_populate_sysroot the rootfs tasks "masks" the populate_sysroot task so bitbake would currently decide not to run it. In this case, we do really want it to be run. The fix is not to skip anything which has been given as an explict target. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-02-11runqueue: Fix setscene hard dependency problemsRichard Purdie
Commit c54e738e2b5dc0d8e6fd8e93b284ed96e7a83051 added in the idea of hard dependencies such as the case a setscene has a hard dependency on pseudo-native and that dependency wasn't available from sstate for some reason. Unfortunately the implementation was a bit too enthusiastic, causing rebuilds of things when it wasn't necessary. A test case was: bitbake quilt-native bitbake quilt-native -c clean bitbake <some-image> and then you'd watch quilt-native get rebuilt for no good reason. The clue to the problem is in the for loop where it never depends on the item being iterated over. The fix is to include the exact list of hard dependencies rather than guessing. With these changes, the use case above works, the one in the original commit also works. This patch also adds in or cleans up various pieces of logging to allow issues like this to be more easily debugged in future. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-01-31runqueue: Fix race against tasks sharing stamp filesRichard Purdie
Shared work directories work by assuming bitbake will not run more than one task with a specific stamp name. Recent runqueue optimisations accidentally broke this meaning there could be races. This fixes the code. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-01-27runqueue: Simplify pointless len() usageRichard Purdie
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-01-21runqueue: Only attempt to print closest matching task if there is a matchRichard Purdie
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-12-20runqueue: Further extend bitbake -S output to view signature differencesRichard Purdie
Based upon the list of difference starting points, we can use the siggen.find_siginfo() function call and the difference printing code to provide a list of differences between the current build target and whatever can be obtained from the sstate cache. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-12-19runqueue: Fix data being written into siginfo/sigdata filesRichard Purdie
The way hash_deps was being generated was different to the way siggen generated the data internally which lead to seemingly different sigdata/siginfo files for the same checksum. The -S output correct but the files written during builds contained superflous data which would look like a difference. This patch removes the badly duplicated data and uses it from the source which ensures its consistent. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-12-18runqueue: Add output for -S option for listing the changepoints compared ↵Richard Purdie
with an sstate cache Its useful to understand where the delta starts against an existing sstate cache for a given target. Adding this to the output of the -S option seems like a natural fit. We use the hashvalidate function to figure this out and assume it can find siginfo files for more than just the setscene tasks. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-11-26bitbake: Share BB_TASKDEPDATA with tasksRichard Purdie
Currently tasks have no knowledge of which other tasks they depend upon. This makes it impossible to do at least two things which would be desirable/interesting: a) Have the ability to create per recipe sysroots b) Allow the aclocal files to be present only for the entries in DEPENDS (directly and indirectly) By exporting task data through this new variable, tasks can inspect their dependencies and then take actions based upon this. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-11-26runqueue: Optimise next_buildable_task()Richard Purdie
This unlikely looking function was found to be eating a lot of CPU time since it gets called once per trip through the idle loop if we're not running a maximum number of processes. This was particularly true in world builds of 13,000 tasks. Calling the computation code is pretty pointless because until some other task finishes nothing is going to become available to build. We can know when things become available so this patch teaches the scheduler this knowledge. It also: * skips any coputation when nothing can be built * if there is only one available item to build, ignore the priority map * precomputes the stamp filenames, rather than doing it every time * saves the length of the array rather than calculating it each time (the extra function overhead is significant) Timing wise, initially, 5000 iterations through here was 20s, with the patch 200000 calls takes the same time. The end result is that builds get up and running faster. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-11-22runqueue/bitbake-worker: Fix dry run fakeroot issuesRichard Purdie
When using the dry run option (-n), bitbake would still try and fire a specific fakeroot worker. This is doomed to failure since it might well not have been built. Add in some checks to prevent the failures. [YOCTO #5367] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-11-20runqueue: Fix hole in setsceneverify skipped task logicRichard Purdie
We have do_bundle_initramfs which is a task inserted after compile and before build. It is not covered by sstate. If we run a build with a valid sstate cache present, the setsceneverify function realises it will rerun the do_compile step (due to the bundle_initramfs task) and hence marks do_populate_sysroot to rerun. do_install, a dependency of do_populate_sysroot is left as marked as covered by sstate. What we need to do is traverse the dependency tree for any setsceneverify invalided task and ensure any dependencies are also invalidated. We can stop at any point we reach another setscene task though. This means the do_populate_sysroot task has the data from do_install available and doesn't crash. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-18bitbake: runqueue: add task hash to Queue eventsAlexandru DAMIAN
Adding the sstate-related hash for all runqueue and scenequeue tasks, as it's needed in the WebHob data. Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-18bitbake: build, runqueue: adds info to the *runQueue* eventsAlexandru DAMIAN
This patch adds task identifying information for all runQueue and sceneQueue events, and for bb.build.Task* events. This will allow matching event to specific tasks in the UI handlers processing these events. Adds RunQueueData functions to get the task name and task file for usage with the runQueue* events. Adds taskfile and taskname properties to bb.build.TaskBase. Adds taskfile and taskname properties to the *runQueue* events Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-18bitbake: cooker,runqueue: send the task dependency treeAlexandru DAMIAN
Adding a CookerFeature that allows UIs to enable receving a dependency tree once the task data has been computed and the runQueue is ready to start. This will allow the clients to display dependency data in an efficient manner, and not recompute the runqueue specifically to get the dependency data. Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-16runqueue: add runQueueTaskSkipped eventAlexandru DAMIAN
Adding a runQueueTaskSkipped to notify that the tasks that are not run either because they are set-scened or they don't need an update (timestamp was ok). Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-10bitbake: runqueue: add sceneQueueTaskCompleted eventAlexandru DAMIAN
Adding an event to be fired when a scene task is completed. It is analogous to the run task completed event, and has been missing for some reason. Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-09runqueue.py: check whether multiple versions of the same PN are due to be builtRobert Yang
There would be an race issue if we: $ bitbake make-3.81 make-3.82 This because they are being built at the same time which would cause unexpected problems, for example: [snip] ERROR: Package already staged (/path/to/tmp/sstate-control/manifest-qemux86-make.populate-sysroot)?! ERROR: Function failed: sstate_task_postfunc [snip] Or there would be python's strack trace such as: [snip] *** 0004: mfile = open(manifest) 0005: entries = mfile.readlines() 0006: mfile.close() 0007: 0008: for entry in entries: Exception: IOError: [Errno 2] No such file or directory: xxx [snip] [YOCTO #5094] We can quit earlier to avoid this kind of issue when two versions of the same PN are going to be built since this isn't supported. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-06bitbake: Ensure ${DATE} and ${TIME} are consistentPeter Kjellerstedt
Due to the worker split the ${DATE} and ${TIME} variables could end up with different values for different workers. E.g., a task like do_rootfs that is run within a fakeroot environment had a slightly different view of the time than another task that was not fakerooted which made it impossible to correctly refer to the image generated by do_rootfs from the other task. Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-04bitbake-worker: ensure BUILDNAME is available during executionPaul Eggleton
BUILDNAME is set from cooker by default, so since the worker split it will not be set when executing functions. In OpenEmbedded this results in /etc/version (which is populated from BUILDNAME) not having any content. Pass this variable value through to the worker explicitly to fix the issue. Fixes [YOCTO #4818]. Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-02runqueue: Fix scenequeue to pass file descriptors, not a floatRichard Purdie
This was missed off in a previous patch. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-01server/process, server/xmlrpc, runqueue: Use select.select() on fds, not ↵Richard Purdie
time.sleep() The existing backend server implementations were inefficient since they were sleeping for the full length of the timeouts rather than being woken when there was data ready for them. It was assumed they would wake and perhaps did when we forked processes directory but that is no longer the case. This updates both the process and xmlrpc backends to wait using select(). This does mean we need to pass the file descriptors to wait on from the internals who know which these file descriptors are but this is a logical improvement. Tests of a pathaolgical load on the process server of ~420 rapid tasks executed on a server with BB_NUMBER_THREAD=48 went from a wall clock measurement of the overall command execution time of 75s to a much more reasonable 24s. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-08-16runqueue: report close matches for an invalid task namePaul Eggleton
Help to pick up mistakes such as "bitbake -c cleanstate xyz" (instead of "bitbake -c cleansstate xyz".) Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>