aboutsummaryrefslogtreecommitdiffstats
path: root/lib/bb/server/process.py
AgeCommit message (Collapse)Author
2017-08-31cooker: Change to consistent prefile/postfile handlingRichard Purdie
Currently the original prefile and postfile passed when starting bitbake server are 'sticky'. With the new memory resident model this doesn't make sense as the server the system is started with isn't special. This patch changes the code so the prefile/postfile are used if specified on the commandline and not used otherwise. This makes the behaviour much more predictable and expected and as an added bonus simplifies the code. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-24process: Clean up connection retry logicRichard Purdie
Its possible for a connection to connect to the server as its shutting down but before its removed the socket file. This patch: a) Removes the socket file earlier to avoid connections. b) Handles EOFError in initial connections gracefully. These occur if the socket is closed during the server shutdown. c) Ensure duplicate events aren't shown on the console. This makes debugging these issues very very confusing. With these changes the backtrace that was concerning users is hidden and the server works as expected with a reconnect when it catches it in a bad state. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-21process: Ensure we call select() to know which fds to readRichard Purdie
There is an interesting bug in the current code where a sync command is not seen until the current async command completes, by which time the UI may have shut down. The reason is that if there are idle commands, we may not end up sleeping in the select call at all, partiularly under heavy load like parsing. Fix this by calling select with a zero timeout so that we see active fds and know to read from them. This fixes various problems toaster was having with the recent server changes. [YOCTO #11898] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-15process: Increase server startup timeoutRichard Purdie
We're seeing the server fail to start within 8s on heavily loaded autobuilders so increase this timeout to 30s which should be more than enough time. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-14process: Improve client disconnectsRichard Purdie
There have been cases where the server could loop indefinitely and incorrectly handle client disconnects. In the EOFError case, ensure a full disconnect happens in the alternative disconnect path to avoid this. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-09server/process: Ensure we don't loop on client EOFErrorRichard Purdie
The server currently crashes if we hit an EOFError due to controllersock still being in ready and the continue meaning ready isn't re-evaluated. Setting the value to False can mean the shutdown code doesn't handle the situation cleanly. Clear ready to avoid the crash/loop instead and handle any OSError whilst we're in here. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-08main: Handle BB_SERVER_TIMEOUT = -1 for no server timeoutRobert Yang
Make BB_SERVER_TIMEOUT = -1 mean no unload forever. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-08process: Fix disconnect when BB_SERVER_TIMEOUTRobert Yang
Fixed: $ export BB_SERVER_TIMEOUT=10000 $ bitbake --server-only $ bitbake --status-only [snip] File "/buildarea/lyang1/poky/bitbake/lib/bb/server/process.py", line 472, in recvfds msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size)) OSError: [Errno 9] Bad file descriptor And: $ export BB_SERVER_TIMEOUT=10000 $ bitbake --server-only -B localhost:-1 $ bitbake --status-only # Everything is fine in first run $ bitbake --status-only [snip] File "/buildarea/lyang1/poky/bitbake/lib/bb/server/process.py", line 472, in recvfds msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size)) OSError: [Errno 9] Bad file descriptor This was because self.controllersock was not set to False, so it still ran sock.recvmsg() when sock was closed. And also need set command_channel to Flase, otherwise the self.command_channel.get() will always run when EOF, and cause infinite loop. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-31process: Add some extra server startup logsRichard Purdie
We have cases where the server is being started but we're not seeing any messages from it. Add some earlier logging so we can try and better understand where issues may be occurring. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-31process: Reorder server command processing and handle EOFErrorRichard Purdie
If the connection control socket and the command channel close together, we can race and hit EOFError exceptions before we close the channel. Reorder the code to handle this in the correct order and ignore the EOFError exceptions as they mean the client is disconnecting and shouldn't terminate the server. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Clean up server communication timeout errorsRichard Purdie
This timeout path was commonly hit due to errors starting the server. Now we have a better way to handle that, the retry logic can be improved and cleaned up. This patch: * Makes the timeout 5s rather than intervals of 1s with a message. Paul noted some commands can take around 1s to run on a server which has just been started on a loaded system. * Allows a broke connection to exit immediately rather than retrying something which will never work. * Drops the Ctrl+C masking, we shouldn't need that anymore and any issues would be better handled in other ways. This should make things clearer and less confusing for users and is much cleaner code too. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Don't leak open pipes upon reconnectionRichard Purdie
If we reconnect to the server, stop leaking pipes and clean up after ourselves. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process/cooker: Allow UI process to know if the cooker was started successfullyRichard Purdie
Currently if the server fails to start, the user sees no error message and the server will be repeatedly attempted to be started until some longer timeouts expire. There are error messages in the cookerdeamon log but nobody thinks to look there. Add in a pipe which can be used to tell the starting process whether the cooker did actually start or not. If it fails to start, no further attempts can be made and if present, the log file can be shown to the user. [YOCTO #11834] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Move socket keep alive into BitBakeProcessServerConnectionRichard Purdie
This cleans up the socket keep alive into better class structured code and adds cleanup of the open file descriptors upon shutdown. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Allow BBUIEventQueue to exit cleanlyRichard Purdie
Currently the monitoring thread exits with some error code or runs indefinitely. Allow closure of the pipe its monitoring to have the thread exit cleanly/silently. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Ensure ConnectionReader/Writer have fileno() and close() methodsRichard Purdie
Expose the underlying close() and fileno() methods which allow connection monitoring and cleanup. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-24process: Change timeout warning to a noteRichard Purdie
The warning message currently shown can occur more frequently than previously if a previous bitbake server is shutting down and we're reconnecting to a new server. Change it to a note message to match the higher level connection logging retry messages and so as not to interfer with selftests. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-24cooker/process: Drop server_main functionRichard Purdie
Now that there is only one server, this abstraction is no longer needed and causes indrection/confusion. The server shutdown is also broken with the cooker post_server calls happening too late, leading to "lock held" warnings in the logs if PRServ is enabled. Remove the abstraction and put the shutdown calls in the right order with respect to the locking. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-21server/process.py: fix self.bitbake_lock.write()Robert Yang
There is no global var "configuration", so the old code hang at self.bitbake_lock.write(), and nothing wrote to bitbake.lock. I didn't figure out why it hang (but not print errors). Reproducer: $ bitbake -B localhost:-1 world -k Check bitbake.log, there was nothing, now fixed. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-21server: Rework the server API so process and xmlrpc servers coexistRichard Purdie
This changes the way bitbake server works quite radically. Now, the server is always a process based server with the option of starting an XMLRPC listener on a specific inferface/port. Behind the scenes this is done with a "bitbake.sock" file alongside the bitbake.lock file. If we can obtain the lock, we know we need to start a server. The server always listens on the socket and UIs can then connect to this. UIs connect by sending a set of three file descriptors over the domain socket, one for sending commands, one for receiving command results and the other for receiving events. These changes meant we can throw away all the horrid server abstraction code, the plugable transport option to bitbake and the code becomes much more readable and debuggable. It also likely removes a ton of ways you could hang the UI/cooker in weird ways due to all the race conditions that existed with previous processes. Changes: * The foreground option for bitbake-server was dropped. Just tail the log if you really want this, the codepaths were complicated enough without adding one for this. * BBSERVER="autodetect" was dropped. The server will autostart and autoconnect in process mode. You have to specify an xmlrpc server address since that can't be autodetected. I can't see a use case for autodetect now. * The transport/servetype option to bitbake was dropped. * A BB_SERVER_TIMEOUT variable is added which allows the server to stay resident for a period of time after the last client disconnects before unloading. This is used if the -T/--idle-timeout option is not passed to bitbake. This change is invasive and may well introduce new issues however I believe the codebase is in a much better position for further development and debugging. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-18server: Remove base classes and inline codeRichard Purdie
In preparation for rewriting this code, expand the relatively useless base classes into the code itself. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-18event/command: Allow UI to request the UI eventhander IDRichard Purdie
The UI may want to change its event mask however to do this, it needs the event handler's ID. Tweak the code to allow this to be stored and add a command to query it. Use the new command in the process server backend. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-18bb/server/process: Handle EINTR on idle_commands selectAníbal Limón
If a signal is sent like SIGWINCH the select could be interrupted so ignore the InterruptError like in XMLRPC server [1]. [1] http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/bitbake/lib/bb/server/xmlrpc.py#n307 Signed-off-by: Aníbal Limón <anibal.limon@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-07event: Queue offline events for the UIRichard Purdie
Messages printed when no UI is connected (e.g. memres) are currently lost. Use the existing queue mechanism to queue these until a UI attaches, then replay them. This isn't ideal but better than the current situation of losing them entirely. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-07server/process: Fix waitEvent() calls with 0 timeoutRichard Purdie
You might think Queue.Queue.get(True, 0) would return an event immediately if present and otherwise return. It doesn't, it immediately "times out" and returns with nothing from the queue. The behaviour we want is not to wait but return anything present which is what .get(False) does so map to this. This fixes some odd behaviour observed in some of the tinfoil selftests. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-01-23bb/server/process.py: ProcessEventQueue add close of _writer pipeAníbal Limón
Call explicity close in _writer to avoid fd leakage because isn't called on Queue.close() [YOCTO #10873] Signed-off-by: Aníbal Limón <anibal.limon@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-16bitbake: remove True option to getVar calls (take 2)Joshua Lock
getVar() now defaults to expanding by default, thus remove the True option from getVar() calls with a regex search and replace. Search made with the following regex: getVar ?\(( ?[^,()]*), True\) (a follow on patch to fix up a few recent introductions) Signed-off-by: Joshua Lock <joshua.g.lock@intel.com> Signed-off-by: Ross Burton <ross.burton@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-14server/process: don't change UI process signal handler on terminatePaul Eggleton
On terminating the connection to the server, we were disabling SIGINT - and this is executed on the UI side. I'm not sure whether the intention here was to undo the SIGINT disabling we did in the server, and it was just a mistake that it disabled rather than restored and it's run on the wrong side, or whether we wanted to stop the user from breaking out of the shutdown code - the commit message provides no clues either way. Regardless, we do not want to permanently disable Ctrl+C here - it's legitimate to terminate the connection to the server and then re-establish it within the same process; at least currently, devtool modify by virtue of using tinfoil in two separate parts of the code does this, and the result of this disabling is that during the second tinfoil usage we can potentially be parsing all recipes without the ability to easily interrupt the process. Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-07cooker process: fire heartbeat event at regular time intervalsPatrick Ohly
The intended usage is for recording current system statistics from /proc in buildstats.bbclass during a build and for improving the BB_DISKMON_DIRS implementation. All other existing hooks are less suitable because they trigger at unpredictable rates: too often can be handled by doing rate-limiting in the event handler, but not often enough (for example, when there is only one long-running task) cannot because the handler does not get called at all. The implementation of the new heartbeat event hooks into the cooker process event queue. The process already wakes up every 0.1s, which is often enough for the intentionally coarse 1s delay between heartbeats. That value was chosen to keep the overhead low while still being frequent enough for the intended usage. If necessary, BB_HEARTBEAT_EVENT can be set to a float specifying the delay in seconds between these heartbeat events. Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-06-01bitbake: Convert to python 3Richard Purdie
Various misc changes to convert bitbake to python3 which don't warrant separation into separate commits. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-05-13server/process: Fix missing log messages issueRichard Purdie
Currently if the server dies, its possible that log messages are never displayed which is particularly problematic if one of those messages is the exception and backtrace the server died with. Rather than having the event queue exit as soon as the server disappears, we should pop events from the queue until its empty before exiting. This patch tweaks that code so that even if the server is dead and we're going to exit, we return any events left in the pipe. This makes debugging certain failures much easier. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-03-24bitbake: xmlrpc: set single use mode differentlyEd Bartosh
Currently xmlrpc server implicitly sets itself into single use mode when bitbake server is started with anonymous port (0) or no port is provided in command line. In this mode bitbake shuts down xmlrpc server after build is done. This assumption is incorrect in some cases. For example Toaster uses bitbake in this mode and expects xmlrpc server to stay in memory. Till recent changes single use mode was always unset due to the bug. When the bug was fixed it broke toaster builds as Toaster couldn't communicate with bitbake server in single use mode. Reimplemented logic of setting single use mode. The mode is explicity set when --server-only command line parameter is not provided to bitbake. It doesn't depend on the port number anymore. [YOCTO #9275] [YOCTO #9240] [YOCTO #9252] Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Elliot Smith <elliot.smith@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-03-09server/process: Try connecting 4 times before giving upLucas Dutra Nunes
Instead of trying one time with a timeout of 20 seconds try 4 times with a timeout of 5 seconds, to account for a slow server start. Signed-off-by: Lucas Dutra Nunes <ldnunes@ossystems.com.br> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-01-29bitbake: Set process names to be meaninfulRichard Purdie
This means that when you view the process tree, the processes have meaningful names, aiding debugging: $ pstree -p 30021 bash(30021)───KnottyUI(115579)───Cooker(115590)─┬─PRServ(115592)───{PRServ Handler}(115593) ├─Worker(115630)───bash:sleep(115631)───run.do_sleep.11(115633)───sleep(115634) └─{ProcessEQueue}(115591) $ pstree -p 30021 bash(30021)───KnottyUI(117319)───Cooker(117330)─┬─Cooker(117335) ├─PRServ(117332)───{PRServ Handler}(117333) ├─Parser-1:2(117336) └─{ProcessEQueue}(117331) Applies to parse threads, PR Server, cooker, the workers and execution threads, working within the 16 character limit as best we can. Needed to tweak the bitbake-worker magic values to tell the workers apart. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-09-08server/process: Handle SIGTERM more gracefullyRichard Purdie
Currently if you send a SIGTERM to the bitbake UI process, the system basically hangs if tasks are executing. This is because the server process doesn't actually try any kind of shutdown before exiting. This patch trys executing a stateForceShutdown command first, which is enough to stop any active tasks before the system exits. I also noticed that terminate can execute multiple times, once at SIGTERM from the handler and once from the real exit. Double execution leads to stack traces and potential hangs (writes to dead pipes), so ensure the code only can run once. With these fixes, bitbake much more correctly deals with SIGTERM to the UI process. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-09-03event/server: Add _uiready flag to handle missing error messagesRichard Purdie
If you start and suspend a bitbake execution so the bitbake lock is held, then try and run "bitbake -w '' X", you will see bitbake return an error exit code but print no message about what happened at all. The reason is that the -w option creates a "UI" which swallows the messages. The code which handles this exit failure mode thinks a UI has printed the messages and therefore doesn't do so. This adds in an extra parameter to the UI registration code so that we can figure out whether its a primary UI or not and base decisions on whether to display information on that instead. This fixes the error shown above and some bizarre failures on the Yocto Project Autobuilder. [YOCTO #8239] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-08-17Fix default function parameter assignment to a listPaul Eggleton
With python you should not assign a list as the default value of a function parameter - because a list is mutable, the result will be that the first time a value is passed it will actually modify the default. Reference: http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-default-arguments Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-06-19server/process: Don't log BBHandledExceptionRichard Purdie
If we see a BBHandledException in the idle handler, the understanding is the system handled it, printing a log and traceback is just confusing. Therefore only print these in the cases where its an unknown/unhandled exception. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-03-10cooker/server: Fix up 100% CPU usage at idleRichard Purdie
The recent inotify changes are causing a 100% cpu usage issue in the idle handlers. To avoid this, we update the idle functions to optionally report a float value which is the delay before the function needs to be called again. 1 second is fine for the inotify handler, in reality its more like 0.1s due to the default idle function sleep. This reverts performance regressions of 1.5 minutes on a kernel build and ~5-6 minutes on a image from scratch. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-01-23server/process: Fix select callRichard Purdie
There was a report that bitbake -e | less would use 100% cpu when it shouldn't really. The issue appears to be a bogus file descriptor in the select call. We shouldn't be blocking if there is event data pending to a *reader* from server context. [YOCTO #7138] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-09-02process: Ensure abnormal exits set an error levelRichard Purdie
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-08-22process: Further improve robustness against server shutdownRichard Purdie
Currently, if an exception occurs in an event handler, the server shuts down but the UI simply hangs. This happens in two places, firstly waiting for events and secondly, sending events to a server which no longer exists. The latter does time out, the former does not. These patches improve both code sections to check if the main server process is alive and if not, trigger things to shut down gracefully. This avoids the timeout in the command sending case too. This resolves various cases where the UI would simply hang indefintely. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-08-20process: Deal with infinite looping of the serverRichard Purdie
Currently if an exception occurs, we just run the idle handler again and again, usually looping indefintely. Chances are the exception that occurred will keep occuring and this is not a good place to be. This was breaking the autobuilders with gigabytes of logs. At least improve things so the cooker shuts down gracefully when this happens. Some trace of the original problem may still be present on the console too! Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09server/process: Optimise latency when finishing idle functionsRichard Purdie
When idle functions finish, its likely we have some other work to do, so don't sleep in the select call but instead, skip it. This removes small amounts of latency in common commands. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09server/process: Drop unnecessary exit delayRichard Purdie
When the server exits, we no longer appear to need this delay. This is likely due to improvements in the various exit codepaths. There is therefore no longer any point in taking the latency hit. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09server/process: Use a pipe for quit events instead of Event()Richard Purdie
Its not possible to notice the change of status of an Event() in the select call we sleep in. It would be possible in python 3.3 but for now use a pipe instead. This removes small latency when bitbake commands finish since the system doesn't sit in the select call. (Debugging these kind of issues is apparent by setting a long sleep for the select call) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09server/process: Deal more gracefully with SIGTERMRichard Purdie
Currently a SIGTERM to the UI process causes the UI simply to lock up. By setting an exit flag, the waitEvent can raise a SIGINT, allowing the UI to break out the event loop and exit. Currently this is results in a traceback but that is more desirable than a hanging process. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-03-09server/process: Use the setFeatures command on the server instead of a mangerRichard Purdie
The use of a manager in the process server causes some issues since it remains around for the lifetime of the server even though its only used during initialisation and the system doesn't respond well to SIGTERM events to the extra process (and two threads) the implementation involves. Switching to a dedicated command simplifies the server process structure. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-18bitbake: cooker,xmlrpc,servers: implement CookerFeaturesAlexandru DAMIAN
Implementing feature set selection that allows a client to enable specific features in the server at connection time. Only enabling of features is supported, as there is no way to safely remove data loaded into the cooker. Once enabled, a feature will remain enabled for the life of the cooker. Client-server connection now supports specifying the feature set required by the client. This is implemented in the Process server using a managed proxy list, so the server cooker will now load dynamically needed features based on what client connects to it. In the XMLRPC server the feature set is requested by using a parameter for registerUIHandler function. This allows observer-only clients to also specify features for the server. The server code configuration now is completly separated from the client code. All hardcoding of client knowledge is removed from the server. The extra_caches is removed as the client can now specify the caches it needs using the feature. The UI modules now need to specify the desired featureSet. HOB is modified to conform to the featureSet specification. The only feature available is CookerFeatures.HOB_EXTRA_CACHES which forces loading the bb.cache_extra:HobRecipeInfo class. Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2013-09-13cooker: Rename confusing 'stop' state to 'forceshutdown'Richard Purdie
The shutdown state causes the server to finish what its doing, stop was them meant to completely stop it. It doesn't mean the server is stopped though. Renaming the current stop event for forceshutdown gives more meaning to what it actually does. The stopped namespace then becomes available to indicate a completely stopped server. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>