summaryrefslogtreecommitdiffstats
path: root/lib/hashserv
AgeCommit message (Collapse)Author
2021-02-09hashserv: Add get-outhash messagePaul Barker
The get-outhash message can be sent via the get_outhash client method. This works in a similar way to the get message but looks up a db entry by outhash rather than by taskhash. It is intended to be used as a read-only form of the report message. As both handle_get_outhash and handle_report use the same query string we can factor this out. Signed-off-by: Paul Barker <pbarker@konsulko.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-02-09hashserv: server: Support searching upstream for outhashPaul Barker
Use the new get-outhash message to perform a read-only query against an upstream server (if present) when a reported taskhash/outhash combination is not found in the current database. If a matching entry is found upstream it is copied into the current database so it can be found by future queries. Signed-off-by: Paul Barker <pbarker@konsulko.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-02-09hashserv: Support read-only serverPaul Barker
The -r/--readonly argument is added to the bitbake-hashserv app. If this argument is given then clients may only perform read operations against the server. The read-only mode is implemented by simply not installing handlers for write operations, this keeps the permission model simple and reduces the risk of accidentally allowing write operations. As a sqlite database can be safely opened by multiple processes in parallel, it's possible to start two hashserv instances against a single database if you wish to export both a read-only port and a read-write port. Signed-off-by: Paul Barker <pbarker@konsulko.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2021-02-05hashserv: client: Fix handling of null responsesPaul Barker
If the server returns an empty response ("null" in json), this cannot be iterated to check for the presence of the "chunk-stream" key. Signed-off-by: Paul Barker <pbarker@konsulko.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-12-10hashserv: Fix broken AF_UNIX path length limitJoshua Watt
Fixes the bug were long paths would break Unix domain socket clients (for real this time; the previous attempt was missing os.path.basename). Adds some tests to prevent regressions Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-12-03hashserv: client: Fix AF_UNIX path length limitsJoshua Watt
Restores a fix for unix domain socket path length limits when using the synchronous hash equivalence client that was accidentally removed when the async client was added. Unfortunately, it's much more difficult to fix the same problem when using the async client directly due to the interaction of chdir() and async code, but this will at least restore the old behavior in the synchronous case. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-11-20bitbake: hashserve: Add support for readonly upstreamJoshua Watt
Adds support for an upstream server to be specified. The upstream server will be queried for equivalent hashes whenever a miss is found in the local server. If the server returns a match, it is merged into the local database. In order to keep the get stream queries as fast as possible since they are the critical path when bitbake is preparing the run queue, missing tasks provided by the server are not immediately pulled from the upstream server, but instead are put into a queue to be backfilled by a worker task later. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-11-20bitbake: hashserve: Add async clientJoshua Watt
Adds support for create a client that operates using Python asynchronous I/O. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-09-18bitbake: hashserv: Fix localhost sometimes resolved to a wrong IPAnatol Belski
From: Anatol Belski <anbelski@linux.microsoft.com> Using localhost for direct builds on host is fine. A case with a misbehavior has been sighted on a Docker build. Even when the host supports IPv6, but Docker is not configured correspondingly - some versions of the asyncio Python module seem to misbehave and try to use IPv6 where it's not supported in the container. This happens at least on some Ubuntu 18.04 based containers, resolving the IP explicitly appears to be the fix. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-08-25lib: fix most undefined code picked up by pylintFrazer Clews
Correctly import, and inherit functions, and variables. Also fix some typos and remove some Python 2 code that isn't recognised. Signed-off-by: Frazer Clews <frazerleslieclews@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-06-26hashserv: Chunkify large messagesJoshua Watt
The hash equivalence client and server can occasionally send messages that are too large for the server to fit in the receive buffer (64 KB). To prevent this, support is added to the protocol to "chunkify" the stream and break it up into manageable pieces that the server can each side can back together. Ideally, this would be negotiated by the client and server, but it's currently hard coded to 32 KB to prevent the round-trip delay. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2020-01-19lib: remove unused importsFrazer Clews
removed unused imports which made the code harder to read, and slightly but less efficient Signed-off-by: Frazer Clews <frazer.clews@codethink.co.uk> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-12-04hashserv: Add support for equivalent hash reportingRichard Purdie
The reason for this should be recorded in the commit logs. Imagine you have a target recipe (e.g. meta-extsdk-toolchain) which depends on gdb-cross. sstate in OE-Core allows gdb-cross to have the same hash regardless of whether its built on x86 or arm. The outhash will be different. We need hashequiv to be able to adapt to the prescence of sstate artefacts for meta-extsdk-toolchain and allow the hashes to re-intersect, rather than trying to force a rebuild of meta-extsdk-toolchain. By this point in the build, it would have already been installed from sstate so the build needs to adapt. Equivalent hashes should be reported to the server as a taskhash that needs to map to an specific unihash. This patch adds API to the hashserv client/server to allow this. [Thanks to Joshua Watt for help with this patch] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-09-27hashserv: Don't daemonize server processJoshua Watt
The hash server process is terminated and waited on with join(), so it should not be a daemon. Daemonizing it cause races with the server cleanup, especially in the selftest because the process may not have terminated and cleanup up its socket before the test cleanup runs and tries to do it. [YOCTO #13542] Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-09-27hashserve: Add missing importJoshua Watt
The os module is required to connect to a unix domain socket Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-09-18bitbake: Rework hash equivalenceJoshua Watt
Reworks the hash equivalence server to address performance issues that were encountered with the REST mechanism used previously, particularly during the heavy request load encountered during signature generation. Notable changes are: 1) The server protocol is no longer HTTP based. Instead, it uses a simpler JSON over a streaming protocol link. This protocol has much lower overhead than HTTP since it eliminates the HTTP headers. 2) The hash equivalence server can either bind to a TCP port, or a Unix domain socket. Unix domain sockets are more efficient for local communication, and so are preferred if the user enables hash equivalence only for the local build. The arguments to the 'bitbake-hashserve' command have been updated accordingly. 3) The value to which BB_HASHSERVE should be set to enable a local hash equivalence server is changed to "auto" instead of "localhost:0". The latter didn't make sense when the local server was using a Unix domain socket. 4) Clients are expected to keep a persistent connection to the server instead of creating a new connection each time a request is made for optimal performance. 5) Most of the client logic has been moved to the hashserve module in bitbake. This makes it easier to share the client code. 6) A new bitbake command has been added called 'bitbake-hashclient'. This command can be used to query a hash equivalence server, including fetching the statistics and running a performance stress test. 7) The table indexes in the SQLite database have been updated to optimize hash lookups. This change is backward compatible, as the database will delete the old indexes first if they exist. 8) The server has been reworked to use python async to maximize performance with persistently connected clients. This requires Python 3.5 or later. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-16hashserv: Ensure we don't accumulate sockets in TIME_WAIT stateRichard Purdie
This can cause a huge backlog of closing sockets on the server and in our case we don't really want/need the protection TCP is trying to give us so work around it. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-14cooker: Improve hash server startup code to avoid exit tracebacksRichard Purdie
At exit the hashserv code was causing tracebacks as join() wasn't being called from the thread that started the process. Ensure that the hashserver is started from the pre_serve hook which is the final thread the cooker runs in. This avoids the traceback at the expense of some horrific poking into data stores which will ultimately need improving through a proper API. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-06hashserv: Switch from threads to multiprocessingRichard Purdie
There were hard to debug lockups when trying to use threading to start hashserv as a thread. Switch to multiprocessing which doesn't show the same locking problems. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-06hashserv: Use separate threads for answering requests and handling themRichard Purdie
Experience with the prserv shows that having two threads, one accepting and queueing connections and one handling the requests leads to much more reliable behaviour than having everything in a single thread. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-06hashserv: Turn off sqlite synchronous modeRichard Purdie
We're seeing performance problems with hashserv running on a normal build system. The cause seems to be the large amounts of file IO that builds involve blocking writes to the database. Since sqlite blocks on the sync calls, this causes a significant problem. Since if we lose power we have bigger problems, run with synchronous=off to avoid locking and put the jounral into memory to avoid any write issues there too. This took writes from 120s down to negligible in my tests, which means hashserv then responds promptly to requests. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-06cooker/hashserv: Allow autostarting of a local hash server using BB_HASHSERVERichard Purdie
Its useful, particularly in the local developer model of usage, for bitbake to start and stop a hash equivalence server on local port, rather than relying on one being started by the user before the build. The new BB_HASHSERVE variable supports this. The database handling is moved internally into the hashserv code so that different threads/processes can be used for the server without errors. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-08-02hashserv: SQL OptimizationsJoshua Watt
Implements a number of optimizations to the SQL used in the hash equivalence server: 1) Two indexes are created for the two methods (method, taskhash and method outhash) by which rows are found in order to speed up the lookup 2) An extra SELECT to lookup the just inserted row was removed. This SELECT is unnecessary since all of the information about the newly inserted row is already available. 3) A uniqueness constraint was added to the table. This should allow the server to be multithreaded in the future since duplicate inserts can be detected (and ignored). This change requires bumping the database version to '2', since a uniqueness constraint can't be added to an existing table. 4) Some comments are added to clarify the trick SELECT statement used when inserting new equivalent hashes Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-05-04bitbake: Drop duplicate license boilerplace textRichard Purdie
With the introduction of SPDX-License-Identifier headers, we don't need a ton of header boilerplate in every file. Simplify the files and rely on the top level for the full licence text. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-05-04bitbake: Add initial pass of SPDX license headers to source codeRichard Purdie
This adds the SPDX-License-Identifier license headers to the majority of our source files to make it clearer exactly which license files are under. The bulk of the files are under GPL v2.0 with one found to be under V2.0 or later, some under MIT and some have dual license. There are some files which are potentially harder to classify where we've imported upstream code and those can be handled specifically in later commits. The COPYING file is replaced with LICENSE.X files which contain the full license texts. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2019-01-08bitbake: hashserv: Add hash equivalence reference serverJoshua Watt
Implements a reference implementation of the hash equivalence server. This server has minimal dependencies (and no dependencies outside of the standard Python library), and implements the minimum required to be a conforming hash equivalence server. [YOCTO #13030] Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>