summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorRichard Purdie <richard.purdie@linuxfoundation.org>2021-12-13 22:59:23 +0000
committerRichard Purdie <richard.purdie@linuxfoundation.org>2021-12-14 22:45:40 +0000
commitc8f845a8f391fa5f3f69a987b3977abdb4959db8 (patch)
tree4386b50d788fc8066a36ceee96ab1fb0473a99fe
parent8adf941a8a2b5b3fe5e4e3313856b725e28d5370 (diff)
downloadopenembedded-core-contrib-c8f845a8f391fa5f3f69a987b3977abdb4959db8.tar.gz
lttng-tools: Backport ptest fix
Add a backport and a dependency from upstream to help address one of the lttng-tools ptest relayd hangs we've been seeing on the autobuilder. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
-rw-r--r--meta/recipes-kernel/lttng/lttng-tools/87250ba19aec78f36e301494a03f5678fcb6fbb4.patch218
-rw-r--r--meta/recipes-kernel/lttng/lttng-tools/8f0646a03fbf31c19b85ec367dc2c3db56e6dbf7.patch113
-rw-r--r--meta/recipes-kernel/lttng/lttng-tools_2.13.1.bb2
3 files changed, 333 insertions, 0 deletions
diff --git a/meta/recipes-kernel/lttng/lttng-tools/87250ba19aec78f36e301494a03f5678fcb6fbb4.patch b/meta/recipes-kernel/lttng/lttng-tools/87250ba19aec78f36e301494a03f5678fcb6fbb4.patch
new file mode 100644
index 0000000000..f4db4f86fe
--- /dev/null
+++ b/meta/recipes-kernel/lttng/lttng-tools/87250ba19aec78f36e301494a03f5678fcb6fbb4.patch
@@ -0,0 +1,218 @@
+Upstream-Status: Backport
+
+From 87250ba19aec78f36e301494a03f5678fcb6fbb4 Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?J=C3=A9r=C3=A9mie=20Galarneau?=
+ <jeremie.galarneau@efficios.com>
+Date: Mon, 1 Nov 2021 15:43:55 -0400
+Subject: [PATCH] Fix: relayd: live: mishandled initial null trace chunk
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Observed issue
+==============
+
+As reported in #1323 (https://bugs.lttng.org/issues/1323), crashes of
+the relay daemon are observed when running the user space clear tests.
+
+The crash occurs with the following stack trace:
+ #0 0x000055fbb861d6ae in urcu_ref_get_unless_zero (ref=0x28) at /usr/local/include/urcu/ref.h:85
+ #1 lttng_trace_chunk_get (chunk=0x0) at trace-chunk.c:1836
+ #2 0x000055fbb86051e2 in make_viewer_streams (relay_session=relay_session@entry=0x7f6ea002d540, viewer_session=<optimized out>, seek_t=seek_t@entry=LTTNG_VIEWER_SEEK_BEGINNING, nb_total=nb_total@entry=0x7f6ea9607b00, nb_unsent=nb_unsent@entry=0x7f6ea9607aec, nb_created=nb_created@entry=0x7f6ea9607ae8, closed=<optimized out>) at live.c:405
+ #3 0x000055fbb86061d9 in viewer_get_new_streams (conn=0x7f6e94000fc0) at live.c:1155
+ #4 process_control (conn=0x7f6e94000fc0, recv_hdr=0x7f6ea9607af0) at live.c:2353
+ #5 thread_worker (data=<optimized out>) at live.c:2515
+ #6 0x00007f6eae86a609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
+ #7 0x00007f6eae78f293 in clone () from /lib/x86_64-linux-gnu/libc.so.6
+
+The race window during which this occurs seems very small as it can take
+hours to reproduce this crash. However, a minimal reproducer could be
+identified, as stated in the bug report.
+
+Essentially, the same crash can be reproduced by attaching a live viewer
+to a session that has seen events being produced, been stopped and been
+cleared.
+
+Cause
+=====
+
+The crash occurs as an attempt is made to take a reference to a viewer
+session’s trace chunk as viewer streams are created. The crux of the
+problem is that the code doesn’t expect a viewer session’s trace chunk
+to be NULL.
+
+The viewer session’s current trace chunk is initially set, when a viewer
+attaches to the viewer session, to a copy the corresponding
+relay_session’s current trace chunk.
+
+A live session always attempts to "catch-up" to the newest available
+trace chunk. This means that when a viewer reaches the end of a trace
+chunk, the viewer session may not transition to the "next" one: it jumps
+to the most recent trace chunk available (the one being produced by the
+relay_session). Hence, if the producer performs multiple rotations
+before a viewer completes the consumption of a trace chunk, it will skip
+over those "intermediary" trace chunks.
+
+A viewer session updates its current trace chunk when:
+ 1) new viewer streams are created,
+ 2) a new index is requested,
+ 3) metadata is requested.
+
+Hence, as a general principle, the viewer session will reference the
+most recent trace chunk available _even if its streams do not point to
+it_. It indicates which trace chunk viewer streams should transition to
+when the end of their current trace chunk is reached.
+
+The live code properly handles transitions to a null chunk. This can be
+verified by attaching a viewer to a live session, stopping the session,
+clearing it (thus entering a null trace chunk), and resuming tracing.
+
+The only issue is that the case where the first trace chunk of a viewer
+session is "null" (no active trace chunk) is mishandled in two places:
+ 1) in make_viewer_streams(), where the crash is observed,
+ 2) in viewer_get_metadata().
+
+Solution
+========
+
+In make_viewer_streams(), it is assumed that a viewer session will have
+a non-null trace chunk whenever a rotation is not ongoing. This is
+reflected by the fact that a reference is always acquired on the viewer
+session’s trace chunk.
+
+That code is one of the three places that can cause a viewer session’s
+trace chunk to be updated. We still want to update the viewer session to
+the most recently seen trace chunk (null, in this case). However, there
+is no reference to acquire and the trace chunk to use for the creation
+of the viewer stream is NULL. This is properly handled by
+viewer_stream_create().
+
+The second site to change is viewer_get_metadata() which doesn’t handle
+a viewer metadata stream not having an active trace chunk at all.
+Thankfully, the protocol allows us to express this condition by
+returning the LTTNG_VIEWER_NO_NEW_METADATA status code when a viewer
+metadata stream doesn’t have an open file and doesn’t have a current
+trace chunk.
+
+Surprisingly, this bug didn’t trigger in the case where a transition to
+a null chunk occurred _after_ attaching to a viewer session.
+
+This is because viewers will typically ask for metadata as a result of an
+LTTNG_VIEWER_FLAG_NEW_METADATA reply to the GET_NEXT_INDEX command. When
+a session is stopped and all data was consumed, this command returns
+that no new data is available, causing the viewers to wait and ask again
+later.
+
+However, when attaching, babeltrace2 (at least, and probably babeltrace 1.x)
+always asks for an initial segment of metadata before asking for an
+index.
+
+Known drawbacks
+===============
+
+None.
+
+Fixes: #1323
+
+Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
+Change-Id: I516fca60755e6897f6b7170c12d706ef57ad61a5
+---
+ src/bin/lttng-relayd/live.c | 47 ++++++++++++++++++++++++-----------
+ src/bin/lttng-relayd/stream.h | 5 ++++
+ 2 files changed, 38 insertions(+), 14 deletions(-)
+
+Index: lttng-tools-2.13.1/src/bin/lttng-relayd/live.c
+===================================================================
+--- lttng-tools-2.13.1.orig/src/bin/lttng-relayd/live.c
++++ lttng-tools-2.13.1/src/bin/lttng-relayd/live.c
+@@ -384,8 +384,6 @@ static int make_viewer_streams(struct re
+ goto error_unlock;
+ }
+ } else {
+- bool reference_acquired;
+-
+ /*
+ * Transition the viewer session into the newest trace chunk available.
+ */
+@@ -402,11 +400,26 @@ static int make_viewer_streams(struct re
+ }
+ }
+
+- reference_acquired = lttng_trace_chunk_get(
+- viewer_session->current_trace_chunk);
+- assert(reference_acquired);
+- viewer_stream_trace_chunk =
+- viewer_session->current_trace_chunk;
++ if (relay_stream->trace_chunk) {
++ /*
++ * If the corresponding relay
++ * stream's trace chunk is set,
++ * the viewer stream will be
++ * created under it.
++ *
++ * Note that a relay stream can
++ * have a NULL output trace
++ * chunk (for instance, after a
++ * clear against a stopped
++ * session).
++ */
++ const bool reference_acquired = lttng_trace_chunk_get(
++ viewer_session->current_trace_chunk);
++
++ assert(reference_acquired);
++ viewer_stream_trace_chunk =
++ viewer_session->current_trace_chunk;
++ }
+ }
+
+ viewer_stream = viewer_stream_create(
+@@ -2016,8 +2029,9 @@ int viewer_get_metadata(struct relay_con
+ }
+ }
+
+- if (conn->viewer_session->current_trace_chunk !=
+- vstream->stream_file.trace_chunk) {
++ if (conn->viewer_session->current_trace_chunk &&
++ conn->viewer_session->current_trace_chunk !=
++ vstream->stream_file.trace_chunk) {
+ bool acquired_reference;
+
+ DBG("Viewer session and viewer stream chunk differ: "
+@@ -2034,11 +2048,16 @@ int viewer_get_metadata(struct relay_con
+
+ len = vstream->stream->metadata_received - vstream->metadata_sent;
+
+- /*
+- * Either this is the first time the metadata file is read, or a
+- * rotation of the corresponding relay stream has occurred.
+- */
+- if (!vstream->stream_file.handle && len > 0) {
++ if (!vstream->stream_file.trace_chunk) {
++ reply.status = htobe32(LTTNG_VIEWER_NO_NEW_METADATA);
++ len = 0;
++ goto send_reply;
++ } else if (vstream->stream_file.trace_chunk &&
++ !vstream->stream_file.handle && len > 0) {
++ /*
++ * Either this is the first time the metadata file is read, or a
++ * rotation of the corresponding relay stream has occurred.
++ */
+ struct fs_handle *fs_handle;
+ char file_path[LTTNG_PATH_MAX];
+ enum lttng_trace_chunk_status status;
+Index: lttng-tools-2.13.1/src/bin/lttng-relayd/stream.h
+===================================================================
+--- lttng-tools-2.13.1.orig/src/bin/lttng-relayd/stream.h
++++ lttng-tools-2.13.1/src/bin/lttng-relayd/stream.h
+@@ -174,6 +174,11 @@ struct relay_stream {
+ /*
+ * The trace chunk to which the file currently being produced (if any)
+ * belongs.
++ *
++ * Note that a relay stream can have no output trace chunk. For
++ * instance, after a session stop followed by a session clear,
++ * streams will not have an output trace chunk until the session
++ * is resumed.
+ */
+ struct lttng_trace_chunk *trace_chunk;
+ LTTNG_OPTIONAL(struct relay_stream_rotation) ongoing_rotation;
diff --git a/meta/recipes-kernel/lttng/lttng-tools/8f0646a03fbf31c19b85ec367dc2c3db56e6dbf7.patch b/meta/recipes-kernel/lttng/lttng-tools/8f0646a03fbf31c19b85ec367dc2c3db56e6dbf7.patch
new file mode 100644
index 0000000000..db2fca03fe
--- /dev/null
+++ b/meta/recipes-kernel/lttng/lttng-tools/8f0646a03fbf31c19b85ec367dc2c3db56e6dbf7.patch
@@ -0,0 +1,113 @@
+Upstream-Status: Backport
+
+From 8f0646a03fbf31c19b85ec367dc2c3db56e6dbf7 Mon Sep 17 00:00:00 2001
+From: Francis Deslauriers <francis.deslauriers@efficios.com>
+Date: Mon, 25 Oct 2021 11:32:24 -0400
+Subject: [PATCH] Typo: occurences -> occurrences
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
+Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
+Change-Id: I719e26febd639f3b047b6aa6361fc6734088e871
+---
+ configure.ac | 2 +-
+ src/bin/lttng-relayd/live.c | 2 +-
+ src/bin/lttng-sessiond/event-notifier-error-accounting.c | 2 +-
+ src/bin/lttng-sessiond/ust-app.c | 2 +-
+ tests/utils/utils.sh | 8 ++++----
+ 5 files changed, 8 insertions(+), 8 deletions(-)
+
+diff --git a/configure.ac b/configure.ac
+index 12cc7a17e..27148c105 100644
+--- a/configure.ac
++++ b/configure.ac
+@@ -253,7 +253,7 @@ AS_IF([test "x$libtool_fixup" = "xyes"],
+ [
+ libtool_m4="$srcdir/m4/libtool.m4"
+ libtool_flag_pattern=".*link_all_deplibs\s*,\s*\$1\s*)"
+- AC_MSG_CHECKING([for occurence(s) of link_all_deplibs = no in $libtool_m4])
++ AC_MSG_CHECKING([for occurrence(s) of link_all_deplibs = no in $libtool_m4])
+ libtool_flag_pattern_count=$($GREP -c "$libtool_flag_pattern\s*=\s*no" $libtool_m4)
+ AS_IF([test $libtool_flag_pattern_count -ne 0],
+ [
+diff --git a/src/bin/lttng-relayd/live.c b/src/bin/lttng-relayd/live.c
+index 13078026b..42b0d947e 100644
+--- a/src/bin/lttng-relayd/live.c
++++ b/src/bin/lttng-relayd/live.c
+@@ -2036,7 +2036,7 @@ int viewer_get_metadata(struct relay_connection *conn)
+
+ /*
+ * Either this is the first time the metadata file is read, or a
+- * rotation of the corresponding relay stream has occured.
++ * rotation of the corresponding relay stream has occurred.
+ */
+ if (!vstream->stream_file.handle && len > 0) {
+ struct fs_handle *fs_handle;
+diff --git a/src/bin/lttng-sessiond/event-notifier-error-accounting.c b/src/bin/lttng-sessiond/event-notifier-error-accounting.c
+index d3e3692f5..1488d801c 100644
+--- a/src/bin/lttng-sessiond/event-notifier-error-accounting.c
++++ b/src/bin/lttng-sessiond/event-notifier-error-accounting.c
+@@ -488,7 +488,7 @@ struct ust_error_accounting_entry *ust_error_accounting_entry_create(
+ lttng_ust_ctl_destroy_counter(daemon_counter);
+ error_create_daemon_counter:
+ error_shm_alloc:
+- /* Error occured before per-cpu SHMs were handed-off to ustctl. */
++ /* Error occurred before per-cpu SHMs were handed-off to ustctl. */
+ if (cpu_counter_fds) {
+ for (i = 0; i < entry->nr_counter_cpu_fds; i++) {
+ if (cpu_counter_fds[i] < 0) {
+diff --git a/src/bin/lttng-sessiond/ust-app.c b/src/bin/lttng-sessiond/ust-app.c
+index b18988560..28c63e70c 100644
+--- a/src/bin/lttng-sessiond/ust-app.c
++++ b/src/bin/lttng-sessiond/ust-app.c
+@@ -1342,7 +1342,7 @@ static struct ust_app_event_notifier_rule *alloc_ust_app_event_notifier_rule(
+ case LTTNG_EVENT_RULE_GENERATE_EXCLUSIONS_STATUS_NONE:
+ break;
+ default:
+- /* Error occured. */
++ /* Error occurred. */
+ ERR("Failed to generate exclusions from trigger while allocating an event notifier rule");
+ goto error_put_trigger;
+ }
+diff --git a/tests/utils/utils.sh b/tests/utils/utils.sh
+index e463e4fe3..42d99444f 100644
+--- a/tests/utils/utils.sh
++++ b/tests/utils/utils.sh
+@@ -1921,7 +1921,7 @@ function validate_trace
+ pass "Validate trace for event $i, $traced events"
+ else
+ fail "Validate trace for event $i"
+- diag "Found $traced occurences of $i"
++ diag "Found $traced occurrences of $i"
+ fi
+ done
+ ret=$?
+@@ -1949,7 +1949,7 @@ function validate_trace_count
+ pass "Validate trace for event $i, $traced events"
+ else
+ fail "Validate trace for event $i"
+- diag "Found $traced occurences of $i"
++ diag "Found $traced occurrences of $i"
+ fi
+ cnt=$(($cnt + $traced))
+ done
+@@ -1979,7 +1979,7 @@ function validate_trace_count_range_incl_min_excl_max
+ pass "Validate trace for event $i, $traced events"
+ else
+ fail "Validate trace for event $i"
+- diag "Found $traced occurences of $i"
++ diag "Found $traced occurrences of $i"
+ fi
+ cnt=$(($cnt + $traced))
+ done
+@@ -2013,7 +2013,7 @@ function validate_trace_exp()
+ pass "Validate trace for expression '${event_exp}', $traced events"
+ else
+ fail "Validate trace for expression '${event_exp}'"
+- diag "Found $traced occurences of '${event_exp}'"
++ diag "Found $traced occurrences of '${event_exp}'"
+ fi
+ ret=$?
+ return $ret
diff --git a/meta/recipes-kernel/lttng/lttng-tools_2.13.1.bb b/meta/recipes-kernel/lttng/lttng-tools_2.13.1.bb
index 063d8e8c2d..187eff9619 100644
--- a/meta/recipes-kernel/lttng/lttng-tools_2.13.1.bb
+++ b/meta/recipes-kernel/lttng/lttng-tools_2.13.1.bb
@@ -37,6 +37,8 @@ SRC_URI = "https://lttng.org/files/lttng-tools/lttng-tools-${PV}.tar.bz2 \
file://lttng-sessiond.service \
file://determinism.patch \
file://0001-src-common-correct-header-location.patch \
+ file://8f0646a03fbf31c19b85ec367dc2c3db56e6dbf7.patch \
+ file://87250ba19aec78f36e301494a03f5678fcb6fbb4.patch \
"
SRC_URI[sha256sum] = "cfe6df7da831fc07fd07ce46b442c2ec1074c167af73f3a1b1d2fba0c453c8b5"