summaryrefslogtreecommitdiffstats
path: root/meta/recipes-devtools/git/git/CVE-2021-21300.patch
blob: ec5d98395d89764f663f004fb643c17a697cc48d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
From 464431b4155e3ff918709de663aa0195d73c99fd Mon Sep 17 00:00:00 2001
From: Matheus Tavares <matheus.bernardino@usp.br>
Date: Sat, 27 Mar 2021 11:50:05 +0900
Subject: [PATCH] checkout: fix bug that makes checkout follow symlinks in
 leading path

Before checking out a file, we have to confirm that all of its leading
components are real existing directories. And to reduce the number of
lstat() calls in this process, we cache the last leading path known to
contain only directories. However, when a path collision occurs (e.g.
when checking out case-sensitive files in case-insensitive file
systems), a cached path might have its file type changed on disk,
leaving the cache on an invalid state. Normally, this doesn't bring
any bad consequences as we usually check out files in index order, and
therefore, by the time the cached path becomes outdated, we no longer
need it anyway (because all files in that directory would have already
been written).

But, there are some users of the checkout machinery that do not always
follow the index order. In particular: checkout-index writes the paths
in the same order that they appear on the CLI (or stdin); and the
delayed checkout feature -- used when a long-running filter process
replies with "status=delayed" -- postpones the checkout of some entries,
thus modifying the checkout order.

When we have to check out an out-of-order entry and the lstat() cache is
invalid (due to a previous path collision), checkout_entry() may end up
using the invalid data and thrusting that the leading components are
real directories when, in reality, they are not. In the best case
scenario, where the directory was replaced by a regular file, the user
will get an error: "fatal: unable to create file 'foo/bar': Not a
directory". But if the directory was replaced by a symlink, checkout
could actually end up following the symlink and writing the file at a
wrong place, even outside the repository. Since delayed checkout is
affected by this bug, it could be used by an attacker to write
arbitrary files during the clone of a maliciously crafted repository.

Some candidate solutions considered were to disable the lstat() cache
during unordered checkouts or sort the entries before passing them to
the checkout machinery. But both ideas include some performance penalty
and they don't future-proof the code against new unordered use cases.

Instead, we now manually reset the lstat cache whenever we successfully
remove a directory. Note: We are not even checking whether the directory
was the same as the lstat cache points to because we might face a
scenario where the paths refer to the same location but differ due to
case folding, precomposed UTF-8 issues, or the presence of `..`
components in the path. Two regression tests, with case-collisions and
utf8-collisions, are also added for both checkout-index and delayed
checkout.

Note: to make the previously mentioned clone attack unfeasible, it would
be sufficient to reset the lstat cache only after the remove_subtree()
call inside checkout_entry(). This is the place where we would remove a
directory whose path collides with the path of another entry that we are
currently trying to check out (possibly a symlink). However, in the
interest of a thorough fix that does not leave Git open to
similar-but-not-identical attack vectors, we decided to intercept
all `rmdir()` calls in one fell swoop.

This addresses CVE-2021-21300.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>

Upstream-Status: Acepted [https://github.com/git/git/commit/684dd4c2b414bcf648505e74498a608f28de4592]
CVE: CVE-2021-21300
Signed-off-by: Minjae Kim <flowergom@gmail.com>
---
 cache.h                         |  1 +
 compat/mingw.c                  |  2 ++
 git-compat-util.h               |  5 +++++
 symlinks.c                      | 24 ++++++++++++++++++++
 t/t0021-conversion.sh           | 39 ++++++++++++++++++++++++++++++++
 t/t0021/rot13-filter.pl         | 21 ++++++++++++++---
 t/t2006-checkout-index-basic.sh | 40 +++++++++++++++++++++++++++++++++
 7 files changed, 129 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 7109765..83776f3 100644
--- a/cache.h
+++ b/cache.h
@@ -1657,6 +1657,7 @@ int has_symlink_leading_path(const char *name, int len);
 int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
 int check_leading_path(const char *name, int len);
 int has_dirs_only_path(const char *name, int len, int prefix_len);
+extern void invalidate_lstat_cache(void);
 void schedule_dir_for_removal(const char *name, int len);
 void remove_scheduled_dirs(void);
 
diff --git a/compat/mingw.c b/compat/mingw.c
index a00f331..a435998 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -367,6 +367,8 @@ int mingw_rmdir(const char *pathname)
 	       ask_yes_no_if_possible("Deletion of directory '%s' failed. "
 			"Should I try again?", pathname))
 	       ret = _wrmdir(wpathname);
+	if (!ret)
+		invalidate_lstat_cache();
 	return ret;
 }
 
diff --git a/git-compat-util.h b/git-compat-util.h
index 104993b..7d3db43 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -349,6 +349,11 @@ static inline int noop_core_config(const char *var, const char *value, void *cb)
 #define platform_core_config noop_core_config
 #endif
 
+int lstat_cache_aware_rmdir(const char *path);
+#if !defined(__MINGW32__) && !defined(_MSC_VER)
+#define rmdir lstat_cache_aware_rmdir
+#endif
+
 #ifndef has_dos_drive_prefix
 static inline int git_has_dos_drive_prefix(const char *path)
 {
diff --git a/symlinks.c b/symlinks.c
index 69d458a..7dbb6b2 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -267,6 +267,13 @@ int has_dirs_only_path(const char *name, int len, int prefix_len)
  */
 static int threaded_has_dirs_only_path(struct cache_def *cache, const char *name, int len, int prefix_len)
 {
+	/*
+	 * Note: this function is used by the checkout machinery, which also
+	 * takes care to properly reset the cache when it performs an operation
+	 * that would leave the cache outdated. If this function starts caching
+	 * anything else besides FL_DIR, remember to also invalidate the cache
+	 * when creating or deleting paths that might be in the cache.
+	 */
 	return lstat_cache(cache, name, len,
 			   FL_DIR|FL_FULLPATH, prefix_len) &
 		FL_DIR;
@@ -321,3 +328,20 @@ void remove_scheduled_dirs(void)
 {
 	do_remove_scheduled_dirs(0);
 }
+
+void invalidate_lstat_cache(void)
+{
+	reset_lstat_cache(&default_cache);
+}
+
+#undef rmdir
+int lstat_cache_aware_rmdir(const char *path)
+{
+	/* Any change in this function must be made also in `mingw_rmdir()` */
+	int ret = rmdir(path);
+
+	if (!ret)
+		invalidate_lstat_cache();
+
+	return ret;
+}
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index f6deaf4..60d34fd 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -953,4 +953,43 @@ test_expect_success PERL 'invalid file in delayed checkout' '
 	grep "error: external filter .* signaled that .unfiltered. is now available although it has not been delayed earlier" git-stderr.log
 '
 
+for mode in 'case' 'utf-8'
+do
+	case "$mode" in
+	case)	dir='A' symlink='a' mode_prereq='CASE_INSENSITIVE_FS' ;;
+	utf-8)
+		dir=$(printf "\141\314\210") symlink=$(printf "\303\244")
+		mode_prereq='UTF8_NFD_TO_NFC' ;;
+	esac
+
+	test_expect_success PERL,SYMLINKS,$mode_prereq \
+	"delayed checkout with $mode-collision don't write to the wrong place" '
+		test_config_global filter.delay.process \
+			"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
+		test_config_global filter.delay.required true &&
+		git init $mode-collision &&
+		(
+			cd $mode-collision &&
+			mkdir target-dir &&
+			empty_oid=$(printf "" | git hash-object -w --stdin) &&
+			symlink_oid=$(printf "%s" "$PWD/target-dir" | git hash-object -w --stdin) &&
+			attr_oid=$(echo "$dir/z filter=delay" | git hash-object -w --stdin) &&
+			cat >objs <<-EOF &&
+			100644 blob $empty_oid	$dir/x
+			100644 blob $empty_oid	$dir/y
+			100644 blob $empty_oid	$dir/z
+			120000 blob $symlink_oid	$symlink
+			100644 blob $attr_oid	.gitattributes
+			EOF
+			git update-index --index-info <objs &&
+			git commit -m "test commit"
+		) &&
+		git clone $mode-collision $mode-collision-cloned &&
+		# Make sure z was really delayed
+		grep "IN: smudge $dir/z .* \\[DELAYED\\]" $mode-collision-cloned/delayed.log &&
+		# Should not create $dir/z at $symlink/z
+		test_path_is_missing $mode-collision/target-dir/z
+	'
+done
+
 test_done
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index cd32a82..7bb9376 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -2,9 +2,15 @@
 # Example implementation for the Git filter protocol version 2
 # See Documentation/gitattributes.txt, section "Filter Protocol"
 #
-# The first argument defines a debug log file that the script write to.
-# All remaining arguments define a list of supported protocol
-# capabilities ("clean", "smudge", etc).
+# Usage: rot13-filter.pl [--always-delay] <log path> <capabilities>
+#
+# Log path defines a debug log file that the script writes to. The
+# subsequent arguments define a list of supported protocol capabilities
+# ("clean", "smudge", etc).
+#
+# When --always-delay is given all pathnames with the "can-delay" flag
+# that don't appear on the list bellow are delayed with a count of 1
+# (see more below).
 #
 # This implementation supports special test cases:
 # (1) If data with the pathname "clean-write-fail.r" is processed with
@@ -53,6 +59,13 @@ sub gitperllib {
 use Git::Packet;
 
 my $MAX_PACKET_CONTENT_SIZE = 65516;
+
+my $always_delay = 0;
+if ( $ARGV[0] eq '--always-delay' ) {
+	$always_delay = 1;
+	shift @ARGV;
+}
+
 my $log_file                = shift @ARGV;
 my @capabilities            = @ARGV;
 
@@ -134,6 +147,8 @@ sub rot13 {
 			if ( $buffer eq "can-delay=1" ) {
 				if ( exists $DELAY{$pathname} and $DELAY{$pathname}{"requested"} == 0 ) {
 					$DELAY{$pathname}{"requested"} = 1;
+				} elsif ( !exists $DELAY{$pathname} and $always_delay ) {
+					$DELAY{$pathname} = { "requested" => 1, "count" => 1 };
 				}
 			} elsif ($buffer =~ /^(ref|treeish|blob)=/) {
 				print $debug " $buffer";
diff --git a/t/t2006-checkout-index-basic.sh b/t/t2006-checkout-index-basic.sh
index 8e181db..602d8fe 100755
--- a/t/t2006-checkout-index-basic.sh
+++ b/t/t2006-checkout-index-basic.sh
@@ -32,4 +32,44 @@ test_expect_success 'checkout-index reports errors (stdin)' '
 	test_i18ngrep not.in.the.cache stderr
 '
 
+for mode in 'case' 'utf-8'
+do
+	case "$mode" in
+	case)	dir='A' symlink='a' mode_prereq='CASE_INSENSITIVE_FS' ;;
+	utf-8)
+		dir=$(printf "\141\314\210") symlink=$(printf "\303\244")
+		mode_prereq='UTF8_NFD_TO_NFC' ;;
+	esac
+
+	test_expect_success SYMLINKS,$mode_prereq \
+	"checkout-index with $mode-collision don't write to the wrong place" '
+		git init $mode-collision &&
+		(
+			cd $mode-collision &&
+			mkdir target-dir &&
+			empty_obj_hex=$(git hash-object -w --stdin </dev/null) &&
+			symlink_hex=$(printf "%s" "$PWD/target-dir" | git hash-object -w --stdin) &&
+			cat >objs <<-EOF &&
+			100644 blob ${empty_obj_hex}	${dir}/x
+			100644 blob ${empty_obj_hex}	${dir}/y
+			100644 blob ${empty_obj_hex}	${dir}/z
+			120000 blob ${symlink_hex}	${symlink}
+			EOF
+			git update-index --index-info <objs &&
+			# Note: the order is important here to exercise the
+			# case where the file at ${dir} has its type changed by
+			# the time Git tries to check out ${dir}/z.
+			#
+			# Also, we use core.precomposeUnicode=false because we
+			# want Git to treat the UTF-8 paths transparently on
+			# Mac OS, matching what is in the index.
+			#
+			git -c core.precomposeUnicode=false checkout-index -f \
+				${dir}/x ${dir}/y ${symlink} ${dir}/z &&
+			# Should not create ${dir}/z at ${symlink}/z
+			test_path_is_missing target-dir/z
+		)
+	'
+done
+
 test_done
-- 
2.17.1