diff options
author | Vicent Marti <tanoku@gmail.com> | 2016-05-23 16:43:13 +0200 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2016-06-06 15:39:05 -0700 |
commit | b2107ae4f05d7fe4388d37025105005592d32ef7 (patch) | |
tree | 5d9f37aaf75c8f6352a2f68a2f4ae2d82e7f65e8 /src/buffer.h | |
parent | 258a88d6c903234831281d4495eff7382932b617 (diff) |
buffer: proper safety checks for unbounded memory
The previous work for unbounded memory usage and overflows on the buffer
API had several shortcomings:
1. The total size of the buffer was limited by arbitrarily small
precision on the storage type for buffer indexes (typedef'd as
`bufsize_t`). This is not a good design pattern in secure applications,
particualarly since it requires the addition of helper functions to cast
to/from the native `size` types and the custom type for the buffer, and
check for overflows.
2. The library was calling `abort` on overflow and memory allocation
failures. This is not a good practice for production libraries, since it
turns a potential RCE into a trivial, guaranteed DoS to the whole
application that is linked against the library. It defeats the whole
point of performing overflow or allocation checks when the checks will
crash the library and the enclosing program anyway.
3. The default size limits for buffers were essentially unbounded
(capped to the precision of the storage type) and could lead to DoS
attacks by simple memory exhaustion (particularly critical in 32-bit
platforms). This is not a good practice for a library that handles
arbitrary user input.
Hence, this patchset provides slight (but in my opinion critical)
improvements on this area, copying some of the patterns we've used in
the past for high throughput, security sensitive Markdown parsers:
1. The storage type for buffer sizes is now platform native (`ssize_t`).
Ideally, this would be a `size_t`, but several parts of the code expect
buffer indexes to be possibly negative. Either way, switching to a
`size` type is an strict improvement, particularly in 64-bit platforms.
All the helpers that assured that values cannot escape the `size` range
have been removed, since they are superfluous.
2. The overflow checks have been removed. Instead, the maximum size for
a buffer has been set to a safe value for production usage (32mb) that
can be proven not to overflow in practice. Users that need to parse
particularly large Markdown documents can increase this value. A static,
compile-time check has been added to ensure that the maximum buffer size
cannot overflow on any growth operations.
3. The library no longer aborts on buffer overflow. The CMark library
now follows the convention of other Markdown implementations (such as
Hoedown and Sundown) and silently handles buffer overflows and
allocation failures by dropping data from the buffer. The result is
that pathological Markdown documents that try to exploit the library
will instead generate truncated (but valid, and safe) outputs.
All tests after these small refactorings have been verified to pass.
---
NOTE: Regarding 32 bit overflows, generating test cases that crash the
library is trivial (any input document larger than 2gb will crash
CMark), but most Python implementations have issues with large strings
to begin with, so a test case cannot be added to the pathological tests
suite, since it's written in Python.
Diffstat (limited to 'src/buffer.h')
-rw-r--r-- | src/buffer.h | 37 |
1 files changed, 20 insertions, 17 deletions
diff --git a/src/buffer.h b/src/buffer.h index 6fd0cae..ad4f341 100644 --- a/src/buffer.h +++ b/src/buffer.h @@ -5,13 +5,15 @@ #include <stdarg.h> #include <string.h> #include <limits.h> +#include <unistd.h> +#include <stdbool.h> #include "config.h" #ifdef __cplusplus extern "C" { #endif -typedef int bufsize_t; +typedef ssize_t bufsize_t; typedef struct { unsigned char *ptr; @@ -22,7 +24,22 @@ extern unsigned char cmark_strbuf__initbuf[]; #define GH_BUF_INIT \ { cmark_strbuf__initbuf, 0, 0 } -#define BUFSIZE_MAX INT_MAX + +/* + * Maximum size for memory storage on any given `cmark_strbuf` object. + * + * This is a "safe" value to prevent unbounded memory growth when + * parsing arbitrarily large (and potentially malicious) documents. + * + * It is currently set to 32mb, which is a reasonable default for + * production applications. If you need to parse documents larger than + * that, you can increase this value up to `SSIZE_MAX / 2` (which in + * practice resolves to 1/4th of the total address space for the program). + * + * Anything larger than that is a security threat and hence static checks + * will prevent CMark from compiling. + */ +#define BUFSIZE_MAX (32 * 1024 * 1024) /** * Initialize a cmark_strbuf structure. @@ -35,7 +52,7 @@ void cmark_strbuf_init(cmark_strbuf *buf, bufsize_t initial_size); /** * Grow the buffer to hold at least `target_size` bytes. */ -void cmark_strbuf_grow(cmark_strbuf *buf, bufsize_t target_size); +bool cmark_strbuf_grow(cmark_strbuf *buf, bufsize_t target_size); void cmark_strbuf_free(cmark_strbuf *buf); void cmark_strbuf_swap(cmark_strbuf *buf_a, cmark_strbuf *buf_b); @@ -72,20 +89,6 @@ void cmark_strbuf_trim(cmark_strbuf *buf); void cmark_strbuf_normalize_whitespace(cmark_strbuf *s); void cmark_strbuf_unescape(cmark_strbuf *s); -/* Print error and abort. */ -void cmark_strbuf_overflow_err(void); - -static CMARK_INLINE bufsize_t cmark_strbuf_check_bufsize(size_t size) { - if (size > BUFSIZE_MAX) { - cmark_strbuf_overflow_err(); - } - return (bufsize_t)size; -} - -static CMARK_INLINE bufsize_t cmark_strbuf_safe_strlen(const char *str) { - return cmark_strbuf_check_bufsize(strlen(str)); -} - #ifdef __cplusplus } #endif |