cmark - My own fork of cmark for commonmark conversion

Age	Commit message (Collapse)	Author
2020-01-23	Rearrange struct cmark_node	Nick Wellnhofer
	Introduce multi-purpose data/len members in struct cmark_node. This is mainly used to store literal text for inlines, code and HTML blocks. Move the content strbuf for blocks from cmark_node to cmark_parser. When finalizing nodes that allow inlines (paragraphs and headings), detach the strbuf and store the block content in the node's data/len members. Free the block content after processing inlines. Reduces size of struct cmark_node by 8 bytes.
2020-01-23	Use C string instead of chunk in renderer	Nick Wellnhofer
	Fix another place where an "allocated" cmark_chunk was used.
2020-01-10	Add options field to cmark_renderer.	John MacFarlane
	This is an internal change, as this isn't part of the public API.
2019-04-06	render: only emit actual newline when escape mode is LITERAL.	John MacFarlane
	For markdown content, e.g., in other contexts we want some kind of escaping, not a literal newline.
2019-03-17	Merge pull request #254 from github/empty-input	John MacFarlane
	Check for empty buffer when rendering
2018-10-31	render.c: reset last_breakable after cr.	John MacFarlane
	Fixes jgm/pandoc#5033.
2018-02-20	Check for empty buffer when rendering	Phil Turnbull
	For empty documents, `->size` is zero so `renderer.buffer->ptr[renderer.buffer->size - 1]` will cause an out-of-bounds read. Empty buffers always point to the global `cmark_strbuf__initbuf` buffer so we read `cmark_strbuf__initbuf[-1]`.
2016-06-24	Reformatted.	John MacFarlane

2016-06-06	cmark: Implement support for custom allocators	Vicent Marti

2016-06-06	buffer: proper safety checks for unbounded memory	Vicent Marti
	The previous work for unbounded memory usage and overflows on the buffer API had several shortcomings: 1. The total size of the buffer was limited by arbitrarily small precision on the storage type for buffer indexes (typedef'd as `bufsize_t`). This is not a good design pattern in secure applications, particualarly since it requires the addition of helper functions to cast to/from the native `size` types and the custom type for the buffer, and check for overflows. 2. The library was calling `abort` on overflow and memory allocation failures. This is not a good practice for production libraries, since it turns a potential RCE into a trivial, guaranteed DoS to the whole application that is linked against the library. It defeats the whole point of performing overflow or allocation checks when the checks will crash the library and the enclosing program anyway. 3. The default size limits for buffers were essentially unbounded (capped to the precision of the storage type) and could lead to DoS attacks by simple memory exhaustion (particularly critical in 32-bit platforms). This is not a good practice for a library that handles arbitrary user input. Hence, this patchset provides slight (but in my opinion critical) improvements on this area, copying some of the patterns we've used in the past for high throughput, security sensitive Markdown parsers: 1. The storage type for buffer sizes is now platform native (`ssize_t`). Ideally, this would be a `size_t`, but several parts of the code expect buffer indexes to be possibly negative. Either way, switching to a `size` type is an strict improvement, particularly in 64-bit platforms. All the helpers that assured that values cannot escape the `size` range have been removed, since they are superfluous. 2. The overflow checks have been removed. Instead, the maximum size for a buffer has been set to a safe value for production usage (32mb) that can be proven not to overflow in practice. Users that need to parse particularly large Markdown documents can increase this value. A static, compile-time check has been added to ensure that the maximum buffer size cannot overflow on any growth operations. 3. The library no longer aborts on buffer overflow. The CMark library now follows the convention of other Markdown implementations (such as Hoedown and Sundown) and silently handles buffer overflows and allocation failures by dropping data from the buffer. The result is that pathological Markdown documents that try to exploit the library will instead generate truncated (but valid, and safe) outputs. All tests after these small refactorings have been verified to pass. --- NOTE: Regarding 32 bit overflows, generating test cases that crash the library is trivial (any input document larger than 2gb will crash CMark), but most Python implementations have issues with large strings to begin with, so a test case cannot be added to the pathological tests suite, since it's written in Python.
2016-06-01	renderer: no_linebreaks instead of no_wrap.	John MacFarlane
	We generally want this option to prohibit any breaking in things like headers (not just wraps, but softbreaks).
2016-03-12	Switch from "inline" to "CMARK_INLINE"	Nick Wellnhofer
	Newer MSVC versions support enough of C99 to be able to compile cmark in plain C mode. Only the "inline" keyword is still unsupported. We have to use "__inline" instead.
2016-01-18	Automatic code reformat.	John MacFarlane

2016-01-17	Improved escaping in commonmark renderer.	John MacFarlane
	We try not to escape punctuation unless we absolutely have to. So, `)` and `.` are no longer escaped whenever they occur after digits; now they are only escaped if they are geuninely in a position where they'd cause a list item. This required a couple changes to render.c. - `renderer->begin_content` is only set to false AFTER a string of digits at the beginning of the line. (This is slightly unprincipled.) - We never break before a numeral (also slightly unprincipled).
2016-01-17	render: initialize begin_content to true.	John MacFarlane

2015-12-28	render: added begin_content field.	John MacFarlane
	This is like `begin_line` except that it doesn't trigger production of the prefix. So it can be set after an initial prefix (say `> `) is printed by the renderer, and consulted in determining whether to escape content that has a special meaning at the beginning of a line. Used in the commonmark renderer.
2015-08-06	Prefix utf8proc functions to avoid conflict with existing library	Kevin Wojniak

2015-07-27	Use clang-format, llvm style, for formatting.	John MacFarlane
	* Reformatted all source files. * Added 'format' target to Makefile. * Removed 'astyle' target. * Updated .editorconfig.
2015-07-14	astyle reformatting.	John MacFarlane

2015-07-12	Small cleanups.	John MacFarlane
	Moved begin_line setting into render.c, so you don't need to worry about it in outc.
2015-07-12	Fixed type on cmark_render_code_point.	John MacFarlane

2015-07-12	Added cmark_render_code_point.	John MacFarlane

2015-07-12	render: added cmark_render_ascii, to be used in char escapers.	John MacFarlane

2015-07-12	Removed options field from renderer struct.	John MacFarlane
	Added options argument to render_node function, and rearrange argument order.
2015-07-12	Removed enumlevel field of renderer.	John MacFarlane
	Now we just calculate this in the latex renderer.
2015-07-12	cmark_render: ensure final newline.	John MacFarlane
	This allows us to remove direct manipulation of buffer from the latex and commonmark renderers.
2015-07-11	Restructured common renderer code.	John MacFarlane
	* Added functions for cr, blankline, out to renderer object. * Removed lit (we'll handle this with a macro). * Changed type of out so it takes a regular string instead of a chunk. * Use macros LIT, OUT, BLANKLINE, CR in renderers to simplify code. (Not sure about this, but `renderer->out(renderer, ...)` sure is verbose.)
2015-07-11	Rename cmark_render_state -> cmark_renderer.	John MacFarlane

2015-07-11	render: Simplified code, avoiding some allocations.	John MacFarlane

2015-07-11	Factored out common bits of rendering into separate render module.	John MacFarlane
	* Added render.c, render.h. * Moved common functions and definitions from latex.c and commonmark.c to render.c, render.h. * Added a wrapper, cmark_render, that creates a renderer given a character-escaper and a node renderer. Closes #63.