cmark - My own fork of cmark for commonmark conversion

Age	Commit message (Collapse)	Author
2016-06-06	buffer: proper safety checks for unbounded memory	Vicent Marti
	The previous work for unbounded memory usage and overflows on the buffer API had several shortcomings: 1. The total size of the buffer was limited by arbitrarily small precision on the storage type for buffer indexes (typedef'd as `bufsize_t`). This is not a good design pattern in secure applications, particualarly since it requires the addition of helper functions to cast to/from the native `size` types and the custom type for the buffer, and check for overflows. 2. The library was calling `abort` on overflow and memory allocation failures. This is not a good practice for production libraries, since it turns a potential RCE into a trivial, guaranteed DoS to the whole application that is linked against the library. It defeats the whole point of performing overflow or allocation checks when the checks will crash the library and the enclosing program anyway. 3. The default size limits for buffers were essentially unbounded (capped to the precision of the storage type) and could lead to DoS attacks by simple memory exhaustion (particularly critical in 32-bit platforms). This is not a good practice for a library that handles arbitrary user input. Hence, this patchset provides slight (but in my opinion critical) improvements on this area, copying some of the patterns we've used in the past for high throughput, security sensitive Markdown parsers: 1. The storage type for buffer sizes is now platform native (`ssize_t`). Ideally, this would be a `size_t`, but several parts of the code expect buffer indexes to be possibly negative. Either way, switching to a `size` type is an strict improvement, particularly in 64-bit platforms. All the helpers that assured that values cannot escape the `size` range have been removed, since they are superfluous. 2. The overflow checks have been removed. Instead, the maximum size for a buffer has been set to a safe value for production usage (32mb) that can be proven not to overflow in practice. Users that need to parse particularly large Markdown documents can increase this value. A static, compile-time check has been added to ensure that the maximum buffer size cannot overflow on any growth operations. 3. The library no longer aborts on buffer overflow. The CMark library now follows the convention of other Markdown implementations (such as Hoedown and Sundown) and silently handles buffer overflows and allocation failures by dropping data from the buffer. The result is that pathological Markdown documents that try to exploit the library will instead generate truncated (but valid, and safe) outputs. All tests after these small refactorings have been verified to pass. --- NOTE: Regarding 32 bit overflows, generating test cases that crash the library is trivial (any input document larger than 2gb will crash CMark), but most Python implementations have issues with large strings to begin with, so a test case cannot be added to the pathological tests suite, since it's written in Python.
2016-06-06	Fix character type detection in commonmark.c	Nick Wellnhofer
	- Implement cmark_isalpha. - Check for ASCII character before implicit cast to char. - Use internal ctype functions in commonmark.c. Fixes test failures on Windows and undefined behavior.
2016-06-02	commonmark renderer: fixed code block as first in list item.	John MacFarlane
	We don't want a blank line before a code block when it's the first thing in a list item.
2016-06-01	renderer: no_linebreaks instead of no_wrap.	John MacFarlane
	We generally want this option to prohibit any breaking in things like headers (not just wraps, but softbreaks).
2016-06-01	Coerce realurllen to int.	John MacFarlane
	This is an alternate solution for pull request #132, which introduced a new warning on the comparison: latex.c:191:20: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'bufsize_t' (aka 'int') [-Wsign-compare] if (realurllen == link_text->as.literal.len && ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~
2016-06-01	Merge pull request #130 from MathieuDuponchelle/fix_unused_variable	John MacFarlane
	inlines: Remove unused variable "link_text"
2016-06-01	Merge pull request #132 from BenedictC/master	John MacFarlane
	Changed type from int to size_t to fix implicit type conversion warning
2016-06-01	- Changed type from int to size_t to fix implicit type conversion warning	Benedict Cohen

2016-06-01	inlines: Remove unused variable "link_text"	Mathieu Duponchelle

2016-05-26	Add 2016 to copyright	Kevin Burke
	I thought I had an outdated version of the binary because it printed 2015 for the version string.
2016-05-14	Better documentation of memory-freeing responsibilities.	John MacFarlane
	in cmark.h and its man page. Closes #124.
2016-04-26	Clarify that it's the caller's responsibility to free the buffer...	John MacFarlane
	returned by cmark_render_html etc. Closes #124.
2016-04-09	Reformatted.	John MacFarlane

2016-04-09	Fixed whitespace.	John MacFarlane

2016-04-09	Use library functions to insert nodes in emphasis/link processing.	John MacFarlane
	Previously we did this manually, which introduces many places where errors can creep in.
2016-04-09	Correctly handle list marker followed only by spaces.	John MacFarlane
	This change allows us to pass the new test introduced in 75f231503d2b5854f1ff517402d2751811295bf7. Previously when a list marker was followed only by spaces, cmark expected the following content to be indented by the same number of spaces. But in this case we should treat the line just like a blank line and set list padding accordingly.
2016-04-09	Fixed a number of issues relating to line wrapping.	John MacFarlane
	- Extend CMARK_OPT_NOBREAKS to all renderers and add `--nobreaks`. - Do not autowrap, regardless of width parameter, if CMARK_OPT_NOBREAKS is set. - Fixed CMARK_OPT_HARDBREAKS for LaTeX and man renderers. - Ensure that no auto-wrapping occurs if CMARK_OPT_NOBREAKS is enabled, or if output is CommonMark and CMARK_OPT_HARDBREAKS is enabled. - Updated man pages.
2016-04-09	Merge pull request #111 from PavloKapyshin/master	John MacFarlane
	Add library option to render softbreaks as spaces
2016-03-27	Merge pull request #118 from nwellnhof/win-eol-fix2	John MacFarlane
	Set stdin to binary mode on Windows
2016-03-27	Note that NOBREAKS option is HTML-only	Pavlo Kapyshin

2016-03-27	Set stdin to binary mode on Windows	Nick Wellnhofer
	Fixes EOLs when reading from stdin. Fully fixes issue #113.
2016-03-26	Handle buffer split across a CRLF line ending (closes #117).	John MacFarlane
	Adds an internal field to the parser struct to keep track of last_buffer_ended_with_cr.
2016-03-26	Merge pull request #115 from nwellnhof/tab-fix	John MacFarlane
	Reset partially_consumed_tab on every new line
2016-03-26	Open files in binary mode	Nick Wellnhofer
	Now that cmark supports different line endings, files must be openend in binary mode on Windows. Fixes issue #113.
2016-03-26	Reset partially_consumed_tab on every new line	Nick Wellnhofer
	Fixes issue #114.
2016-03-23	Doc: clarify that cmark_node_free frees a node's children too.	John MacFarlane

2016-03-18	Add library option to render softbreaks as spaces	Pavlo Kapyshin

2016-03-12	Compile in plain C mode with MSVC 12.0 or newer	Nick Wellnhofer
	Under MSVC, we used to compile in C++ mode to get some C99 features like mixing declarations and code. With newer MSVC versions, it's possible to build in plain C mode.
2016-03-12	Don't use variable length arrays	Nick Wellnhofer
	They're not supported by MSVC.
2016-03-12	Switch from "inline" to "CMARK_INLINE"	Nick Wellnhofer
	Newer MSVC versions support enough of C99 to be able to compile cmark in plain C mode. Only the "inline" keyword is still unsupported. We have to use "__inline" instead.
2016-02-28	Fix ctype(3) usage on NetBSD	Kamil Rytarowski
	We need to cast value passed to isspace(3) to unsigned char to explicitly prevent possibly undefined behavior. /tmp/pkgsrc-tmp/wip/cmark/work/cmark-0.24.1/src/commonmark.c: In function 'S_render_node': /tmp/pkgsrc-tmp/wip/cmark/work/cmark-0.24.1/src/commonmark.c:273:9: warning: array subscript has type 'char' [-Wchar-subscripts] (code_len > 2 && !isspace(code[0]) && ^ /tmp/pkgsrc-tmp/wip/cmark/work/cmark-0.24.1/src/commonmark.c:274:10: warning: array subscript has type 'char' [-Wchar-subscripts] !(isspace(code[code_len - 1]) && isspace(code[code_len - 2]))) && ^ /tmp/pkgsrc-tmp/wip/cmark/work/cmark-0.24.1/src/commonmark.c:274:10: warning: array subscript has type 'char' [-Wchar-subscripts] CTYPE(3) Library Functions Manual CTYPE(3) NAME isalpha, isupper, islower, isdigit, isxdigit, isalnum, isspace, ispunct, isprint, isgraph, iscntrl, isblank, toupper, tolower, - character classification and mapping functions LIBRARY Standard C Library (libc, -lc) CAVEATS The first argument of these functions is of type int, but only a very restricted subset of values are actually valid. The argument must either be the value of the macro EOF (which has a negative value), or must be a non-negative value within the range representable as unsigned char. Passing invalid values leads to undefined behavior. NetBSD 7.99 February 25, 2015 NetBSD 7.99
2016-02-17	Update cmark.h	Chris Eidhof

2016-02-12	blocks: More documentation and refactoring	Mathieu Duponchelle

2016-02-10	Removed unnecessary check for empty string_content.	John MacFarlane

2016-02-10	Revert "Simplified condition for lazy line."	John MacFarlane
	This reverts commit 4d2d486333c358eb3adf3d0649163e319a3b8b69. This commit caused a valgrind invalid read. ==29731== Invalid read of size 4 ==29731== at 0x40500E: S_process_line (blocks.c:1050) ==29731== by 0x403CF7: S_parser_feed (blocks.c:526) ==29731== by 0x403BC9: cmark_parser_feed (blocks.c:494) ==29731== by 0x433A95: main (main.c:168) ==29731== Address 0x51d5b60 is 64 bytes inside a block of size 128 free'd ==29731== at 0x4C27D4E: free (vg_replace_malloc.c:427) ==29731== by 0x4015F0: S_free_nodes (node.c:134) ==29731== by 0x401634: cmark_node_free (node.c:142) ==29731== by 0x4033B1: finalize (blocks.c:259) ==29731== by 0x40365E: add_child (blocks.c:337) ==29731== by 0x4046D8: try_new_container_starts (blocks.c:836) ==29731== by 0x404F12: S_process_line (blocks.c:1015) ==29731== by 0x403CF7: S_parser_feed (blocks.c:526) ==29731== by 0x403BC9: cmark_parser_feed (blocks.c:494) ==29731== by 0x433A95: main (main.c:168)
2016-02-09	Factored out contains_inlines.	John MacFarlane

2016-02-09	Simplified condition for lazy line.	John MacFarlane

2016-02-09	Added code comments.	John MacFarlane

2016-02-09	Added code comment.	John MacFarlane

2016-02-06	Code cleanup: add function to test for space or tab.	John MacFarlane

2016-02-06	Use an assertion to check for in-range html_block_type.	John MacFarlane
	It's a programming error if the type is out of range.
2016-02-06	Merge branch 'refactor-S_processLine' of ↵	John MacFarlane
	https://github.com/MathieuDuponchelle/cmark into MathieuDuponchelle-refactor-S_processLine
2016-02-06	Fixed handling of tabs in lists.	John MacFarlane

2016-02-07	blocks: Factorize S_processLines	Mathieu Duponchelle
	It's the core of the program and I had too much trouble making sense of it, two loops with many cases and other code interspersed hurt my head. All the tests still passed before rebasing, now I've got the exact same set of issues as master.
2016-02-06	Properly handle tabs with blockquotes and fenced blocks.	John MacFarlane

2016-02-06	Clarify logic in S_advance_offset.	John MacFarlane

2016-02-06	Generated scanners.c with more recent re2c.	John MacFarlane

2016-02-06	S_advance_offset: Only set partially_consumed_tab in columns mode.	John MacFarlane

2016-02-05	Simplified add_line (only need parser parameter).	John MacFarlane

2016-02-05	Properly handle partially consumed tab.	John MacFarlane
	E.g. in ``` - foo <TAB><TAB>bar ``` we should consume two spaces from the second tab, including two spaces in the code block.