Age | Commit message (Collapse) | Author |
|
The old one had many errors.
The new one is derived from the list in the npm entities package.
Since the sequences can now be longer (multi-code-point), we
have bumped the length limit from 4 to 8, which also affects
houdini_html_u.c.
An example of the kind of error that was fixed in given
in jgm/commonmark.js#47: `≧̸` should be rendered as "≧̸" (U+02267
U+00338), but it's actually rendered as "≧" (which is the same as
`≧`).
|
|
This isn't actually needed.
|
|
It breaks on Windows.
|
|
|
|
Removed sundown, because the reading was anomalous.
This commit in hoedown caused the speed difference btw
sundown and hoedown that I was measuring before (on 32 bit
machines):
https://github.com/hoedown/hoedown/commit/ca829ff83580ed52cc56c09a67c80119026bae20
As Nick Wellnhofer explains: "The commit removes a rather arbitrary
limit of 16MB for buffers. Your benchmark input probably results in
an buffer larger than 16MB. It also seems that hoedown didn't check
error returns thoroughly at the time of the commit. This basically means
that large input files ould produce any kind of random behavior before
that commit, and that any benchmark that results in a too large buffer
can't be relied on."
|
|
Now we have an array of pointers (`potential_openers`),
keyed to the delim char.
When we've failed to match a potential opener prior to point X
in the delimiter stack, we reset `potential_openers` for that opener
type to X, and thus avoid having to look again through all the openers
we've already rejected.
See jgm/commonmark#43.
|
|
"*a_ " * 20000
See jgm/commonmark#43.
|
|
This way tests fail instead of just hanging.
Currently we use a 1 sec timeout.
Added a failing test from jgm/commonmark#43.
|
|
|
|
|
|
Many link closers with no openers.
Many link openers with no closers.
Many emph openers with no closers.
|
|
Many closers with no openers.
|
|
When they have no matching openers and cannot be openers themselves,
we can safely remove them.
This helps with a performance case: "a_ " * 20000.
See jgm/commonmark.js#43.
|
|
This reverts commit 54d1249c2caebf45a24d691dc765fb93c9a5e594, reversing
changes made to bc14d869323650e936c7143dcf941b28ccd5b57d.
|
|
|
|
Further optimize utf8proc_valid
|
|
Assume a multi-byte sequence and rework switch statement into if/else
for another 2% speedup.
|
|
Optimize utf8proc_detab
|
|
Speeds up "make bench" by another percent.
|
|
Handle valid UTF-8 chars inside the main loop and avoid a call to
strbuf_put for every UTF-8 char.
Results in a 8% speedup in the UTF-8-heavy "make bench" on my system.
|
|
|
|
|
|
Safer handling of string buffer sizes and indices
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guard against too large chunks passed via the API.
|
|
There are probably a couple of places I missed. But this will only
be a problem if we use a 64-bit bufsize_t at some point. Then, we'll
get warnings from -Wshorten-64-to-32.
|
|
|
|
|
|
This function was missing a couple of range checks that I'm too lazy
to fix.
|
|
Avoid potential overflow and allow for different bufsize types.
|
|
|
|
Replace macro ENSURE_SIZE with inline function S_strbuf_grow_by that
checks for overflow.
|
|
cmark_strbuf_grow will never truncate a buffer.
|
|
This simplifies overflow checks.
|
|
|
|
Always add 50% on top of target size. No need for a loop.
|
|
This makes it easier to change the type later.
No functional change. The rest of the code base still has to be
adjusted to use the new type.
Also add some TODO comments in buffer.c.
|
|
|
|
Note that hoedown doesn't show the 32/64 bit difference that
sundown does -- so it was probably a bug in sundown. Removed
the comments from benchmarks.md about this.
|
|
Abort on strbuf errors
|
|
|
|
Users of the strbuf API are supposed to check for an OOM condition
after appending to strbufs, but:
* This is never done in the whole code base.
* The implementation was flawed because only `ptr` was set to the
OOM value without adjusting `size` and `asize`. After an error,
subsequent calls could very well lead to segfaults, contrary to the
documentation.
Change the code to always abort on errors with a message printed to
stderr. The only alternative is to propagate errors throughout the
whole library which seems infeasible.
|
|
fix ENSURE_SIZE to actually check left arg length.
|
|
|
|
|