Age | Commit message (Collapse) | Author |
|
Closes #334.
|
|
This is kivikakk's commit 62166fe3b6b07068ed4c4207113e3c4b060ad4a8
in cmark-gfm.
|
|
This commit ports Vicent Marti's fix in cmark-gfm.
(384cc9db4cd7a90f59c0751e58eb7b3023d38b85)
His commit message follows:
As explained on the previous commit, it is trivial to DoS the CMark
parser by generating a document where all the link reference names hash
to the same bucket in the hash table.
This will cause the lookup process for each reference to take linear
time on the amount of references in the document, and with enough link
references to lookup, the end result is a pathological O(N^2) that
causes medium-sized documents to finish parsing in 5+ minutes.
To avoid this issue, we propose the present commit.
Based on the fact that all reference lookup/resolution in a Markdown
document is always performed as a last step during the parse process,
we've reimplemented reference storage as follows:
1. New references are always inserted at the end of a linked list. This
is an O(1) operation, and does not check whether an existing (duplicate)
reference with the same label already exists in the document.
2. Upon the first call to `cmark_reference_lookup` (when it is expected
that no further references will be added to the reference map), the
linked list of references is written into a fixed-size array.
3. The fixed size array can then be efficiently sorted in-place in O(n
log n). This operation only happens once. We perform this sort in a
_stable_ manner to ensure that the earliest link reference in the
document always has preference, as the spec dictates. To accomplish
this, every reference is tagged with a generation number when initially
inserted in the linked list.
4. The sorted array is then compacted in O(n). Since it was sorted in a
stable way, the first reference for each label is preserved and the
duplicates are removed, matching the spec.
5. We can now simply perform a binary search for the current
`cmark_reference_lookup` query in O(log n). Any further lookup calls
will also be O(log n), since the sorted references table only needs to
be generated once.
The resulting implementation is notably simple (as it uses standard
library builtins `qsort` and `bsearch`), whilst performing better than
the fixed size hash table in documents that have a high number of
references and never becoming pathological regardless of the input.
|
|
This is taken from GitHub's fix:
https://github.com/github/cmark-gfm/commit/66a0836dc91e1653f7931e1218446664493da520
|
|
|
|
Closes #332.
|
|
See #332
|
|
API change: This adds a new exported function in cmark.h.
Closes #330.
|
|
In a recent commit, the check was changed to strcmp, but we really
have to use strncmp.
|
|
Introduced by a recent commit. Found by OSS-Fuzz.
|
|
This resorts to the variable substitution to ensure the path embedded is
correct. Without this, the path at the time of the configuration. In
the case of the Swift project, this ended up searching in the *source*
directory rather than the *build* directory. This will ensure that we
export the file to an absolute location and we use the same location in
the `cmarkConfig.cmake` file by means of CMake's `configure_file`
subsitution.
|
|
Adjust the include of the CMake file to use a cmarkConfig.cmake relative
location which enables use without considerations for the path.
|
|
Introduce multi-purpose data/len members in struct cmark_node. This
is mainly used to store literal text for inlines, code and HTML blocks.
Move the content strbuf for blocks from cmark_node to cmark_parser.
When finalizing nodes that allow inlines (paragraphs and headings),
detach the strbuf and store the block content in the node's data/len
members. Free the block content after processing inlines.
Reduces size of struct cmark_node by 8 bytes.
|
|
Allows to reduce size of struct cmark_node later.
|
|
Fix another place where an "allocated" cmark_chunk was used.
|
|
Use zero-terminated C strings and a separate length field instead of
cmark_chunks. Literal inline text will now be copied from the parent
block's content buffer, slowing the benchmark down by 10-15%.
The node struct never references memory of other nodes now, fixing #309.
Node accessors don't have to check for delayed creation of C strings,
so parsing and iterating all literals using the public API should
actually be faster than before.
|
|
Reduces size of struct cmark_node by 8 bytes.
|
|
Use zero-terminated C strings instead of cmark_chunks without storing
the length. This introduces a few additional strlen computations,
but overhead should be low.
Allows to reduce size of struct cmark_node later.
|
|
Use zero-terminated C strings instead of cmark_chunks without storing
the length. The length of code literals will be readded in a later
commit. strlen overhead for code info should be negligible.
Reduces size of struct cmark_node by 8 bytes.
|
|
|
|
When using multiprocessing on Windows, the main program must be
guarded with a __name__ check.
|
|
These checks don't seem to be required and broke pathological_tests.py
on Windows where multiprocessing sets __name__ to "__mp_main__".
|
|
|
|
The flag is only required for old MSVC versions.
|
|
|
|
When CMARK_OPT_SMART is enabled, we escape literal `-`,
`.`, and quote characters when needed to avoid their
being "smartified."
See e.g. jgm/pandoc#6041 for an application.
|
|
This is an internal change, as this isn't part of the
public API.
|
|
This reverts a change by @compnerd in commit
b6ffaca93e2b539ec407aeb4fd588c7f9441e7a9.
We don't want this for api_tests, as it triggers this warning:
```
CMake Warning (dev) at api_test/CMakeLists.txt:1 (add_executable):
Policy CMP0063 is not set: Honor visibility properties for all target
types. Run "cmake --help-policy CMP0063" for policy details. Use the
cmake_policy command to set the policy and suppress this warning.
Target "api_test" of type "EXECUTABLE" has the following visibility
properties set for C:
C_VISIBILITY_PRESET
For compatibility CMake is not honoring them for this target.
This warning is for project developers. Use -Wno-dev to suppress it.
```
|
|
|
|
Recommended by build log at
https://oss-fuzz-build-logs.storage.googleapis.com/log-6a7500a1-8617-42c6-b8e4-78cab009b5b5.txt
|
|
The string literal being assigned is const, but the assignment looses
the constness of this string. This enables building with `/Zc:strictString`
with MSVC as well.
|
|
This enables the use of the export targets from the build tree to allow
easy use of the CMark library in other projects.
Resolves: #307
|
|
This configures the target to setup the include paths publicly for the
library targets in the build interface. This enables uses of the
targets in the build tree without having to specify the include
directories. This is particularly useful for use in the export targets,
but also simplifies the rules for the API tests. The install interface
does not need the include directories as `cmark.h` is installed into
`include` which is a default include path.
|
|
Remove the unnecessary execute permission on CMakeLists.txt.
|
|
This reduces the work that CMake needs to do to configure the libraries
by setting all the properties at once.
|
|
This uses the CMake mechanism for including the current source and
binary directories. This avoids the custom handling for this.
|
|
man pages are extremely useful, but are not generally available on
Windows. This changes the install condition to check for the Windows
cross-compile rather than the toolchain in use. It is possible to build
for Windows using clang in the GNU driver.
|
|
Avoid including the utility once, which should avoid some unnecessary
CMake checks, and reduces duplication.
|
|
Replace `add_compile_definitions` with `add_compile_options` since the
former was introduced in 3.12.
|
|
* build: inline a variable
* build: use `LINKER_LANGUAGE` property for C++ runtime
Rather than explicitly name the C++ runtime, use the `LINKER_LANGUAGE`
property to use the driver to spell the C++ runtime appropriately.
* build: use CMake to control C standard
Rather than use compiler specific flags to control the language
standard, indicate to CMake the desired standard.
* build: use the correct variable
These flags are being applied to the *C* compiler, check the C compiler,
not the C++ compiler.
* build: loosen the compiler check
This loosens the compiler identifier check to enable matching AppleClang
which is the identifier for the Xcode compiler.
* build: hoist shared flags to top-level CMakeLists
This hoists the common shared flags handling to the top-level CMakeLists
from sub-layers. This prevents the duplication of the handling.
* build: remove duplicated flags
This is unnecessary, `/TP` is forced on all MSVC builds, no need to
duplicate the flag for older versions.
* build: loosen C compiler identifier check
Loosen the check to a match rather than equality check, this allows it
to match AppleClang which is the identifier for the Apple vended clang
compiler part of Xcode.
* build: use `add_compile_options`
Use `add_compile_options` rather than modify `CMAKE_C_FLAGS`. The
latter is meant to be only modified by the user, not the package
developer.
* build: hoist sanitizer flags to global state
This moves the CMAKE_C_FLAGS handling to the top-level and uses
`add_compile_options` rather than modifying the user controlled flags.
* build: hoist `-fvisibilty` flags to top-level
These are global settings, hoist them to the top level.
* build: hoist the debug flag handling
Use a generator expression and hoist the flag handling for the debug
build.
* build: hoist the profile flag handling
This is a global flag, hoist it to the top level and use
`add_compile_options` rather than modify the user controlled flags.
* build: remove incorrect variable handling
This seemed to be attempting to set the linker not the linker flags for
the profile configuration. This variable is not used, do not set it.
* build: remove unused CMake includes
|
|
This solves problems with adjacent code blocks being
merged.
|
|
by inserting an HTML comment. Closes #317.
I think I'll follow up with a change to use fenced code
blocks, but this was the minimal fix.
|
|
Closes #316.
|
|
Closes #313.
|
|
This modifies unescaping in houdini_html_u.c rather than
the entity handling in inlines.c. Unlike the other,
this approach works also in e.g. link titles.
|
|
|
|
|
|
See commonmark/commonmark.js#177.
|
|
|
|
|