summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2020-03-03Skip UTF-8 BOM if present at beginning of buffer.John MacFarlane
Closes #334.
2020-02-16Add casts for MSVC10.John MacFarlane
This is kivikakk's commit 62166fe3b6b07068ed4c4207113e3c4b060ad4a8 in cmark-gfm.
2020-02-16Fix #220 (hash collisions for references).John MacFarlane
This commit ports Vicent Marti's fix in cmark-gfm. (384cc9db4cd7a90f59c0751e58eb7b3023d38b85) His commit message follows: As explained on the previous commit, it is trivial to DoS the CMark parser by generating a document where all the link reference names hash to the same bucket in the hash table. This will cause the lookup process for each reference to take linear time on the amount of references in the document, and with enough link references to lookup, the end result is a pathological O(N^2) that causes medium-sized documents to finish parsing in 5+ minutes. To avoid this issue, we propose the present commit. Based on the fact that all reference lookup/resolution in a Markdown document is always performed as a last step during the parse process, we've reimplemented reference storage as follows: 1. New references are always inserted at the end of a linked list. This is an O(1) operation, and does not check whether an existing (duplicate) reference with the same label already exists in the document. 2. Upon the first call to `cmark_reference_lookup` (when it is expected that no further references will be added to the reference map), the linked list of references is written into a fixed-size array. 3. The fixed size array can then be efficiently sorted in-place in O(n log n). This operation only happens once. We perform this sort in a _stable_ manner to ensure that the earliest link reference in the document always has preference, as the spec dictates. To accomplish this, every reference is tagged with a generation number when initially inserted in the linked list. 4. The sorted array is then compacted in O(n). Since it was sorted in a stable way, the first reference for each label is preserved and the duplicates are removed, matching the spec. 5. We can now simply perform a binary search for the current `cmark_reference_lookup` query in O(log n). Any further lookup calls will also be O(log n), since the sorted references table only needs to be generated once. The resulting implementation is notably simple (as it uses standard library builtins `qsort` and `bsearch`), whilst performing better than the fixed size hash table in documents that have a high number of references and never becoming pathological regardless of the input.
2020-02-09Add cmark_get_default_mem_allocator().John MacFarlane
API change: This adds a new exported function in cmark.h. Closes #330.
2020-01-25Fix URL check in is_autolinkNick Wellnhofer
In a recent commit, the check was changed to strcmp, but we really have to use strncmp.
2020-01-25Fix null pointer deref in is_autolinkNick Wellnhofer
Introduced by a recent commit. Found by OSS-Fuzz.
2020-01-24build: substitute the path into the generate filesSaleem Abdulrasool
This resorts to the variable substitution to ensure the path embedded is correct. Without this, the path at the time of the configuration. In the case of the Swift project, this ended up searching in the *source* directory rather than the *build* directory. This will ensure that we export the file to an absolute location and we use the same location in the `cmarkConfig.cmake` file by means of CMake's `configure_file` subsitution.
2020-01-23build: use absolute path for cmarkTargets.cmakeSaleem Abdulrasool
Adjust the include of the CMake file to use a cmarkConfig.cmake relative location which enables use without considerations for the path.
2020-01-23Rearrange struct cmark_nodeNick Wellnhofer
Introduce multi-purpose data/len members in struct cmark_node. This is mainly used to store literal text for inlines, code and HTML blocks. Move the content strbuf for blocks from cmark_node to cmark_parser. When finalizing nodes that allow inlines (paragraphs and headings), detach the strbuf and store the block content in the node's data/len members. Free the block content after processing inlines. Reduces size of struct cmark_node by 8 bytes.
2020-01-23Improve packing of struct cmark_listNick Wellnhofer
Allows to reduce size of struct cmark_node later.
2020-01-23Use C string instead of chunk in rendererNick Wellnhofer
Fix another place where an "allocated" cmark_chunk was used.
2020-01-23Use C string instead of chunk for literal textNick Wellnhofer
Use zero-terminated C strings and a separate length field instead of cmark_chunks. Literal inline text will now be copied from the parent block's content buffer, slowing the benchmark down by 10-15%. The node struct never references memory of other nodes now, fixing #309. Node accessors don't have to check for delayed creation of C strings, so parsing and iterating all literals using the public API should actually be faster than before.
2020-01-23Use C string instead of chunk for custom block contentsNick Wellnhofer
Reduces size of struct cmark_node by 8 bytes.
2020-01-23Use C string instead of chunk for link URL and titleNick Wellnhofer
Use zero-terminated C strings instead of cmark_chunks without storing the length. This introduces a few additional strlen computations, but overhead should be low. Allows to reduce size of struct cmark_node later.
2020-01-23Use C string instead of chunk for code info and literalNick Wellnhofer
Use zero-terminated C strings instead of cmark_chunks without storing the length. The length of code literals will be readded in a later commit. strlen overhead for code info should be negligible. Reduces size of struct cmark_node by 8 bytes.
2020-01-23Helper function to set C strings in nodesNick Wellnhofer
2020-01-15Remove unused variableNick Wellnhofer
2020-01-11Fix CMake generator expression checking for MSVCNick Wellnhofer
2020-01-10commonmark renderer: better escaping in smart mode.John MacFarlane
When CMARK_OPT_SMART is enabled, we escape literal `-`, `.`, and quote characters when needed to avoid their being "smartified." See e.g. jgm/pandoc#6041 for an application.
2020-01-10Add options field to cmark_renderer.John MacFarlane
This is an internal change, as this isn't part of the public API.
2020-01-05Move C_VISIBILITY_PRESET back to src/CMakeLists.txt.John MacFarlane
This reverts a change by @compnerd in commit b6ffaca93e2b539ec407aeb4fd588c7f9441e7a9. We don't want this for api_tests, as it triggers this warning: ``` CMake Warning (dev) at api_test/CMakeLists.txt:1 (add_executable): Policy CMP0063 is not set: Honor visibility properties for all target types. Run "cmake --help-policy CMP0063" for policy details. Use the cmake_policy command to set the policy and suppress this warning. Target "api_test" of type "EXECUTABLE" has the following visibility properties set for C: C_VISIBILITY_PRESET For compatibility CMake is not honoring them for this target. This warning is for project developers. Use -Wno-dev to suppress it. ```
2020-01-05commonmark.c - use size_t instead of int.John MacFarlane
2020-01-03fix -Wconst-qual warningSaleem Abdulrasool
The string literal being assigned is const, but the assignment looses the constness of this string. This enables building with `/Zc:strictString` with MSVC as well.
2020-01-02build: add exports targets for build tree usageSaleem Abdulrasool
This enables the use of the export targets from the build tree to allow easy use of the CMark library in other projects. Resolves: #307
2020-01-02build: use target properties for include pathsSaleem Abdulrasool
This configures the target to setup the include paths publicly for the library targets in the build interface. This enables uses of the targets in the build tree without having to specify the include directories. This is particularly useful for use in the export targets, but also simplifies the rules for the API tests. The install interface does not need the include directories as `cmark.h` is installed into `include` which is a default include path.
2020-01-02build: reduce property computation in CMakeSaleem Abdulrasool
This reduces the work that CMake needs to do to configure the libraries by setting all the properties at once.
2020-01-02build: use `CMAKE_INCLUDE_CURRENT_DIRECTORY`Saleem Abdulrasool
This uses the CMake mechanism for including the current source and binary directories. This avoids the custom handling for this.
2020-01-02build: only include GNUInstallDirs onceSaleem Abdulrasool
Avoid including the utility once, which should avoid some unnecessary CMake checks, and reduces duplication.
2019-12-22build: cleanup CMake (#319)Saleem Abdulrasool
* build: inline a variable * build: use `LINKER_LANGUAGE` property for C++ runtime Rather than explicitly name the C++ runtime, use the `LINKER_LANGUAGE` property to use the driver to spell the C++ runtime appropriately. * build: use CMake to control C standard Rather than use compiler specific flags to control the language standard, indicate to CMake the desired standard. * build: use the correct variable These flags are being applied to the *C* compiler, check the C compiler, not the C++ compiler. * build: loosen the compiler check This loosens the compiler identifier check to enable matching AppleClang which is the identifier for the Xcode compiler. * build: hoist shared flags to top-level CMakeLists This hoists the common shared flags handling to the top-level CMakeLists from sub-layers. This prevents the duplication of the handling. * build: remove duplicated flags This is unnecessary, `/TP` is forced on all MSVC builds, no need to duplicate the flag for older versions. * build: loosen C compiler identifier check Loosen the check to a match rather than equality check, this allows it to match AppleClang which is the identifier for the Apple vended clang compiler part of Xcode. * build: use `add_compile_options` Use `add_compile_options` rather than modify `CMAKE_C_FLAGS`. The latter is meant to be only modified by the user, not the package developer. * build: hoist sanitizer flags to global state This moves the CMAKE_C_FLAGS handling to the top-level and uses `add_compile_options` rather than modifying the user controlled flags. * build: hoist `-fvisibilty` flags to top-level These are global settings, hoist them to the top level. * build: hoist the debug flag handling Use a generator expression and hoist the flag handling for the debug build. * build: hoist the profile flag handling This is a global flag, hoist it to the top level and use `add_compile_options` rather than modify the user controlled flags. * build: remove incorrect variable handling This seemed to be attempting to set the linker not the linker flags for the profile configuration. This variable is not used, do not set it. * build: remove unused CMake includes
2019-12-21Commonmark renderer: always use fences for code (#317).John MacFarlane
This solves problems with adjacent code blocks being merged.
2019-12-21Ensure that consecutive indented code blocks aren't merged...John MacFarlane
by inserting an HTML comment. Closes #317. I think I'll follow up with a change to use fenced code blocks, but this was the minimal fix.
2019-12-19Improve rendering of commonmark code spans with spaces.John MacFarlane
Closes #316.
2019-11-11Cleaner approach to max digits for numeric entities.John MacFarlane
This modifies unescaping in houdini_html_u.c rather than the entity handling in inlines.c. Unlike the other, this approach works also in e.g. link titles.
2019-11-11Fix entity parser (and api test) to respect length limit on numeric entities.John MacFarlane
2019-11-11Code reformatJohn MacFarlane
2019-11-11Don't allow link destinations with unbalanced unescaped parentheses.John MacFarlane
See commonmark/commonmark.js#177.
2019-07-05print_usage(): Minor grammar fix, swap two words (#305)Øyvind A. Holm
2019-04-23Link executable with static or shared libraryNick Wellnhofer
If CMARK_STATIC is on (default), link the executable with the static library. This produces exactly the same result as compiling the library sources again and linking with the object files. If CMARK_STATIC is off, link the executable with the shared library. This wasn't supported before and should be the preferred way to package cmark on Linux distros. Building only a shared library and a statically linked executable isn't supported anymore but this doesn't seem useful.
2019-04-06Resolve link references before creating setext header.John MacFarlane
A setext header line after a link reference should not create a header, according to the spec. See commonmark/commonmark-spec#395.
2019-04-06commonmark renderer: improve escaping.John MacFarlane
URL-escape special characters when escape mode is URL, and not otherwise. Entity-escape control characters (< 0x20) in non-literal escape modes.
2019-04-06render: only emit actual newline when escape mode is LITERAL.John MacFarlane
For markdown content, e.g., in other contexts we want some kind of escaping, not a literal newline.
2019-04-04Update code span normalization...John MacFarlane
to conform with spec change.
2019-04-03Allow empty `<>` link destination in reference link.John MacFarlane
2019-03-28Remove leftover includes of memory.h.John MacFarlane
Closes #290.
2019-03-26Merge pull request #269 from foonathan/masterJohn MacFarlane
Fix cmake warning about CMP0048, again
2019-03-26Fix #289.John MacFarlane
A link destination can't start with `<` unless it is an angle-bracket link that also ends with `>`. (If your URL really starts with `<`, URL-escape it.)
2019-03-23Update spec; allow internal delimiter runs to match if...John MacFarlane
both have lengths that are multiples of 3. See commonmark/commonmark#528.
2019-03-22Include references.h in parser.hJohn MacFarlane
Closes #287.
2019-03-19Update spec. Fix `[link](<foo\>)`.John MacFarlane
2019-03-19Define CMARK_OPT_SAFE for API compatibility.John MacFarlane
It doesn't do anything; this is documented.