summaryrefslogtreecommitdiff
path: root/src/utf8.c
AgeCommit message (Collapse)Author
2015-06-07Convert code base to strbuf_tNick Wellnhofer
There are probably a couple of places I missed. But this will only be a problem if we use a 64-bit bufsize_t at some point. Then, we'll get warnings from -Wshorten-64-to-32.
2015-04-16Pass-through Unicode non-charactersNick Wellnhofer
Despite their name, Unicode non-characters are valid code points. They should be passed through by a library like libcmark.
2015-01-05Reformatted code consistently with astyle.John MacFarlane
2014-12-29Added cmark_ prefix to functions in cmark_ctype.John MacFarlane
2014-12-29Added cmark_ctype.h with locale-independent isspace, ispunct, etc.John MacFarlane
Otherwise cmark's behavior varies unpredictably with the locale. `is_punctuation` in utf8.h has also been adjusted so that everything that counts all ASCII symbol characters count as punctuation, even though some are not in P* character classes.
2014-12-15Re-added cmark_ prefix to strbuf and chunk.John MacFarlane
Reverts 225d720.
2014-11-24Validate UTF-8 inputNick Wellnhofer
Invalid UTF-8 byte sequences are replaced with the Unicode replacement character U+FFFD. Fixes #213.
2014-11-24Off-by-one error in utf8proc_detabNick Wellnhofer
2014-11-20Added utf8proc_is_space.John MacFarlane
2014-11-20Added utf8proc_is_punctuation.John MacFarlane
We'll probably need this when the spec for emph/strong gets revised.
2014-11-16Remove unneeded #includesNick Wellnhofer
Fixes cross-platform issues.
2014-10-18Reindented c sources.John MacFarlane
2014-09-10Improve invalid UTF8 codepoint skippingVicent Marti
2014-09-10Fix infinite loop when case folding invalid UTF8 charsVicent Marti
2014-09-10Cleanup reference implementationVicent Marti
2014-09-09UTF8-aware detabbing and entity handlingVicent Marti
2014-09-09Rename to strbufVicent Marti
2014-09-09It buiiiildsVicent Marti
2014-09-09ffffixVicent Marti
2014-09-09lolVicent Marti
2014-08-13Initial commitJohn MacFarlane