summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Makefile2
-rw-r--r--README.md30
-rwxr-xr-xjs/markdown2
-rwxr-xr-xjs/stmd.js19
-rwxr-xr-xjs/test.js19
-rw-r--r--man/man1/stmd.12
-rw-r--r--man/stmd.1.md4
-rw-r--r--narrative.md4
-rw-r--r--runtests.pl3
-rw-r--r--spec.txt174
-rw-r--r--src/html.c6
-rw-r--r--src/main.c2
12 files changed, 136 insertions, 131 deletions
diff --git a/Makefile b/Makefile
index c1decfc..55b6645 100644
--- a/Makefile
+++ b/Makefile
@@ -66,7 +66,7 @@ update-site: spec.html narrative.html
cp spec.html _site/
cp narrative.html _site/index.html
cp -r js/* _site/js/
- (cd _site ; git commit -a -m "Updated site for latest spec, narrative, js" ; git push; cd ..)
+ (cd _site ; git pull ; git commit -a -m "Updated site for latest spec, narrative, js" ; git push; cd ..)
clean:
-rm test $(SRCDIR)/*.o $(SRCDIR)/scanners.c
diff --git a/README.md b/README.md
index 889cc4e..78fc837 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,14 @@
-Standard markdown
-=================
+CommonMark
+==========
-Standard markdown is a [specification of markdown syntax][the spec],
+CommonMark is a [specification of Markdown syntax][the spec],
together with BSD3-licensed implementations (`stmd`) in C and javascript.
The implementations
-------------------
The C implementation provides both a library and a standalone program
-`stmd` that converts markdown to HTML. It is written in standard C99
+`stmd` that converts Markdown to HTML. It is written in standard C99
and has no library dependencies. (However, if you check it out from the
repository, you'll need [`re2c`](http://re2c.org) to generate
`scanners.c` from `scanners.re`. This is only a build dependency for
@@ -30,7 +30,7 @@ this.)
[The spec] contains over 400 embedded examples which serve as conformance
tests. To run the tests for `stmd`, do `make test`. To run them for
-another markdown program, say `myprog`, do `make test PROG=myprog`. To
+another Markdown program, say `myprog`, do `make test PROG=myprog`. To
run the tests for `stmd.js`, do `make testjs`.
[The spec]: http://jgm.github.io/stmd/spec.html
@@ -38,11 +38,11 @@ run the tests for `stmd.js`, do `make testjs`.
The spec
--------
-The source of [the spec] is `spec.txt`. This is basically a markdown
+The source of [the spec] is `spec.txt`. This is basically a Markdown
file, with code examples written in a shorthand form:
.
- markdown source
+ Markdown source
.
expected HTML output
.
@@ -55,7 +55,7 @@ The spec is written from the point of view of the human writer, not
the computer reader. It is not an algorithm---an English translation of
a computer program---but a declarative description of what counts as a block
quote, a code block, and each of the other structural elements that can
-make up a markdown document.
+make up a Markdown document.
Because John Gruber's [canonical syntax
description](http://daringfireball.net/projects/markdown/syntax) leaves
@@ -64,13 +64,13 @@ making a large number of decisions, many of them somewhat arbitrary.
In making them, I have appealed to existing conventions and
considerations of simplicity, readability, expressive power, and
consistency. I have tried to ensure that "normal" documents in the many
-incompatible existing implementations of markdown will render, as far as
+incompatible existing implementations of Markdown will render, as far as
possible, as their authors intended. And I have tried to make the rules
for different elements work together harmoniously. In places where
different decisions could have been made (for example, the rules
governing list indentation), I have explained the rationale for
my choices. In a few cases, I have departed slightly from the canonical
-syntax description, in ways that I think further the goals of markdown
+syntax description, in ways that I think further the goals of Markdown
as stated in that description.
For the most part, I have limited myself to the basic elements
@@ -80,17 +80,17 @@ right before considering such things. However, I have included a visible
syntax for line breaks and fenced code blocks.
In all of this, I have been guided by eight years experience writing
-markdown implementations in several languages, including the first
-markdown parser not based on regular expression substitutions
+Markdown implementations in several languages, including the first
+Markdown parser not based on regular expression substitutions
([pandoc](http://github.com/jgm/pandoc)) and the first markdown parsers
based on PEG grammars
([peg-markdown](http://github.com/jgm/peg-markdown),
[lunamark](http://github.com/jgm/lunamark)). Maintaining these projects
and responding to years of user feedback have given me a good sense of
-the complexities involved in parsing markdown, and of the various design
+the complexities involved in parsing Markdown, and of the various design
decisions that can be made. I have also explored differences between
-markdown implementations extensively using [babelmark
+Markdown implementations extensively using [babelmark
2](http://johnmacfarlane.net/babelmark2/). In the early phases of
working out the spec, I benefited greatly from collaboration with David
-Greenspan, and from feedback from several industrial users of markdown,
+Greenspan, and from feedback from several industrial users of Markdown,
including Jeff Atwood, Vincent Marti, and Neil Williams.
diff --git a/js/markdown b/js/markdown
index 05a372a..2b23d54 100755
--- a/js/markdown
+++ b/js/markdown
@@ -11,5 +11,5 @@ fs.readFile(file, 'utf8', function(err, data) {
}
var parser = new stmd.DocParser();
var renderer = new stmd.HtmlRenderer();
- console.log(renderer.render(parser.parse(data)));
+ process.stdout.write(renderer.render(parser.parse(data)));
});
diff --git a/js/stmd.js b/js/stmd.js
index 16baa59..6895008 100755
--- a/js/stmd.js
+++ b/js/stmd.js
@@ -1,4 +1,4 @@
-// stmd.js - "standard markdown" in javascript
+// stmd.js - CommomMark in javascript
// Copyright (C) 2014 John MacFarlane
// License: BSD3.
@@ -373,7 +373,7 @@ var parseEmphasis = function(inlines) {
return (this.pos - startpos);
default:
- return result;
+ return res;
}
return 0;
@@ -382,7 +382,7 @@ var parseEmphasis = function(inlines) {
// Attempt to parse link title (sans quotes), returning the string
// or null if no match.
var parseLinkTitle = function() {
- title = this.match(reLinkTitle);
+ var title = this.match(reLinkTitle);
if (title) {
// chop off quotes from title and unescape:
return unescape(title.substr(1, title.length - 2));
@@ -861,7 +861,7 @@ var parseListMarker = function(ln, offset) {
} else {
return null;
}
- blank_item = match[0].length === rest.length;
+ var blank_item = match[0].length === rest.length;
if (spaces_after_marker >= 5 ||
spaces_after_marker < 1 ||
blank_item) {
@@ -926,7 +926,7 @@ var incorporateLine = function(ln, line_number) {
switch (container.t) {
case 'BlockQuote':
- matched = indent <= 3 && ln[first_nonspace] === '>';
+ var matched = indent <= 3 && ln[first_nonspace] === '>';
if (matched) {
offset = first_nonspace + 1;
if (ln[offset] === ' ') {
@@ -1234,7 +1234,7 @@ var finalize = function(block, line_number) {
if (line_number > block.start_line) {
block.end_line = line_number - 1;
} else {
- block_end_line = line_number;
+ block.end_line = line_number;
}
switch (block.t) {
@@ -1478,9 +1478,10 @@ var renderBlock = function(block, in_tight_list) {
case 'FencedCode':
info_words = block.info.split(/ +/);
attr = info_words.length === 0 || info_words[0].length === 0 ?
- [] : [['class',this.escape(info_words[0],true)]];
- return inTags('pre', attr,
- inTags('code', [], this.escape(block.string_content)));
+ [] : [['class','language-' +
+ this.escape(info_words[0],true)]];
+ return inTags('pre', [],
+ inTags('code', attr, this.escape(block.string_content)));
case 'HtmlBlock':
return block.string_content;
case 'ReferenceDef':
diff --git a/js/test.js b/js/test.js
index b16b2f1..19c0c92 100755
--- a/js/test.js
+++ b/js/test.js
@@ -1,9 +1,8 @@
#!/usr/bin/env node
var fs = require('fs');
-var util = require('util');
var stmd = require('./stmd');
-var ansi = require('./ansi/ansi')
+var ansi = require('./ansi/ansi');
var cursor = ansi(process.stdout);
var writer = new stmd.HtmlRenderer();
@@ -15,19 +14,23 @@ var failed = 0;
var showSpaces = function(s) {
var t = s;
return t.replace(/\t/g,'→')
- .replace(/ /g,'␣');
-}
+ .replace(/ /g,'␣');
+};
fs.readFile('spec.txt', 'utf8', function(err, data) {
if (err) {
return console.log(err);
}
+ var i;
var examples = [];
var current_section = "";
var example_number = 0;
- tests = data.replace(/^<!-- END TESTS -->(.|[\n])*/m,'');
+ var tests = data
+ .replace(/\r\n?/g, "\n") // Normalize newlines for platform independence
+ .replace(/^<!-- END TESTS -->(.|[\n])*/m, '');
+
tests.replace(/^\.\n([\s\S]*?)^\.\n([\s\S]*?)^\.$|^#{1,6} *(.*)$/gm,
- function(_,x,y,z,w){
+ function(_,x,y,z){
if (z) {
current_section = z;
} else {
@@ -45,7 +48,7 @@ fs.readFile('spec.txt', 'utf8', function(err, data) {
for (i = 0; i < examples.length; i++) {
var example = examples[i];
- if (example.section != current_section) {
+ if (example.section !== current_section) {
if (current_section !== '') {
cursor.write('\n');
}
@@ -53,7 +56,7 @@ fs.readFile('spec.txt', 'utf8', function(err, data) {
cursor.reset().write(current_section).reset().write(' ');
}
var actual = writer.renderBlock(reader.parse(example.markdown.replace(/→/g, '\t')));
- if (actual == example.html) {
+ if (actual === example.html) {
passed++;
cursor.green().write('✓').reset();
} else {
diff --git a/man/man1/stmd.1 b/man/man1/stmd.1
index 913d5a7..6bfdd80 100644
--- a/man/man1/stmd.1
+++ b/man/man1/stmd.1
@@ -10,7 +10,7 @@ stmd [\f[I]options\f[]] [file*]
\f[C]stmd\f[] acts as a pipe, reading from stdin or from the specified
files and writing to stdout.
It converts markdown formatted plain text to HTML, using the conventions
-described in the standard markdown spec.
+described in the CommonMark spec.
.SH OPTIONS
.TP
.B \f[C]\-\-ast\f[]
diff --git a/man/stmd.1.md b/man/stmd.1.md
index 6e38afc..3947a79 100644
--- a/man/stmd.1.md
+++ b/man/stmd.1.md
@@ -17,8 +17,8 @@ stmd [*options*] [file\*]
`stmd` acts as a pipe, reading from stdin or from the specified
files and writing to stdout. It converts markdown formatted plain
-text to HTML, using the conventions described in the standard
-markdown spec.
+text to HTML, using the conventions described in the CommonMark
+spec.
# OPTIONS
diff --git a/narrative.md b/narrative.md
index 12bf780..315c47b 100644
--- a/narrative.md
+++ b/narrative.md
@@ -1,8 +1,8 @@
---
-title: Standard markdown
+title: CommonMark
...
-Standard markdown is a [specification of markdown
+CommonMark is a [specification of markdown
syntax](http://jgm.github.io/stmd/spec.html), together with
BSD3-licensed implementations (`stmd`) in C and javascript. The source
for the spec and the two implementations can be found in [this
diff --git a/runtests.pl b/runtests.pl
index 2d80f14..2e2b795 100644
--- a/runtests.pl
+++ b/runtests.pl
@@ -16,9 +16,6 @@ if (!(@PROG && defined $SPEC)) {
exit 1;
}
-# Disable ANSI colors if we're not hooked up to a terminal
-$ENV{ANSI_COLORS_DISABLED} ||= !-t *STDOUT;
-
my $passed = 0;
my $failed = 0;
my $skipped = 0;
diff --git a/spec.txt b/spec.txt
index 525cb74..5fc1dac 100644
--- a/spec.txt
+++ b/spec.txt
@@ -1,9 +1,9 @@
---
-title: Standard Markdown Spec
+title: CommonMark Spec
author:
- John MacFarlane
version: 1
-date: 2014-07-21
+date: 2014-09-06
...
# Introduction
@@ -191,7 +191,7 @@ In the examples, the `→` character is used to represent tabs.
# Preprocessing
-A [line](#line) <a id="line"/>
+A [line](#line) <a id="line"></a>
is a sequence of zero or more characters followed by a line
ending (CR, LF, or CRLF) or by the end of
file.
@@ -203,29 +203,33 @@ to a certain encoding.
Tabs in lines are expanded to spaces, with a tab stop of 4 characters:
.
-foo→baz→→bim
+→foo→baz→→bim
.
-<p>foo baz bim</p>
+<pre><code>foo baz bim
+</code></pre>
.
.
-οὐ→χρῆν
+ a→a
+ ὐ→a
.
-<p>οὐ χρῆν</p>
+<pre><code>a a
+ὐ a
+</code></pre>
.
Line endings are replaced by newline characters (LF).
A line containing only spaces (after tab expansion) followed by
-a line ending is called a [blank line](#blank-line). <a
-id="blank-line"/>
+a line ending is called a [blank line](#blank-line).
+<a id="blank-line"></a>
# Blocks and inlines
We can think of a document as a sequence of [blocks](#block)<a
-id="block"/>---structural elements like paragraphs, block quotations,
+id="block"></a>---structural elements like paragraphs, block quotations,
lists, headers, rules, and code blocks. Blocks can contain other
-blocks, or they can contain [inline](#inline)<a id="inline"/> content:
+blocks, or they can contain [inline](#inline)<a id="inline"></a> content:
words, spaces, links, emphasized text, images, and inline code.
## Precedence
@@ -256,9 +260,9 @@ one block element does not affect the inline parsing of any other.
## Container blocks and leaf blocks
We can divide blocks into two types:
-[container blocks](#container-block), <a id="container-block"/>
+[container blocks](#container-block), <a id="container-block"></a>
which can contain other blocks, and [leaf blocks](#leaf-block),
-<a id="leaf-block"/> which cannot.
+<a id="leaf-block"></a> which cannot.
# Leaf blocks
@@ -269,8 +273,8 @@ Markdown document.
A line consisting of 0-3 spaces of indentation, followed by a sequence
of three or more matching `-`, `_`, or `*` characters, each followed
-optionally any number of spaces, forms a [horizontal
-rule](#horizontal-rule). <a id="horizontal-rule"/>
+optionally by any number of spaces, forms a [horizontal
+rule](#horizontal-rule). <a id="horizontal-rule"></a>
.
***
@@ -465,7 +469,7 @@ If you want a horizontal rule in a list item, use a different bullet:
## ATX headers
-An [ATX header](#atx-header) <a id="atx-header"/>
+An [ATX header](#atx-header) <a id="atx-header"></a>
consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of `#` characters. The opening sequence
@@ -655,11 +659,11 @@ ATX headers can be empty:
## Setext headers
-A [setext header](#setext-header) <a id="setext-header"/>
+A [setext header](#setext-header) <a id="setext-header"></a>
consists of a line of text, containing at least one nonspace character,
with no more than 3 spaces indentation, followed by a [setext header
underline](#setext-header-underline). A [setext header
-underline](#setext-header-underline) <a id="setext-header-underline"/>
+underline](#setext-header-underline) <a id="setext-header-underline"></a>
is a sequence of `=` characters or a sequence of `-` characters, with no
more than 3 spaces indentation and any number of trailing
spaces. The header is a level 1 header if `=` characters are used, and
@@ -863,9 +867,9 @@ Setext headers cannot be empty:
## Indented code blocks
An [indented code block](#indented-code-block)
-<a id="indented-code-block"/> is composed of one or more
+<a id="indented-code-block"></a> is composed of one or more
[indented chunks](#indented-chunk) separated by blank lines.
-An [indented chunk](#indented-chunk) <a id="indented-chunk"/>
+An [indented chunk](#indented-chunk) <a id="indented-chunk"></a>
is a sequence of non-blank lines, each indented four or more
spaces. An indented code block cannot interrupt a paragraph, so
if it occurs before or after a paragraph, there must be an
@@ -1015,16 +1019,16 @@ Trailing spaces are included in the code block's content:
## Fenced code blocks
-A [code fence](#code-fence) <a id="code-fence"/> is a sequence
+A [code fence](#code-fence) <a id="code-fence"></a> is a sequence
of at least three consecutive backtick characters (`` ` ``) or
tildes (`~`). (Tildes and backticks cannot be mixed.)
-A [fenced code block](#fenced-code-block) <a id="fenced-code-block"/>
+A [fenced code block](#fenced-code-block) <a id="fenced-code-block"></a>
begins with a code fence, indented no more than three spaces.
The line with the opening code fence may optionally contain some text
following the code fence; this is trimmed of leading and trailing
-spaces and called the [info string](#info-string). <a
-id="info-string"/> The info string may not contain any backtick
+spaces and called the [info string](#info-string).
+<a id="info-string"></a> The info string may not contain any backtick
characters. (The reason for this restriction is that otherwise
some inline code would be incorrectly interpreted as the
beginning of a fenced code block.)
@@ -1282,9 +1286,9 @@ bar
.
An [info string](#info-string) can be provided after the opening code fence.
-Opening and closing spaces will be stripped, and the first word
-is used here to populate the `class` attribute of the enclosing
-`pre` tag.
+Opening and closing spaces will be stripped, and the first word, prefixed
+with `language-`, is used as the value for the `class` attribute of the
+`code` element within the enclosing `pre` element.
.
```ruby
@@ -1293,7 +1297,7 @@ def foo(x)
end
```
.
-<pre class="ruby"><code>def foo(x)
+<pre><code class="language-ruby">def foo(x)
return 3
end
</code></pre>
@@ -1306,7 +1310,7 @@ def foo(x)
end
~~~~~~~
.
-<pre class="ruby"><code>def foo(x)
+<pre><code class="language-ruby">def foo(x)
return 3
end
</code></pre>
@@ -1316,7 +1320,7 @@ end
````;
````
.
-<pre class=";"><code></code></pre>
+<pre><code class="language-;"></code></pre>
.
Info strings for backtick code blocks cannot contain backticks:
@@ -1343,7 +1347,7 @@ Closing code fences cannot have info strings:
## HTML blocks
-An [HTML block tag](#html-block-tag) <a id="html-block-tag"/> is
+An [HTML block tag](#html-block-tag) <a id="html-block-tag"></a> is
an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag
name is one of the following (case-insensitive):
`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `body`,
@@ -1354,7 +1358,7 @@ name is one of the following (case-insensitive):
`footer`, `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`,
`video`, `script`, `style`.
-An [HTML block](#html-block) <a id="html-block"/> begins with an
+An [HTML block](#html-block) <a id="html-block"></a> begins with an
[HTML block tag](#html-block-tag), [HTML comment](#html-comment),
[processing instruction](#processing-instruction),
[declaration](#declaration), or [CDATA section](#cdata-section).
@@ -1629,7 +1633,7 @@ So there is no important loss of expressive power with the new rule.
## Link reference definitions
A [link reference definition](#link-reference-definition)
-<a id="link-reference-definition"/> consists of a [link
+<a id="link-reference-definition"></a> consists of a [link
label](#link-label), indented up to three spaces, followed
by a colon (`:`), optional blank space (including up to one
newline), a [link destination](#link-destination), optional
@@ -1854,7 +1858,7 @@ are defined:
## Paragraphs
A sequence of non-blank lines that cannot be interpreted as other
-kinds of blocks forms a [paragraph](#paragraph) <a id="paragraph"/>.
+kinds of blocks forms a [paragraph](#paragraph).<a id="paragraph"></a>
The contents of the paragraph are the result of parsing the
paragraph's raw content as inlines. The paragraph's raw content
is formed by concatenating the lines and removing initial and final
@@ -1998,12 +2002,12 @@ provided below in the section entitled [A parsing strategy].)
## Block quotes
-A [block quote marker](#block-quote-marker) <a id="block-quote-marker"/>
+A [block quote marker](#block-quote-marker) <a id="block-quote-marker"></a>
consists of 0-3 spaces of initial indent, plus (a) the character `>` together
with a following space, or (b) a single character `>` not followed by a space.
The following rules define [block quotes](#block-quote):
-<a id="block-quote"/>
+<a id="block-quote"></a>
1. **Basic case.** If a string of lines *Ls* constitute a sequence
of blocks *Bs*, then the result of appending a [block quote marker]
@@ -2016,7 +2020,7 @@ The following rules define [block quotes](#block-quote):
more lines in which the next non-space character after the [block
quote marker](#block-quote-marker) is [paragraph continuation
text](#paragraph-continuation-text) is a block quote with *Bs* as
- its content. <a id="paragraph-continuation-text"/>
+ its content. <a id="paragraph-continuation-text"></a>
[Paragraph continuation text](#paragraph-continuation-text) is text
that will be parsed as part of the content of a paragraph, but does
not occur at the beginning of the paragraph.
@@ -2360,14 +2364,14 @@ the `>`:
## List items
-A [list marker](#list-marker) <a id="list-marker"/> is a
+A [list marker](#list-marker) <a id="list-marker"></a> is a
[bullet list marker](#bullet-list-marker) or an [ordered list
marker](#ordered-list-marker).
-A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"/>
+A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"></a>
is a `-`, `+`, or `*` character.
-An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"/>
+An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"></a>
is a sequence of one of more digits (`0-9`), followed by either a
`.` character or a `)` character.
@@ -2889,7 +2893,7 @@ continued here.</p>
5. **That's all.** Nothing that is not counted as a list item by rules
- #1--4 counts as a [list item](#block-quote).
+ #1--4 counts as a [list item](#list-item).
The rules for sublists follow from the general rules above. A sublist
must be indented the same number of spaces a paragraph would need to be
@@ -3183,25 +3187,25 @@ takes four spaces (a common case), but diverge in other cases.
## Lists
-A [list](#list) <a id="list"/> is a sequence of one or more
+A [list](#list) <a id="list"></a> is a sequence of one or more
list items [of the same type](#of-the-same-type). The list items
may be separated by single [blank lines](#blank-line), but two
blank lines end all containing lists.
Two list items are [of the same type](#of-the-same-type)
-<a id="of-the-same-type"/> if they begin with a [list
+<a id="of-the-same-type"></a> if they begin with a [list
marker](#list-marker) of the same type. Two list markers are of the
same type if (a) they are bullet list markers using the same character
(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
delimiter (either `.` or `)`).
-A list is an [ordered list](#ordered-list) <a id="ordered-list"/>
+A list is an [ordered list](#ordered-list) <a id="ordered-list"></a>
if its constituent list items begin with
[ordered list markers](#ordered-list-marker), and a [bullet
-list](#bullet-list) <a id="bullet-list"/> if its constituent list
+list](#bullet-list) <a id="bullet-list"></a> if its constituent list
items begin with [bullet list markers](#bullet-list-marker).
-The [start number](#start-number) <a id="start-number"/>
+The [start number](#start-number) <a id="start-number"></a>
of an [ordered list](#ordered-list) is determined by the list number of
its initial list item. The numbers of subsequent list items are
disregarded.
@@ -3716,7 +3720,7 @@ blocks](#fenced-code-block):
foo
```
.
-<pre class="foo+bar"><code>foo
+<pre><code class="language-foo+bar">foo
</code></pre>
.
@@ -3726,7 +3730,7 @@ foo
Entities are parsed as entities, not as literal text, in all contexts
except code spans and code blocks. Three kinds of entities are recognized.
-[Named entities](#name-entities) <a id="named-entities"/> consist of `&`
+[Named entities](#name-entities) <a id="named-entities"></a> consist of `&`
+ a string of 2-32 alphanumerics beginning with a letter + `;`.
.
@@ -3735,7 +3739,7 @@ except code spans and code blocks. Three kinds of entities are recognized.
<p>&nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &ClockwiseContourIntegral;</p>
.
-[Decimal entities](#decimal-entities) <a id="decimal-entities"/>
+[Decimal entities](#decimal-entities) <a id="decimal-entities"></a>
consist of `&#` + a string of 1--8 arabic digits + `;`.
.
@@ -3744,7 +3748,7 @@ consist of `&#` + a string of 1--8 arabic digits + `;`.
<p>&#1; &#35; &#1234; &#992; &#98765432;</p>
.
-[Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"/>
+[Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"></a>
consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits
+ `;`.
@@ -3809,7 +3813,7 @@ code blocks, including raw HTML, URLs, [link titles](#link-title), and
foo
```
.
-<pre class="f&ouml;&ouml;"><code>foo
+<pre><code class="language-f&ouml;&ouml;">foo
</code></pre>
.
@@ -3830,7 +3834,7 @@ Entities are treated as literal text in code spans and code blocks:
## Code span
-A [backtick string](#backtick-string) <a id="backtick-string"/>
+A [backtick string](#backtick-string) <a id="backtick-string"></a>
is a string of one or more backtick characters (`` ` ``) that is neither
preceded nor followed by a backtick.
@@ -4015,7 +4019,7 @@ The following rules capture all of these patterns, while allowing
for efficient parsing strategies that do not backtrack:
1. A single `*` character [can open emphasis](#can-open-emphasis)
- <a id="can-open-emphasis"/> iff
+ <a id="can-open-emphasis"></a> iff
(a) it is not part of a sequence of four or more unescaped `*`s,
(b) it is not followed by whitespace, and
@@ -4031,7 +4035,7 @@ for efficient parsing strategies that do not backtrack:
followed immediately by strong emphasis.
3. A single `*` character [can close emphasis](#can-close-emphasis)
- <a id="can-close-emphasis"/> iff
+ <a id="can-close-emphasis"></a> iff
(a) it is not part of a sequence of four or more unescaped `*`s, and
(b) it is not preceded by whitespace.
@@ -4043,7 +4047,7 @@ for efficient parsing strategies that do not backtrack:
(c) it is not followed by an ASCII alphanumeric character.
5. A double `**` [can open strong emphasis](#can-open-strong-emphasis)
- <a id="can-open-strong-emphasis" /> iff
+ <a id="can-open-strong-emphasis" ></a> iff
(a) it is not part of a sequence of four or more unescaped `*`s,
(b) it is not followed by whitespace, and
@@ -4060,7 +4064,7 @@ for efficient parsing strategies that do not backtrack:
followed immediately by emphasis.
7. A double `**` [can close strong emphasis](#can-close-strong-emphasis)
- <a id="can-close-strong-emphasis" /> iff
+ <a id="can-close-strong-emphasis" ></a> iff
(a) it is not part of a sequence of four or more unescaped `*`s, and
(b) it is not preceded by whitespace.
@@ -4642,7 +4646,7 @@ and title are given immediately after the label. In [reference
links](#reference-links) the destination and title are defined elsewhere
in the document.
-A [link label](#link-label) <a id="link-label"/> consists of
+A [link label](#link-label) <a id="link-label"></a> consists of
- an opening `[`, followed by
- zero or more backtick code spans, autolinks, HTML tags, link labels,
@@ -4657,7 +4661,7 @@ These rules are motivated by the following intuitive ideas:
but less tightly than `<>` or `` ` ``.
- Link labels may contain material in matching square brackets.
-A [link destination](#link-destination) <a id="link-destination"/>
+A [link destination](#link-destination) <a id="link-destination"></a>
consists of either
- a sequence of zero or more characters between an opening `<` and a
@@ -4670,7 +4674,7 @@ consists of either
a balanced pair of unescaped parentheses that is not itself
inside a balanced pair of unescaped paretheses.
-A [link title](#link-title) <a id="link-title"/> consists of either
+A [link title](#link-title) <a id="link-title"></a> consists of either
- a sequence of zero or more characters between straight double-quote
characters (`"`), including a `"` character only if it is
@@ -4683,7 +4687,7 @@ A [link title](#link-title) <a id="link-title"/> consists of either
- a sequence of zero or more characters between matching parentheses
(`(...)`), including a `)` character only if it is backslash-escaped.
-An [inline link](#inline-link) <a id="inline-link"/>
+An [inline link](#inline-link) <a id="inline-link"></a>
consists of a [link label](#link-label) followed immediately
by a left parenthesis `(`, optional whitespace,
an optional [link destination](#link-destination),
@@ -4887,15 +4891,15 @@ an HTML tag:
There are three kinds of [reference links](#reference-link):
-<a id="reference-link"/>
+<a id="reference-link"></a>
-A [full reference link](#full-reference-link) <a id="full-reference-link"/>
+A [full reference link](#full-reference-link) <a id="full-reference-link"></a>
consists of a [link label](#link-label), optional whitespace, and
another [link label](#link-label) that [matches](#matches) a
[link reference definition](#link-reference-definition) elsewhere in the
document.
-One label [matches](#matches) <a id="matches"/>
+One label [matches](#matches) <a id="matches"></a>
another just in case their normalized forms are equal. To normalize a
label, perform the *unicode case fold* and collapse consecutive internal
whitespace to a single space. If there are multiple matching reference
@@ -5003,7 +5007,7 @@ labels define equivalent inline content:
.
A [collapsed reference link](#collapsed-reference-link)
-<a id="collapsed-reference-link"/> consists of a [link
+<a id="collapsed-reference-link"></a> consists of a [link
label](#link-label) that [matches](#matches) a [link reference
definition](#link-reference-definition) elsewhere in the
document, optional whitespace, and the string `[]`. The contents of the
@@ -5051,7 +5055,7 @@ between the two sets of brackets:
.
A [shortcut reference link](#shortcut-reference-link)
-<a id="shortcut-reference-link"/> consists of a [link
+<a id="shortcut-reference-link"></a> consists of a [link
label](#link-label) that [matches](#matches) a [link reference
definition](#link-reference-definition) elsewhere in the
document and is not followed by `[]` or a link label.
@@ -5386,18 +5390,18 @@ Autolinks are absolute URIs and email addresses inside `<` and `>`.
They are parsed as links, with the URL or email address as the link
label.
-A [URI autolink](#uri-autolink) <a id="uri-autolink"/>
+A [URI autolink](#uri-autolink) <a id="uri-autolink"></a>
consists of `<`, followed by an [absolute
URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed
as a link to the URI, with the URI as the link's label.
-An [absolute URI](#absolute-uri), <a id="absolute-uri"/>
+An [absolute URI](#absolute-uri), <a id="absolute-uri"></a>
for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`)
followed by zero or more characters other than ASCII whitespace and
control characters, `<`, and `>`. If the URI includes these characters,
you must use percent-encoding (e.g. `%20` for a space).
-The following [schemes](#scheme) <a id="scheme"/>
+The following [schemes](#scheme) <a id="scheme"></a>
are recognized (case-insensitive):
`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`,
`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`,
@@ -5459,12 +5463,12 @@ Spaces are not allowed in autolinks:
<p>&lt;http://foo.bar/baz bim&gt;</p>
.
-An [email autolink](#email-autolink) <a id="email-autolink"/>
+An [email autolink](#email-autolink) <a id="email-autolink"></a>
consists of `<`, followed by an [email address](#email-address),
followed by `>`. The link's label is the email address,
and the URL is `mailto:` followed by the email address.
-An [email address](#email-address), <a id="email-address"/>
+An [email address](#email-address), <a id="email-address"></a>
for these purposes, is anything that matches
the [non-normative regex from the HTML5
spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-mail-state-%28type=email%29):
@@ -5539,67 +5543,67 @@ so custom tags (and even, say, DocBook tags) may be used.
Here is the grammar for tags:
-A [tag name](#tag-name) <a id="tag-name"/> consists of an ASCII letter
+A [tag name](#tag-name) <a id="tag-name"></a> consists of an ASCII letter
followed by zero or more ASCII letters or digits.
-An [attribute](#attribute) <a id="attribute"/> consists of whitespace,
+An [attribute](#attribute) <a id="attribute"></a> consists of whitespace,
an **attribute name**, and an optional **attribute value
specification**.
-An [attribute name](#attribute-name) <a id="attribute-name"/>
+An [attribute name](#attribute-name) <a id="attribute-name"></a>
consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML
specification restricted to ASCII. HTML5 is laxer.)
An [attribute value specification](#attribute-value-specification)
-<a id="attribute-value-specification"/> consists of optional whitespace,
+<a id="attribute-value-specification"></a> consists of optional whitespace,
a `=` character, optional whitespace, and an [attribute
value](#attribute-value).
-An [attribute value](#attribute-value) <a id="attribute-value"/>
+An [attribute value](#attribute-value) <a id="attribute-value"></a>
consists of an [unquoted attribute value](#unquoted-attribute-value),
a [single-quoted attribute value](#single-quoted-attribute-value),
or a [double-quoted attribute value](#double-quoted-attribute-value).
An [unquoted attribute value](#unquoted-attribute-value)
-<a id="unquoted-attribute-value"/> is a nonempty string of characters not
+<a id="unquoted-attribute-value"></a> is a nonempty string of characters not
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
A [single-quoted attribute value](#single-quoted-attribute-value)
-<a id="single-quoted-attribute-value"/> consists of `'`, zero or more
+<a id="single-quoted-attribute-value"></a> consists of `'`, zero or more
characters not including `'`, and a final `'`.
A [double-quoted attribute value](#double-quoted-attribute-value)
-<a id="double-quoted-attribute-value"/> consists of `"`, zero or more
+<a id="double-quoted-attribute-value"></a> consists of `"`, zero or more
characters not including `"`, and a final `"`.
-An [open tag](#open-tag) <a id="open-tag"/> consists of a `<` character,
+An [open tag](#open-tag) <a id="open-tag"></a> consists of a `<` character,
a [tag name](#tag-name), zero or more [attributes](#attribute),
optional whitespace, an optional `/` character, and a `>` character.
-A [closing tag](#closing-tag) <a id="closing-tag"/> consists of the
+A [closing tag](#closing-tag) <a id="closing-tag"></a> consists of the
string `</`, a [tag name](#tag-name), optional whitespace, and the
character `>`.
-An [HTML comment](#html-comment) <a id="html-comment"/> consists of the
+An [HTML comment](#html-comment) <a id="html-comment"></a> consists of the
string `<!--`, a string of characters not including the string `--`, and
the string `-->`.
A [processing instruction](#processing-instruction)
-<a id="processing-instruction"/> consists of the string `<?`, a string
+<a id="processing-instruction"></a> consists of the string `<?`, a string
of characters not including the string `?>`, and the string
`?>`.
-A [declaration](#declaration) <a id="declaration"/> consists of the
+A [declaration](#declaration) <a id="declaration"></a> consists of the
string `<!`, a name consisting of one or more uppercase ASCII letters,
whitespace, a string of characters not including the character `>`, and
the character `>`.
-A [CDATA section](#cdata-section) <a id="cdata-section"/> consists of
+A [CDATA section](#cdata-section) <a id="cdata-section"></a> consists of
the string `<![CDATA[`, a string of characters not including the string
`]]>`, and the string `]]>`.
-An [HTML tag](#html-tag) <a id="html-tag"/> consists of an [open
+An [HTML tag](#html-tag) <a id="html-tag"></a> consists of an [open
tag](#open-tag), a [closing tag](#closing-tag), an [HTML
comment](#html-comment), a [processing
instruction](#processing-instruction), an [element type
diff --git a/src/html.c b/src/html.c
index 56d5dbb..aeec5f1 100644
--- a/src/html.c
+++ b/src/html.c
@@ -156,15 +156,15 @@ extern int blocks_to_html(block* b, bstring* result, bool tight)
case fenced_code:
escaped = escape_html(b->string_content, false);
cr(html);
- bformata(html, "<pre");
+ bformata(html, "<pre><code");
if (blength(b->attributes.fenced_code_data.info) > 0) {
escaped2 = escape_html(b->attributes.fenced_code_data.info, true);
info_words = bsplit(escaped2, ' ');
- bformata(html, " class=\"%s\"", info_words->entry[0]->data);
+ bformata(html, " class=\"language-%s\"", info_words->entry[0]->data);
bdestroy(escaped2);
bstrListDestroy(info_words);
}
- bformata(html, "><code>%s</code></pre>", escaped->data);
+ bformata(html, ">%s</code></pre>", escaped->data);
cr(html);
bdestroy(escaped);
break;
diff --git a/src/main.c b/src/main.c
index fa334b3..f0ecb82 100644
--- a/src/main.c
+++ b/src/main.c
@@ -22,7 +22,7 @@ int main(int argc, char *argv[]) {
for (i=1; i < argc; i++) {
if (strcmp(argv[i], "--version") == 0) {
printf("stmd %s", VERSION);
- printf(" - standard markdown converter (c) 2014 John MacFarlane\n");
+ printf(" - CommonMark converter (c) 2014 John MacFarlane\n");
exit(0);
} else if ((strcmp(argv[i], "--help") == 0) ||
(strcmp(argv[i], "-h") == 0)) {