summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Makefile4
-rw-r--r--js/index.html1
-rwxr-xr-xjs/stmd.js2
-rw-r--r--narrative.md2
-rw-r--r--runtests.pl10
-rw-r--r--spec.txt108
6 files changed, 64 insertions, 63 deletions
diff --git a/Makefile b/Makefile
index ee3c204..c1decfc 100644
--- a/Makefile
+++ b/Makefile
@@ -32,7 +32,7 @@ oldtests:
make -C oldtests --quiet clean all
test: spec.txt
- perl runtests.pl $(PROG) $<
+ perl runtests.pl $< $(PROG)
testjs: spec.txt
node js/test.js
@@ -45,7 +45,7 @@ $(PROG): $(SRCDIR)/main.c $(SRCDIR)/inlines.o $(SRCDIR)/blocks.o $(SRCDIR)/detab
$(CC) $(LDFLAGS) -o $@ $^
$(SRCDIR)/scanners.c: $(SRCDIR)/scanners.re
- re2c --case-insensitive -bis $< > $@
+ re2c --case-insensitive -bis $< > $@ || (rm $@ && false)
$(SRCDIR)/case_fold_switch.c: $(DATADIR)/CaseFolding-3.2.0.txt
perl mkcasefold.pl < $< > $@
diff --git a/js/index.html b/js/index.html
index 7ba5a86..994b147 100644
--- a/js/index.html
+++ b/js/index.html
@@ -63,7 +63,6 @@ $(document).ready(function() {
div#preview { height: 400px; overflow: scroll; }
div.row { margin-top: 1em; }
blockquote { font-size: 100%; }
- h3 { margin-top: 0; margin-bottom: 0; padding: 0; font-size: 100%; }
footer { color: #555; text-align: center; margin: 1em; }
pre { display: block; padding: 0.5em; color: #333; background: #f8f8ff }
#warnings li { color: red; font-weight: bold; }
diff --git a/js/stmd.js b/js/stmd.js
index 4635b10..16baa59 100755
--- a/js/stmd.js
+++ b/js/stmd.js
@@ -1064,7 +1064,7 @@ var incorporateLine = function(ln, line_number) {
container.level = match[0].trim().length; // number of #s
// remove trailing ###s:
container.strings =
- [ln.slice(offset).replace(/(?:(\\#) *#+| *#+) *$/,'$1')];
+ [ln.slice(offset).replace(/(?:(\\#) *#*| *#+) *$/,'$1')];
break;
} else if ((match = ln.slice(first_nonspace).match(/^`{3,}(?!.*`)|^~{3,}(?!.*~)/))) {
diff --git a/narrative.md b/narrative.md
index 73daf5c..12bf780 100644
--- a/narrative.md
+++ b/narrative.md
@@ -47,7 +47,7 @@ description.
There are only a few places where this spec says things that contradict
the canonical syntax description:
-- It [allows all puncutation symbols to be
+- It [allows all punctuation symbols to be
backslash-escaped](http://jgm.github.io/stmd/spec.html#backslash-escapes),
not just the symbols with special meanings in markdown. I found
that it was just too hard to remember which symbols could be
diff --git a/runtests.pl b/runtests.pl
index 370b43c..1dfdcf6 100644
--- a/runtests.pl
+++ b/runtests.pl
@@ -8,13 +8,13 @@ use IO::Handle;
use IPC::Open2;
$|++;
-my $usage="runtests.pl PROGRAM SPEC\nSet ANSI_COLORS_DISABLED=1 if you redirect to a file.\nSet PATT='...' to restrict tests to sections matching a regex.\n";
+my $usage="runtests.pl SPEC PROGRAM\nSet ANSI_COLORS_DISABLED=1 if you redirect to a file.\nSet PATT='...' to restrict tests to sections matching a regex.\n";
-my $PROG=$ARGV[0];
-my $SPEC=$ARGV[1];
+my $SPEC = shift @ARGV;
+my @PROG = @ARGV;
my $PATT=$ENV{'PATT'};
-if (!(defined $PROG && defined $SPEC)) {
+if (!(@PROG && defined $SPEC)) {
print STDERR $usage;
exit 1;
}
@@ -72,7 +72,7 @@ sub dotest
# We use → to indicate tab and ␣ space in the spec
$markdown =~ s/→/\t/g;s/␣/ /g;
$html =~ s/→/\t/g;s/␣/ /g;
- open2(my $out, my $in, $PROG);
+ open2(my $out, my $in, @PROG);
print $in $markdown;
close $in;
flush $out;
diff --git a/spec.txt b/spec.txt
index 569ada8..cbe58a4 100644
--- a/spec.txt
+++ b/spec.txt
@@ -8,21 +8,21 @@ date: 2014-07-21
# Introduction
-## What is markdown?
+## What is Markdown?
Markdown is a plain text format for writing structured documents,
based on conventions used for indicating formatting in email and
usenet posts. It was developed in 2004 by John Gruber, who wrote
-the first markdown-to-HTML converter in perl, and it soon became
+the first Markdown-to-HTML converter in perl, and it soon became
widely used in websites. By 2014 there were dozens of
implementations in many languages. Some of them extended basic
-markdown syntax with conventions for footnotes, definition lists,
+Markdown syntax with conventions for footnotes, definition lists,
tables, and other constructs, and some allowed output not just in
HTML but in LaTeX and many other formats.
## Why is a spec needed?
-John Gruber's [canonical description of markdown's
+John Gruber's [canonical description of Markdown's
syntax](http://daringfireball.net/projects/markdown/syntax)
does not specify the syntax unambiguously. Here are some examples of
questions it does not answer:
@@ -95,7 +95,7 @@ questions it does not answer:
```
7. When list markers change from numbers to bullets, do we have
- two lists or one? (The markdown syntax description suggests two,
+ two lists or one? (The Markdown syntax description suggests two,
but the perl scripts and many other implementations produce one.)
``` markdown
@@ -162,20 +162,20 @@ Because there is no unambiguous spec, implementations have diverged
considerably. As a result, users are often surprised to find that
a document that renders one way on one system (say, a github wiki)
renders differently on another (say, converting to docbook using
-pandoc). To make matters worse, because nothing in markdown counts
+pandoc). To make matters worse, because nothing in Markdown counts
as a "syntax error," the divergence often isn't discovered right away.
## About this document
-This document attempts to specify markdown syntax unambiguously.
-It contains many examples with side-by-side markdown and
+This document attempts to specify Markdown syntax unambiguously.
+It contains many examples with side-by-side Markdown and
HTML. These are intended to double as conformance tests. An
accompanying script `runtests.pl` can be used to run the tests
-against any markdown program:
+against any Markdown program:
perl runtests.pl PROGRAM spec.html
-Since this document describes how markdown is to be parsed into
+Since this document describes how Markdown is to be parsed into
an abstract syntax tree, it would have made sense to use an abstract
representation of the syntax tree instead of HTML. But HTML is capable
of representing the structural distinctions we need to make, and the
@@ -183,17 +183,17 @@ choice of HTML for the tests makes it possible to run the tests against
an implementation without writing an abstract syntax tree renderer.
This document is generated from a text file, `spec.txt`, written
-in markdown with a small extension for the side-by-side tests.
+in Markdown with a small extension for the side-by-side tests.
The script `spec2md.pl` can be used to turn `spec.txt` into pandoc
-markdown, which can then be converted into other formats.
+Markdown, which can then be converted into other formats.
In the examples, the `→` character is used to represent tabs.
# Preprocessing
A [line](#line) <a id="line"/>
-is a sequence of one or more characters followed by a line
-ending (CR, LF, or CRLF, depending on the platform) or by the end of
+is a sequence of zero or more characters followed by a line
+ending (CR, LF, or CRLF) or by the end of
file.
This spec does not specify an encoding; it thinks of lines as composed
@@ -263,7 +263,7 @@ which can contain other blocks, and [leaf blocks](#leaf-block),
# Leaf blocks
This section describes the different kinds of leaf block that make up a
-markdown document.
+Markdown document.
## Horizontal rules
@@ -611,9 +611,11 @@ of the closing sequence:
.
### foo \###
## foo \#\##
+# foo \#
.
<h3>foo #</h3>
<h2>foo ##</h2>
+<h1>foo #</h1>
.
ATX headers need not be separated from surrounding content by blank
@@ -659,10 +661,10 @@ with no more than 3 spaces indentation, followed by a [setext header
underline](#setext-header-underline). A [setext header
underline](#setext-header-underline) <a id="setext-header-underline"/>
is a sequence of `=` characters or a sequence of `-` characters, with no
-more than 3 spaces indentation and any number of leading or trailing
+more than 3 spaces indentation and any number of trailing
spaces. The header is a level 1 header if `=` characters are used, and
a level 2 header if `-` characters are used. The contents of the header
-are the result of parsing the first line as markdown inline content.
+are the result of parsing the first line as Markdown inline content.
In general, a setext header need not be preceded or followed by a
blank line. However, it cannot interrupt a paragraph, so when a
@@ -881,7 +883,7 @@ attributes.
</code></pre>
.
-The contents are literal text, and do not get parsed as markdown:
+The contents are literal text, and do not get parsed as Markdown:
.
<a/>
@@ -931,7 +933,7 @@ in interior blank lines:
</code></pre>
.
-An indented code code block cannot interrupt a paragraph. (This
+An indented code block cannot interrupt a paragraph. (This
allows hanging indents and the like.)
.
@@ -1015,14 +1017,14 @@ Trailing spaces are included in the code block's content:
A [code fence](#code-fence) <a id="code-fence"/> is a sequence
of at least three consecutive backtick characters (`` ` ``) or
-tildes (`~`). (Tildes and backticks cannot be mixed.).
+tildes (`~`). (Tildes and backticks cannot be mixed.)
A [fenced code block](#fenced-code-block) <a id="fenced-code-block"/>
begins with a code fence, indented no more than three spaces.
The line with the opening code fence may optionally contain some text
following the code fence; this is trimmed of leading and trailing
spaces and called the [info string](#info-string). <a
-id="info-string"/> The [info string] may not contain any backtick
+id="info-string"/> The info string may not contain any backtick
characters. (The reason for this restriction is that otherwise
some inline code would be incorrectly interpreted as the
beginning of a fenced code block.)
@@ -1395,7 +1397,7 @@ okay.
<foo><a>
.
-Here we have two code blocks with a markdown paragraph between them:
+Here we have two code blocks with a Markdown paragraph between them:
.
<DIV CLASS="foo">
@@ -1409,7 +1411,7 @@ Here we have two code blocks with a markdown paragraph between them:
</DIV>
.
-In the following example, what looks like a markdown code block
+In the following example, what looks like a Markdown code block
is actually part of the HTML block, which continues until a blank
line or the end of the document is reached:
@@ -1533,7 +1535,7 @@ foo
foo
.
-This rule differs from John Gruber's original markdown syntax
+This rule differs from John Gruber's original Markdown syntax
specification, which says:
> The only restrictions are that block-level HTML elements —
@@ -1549,7 +1551,7 @@ here:
- It requires a matching end tag, which it also does not allow to
be indented.
-Indeed, most markdown implementations, including some of Gruber's
+Indeed, most Markdown implementations, including some of Gruber's
own perl implementations, do not impose these restrictions.
There is one respect, however, in which Gruber's rule is more liberal
@@ -1558,8 +1560,8 @@ an HTML block. There are two reasons for disallowing them here.
First, it removes the need to parse balanced tags, which is
expensive and can require backtracking from the end of the document
if no matching end tag is found. Second, it provides a very simple
-and flexible way of including markdown content inside HTML tags:
-simply separate the markdown from the HTML using blank lines:
+and flexible way of including Markdown content inside HTML tags:
+simply separate the Markdown from the HTML using blank lines:
.
<div>
@@ -1585,14 +1587,14 @@ Compare:
</div>
.
-Some markdown implementations have adopted a convention of
+Some Markdown implementations have adopted a convention of
interpreting content inside tags as text if the open tag has
the attribute `markdown=1`. The rule given above seems a simpler and
more elegant way of achieving the same expressive power, which is also
much simpler to parse.
The main potential drawback is that one can no longer paste HTML
-blocks into markdown documents with 100% reliability. However,
+blocks into Markdown documents with 100% reliability. However,
*in most cases* this will work fine, because the blank lines in
HTML are usually followed by HTML block tags. For example:
@@ -2014,10 +2016,10 @@ The following rules define [block quotes](#block-quote):
more lines in which the next non-space character after the [block
quote marker](#block-quote-marker) is [paragraph continuation
text](#paragraph-continuation-text) is a block quote with *Bs* as
- its content. [Paragraph continuation
- text](#paragraph-continuation-text) is text that will be parsed as
- part of the content of a paragraph, but does not occur at the
- beginning of the paragraph.
+ its content. <a id="paragraph-continuation-text"/>
+ [Paragraph continuation text](#paragraph-continuation-text) is text
+ that will be parsed as part of the content of a paragraph, but does
+ not occur at the beginning of the paragraph.
3. **Consecutiveness.** A document cannot contain two [block
quotes](#block-quote) in a row unless there is a [blank
@@ -2207,8 +2209,8 @@ A blank line always separates block quotes:
</blockquote>
.
-(Most current markdown implementations, including John Gruber's
-original `Markdown.pl`, will parse this eample as a single block quote
+(Most current Markdown implementations, including John Gruber's
+original `Markdown.pl`, will parse this example as a single block quote
with two paragraphs. But it seems better to allow the author to decide
whether two block quotes or one are wanted.)
@@ -2887,7 +2889,7 @@ continued here.</p>
5. **That's all.** Nothing that is not counted as a list item by rules
- #1--4 counts as a [list item](#block-quote).
+ #1--4 counts as a [list item](#list-item).
The rules for sublists follow from the general rules above. A sublist
must be indented the same number of spaces a paragraph would need to be
@@ -3001,7 +3003,7 @@ A list item may be empty:
### Motivation
-John Gruber's markdown spec says the following about list items:
+John Gruber's Markdown spec says the following about list items:
1. "List markers typically start at the left margin, but may be indented
by up to three spaces. List markers must be followed by one or more
@@ -3041,10 +3043,10 @@ sublists to start with only two spaces indentation, at least on the
outer level. Worse, its behavior was inconsistent: a sublist of an
outer-level list needed two spaces indentation, but a sublist of this
sublist needed three spaces. It is not surprising, then, that different
-implementations of markdown have developed very different rules for
-determining what comes under a list item. (Pandoc and python-markdown,
+implementations of Markdown have developed very different rules for
+determining what comes under a list item. (Pandoc and python-Markdown,
for example, stuck with Gruber's syntax description and the four-space
-rule, while discount, redcarpet, marked, PHP markdown, and others
+rule, while discount, redcarpet, marked, PHP Markdown, and others
followed `Markdown.pl`'s behavior more closely.)
Unfortunately, given the divergences between implementations, there
@@ -3159,7 +3161,7 @@ is not indented as far as the first paragraph `foo`:
Arguably this text does read like a list item with `bar` as a subparagraph,
which may count in favor of the proposal. However, on this proposal indented
code would have to be indented six spaces after the list marker. And this
-would break a lot of existing markdown, which has the pattern:
+would break a lot of existing Markdown, which has the pattern:
``` markdown
1. foo
@@ -3614,7 +3616,7 @@ backslashes:
.
Escaped characters are treated as regular characters and do
-not have their usual markdown meanings:
+not have their usual Markdown meanings:
.
\*not emphasized*
@@ -3778,7 +3780,7 @@ named entities are recognized as entities here:
<p>&MadeUpEntity;</p>
.
-Entities are recognized in any any context besides code spans or
+Entities are recognized in any context besides code spans or
code blocks, including raw HTML, URLs, [link titles](#link-title), and
[fenced code block](#fenced-code-block) info strings:
@@ -3968,7 +3970,7 @@ we just have literal backticks:
## Emphasis and strong emphasis
-John Gruber's original [markdown syntax
+John Gruber's original [Markdown syntax
description](http://daringfireball.net/projects/markdown/syntax#em) says:
> Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
@@ -4635,8 +4637,8 @@ More cases with mismatched delimiters:
A link contains a [link label](#link-label) (the visible text),
a [destination](#destination) (the URI that is the link destination),
and optionally a [link title](#link-title). There are two basic kinds
-of links in markdown. In [inline links](#inline-links) the destination
-and title are given immediately after the lable. In [reference
+of links in Markdown. In [inline links](#inline-links) the destination
+and title are given immediately after the label. In [reference
links](#reference-links) the destination and title are defined elsewhere
in the document.
@@ -4780,7 +4782,7 @@ or use the `<...>` form:
.
Parentheses and other symbols can also be escaped, as usual
-in markdown:
+in Markdown:
.
[link](foo\)\:)
@@ -5114,7 +5116,7 @@ than emphasis:
<p>*<a href="/url">foo*</a></p>
.
-However, this is not, because link labels bind tight less
+However, this is not, because link labels bind less
tightly than code backticks:
.
@@ -5941,7 +5943,7 @@ blocks but not parsed. Link reference definitions are parsed and a
map of links is constructed.
2. In the second phase, the raw text contents of paragraphs and headers
-are parsed into sequences of markdown inline elements (strings,
+are parsed into sequences of Markdown inline elements (strings,
code spans, links, emphasis, and so on), using the map of link
references constructed in phase 1.
@@ -5950,7 +5952,7 @@ references constructed in phase 1.
At each point in processing, the document is represented as a tree of
**blocks**. The root of the tree is a `document` block. The `document`
may have any number of other blocks as **children**. These children
-may, in turn, have other blocks a children. The last child of a block
+may, in turn, have other blocks as children. The last child of a block
is normally considered **open**, meaning that subsequent lines of input
can alter its contents. (Blocks that are not open are **closed**.)
Here, for example, is a possible document tree, with the open blocks
@@ -5986,7 +5988,7 @@ Once a line has been incorporated into the tree in this way,
it can be discarded, so input can be read in a stream.
We can see how this works by considering how the tree above is
-generated by four lines of markdown:
+generated by four lines of Markdown:
``` markdown
> Lorem ipsum dolor
@@ -6043,8 +6045,8 @@ The third line,
causes the `paragraph` block to be closed, and a new `list` block
opened as a child of the `block_quote`. A `list_item` is also
-added as a child of the `list`, and a `paragraph` as a chid of
-the `list_item`. The text is then added to the `paragraph`:
+added as a child of the `list`, and a `paragraph` as a child of
+the `list_item`. The text is then added to the new `paragraph`:
``` tree
-> document