From 45c1d9fadb3e8aab4a01bb27a4e2ece379902d1a Mon Sep 17 00:00:00 2001 From: Vicent Marti Date: Thu, 4 Sep 2014 17:26:11 +0200 Subject: 426/15 --- spec.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 82ae0b6..d7e70f5 100644 --- a/spec.txt +++ b/spec.txt @@ -1682,7 +1682,7 @@ them. [Foo bar] . -

Foo bar

+

Foo bar

. The title may be omitted: @@ -1745,7 +1745,7 @@ case-insensitive (see [matches](#matches)). [αγω] . -

αγω

+

αγω

. Here is a link reference definition with no corresponding link. @@ -3688,7 +3688,7 @@ raw HTML: . . -

http://google.com?find=\*

+

http://google.com?find=\*

. . -- cgit v1.2.3 From d8f44f1e4f0bd944ab43e6434a1579d670ed66cf Mon Sep 17 00:00:00 2001 From: Vicent Marti Date: Thu, 4 Sep 2014 17:49:13 +0200 Subject: 433/8 --- spec.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index d7e70f5..cfda2a3 100644 --- a/spec.txt +++ b/spec.txt @@ -3946,7 +3946,7 @@ But this is a link: . ` . -

http://foo.bar.`baz`

+

http://foo.bar.`baz`

. And this is an HTML tag: -- cgit v1.2.3 From 38220c56c9a888a0c00ff22fb82ba156fec1f6a8 Mon Sep 17 00:00:00 2001 From: Vicent Marti Date: Thu, 4 Sep 2014 17:54:37 +0200 Subject: 5 failed --- spec.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index cfda2a3..a353d56 100644 --- a/spec.txt +++ b/spec.txt @@ -3688,7 +3688,7 @@ raw HTML: . . -

http://google.com?find=\*

+

http://google.com?find=\*

. . @@ -4755,7 +4755,7 @@ braces: . [link]() . -

link

+

link

. The destination cannot contain line breaks, even with pointy braces: @@ -4821,7 +4821,7 @@ get unexpected results: . [link]("title") . -

link

+

link

. Titles may be in single quotes, double quotes, or parentheses: -- cgit v1.2.3 From d260c800c90e024714a6d84e28ac2caea70866e7 Mon Sep 17 00:00:00 2001 From: Vicent Marti Date: Thu, 4 Sep 2014 20:04:12 +0200 Subject: This spec was correct --- spec.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index a353d56..616cb96 100644 --- a/spec.txt +++ b/spec.txt @@ -3688,7 +3688,7 @@ raw HTML: . . -

http://google.com?find=\*

+

http://google.com?find=\*

. . -- cgit v1.2.3 From 798f58a2b614280201141b398c8e498cecc8ab5e Mon Sep 17 00:00:00 2001 From: Vicent Marti Date: Sat, 6 Sep 2014 21:17:23 +0200 Subject: This is going well --- spec.txt | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 616cb96..ebd6d98 100644 --- a/spec.txt +++ b/spec.txt @@ -3688,7 +3688,7 @@ raw HTML: . . -

http://google.com?find=\*

+

http://google.com?find=\*

. . @@ -3727,25 +3727,37 @@ foo ## Entities -Entities are parsed as entities, not as literal text, in all contexts -except code spans and code blocks. Three kinds of entities are recognized. +With the goal of making this standard as HTML-agnostic as possible, all HTML valid HTML Entities in any +context are recognized as such and converted into their actual values (i.e. the UTF8 characters representing +the entity itself) before they are stored in the AST. + +This allows implementations that target HTML output to trivially escape the entities when generating HTML, +and simplifies the job of implementations targetting other languages, as these will only need to handle the +UTF8 chars and need not be HTML-entity aware. [Named entities](#name-entities) consist of `&` -+ a string of 2-32 alphanumerics beginning with a letter + `;`. ++ any of the valid HTML5 entity names + `;`. The [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json) +is used as an authoritative source of the valid entity names and their corresponding codepoints. + +Conforming implementations that target Markdown don't need to generate entities for all the valid +named entities that exist, with the exception of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), +which always need to be written as entities for security reasons. .   & © Æ Ď ¾ ℋ ⅆ ∲ . -

  & © Æ Ď ¾ ℋ ⅆ ∲

+

  & © Æ Ď ¾ ℋ ⅆ ∲

. [Decimal entities](#decimal-entities) -consist of `&#` + a string of 1--8 arabic digits + `;`. +consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these entities need to be recognised +and tranformed into their corresponding UTF8 codepoints. Invalid Unicode codepoints will be written +as the "unknown codepoint" character (`0xFFFD`) . - # Ӓ Ϡ � +# Ӓ Ϡ � . -

 # Ӓ Ϡ �

+

# Ӓ Ϡ �

. [Hexadecimal entities](#hexadecimal-entities) @@ -3767,7 +3779,7 @@ Here are some nonentities: . Although HTML5 does accept some entities without a trailing semicolon -(such as `©`), these are not recognized as entities here: +(such as `©`), these are not recognized as entities here, because it makes the grammar too ambiguous: . © @@ -3775,13 +3787,12 @@ Although HTML5 does accept some entities without a trailing semicolon

&copy

. -On the other hand, many strings that are not on the list of HTML5 -named entities are recognized as entities here: +Strings that are not on the list of HTML5 named entities are not recognized as entities either: . &MadeUpEntity; . -

&MadeUpEntity;

+

&MadeUpEntity;

. Entities are recognized in any context besides code spans or -- cgit v1.2.3 From 9d86d2f32303ae0048f6a5daa552bacceb9b12ea Mon Sep 17 00:00:00 2001 From: Vicent Marti Date: Tue, 9 Sep 2014 04:00:36 +0200 Subject: Update the spec with better entity handling --- spec.txt | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index ebd6d98..112dccc 100644 --- a/spec.txt +++ b/spec.txt @@ -3762,20 +3762,20 @@ as the "unknown codepoint" character (`0xFFFD`) [Hexadecimal entities](#hexadecimal-entities) consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits -+ `;`. ++ `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. . - " ആ ಫ +" ആ ಫ . -

 " ആ ಫ

+

" ആ ಫ

. Here are some nonentities: . -  &x; &#; &#x; � &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; +  &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; . -

&nbsp &x; &#; &#x; &#123456789; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?;

+

&nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?;

. Although HTML5 does accept some entities without a trailing semicolon @@ -3808,7 +3808,7 @@ code blocks, including raw HTML, URLs, [link titles](#link-title), and . [foo](/föö "föö") . -

foo

+

foo

. . @@ -3816,7 +3816,7 @@ code blocks, including raw HTML, URLs, [link titles](#link-title), and [foo]: /föö "föö" . -

foo

+

foo

. . @@ -3824,7 +3824,7 @@ code blocks, including raw HTML, URLs, [link titles](#link-title), and foo ``` . -
foo
+
foo
 
. @@ -4817,12 +4817,14 @@ in Markdown:

link

. -URL-escaping and entities should be left alone inside the destination: +URL-escaping and should be left alone inside the destination, as all URL-escaped characters +are also valid URL characters. HTML entities in the destination will be parsed into their UTF8 +codepoints, as usual, and optionally URL-escaped when written as HTML. . [link](foo%20bä) . -

link

+

link

. Note that, because titles can often be parsed as destinations, -- cgit v1.2.3 From df58eee1f127f5c24631032792672bfe5120e6a3 Mon Sep 17 00:00:00 2001 From: Artyom Kazak Date: Tue, 9 Sep 2014 21:43:23 +0400 Subject: `code`, not `pre`. --- spec.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 82ae0b6..c06f750 100644 --- a/spec.txt +++ b/spec.txt @@ -1058,7 +1058,7 @@ a blank line either before or after. The content of a code fence is treated as literal text, not parsed as inlines. The first word of the info string is typically used to specify the language of the code sample, and rendered in the `class` -attribute of the `pre` tag. However, this spec does not mandate any +attribute of the `code` tag. However, this spec does not mandate any particular treatment of the info string. Here is a simple example with backticks: -- cgit v1.2.3 From 5b16a88558f74eee5b4c93e43e895e98f4ea86d6 Mon Sep 17 00:00:00 2001 From: Artyom Kazak Date: Thu, 11 Sep 2014 04:19:01 +0400 Subject: =?UTF-8?q?Fix=20a=20broken=20link=20to=20the=20=E2=80=9CA=20parsi?= =?UTF-8?q?ng=20strategy=E2=80=9D=20section.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit (Line lengths changed so that the link wouldn't have to be broken.) --- spec.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index c06f750..c4e77b2 100644 --- a/spec.txt +++ b/spec.txt @@ -1994,11 +1994,11 @@ form of the definition is: > transforming X in such-and-such a way is a container of type Y > with these blocks as its content. -So, we explain what counts as a block quote or list item by -explaining how these can be *generated* from their contents. -This should suffice to define the syntax, although it does not -give a recipe for *parsing* these constructions. (A recipe is -provided below in the section entitled [A parsing strategy].) +So, we explain what counts as a block quote or list item by explaining +how these can be *generated* from their contents. This should suffice +to define the syntax, although it does not give a recipe for *parsing* +these constructions. (A recipe is provided below in the section entitled +[A parsing strategy](#appendix-a-a-parsing-strategy).) ## Block quotes -- cgit v1.2.3 From a6722b8a737eaefdf3d757227036deb4f10492db Mon Sep 17 00:00:00 2001 From: Artyom Kazak Date: Thu, 11 Sep 2014 04:30:08 +0400 Subject: Fix another broken link. --- spec.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index c4e77b2..4a9e9fd 100644 --- a/spec.txt +++ b/spec.txt @@ -2010,9 +2010,9 @@ The following rules define [block quotes](#block-quote): 1. **Basic case.** If a string of lines *Ls* constitute a sequence - of blocks *Bs*, then the result of appending a [block quote marker] - to the beginning of each line in *Ls* is a [block quote](#block-quote) - containing *Bs*. + of blocks *Bs*, then the result of appending a [block quote + marker](#block-quote-marker) to the beginning of each line in *Ls* + is a [block quote](#block-quote) containing *Bs*. 2. **Laziness.** If a string of lines *Ls* constitute a [block quote](#block-quote) with contents *Bs*, then the result of deleting -- cgit v1.2.3 From bd271515770a17f3c320eb394f2012ccd51a417b Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 9 Sep 2014 22:30:54 -0700 Subject: spec: change nesting order of strong/emph in ***a***. --- spec.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 4a9e9fd..88c8dea 100644 --- a/spec.txt +++ b/spec.txt @@ -4392,13 +4392,13 @@ The rules are sufficient for the following nesting patterns: . ***foo bar*** . -

foo bar

+

foo bar

. . ___foo bar___ . -

foo bar

+

foo bar

. . -- cgit v1.2.3 From 905b5d4d11cf1e56137fea1e68eb503863f1b113 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Wed, 10 Sep 2014 08:42:39 -0700 Subject: Revert "spec: change nesting order of strong/emph in ***a***." This reverts commit 49a03b7666e2901d1ab2813fc0bdd23968d22979. --- spec.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 88c8dea..4a9e9fd 100644 --- a/spec.txt +++ b/spec.txt @@ -4392,13 +4392,13 @@ The rules are sufficient for the following nesting patterns: . ***foo bar*** . -

foo bar

+

foo bar

. . ___foo bar___ . -

foo bar

+

foo bar

. . -- cgit v1.2.3 From e245f1a2d5ec76807633806a5af1ebe52fe5bd6d Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Wed, 10 Sep 2014 08:56:20 -0700 Subject: Updated spec (but not yet examples) with new rules. These reflect the current parsing algorithm. We now get a symmetry that we lacked before: **a* b* *a *b** are both emphasis within emphasis. One asymmetry remains: **a* has no emphasis, while *a** has emphasis. Further tweaking of the algorithm could regularize this. --- spec.txt | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 4a9e9fd..37f92c5 100644 --- a/spec.txt +++ b/spec.txt @@ -4024,7 +4024,7 @@ for efficient parsing strategies that do not backtrack: (a) it is not part of a sequence of four or more unescaped `*`s, (b) it is not followed by whitespace, and (c) either it is not followed by a `*` character or it is - followed immediately by strong emphasis. + followed immediately by emphasis or strong emphasis. 2. A single `_` character [can open emphasis](#can-open-emphasis) iff @@ -4032,7 +4032,7 @@ for efficient parsing strategies that do not backtrack: (b) it is not followed by whitespace, (c) is is not preceded by an ASCII alphanumeric character, and (d) either it is not followed by a `_` character or it is - followed immediately by strong emphasis. + followed immediately by emphasis or strong emphasis. 3. A single `*` character [can close emphasis](#can-close-emphasis) iff @@ -4088,6 +4088,11 @@ for efficient parsing strategies that do not backtrack: emphasis](#can-close-strong-emphasis), and that uses the same character (`_` or `*`) as the opening delimiter, is reached. +11. In case of ambiguity, strong emphasis takes precedence. Thus, + `**foo**` is `foo`, not `foo`, + and `***foo***` is `foo`, not + `foo` or `foo`. + These rules can be illustrated through a series of examples. Simple emphasis: -- cgit v1.2.3 From 5cd513026fe49e83cfd544a7b375bf4fa1466b21 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Wed, 10 Sep 2014 09:00:40 -0700 Subject: Updated test cases in spec to reflect last change. --- spec.txt | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 37f92c5..e1aa502 100644 --- a/spec.txt +++ b/spec.txt @@ -4612,17 +4612,11 @@ Note that there are some asymmetries here: **foo* bar* .

foo bar

-

**foo* bar*

+

foo bar

. More cases with mismatched delimiters: -. -**foo* bar* -. -

**foo* bar*

-. - . *bar*** . -- cgit v1.2.3 From 9c08b31793f269e4b5902908282034618ee66eef Mon Sep 17 00:00:00 2001 From: Alex Kocharin Date: Tue, 16 Sep 2014 00:44:52 +0400 Subject: typo fix --- spec.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 4a9e9fd..40d04f2 100644 --- a/spec.txt +++ b/spec.txt @@ -4030,7 +4030,7 @@ for efficient parsing strategies that do not backtrack: (a) it is not part of a sequence of four or more unescaped `_`s, (b) it is not followed by whitespace, - (c) is is not preceded by an ASCII alphanumeric character, and + (c) it is not preceded by an ASCII alphanumeric character, and (d) either it is not followed by a `_` character or it is followed immediately by strong emphasis. -- cgit v1.2.3 From c4b76cf93c8c54b6a33bab82056dc542c6630d92 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Fri, 19 Sep 2014 18:11:33 -0700 Subject: spec: Fixed date, version. Closes #133. --- spec.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 040c060..fce8792 100644 --- a/spec.txt +++ b/spec.txt @@ -2,8 +2,8 @@ title: CommonMark Spec author: - John MacFarlane -version: 1 -date: 2014-09-06 +version: 2 +date: 2014-09-19 ... # Introduction -- cgit v1.2.3 From efc3e5d7a234587c79ac847213437f936de2499b Mon Sep 17 00:00:00 2001 From: Andrew January Date: Mon, 29 Sep 2014 13:12:29 +0100 Subject: Changes append to prepend When adding something to the beginning it is "prepending", not "appending" --- spec.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index fce8792..b89105f 100644 --- a/spec.txt +++ b/spec.txt @@ -2010,7 +2010,7 @@ The following rules define [block quotes](#block-quote): 1. **Basic case.** If a string of lines *Ls* constitute a sequence - of blocks *Bs*, then the result of appending a [block quote + of blocks *Bs*, then the result of prepending a [block quote marker](#block-quote-marker) to the beginning of each line in *Ls* is a [block quote](#block-quote) containing *Bs*. -- cgit v1.2.3 From 749b3000e8cc3202c52e30f2cd5e585175e9e17d Mon Sep 17 00:00:00 2001 From: Andrew January Date: Mon, 29 Sep 2014 13:24:54 +0100 Subject: Changes urls to use example.com As per RFC 2606 it is recommended to use example.com for sample urls in specifications. One example is left using "foo+special@Bar.baz-bar0.com" because it is designed to demonstrate the complexity of email addresses that should be permitted. --- spec.txt | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index fce8792..9a7e675 100644 --- a/spec.txt +++ b/spec.txt @@ -3686,9 +3686,9 @@ raw HTML: . . - + . -

http://google.com?find=\*

+

http://example.com?find=\*

. . @@ -5504,9 +5504,9 @@ spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m Examples of email autolinks: . - + . -

foo@bar.baz.com

+

foo@bar.example.com

. . @@ -5548,15 +5548,15 @@ These are not autolinks: . . -http://google.com +http://example.com . -

http://google.com

+

http://example.com

. . -foo@bar.baz.com +foo@bar.example.com . -

foo@bar.baz.com

+

foo@bar.example.com

. ## Raw HTML @@ -6146,5 +6146,3 @@ an `emph`. The document can be rendered as HTML, or in any other format, given an appropriate renderer. - - -- cgit v1.2.3 From 205b4aafe8c4aeb03700b450d2805f6f5b9fdc3f Mon Sep 17 00:00:00 2001 From: Andrew January Date: Mon, 29 Sep 2014 13:27:12 +0100 Subject: Adds missing newlines --- spec.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 9a7e675..c9d207a 100644 --- a/spec.txt +++ b/spec.txt @@ -6146,3 +6146,5 @@ an `emph`. The document can be rendered as HTML, or in any other format, given an appropriate renderer. + + -- cgit v1.2.3 From 8a2b85da34e1de10abaf55b212b0660a7917b5d8 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 7 Oct 2014 09:05:27 -0700 Subject: Removed spurious 'and', reflowed. --- spec.txt | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index bc2e381..c520272 100644 --- a/spec.txt +++ b/spec.txt @@ -4817,9 +4817,10 @@ in Markdown:

link

. -URL-escaping and should be left alone inside the destination, as all URL-escaped characters -are also valid URL characters. HTML entities in the destination will be parsed into their UTF8 -codepoints, as usual, and optionally URL-escaped when written as HTML. +URL-escaping should be left alone inside the destination, as all +URL-escaped characters are also valid URL characters. HTML entities in +the destination will be parsed into their UTF8 codepoints, as usual, and +optionally URL-escaped when written as HTML. . [link](foo%20bä) -- cgit v1.2.3 From 4dc7bbb0c3fb1057c921dedc2f83786caaa6f0ad Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 7 Oct 2014 09:05:27 -0700 Subject: Removed spurious 'and', reflowed. --- spec.txt | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 0a62b80..990ae8c 100644 --- a/spec.txt +++ b/spec.txt @@ -4816,9 +4816,10 @@ in Markdown:

link

. -URL-escaping and should be left alone inside the destination, as all URL-escaped characters -are also valid URL characters. HTML entities in the destination will be parsed into their UTF8 -codepoints, as usual, and optionally URL-escaped when written as HTML. +URL-escaping should be left alone inside the destination, as all +URL-escaped characters are also valid URL characters. HTML entities in +the destination will be parsed into their UTF8 codepoints, as usual, and +optionally URL-escaped when written as HTML. . [link](foo%20bä) -- cgit v1.2.3 From 3d99baba064091f74b9da78eaed38fcf4875af46 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 7 Oct 2014 22:21:03 -0700 Subject: Adjusted tests for new js parser. --- spec.txt | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 990ae8c..db62f53 100644 --- a/spec.txt +++ b/spec.txt @@ -4525,6 +4525,24 @@ __foo _bar_ baz__

foo bar baz

. +But note: + +. +*foo**bar**baz* +. +

foobarbaz

+. + +. +**foo*bar*baz** +. +

foobarbaz**

+. + +The difference is that in the two preceding cases, +the internal delimiters [can close emphasis](#can-close-emphasis), +while in the cases with spaces, they cannot. + Note that you cannot nest emphasis directly inside emphasis using the same delimeter, or strong emphasis directly inside strong emphasis: @@ -4606,7 +4624,7 @@ However, a string of four or more `****` can never close emphasis:

*foo****

. -Note that there are some asymmetries here: +We retain symmetry in these cases: . *foo** @@ -4614,7 +4632,7 @@ Note that there are some asymmetries here: **foo* .

foo*

-

**foo*

+

*foo

. . @@ -4637,7 +4655,7 @@ More cases with mismatched delimiters: . ***foo* . -

***foo*

+

**foo

. . @@ -4649,7 +4667,7 @@ More cases with mismatched delimiters: . ***foo** . -

***foo**

+

*foo

. . -- cgit v1.2.3 From d3c3e749f4f7b95a9604f751cf993fd488a15b19 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 7 Oct 2014 22:24:53 -0700 Subject: Cleaned up entity section of spec. We convert entities to unicode characters, not UTF-8 sequences. (Though they might ultimately be output that way.) --- spec.txt | 41 ++++++++++++++++++++++++----------------- 1 file changed, 24 insertions(+), 17 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index db62f53..489b9c0 100644 --- a/spec.txt +++ b/spec.txt @@ -3727,21 +3727,25 @@ foo ## Entities -With the goal of making this standard as HTML-agnostic as possible, all HTML valid HTML Entities in any -context are recognized as such and converted into their actual values (i.e. the UTF8 characters representing -the entity itself) before they are stored in the AST. +With the goal of making this standard as HTML-agnostic as possible, all +valid HTML entities in any context are recognized as such and +converted into unicode characters before they are stored in the AST. -This allows implementations that target HTML output to trivially escape the entities when generating HTML, -and simplifies the job of implementations targetting other languages, as these will only need to handle the -UTF8 chars and need not be HTML-entity aware. +This allows implementations that target HTML output to trivially escape +the entities when generating HTML, and simplifies the job of +implementations targetting other languages, as these will only need to +handle the unicode chars and need not be HTML-entity aware. [Named entities](#name-entities) consist of `&` -+ any of the valid HTML5 entity names + `;`. The [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json) -is used as an authoritative source of the valid entity names and their corresponding codepoints. ++ any of the valid HTML5 entity names + `;`. The +[following document](http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json) +is used as an authoritative source of the valid entity names and their +corresponding codepoints. -Conforming implementations that target Markdown don't need to generate entities for all the valid -named entities that exist, with the exception of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), -which always need to be written as entities for security reasons. +Conforming implementations that target HTML don't need to generate +entities for all the valid named entities that exist, with the exception +of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which +always need to be written as entities for security reasons. .   & © Æ Ď ¾ ℋ ⅆ ∲ @@ -3750,9 +3754,10 @@ which always need to be written as entities for security reasons. . [Decimal entities](#decimal-entities) -consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these entities need to be recognised -and tranformed into their corresponding UTF8 codepoints. Invalid Unicode codepoints will be written -as the "unknown codepoint" character (`0xFFFD`) +consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these +entities need to be recognised and tranformed into their corresponding +UTF8 codepoints. Invalid Unicode codepoints will be written as the +"unknown codepoint" character (`0xFFFD`) . # Ӓ Ϡ � @@ -3779,7 +3784,8 @@ Here are some nonentities: . Although HTML5 does accept some entities without a trailing semicolon -(such as `©`), these are not recognized as entities here, because it makes the grammar too ambiguous: +(such as `©`), these are not recognized as entities here, because it +makes the grammar too ambiguous: . © @@ -3787,7 +3793,8 @@ Although HTML5 does accept some entities without a trailing semicolon

&copy

. -Strings that are not on the list of HTML5 named entities are not recognized as entities either: +Strings that are not on the list of HTML5 named entities are not +recognized as entities either: . &MadeUpEntity; @@ -4836,7 +4843,7 @@ in Markdown: URL-escaping should be left alone inside the destination, as all URL-escaped characters are also valid URL characters. HTML entities in -the destination will be parsed into their UTF8 codepoints, as usual, and +the destination will be parsed into their UTF-8 codepoints, as usual, and optionally URL-escaped when written as HTML. . -- cgit v1.2.3 From 8122177e49f9d28b6606ce8168788113508e3306 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 7 Oct 2014 22:45:19 -0700 Subject: Added test case from issue #147. --- spec.txt | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 2a7e3de..fa2a877 100644 --- a/spec.txt +++ b/spec.txt @@ -4532,6 +4532,18 @@ __foo _bar_ baz__

foo bar baz

. +. +**foo, *bar*, baz** +. +

foo, bar, baz

+. + +. +__foo, _bar_, baz__ +. +

foo, bar, baz

+. + But note: . -- cgit v1.2.3 From 735f77b2a6a016abd56dfd1717de5a4b14528c36 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 7 Oct 2014 23:00:56 -0700 Subject: Added cases from #51 to spec. Closes #51. --- spec.txt | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index fa2a877..7b447f1 100644 --- a/spec.txt +++ b/spec.txt @@ -4357,6 +4357,32 @@ __this is a double underscore (`__`)__

this is a double underscore (__)

. +Or use the other emphasis character: + +. +*_* +. +

_

+. + +. +_*_ +. +

*

+. + +. +*__* +. +

__

+. + +. +_**_ +. +

**

+. + `*` delimiters allow intra-word emphasis; `_` delimiters do not: . -- cgit v1.2.3