From c778892529949b8bed880babab20ef3f8a0adc03 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 4 Nov 2014 11:02:20 -0800 Subject: Rewrote emph/strong part of spec, with more systematic examples. --- spec.txt | 690 ++++++++++++++++++++++++++++++++++++--------------------------- 1 file changed, 390 insertions(+), 300 deletions(-) diff --git a/spec.txt b/spec.txt index 8c0a30e..5dff831 100644 --- a/spec.txt +++ b/spec.txt @@ -4248,79 +4248,71 @@ The following rules capture all of these patterns, while allowing for efficient parsing strategies that do not backtrack: 1. A single `*` character [can open emphasis](#can-open-emphasis) - iff - - (a) it is not followed by whitespace, and - (b) either it is not followed by a `*` character or it is - followed immediately by emphasis or strong emphasis. + iff it is not followed by + whitespace. 2. A single `_` character [can open emphasis](#can-open-emphasis) iff - - (a) it is not followed by whitespace, - (b) it is not preceded by an ASCII alphanumeric character, and - (c) either it is not followed by a `_` character or it is - followed immediately by emphasis or strong emphasis. + it is not followed by whitespace and it is not preceded by an + ASCII alphanumeric character. 3. A single `*` character [can close emphasis](#can-close-emphasis) - iff - - (b) it is not preceded by whitespace. + iff it is not preceded by whitespace. 4. A single `_` character [can close emphasis](#can-close-emphasis) iff - - (a) it is not preceded by whitespace, and - (b) it is not followed by an ASCII alphanumeric character. + it is not preceded by whitespace and it is not followed by an + ASCII alphanumeric character. 5. A double `**` [can open strong emphasis](#can-open-strong-emphasis) - iff - - (a) it is not followed by whitespace, and - (b) either it is not followed by a `*` character or it is - followed immediately by emphasis. + iff it is not followed by + whitespace. 6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) - iff - - (a) it is not followed by whitespace, and - (b) it is not preceded by an ASCII alphanumeric character, and - (c) either it is not followed by a `_` character or it is - followed immediately by emphasis. + iff it is not followed by whitespace and it is not preceded by an + ASCII alphanumeric character. 7. A double `**` [can close strong emphasis](#can-close-strong-emphasis) - iff - - (a) it is not preceded by whitespace. + iff it is not preceded by + whitespace. 8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) - iff - - (a) it is not preceded by whitespace, and - (b) it is not followed by an ASCII alphanumeric character. + iff it is not preceded by whitespace and it is not followed by an + ASCII alphanumeric character. 9. Emphasis begins with a delimiter that [can open emphasis](#can-open-emphasis) and ends with a delimiter that [can close emphasis](#can-close-emphasis), and that uses the same - character (`_` or `*`) as the opening delimiter. The inlines - between the open delimiter and the closing delimiter are the - contents of the emphasis inline. + character (`_` or `*`) as the opening delimiter. There must + be a nonempty sequence of inlines between the open delimiter + and the closing delimiter; these form the contents of the emphasis + inline. 10. Strong emphasis begins with a delimiter that [can open strong emphasis](#can-open-strong-emphasis) and ends with a delimiter that - [can close strong emphasis](#can-close-strong-emphasis), and that uses the - same character (`_` or `*`) as the opening delimiter. The inlines - between the open delimiter and the closing delimiter are the - contents of the strong emphasis inline. + [can close strong emphasis](#can-close-strong-emphasis), and that + uses the same character (`_` or `*`) as the opening delimiter. + There must be a nonempty sequence of inlines between the open + delimiter and the closing delimiter; these form the contents of + the strong emphasis inline. + +11. A literal `*` character cannot occur at the beginning or end of + `*`-delimited emphasis or `**`-delimited strong emphasis, unless it + is backslash-escaped. + +12. A literal `_` character cannot occur at the beginning or end of + `_`-delimited emphasis or `__`-delimited strong emphasis, unless it + is backslash-escaped. -Where rules 1--10 above are compatible with multiple parsings, +Where rules 1--12 above are compatible with multiple parsings, the following principles resolve ambiguity: -11. An interpretation `...` is always preferred to +13. The number of nestings should be minimized. Thus, for example, + an interpretation `...` is always preferred to `...`. -12. An interpretation `...` is always +14. An interpretation `...` is always preferred to `..`. -13. When two potential emphasis or strong emphasis spans overlap, +15. When two potential emphasis or strong emphasis spans overlap, so that the second begins before the first ends and ends after the first ends, the first is preferred. Thus, for example, `*foo _bar* baz_` is parsed as `foo _bar baz_` rather @@ -4328,13 +4320,13 @@ the following principles resolve ambiguity: `**foo*bar**` is parsed as `foobar*` rather than `foo*bar`. -14. When there are two potential emphasis or strong emphasis spans +16. When there are two potential emphasis or strong emphasis spans with the same closing delimiter, the shorter one (the one that opens later) is preferred. Thus, for example, `**foo **bar baz**` is parsed as `**foo bar baz` rather than `foo **bar baz`. -15. Inline code spans, links, images, and HTML tags group more tightly +17. Inline code spans, links, images, and HTML tags group more tightly than emphasis. So, when there is a choice between an interpretation that contains one of these elements and one that does not, the former always wins. Thus, for example, `*[foo*](bar)` is @@ -4343,7 +4335,7 @@ the following principles resolve ambiguity: These rules can be illustrated through a series of examples. -Simple emphasis: +Rule 1: . *foo bar* @@ -4351,363 +4343,424 @@ Simple emphasis:

foo bar

. +This is not emphasis, because the opening `*` is followed by +whitespace: + . -_foo bar_ +a * foo bar* . -

foo bar

+

a * foo bar*

. -Simple strong emphasis: +Intraword emphasis with `*` is permitted: . -**foo bar** +foo*bar* . -

foo bar

+

foobar

. . -__foo bar__ +5*6*78 . -

foo bar

+

5678

. -Emphasis can continue over line breaks: +Rule 2: . -*foo -bar* +_foo bar_ . -

foo -bar

+

foo bar

. +This is not emphasis, because the opening `*` is followed by +whitespace: + . -_foo -bar_ +_ foo bar_ . -

foo -bar

+

_ foo bar_

. +Emphasis with `_` is not allowed inside ASCII words: + . -**foo -bar** +foo_bar_ . -

foo -bar

+

foo_bar_

. . -__foo -bar__ +5_6_78 . -

foo -bar

+

5_6_78

. -Emphasis can contain other inline constructs: +But it is permitted inside non-ASCII words: . -*foo [bar](/url)* +пристаням_стремятся_ . -

foo bar

+

пристанямстремятся

. +Rule 3: + +This is not emphasis, because the closing `*` is preceded by +whitespace: + . -_foo [bar](/url)_ +*foo bar * . -

foo bar

+

*foo bar *

. +Intraword emphasis with `*` is allowed: + . -**foo [bar](/url)** +*foo*bar . -

foo bar

+

foobar

. + +Rule 4: + +This is not emphasis, because the closing `_` is preceded by +whitespace: + . -__foo [bar](/url)__ +_foo bar _ . -

foo bar

+

_foo bar _

. -Symbols contained in other inline constructs will not -close emphasis: +Intraword emphasis: . -*foo [bar*](/url) +_foo_bar . -

*foo bar*

+

_foo_bar

. . -_foo [bar_](/url) +_пристаням_стремятся . -

_foo bar_

+

пристанямстремятся

. . -** +_foo_bar_baz_ . -

**

+

foo_bar_baz

. +Rule 5: + . -__
+**foo bar** . -

__

+

foo bar

. +This is not strong emphasis, because the opening delimiter is +followed by whitespace: + . -*a `*`* +** foo bar** . -

a *

+

** foo bar**

. +Intraword strong emphasis with `**` is permitted: + . -_a `_`_ +foo**bar** . -

a _

+

foobar

. +Rule 6: + . -**a +__foo bar__ . -

**ahttp://foo.bar?q=**

+

foo bar

. +This is not strong emphasis, because the opening delimiter is +followed by whitespace: + . -__a +__ foo bar__ . -

__ahttp://foo.bar?q=__

+

__ foo bar__

. -This is not emphasis, because the opening delimiter is -followed by white space: +Intraword emphasis examples: . -and * foo bar* +foo__bar__ . -

and * foo bar*

+

foo__bar__

. . -_ foo bar_ +5__6__78 . -

_ foo bar_

+

5__6__78

. . -and ** foo bar** +пристаням__стремятся__ . -

and ** foo bar**

+

пристанямстремятся

. . -__ foo bar__ +__foo, __bar__, baz__ . -

__ foo bar__

+

foo, bar, baz

. -This is not emphasis, because the closing delimiter is -preceded by white space: +Rule 7: + +This is not strong emphasis, because the closing delimiter is preceded +by whitespace: . -and *foo bar * +**foo bar ** . -

and *foo bar *

+

**foo bar **

. +(Nor can it be interpreted as an emphasized `*foo bar *`, because of +Rule 11.) + +Intraword emphasis: + . -and _foo bar _ +**foo**bar . -

and _foo bar _

+

foobar

. +Rule 8: + +This is not strong emphasis, because the closing delimiter is +preceded by whitespace: + . -and **foo bar ** +__foo bar __ . -

and **foo bar **

+

__foo bar __

. +Intraword strong emphasis examples: + . -and __foo bar __ +__foo__bar . -

and __foo bar __

+

__foo__bar

. -The rules imply that a sequence of `*` or `_` characters -surrounded by whitespace will be parsed as a literal string: - . -foo ******** +__пристаням__стремятся . -

foo ********

+

пристанямстремятся

. . -Sign here: _________ +__foo__bar__baz__ . -

Sign here: _________

+

foo__bar__baz

. -The rules also imply that there can be no empty emphasis or strong -emphasis: +Rule 9: + +Any nonempty sequence of inline elements can be the contents of an +emphasized span. . -** is not an empty emphasis +*foo [bar](/url)* . -

** is not an empty emphasis

+

foo bar

. . -**** is not an empty strong emphasis +*foo +bar* . -

**** is not an empty strong emphasis

+

foo +bar

. -To include `*` or `_` in emphasized sections, use backslash escapes -or code spans: +In particular, emphasis and strong emphasis can be nested +inside emphasis: . -*here is a \** +_foo __bar__ baz_ . -

here is a *

+

foo bar baz

. . -__this is a double underscore (`__`)__ +_foo _bar_ baz_ . -

this is a double underscore (__)

+

foo bar baz

. -Or use the other emphasis character: - . -*_* +__foo_ bar_ . -

_

+

foo bar

. . -_*_ +*foo *bar** . -

*

+

foo bar

. . -*__* +*foo **bar** baz* . -

__

+

foo bar baz

. +But note: + . -_**_ +*foo**bar**baz* . -

**

+

foobarbaz

. -`*` delimiters allow intra-word emphasis; `_` delimiters do not: +The difference is that in the preceding case, +the internal delimiters [can close emphasis](#can-close-emphasis), +while in the cases with spaces, they cannot. . -foo*bar*baz +***foo** bar* . -

foobarbaz

+

foo bar

. . -foo_bar_baz +*foo **bar*** . -

foo_bar_baz

+

foo bar

. +Note, however, that in the following case we get no strong +emphasis, because the opening delimiter is closed by the first +`*` before `bar`: + . -foo__bar__baz +*foo**bar*** . -

foo__bar__baz

+

foobar**

. + +Indefinite levels of nesting are possible: + . -_foo_bar_baz_ +*foo **bar *baz* bim** bop* . -

foo_bar_baz

+

foo bar baz bim bop

. . -11*15*32 +*foo [*bar*](/url)* . -

111532

+

foo bar

. +There can be no empty emphasis or strong emphasis: + . -11_15_32 +** is not an empty emphasis . -

11_15_32

+

** is not an empty emphasis

. -Internal underscores will be ignored in underscore-delimited -emphasis: - . -_foo_bar_baz_ +**** is not an empty strong emphasis . -

foo_bar_baz

+

**** is not an empty strong emphasis

. + +Rule 10: + +Any nonempty sequence of inline elements can be the contents of an +strongly emphasized span. + . -__foo__bar__baz__ +**foo [bar](/url)** . -

foo__bar__baz

+

foo bar

. -The rules are sufficient for the following nesting patterns: - . -***foo bar*** +**foo +bar** . -

foo bar

+

foo +bar

. +In particular, emphasis and strong emphasis can be nested +inside strong emphasis: + . -___foo bar___ +__foo _bar_ baz__ . -

foo bar

+

foo bar baz

. . -***foo** bar* +__foo __bar__ baz__ . -

foo bar

+

foo bar baz

. . -___foo__ bar_ +____foo__ bar__ . -

foo bar

+

foo bar

. . -***foo* bar** +**foo **bar**** . -

foo bar

+

foo bar

. . -___foo_ bar__ +**foo *bar* baz** . -

foo bar

+

foo bar baz

. +But note: + . -*foo **bar*** +**foo*bar*baz** . -

foo bar

+

foobarbaz**

. +The difference is that in the preceding case, +the internal delimiters [can close emphasis](#can-close-emphasis), +while in the cases with spaces, they cannot. + . -_foo __bar___ +***foo* bar** . -

foo bar

+

foo bar

. . @@ -4716,256 +4769,261 @@ _foo __bar___

foo bar

. +Indefinite levels of nesting are possible: + . -__foo _bar___ +**foo *bar **baz** +bim* bop** . -

foo bar

+

foo bar baz +bim bop

. . -*foo **bar*** +**foo [*bar*](/url)** . -

foo bar

+

foo bar

. +There can be no empty emphasis or strong emphasis: + . -_foo __bar___ +__ is not an empty emphasis . -

foo bar

+

__ is not an empty emphasis

. . -*foo *bar* baz* +____ is not an empty strong emphasis . -

foo bar baz

+

____ is not an empty strong emphasis

. + +Rule 11: + . -_foo _bar_ baz_ +foo *** . -

foo bar baz

+

foo ***

. . -**foo **bar** baz** +foo *\** . -

foo bar baz

+

foo *

. . -__foo __bar__ baz__ +foo *_* . -

foo bar baz

+

foo _

. . -*foo **bar** baz* +foo ***** . -

foo bar baz

+

foo *****

. . -_foo __bar__ baz_ +foo **\*** . -

foo bar baz

+

foo *

. . -**foo *bar* baz** +foo **_** . -

foo bar baz

+

foo _

. +Note that when delimiters do not match evenly, Rule 11 determines +that the excess literal `*` characters will appear outside of the +emphasis, rather than inside it: + . -__foo _bar_ baz__ +**foo* . -

foo bar baz

+

*foo

. . -**foo, *bar*, baz** +*foo** . -

foo, bar, baz

+

foo*

. . -__foo, _bar_, baz__ +***foo** . -

foo, bar, baz

+

*foo

. -But note: +. +****foo* +. +

***foo

+. . -*foo**bar**baz* +**foo*** . -

foobarbaz

+

foo*

. . -**foo*bar*baz** +*foo**** . -

foobarbaz**

+

foo***

. -The difference is that in the two preceding cases, -the internal delimiters [can close emphasis](#can-close-emphasis), -while in the cases with spaces, they cannot. -Note that you cannot nest emphasis directly inside emphasis -using the same delimeter: +Rule 12: . -**foo** +foo ___ . -

foo

+

foo ___

. -For this, you need to switch delimiters: - . -*_foo_* +foo _\__ . -

foo

+

foo _

. -Strong within strong is possible without switching -delimiters: - . -****foo**** +foo _*_ . -

foo

+

foo *

. . -____foo____ +foo _____ . -

foo

+

foo _____

. -Note that a `*` followed by a `*` can close emphasis, and -a `**` followed by a `*` can close strong emphasis (and -similarly for `_` and `__`): - . -*foo** +foo __\___ . -

foo*

+

foo _

. . -*foo *bar** +foo __*__ . -

foo bar

+

foo *

. . -**foo*** +__foo_ . -

foo*

+

_foo

. +Note that when delimiters do not match evenly, Rule 12 determines +that the excess literal `_` characters will appear outside of the +emphasis, rather than inside it: + . -***foo* bar*** +_foo__ . -

foo bar*

+

foo_

. . -***foo** bar*** +___foo__ . -

foo bar**

+

_foo

. . -*foo**** +____foo_ . -

foo***

+

___foo

. -The following contains no strong emphasis, because the opening -delimiter is closed by the first `*` before `bar`: +. +__foo___ +. +

foo_

+. . -*foo**bar*** +_foo____ . -

foobar**

+

foo___

. -We retain symmetry in these cases: +Rule 13 implies that if you want emphasis nested directly inside +emphasis, you must use different delimiters: . -*foo** - -**foo* +**foo** . -

foo*

-

*foo

+

foo

. . -*foo *bar** - -**foo* bar* +*_foo_* . -

foo bar

-

foo bar

+

foo

. . -**foo*** - -***foo** +__foo__ . -

foo*

-

*foo

+

foo

. . -**foo **bar**** - -****foo** bar** +_*foo*_ . -

foo bar

-

foo bar

+

foo

. - - -More cases with mismatched delimiters: +However, strong emphasis within strong emphasisis possible without +switching delimiters: . -*bar*** +****foo**** . -

bar**

+

foo

. . -***foo* +____foo____ . -

**foo

+

foo

. + +Rule 13 can be applied to arbitrarily long sequences of +delimiters: + . -**bar*** +******foo****** . -

bar*

+

foo

. +Rule 14: + . -***foo** +***foo*** . -

*foo

+

foo

. . -***foo *bar* +_____foo_____ . -

***foo bar

+

foo

. -The following cases illustrate rule 13: +Rule 15: . *foo _bar* baz_ @@ -4974,12 +5032,13 @@ The following cases illustrate rule 13: . . -**foo bar* baz** +**foo*bar** . -

foo bar baz*

+

foobar*

. -The following cases illustrate rule 14: + +Rule 16: . **foo **bar baz** @@ -4993,18 +5052,18 @@ The following cases illustrate rule 14:

*foo bar baz

. -The following cases illustrate rule 15: +Rule 17: . -*[foo*](bar) +*[bar*](/url) . -

*foo*

+

*bar*

. . -*![foo*](bar) +_foo [bar_](/url) . -

*foo*

+

_foo bar_

. . @@ -5014,11 +5073,42 @@ The following cases illustrate rule 15: . . -*a`a*` +** +. +

**

+. + +. +__ +. +

__

+. + +. +*a `*`* . -

*aa*

+

a *

. +. +_a `_`_ +. +

a _

+. + +. +**a +. +

**ahttp://foo.bar?q=**

+. + +. +__a +. +

__ahttp://foo.bar?q=__

+. + + ## Links A link contains a [link label](#link-label) (the visible text), -- cgit v1.2.3