From c818df9888d452f0ae54b3a504eefdd970fd73d8 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Fri, 24 Oct 2014 20:09:53 -0700 Subject: Spec: say explicitly that a character is a unicode code point. --- spec.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'spec.txt') diff --git a/spec.txt b/spec.txt index 4d2a987..e3cf027 100644 --- a/spec.txt +++ b/spec.txt @@ -192,10 +192,10 @@ In the examples, the `→` character is used to represent tabs. # Preprocessing A [line](#line) -is a sequence of zero or more characters followed by a line -ending (CR, LF, or CRLF) or by the end of -file. +is a sequence of zero or more [characters](#character) followed by a +line ending (CR, LF, or CRLF) or by the end of file. +A [character](#character) is a unicode code point. This spec does not specify an encoding; it thinks of lines as composed of characters rather than bytes. A conforming parser may be limited to a certain encoding. -- cgit v1.2.3