From 53ff724c3f363f9b914040395b5d1ffcec04668b Mon Sep 17 00:00:00 2001 From: jsmall-nvidia Date: Mon, 8 Feb 2021 17:53:02 -0500 Subject: Hotfix/doc typo lexical (#1696) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix typo --- docs/language-reference/02-lexcial-structure.md | 119 ------------------------ docs/language-reference/02-lexical-structure.md | 119 ++++++++++++++++++++++++ 2 files changed, 119 insertions(+), 119 deletions(-) delete mode 100644 docs/language-reference/02-lexcial-structure.md create mode 100644 docs/language-reference/02-lexical-structure.md (limited to 'docs/language-reference') diff --git a/docs/language-reference/02-lexcial-structure.md b/docs/language-reference/02-lexcial-structure.md deleted file mode 100644 index e22ea608c..000000000 --- a/docs/language-reference/02-lexcial-structure.md +++ /dev/null @@ -1,119 +0,0 @@ -Lexical Structure -================= - -Source Units ------------- - -A _source unit_ comprises a sequence of zero or more _characters_ which for purposes of this document are defined as Unicode scalars (code points). -Implementations *may* accept source units stored as files on disk, buffers in memory, or any appropriate implementation-specified means. - -Encoding --------- - -When encoding is required, source units *should* be encoded using UTF-8. -Implementations *may* support additional implemented-specified encodings. - -Whitespace ----------- - -_Horizontal whitespace_ consists of space (U+0020) and horizontal tab (U+0009). - -A _line break_ consists of a line feed (U+000A), carriage return (U+000D) or a carriage return followed by a line feed (U+000D, U+000A). -Line breaks are used as line separators rather than terminators; it is not necessary for a source unit to end with a line break. - -Escaped Line Breaks -------------------- - -An _escaped line break_ comprises a backslack (`\`, U+005C) follow immediately by a line break. - -Comments --------- - -A _comment_ is either a line comment or a block comment: - -```hlsl -// a line comment -/* a block comment */ -``` - -A _line comment_ comprises two forward slashes (`/`, U+002F) followed by zero or more characters that do not contain a line break. -A line comment extends up to, but does not include, a subsequent line break or the end of the source unit. - -A _block comment_ begins with a forward slash (`/`, U+002F) followed by an asterisk (`*`, U+0052). -A block comment is terminated by the next instance of an asterisk followed by a forward slash (`*/`). -A block comment contains all characters between where it begins and where it terminates, including any line breaks. -Block comments do not nest. -It is an error if a block comment that begins in a source unit is not terminated in that source unit. - -Phases ------- - -Compilation of a source unit proceeds _as if_ the following steps are executed in order: - -1. Line numbering (for subsequent diagnostic messages) is noted based on the locations of line breaks - -2. Escaped line breaks are eliminated. No new characters are inserted to replace them. Any new escaped line breaks introduced by this step are not eliminated. - -3. All comments are replaced with a single space (U+0020) - -4. The source unit is _lexed_ into a sequence of tokens according the lexical grammar in this chapter - -5. The lexed sequence of tokens is _preprocessed_ to produce a new sequence of tokens (Chapter 3) - -6. Subsequent processing is performed on the preprocessed sequence of tokens - -Identifiers ------------ - -An _identifier_ begins with an uppercase or lowercase ASCII letter (`A` through `Z`, `a` through `z`), or an underscore (`_`). -After the first character, ASCII digits (`0` through `9`) may also be used in an identifier. - -The identifier consistent of a single underscore (`_`) is reserved by the language and must not be used by programs. -Otherwise, there are no fixed keywords or reserved words. -Words that name a built-in language construct can also be used as user-defined identifiers and will shadow the built-in definitions in the scope of their definition. - -Literals --------- - -### Integer Literals - -An _integer literal_ consists of an optional radix specifier followed by digits and an optional suffix. - -The _radix specifier_ may be: - -* `0x` or `0X` to specify a hexadecimal literal (radix 16) -* `0b` or `0B` to specify a binary literal (radix 2) - -When no radix specifier is present a radix of 10 is used. - -Octal literals (radix 8) are not supported. -A `0` prefix on an integer literal does *not* specify an octal literal as it does in C. -Implementations *may* warn on integer literals with a `0` prefix in case users expect C behavior. - -The _digits_ of an integer literal may include ASCII `0` through `9`. -In the case of a hexadecimal literal, digits may include the letters `A` through `F` (and `a` through `f`) which represent digit values of 10 through 15. -It is an error for an integer literal to include a digit with a value greater than or equal to the radix. -The digits of an integer literal may also include underscore (`_`) characters, which are ignored and have no semantic impact. - -The _suffix_ on an integer literal may be used to indicate the desired type of the literal: - -* A `u` suffix indicates the `uint` type -* An `l` or `ll` suffix indicates the `int64_t` type -* A `ul` or `ull` suffix indicates the `uint64_t` type - -### Floating-Point Literals - -> Note: This section is not yet complete. - -### String Literals - -> Note: This section is not yet complete. - -### Character Literals - -> Note: This section is not yet complete. - -Operators and Punctuation -------------------------- - -> Note: This section is not yet complete. diff --git a/docs/language-reference/02-lexical-structure.md b/docs/language-reference/02-lexical-structure.md new file mode 100644 index 000000000..e22ea608c --- /dev/null +++ b/docs/language-reference/02-lexical-structure.md @@ -0,0 +1,119 @@ +Lexical Structure +================= + +Source Units +------------ + +A _source unit_ comprises a sequence of zero or more _characters_ which for purposes of this document are defined as Unicode scalars (code points). +Implementations *may* accept source units stored as files on disk, buffers in memory, or any appropriate implementation-specified means. + +Encoding +-------- + +When encoding is required, source units *should* be encoded using UTF-8. +Implementations *may* support additional implemented-specified encodings. + +Whitespace +---------- + +_Horizontal whitespace_ consists of space (U+0020) and horizontal tab (U+0009). + +A _line break_ consists of a line feed (U+000A), carriage return (U+000D) or a carriage return followed by a line feed (U+000D, U+000A). +Line breaks are used as line separators rather than terminators; it is not necessary for a source unit to end with a line break. + +Escaped Line Breaks +------------------- + +An _escaped line break_ comprises a backslack (`\`, U+005C) follow immediately by a line break. + +Comments +-------- + +A _comment_ is either a line comment or a block comment: + +```hlsl +// a line comment +/* a block comment */ +``` + +A _line comment_ comprises two forward slashes (`/`, U+002F) followed by zero or more characters that do not contain a line break. +A line comment extends up to, but does not include, a subsequent line break or the end of the source unit. + +A _block comment_ begins with a forward slash (`/`, U+002F) followed by an asterisk (`*`, U+0052). +A block comment is terminated by the next instance of an asterisk followed by a forward slash (`*/`). +A block comment contains all characters between where it begins and where it terminates, including any line breaks. +Block comments do not nest. +It is an error if a block comment that begins in a source unit is not terminated in that source unit. + +Phases +------ + +Compilation of a source unit proceeds _as if_ the following steps are executed in order: + +1. Line numbering (for subsequent diagnostic messages) is noted based on the locations of line breaks + +2. Escaped line breaks are eliminated. No new characters are inserted to replace them. Any new escaped line breaks introduced by this step are not eliminated. + +3. All comments are replaced with a single space (U+0020) + +4. The source unit is _lexed_ into a sequence of tokens according the lexical grammar in this chapter + +5. The lexed sequence of tokens is _preprocessed_ to produce a new sequence of tokens (Chapter 3) + +6. Subsequent processing is performed on the preprocessed sequence of tokens + +Identifiers +----------- + +An _identifier_ begins with an uppercase or lowercase ASCII letter (`A` through `Z`, `a` through `z`), or an underscore (`_`). +After the first character, ASCII digits (`0` through `9`) may also be used in an identifier. + +The identifier consistent of a single underscore (`_`) is reserved by the language and must not be used by programs. +Otherwise, there are no fixed keywords or reserved words. +Words that name a built-in language construct can also be used as user-defined identifiers and will shadow the built-in definitions in the scope of their definition. + +Literals +-------- + +### Integer Literals + +An _integer literal_ consists of an optional radix specifier followed by digits and an optional suffix. + +The _radix specifier_ may be: + +* `0x` or `0X` to specify a hexadecimal literal (radix 16) +* `0b` or `0B` to specify a binary literal (radix 2) + +When no radix specifier is present a radix of 10 is used. + +Octal literals (radix 8) are not supported. +A `0` prefix on an integer literal does *not* specify an octal literal as it does in C. +Implementations *may* warn on integer literals with a `0` prefix in case users expect C behavior. + +The _digits_ of an integer literal may include ASCII `0` through `9`. +In the case of a hexadecimal literal, digits may include the letters `A` through `F` (and `a` through `f`) which represent digit values of 10 through 15. +It is an error for an integer literal to include a digit with a value greater than or equal to the radix. +The digits of an integer literal may also include underscore (`_`) characters, which are ignored and have no semantic impact. + +The _suffix_ on an integer literal may be used to indicate the desired type of the literal: + +* A `u` suffix indicates the `uint` type +* An `l` or `ll` suffix indicates the `int64_t` type +* A `ul` or `ull` suffix indicates the `uint64_t` type + +### Floating-Point Literals + +> Note: This section is not yet complete. + +### String Literals + +> Note: This section is not yet complete. + +### Character Literals + +> Note: This section is not yet complete. + +Operators and Punctuation +------------------------- + +> Note: This section is not yet complete. -- cgit v1.2.3