Structured Document
PEG expression
structured_document = section_content*
section_content = section
/ section_delimiter
/ block
Base building block of the structured document is flat block from previous section. (Technically in PEG, we have to unpack the flat_block
to match by indent and base_block
kinds)
I'll explain why we have this kind of two-layer parsing in macro system section.
Using the basic components introduced earlier, the following concepts are newly introduced:
Section / Section Delimiter
Indent Groups
Section / Section Delimiter
PEG expression
section = heading section_content*
section_delimiter = blank_heading
section_content = section
/ section_delimiter
/ block
A section is basically renamed version of heading from v1 spec.
A section has a single heading and contains the following items as its children:
other sections with lower level
section delimiters with lower level
blocks
A section delimiter is an unindented blank heading. Because it has level and won't greedly consume following blocks as content, we can use it to end the section.* heading 1
paragraph under level 1 section
** heading 2
paragraph under level 2 section
*** heading 3
paragraph under level 3 section
**
paragraph under section 1
We just jumped from level 3 section to level 1 section.
*
paragraph outside of level 1 section
In CST:
blank_line
nodes omitted for readability(section ; level 1 section
heading: (heading title: (title))
(paragraph)
(section
heading: (heading title: (title))
(paragraph)
(section
heading: (heading title: (title))
(paragraph)))
(section_delimiter)
(paragraph)
(paragraph))
(section_delimiter)
(paragraph)
Indent Groups
PEG expression
Indentation level is quite important here with context-sensitive parsing. So I'm using special generic syntax like
<A>
to compare indentation levels.
indented_group = unordered_list
/ ordered_list
/ quote
unordered_list<A> = unordered_list_item<A>+
# where: B >= A && C > A
unordered_list_item<A> = unordered_indented_block<A>
(null_indented_block<B> / indented_group<C>)*
# L: indent level
unordered_indented_block<L> = unordered_indent<L>
whitespace
(attributes whitespace)?
base_block
# L: indent level
null_indented_block<L> = null_indent<L>
whitespace
(attributes whitespace)?
base_block
An indent group is redesigned syntax of the nestable detached modifier from v1. Broadly speaking, an indent group is a group of flat_block
s which are indented in same indentation kind.
There are three types of indent groups
unordered list
ordered list
quote
All indent groups are basically identical except their indent kinds. So I'll explain with unordered list here.
An unordered list is a list of unordered list items. An unordered list item starts with single flat block that is indented with unordered kind and can have following items as children:
following null indented flat block with same or lower level
following indent groups with lower level.
Example 1 - Usage of null indents
A null indent means we will change that base block's indent level and won't set any meanings to the indent kind. We can use this to include multiple blocks under single list item. Similar to putting multiple <p>
tags under <li>
tag in HTML.- paragraph 1
/ paragraph 2
/ @code
even with code blocks
@end
Example 2 - Indented group delimiter
An unindented blank line has level 0, so it cannot contained in any indented group. An unindented blank line can be used as indented group delimiter.- this is
- 1st unordered
- list
- this is
- 2nd unordered
- list
Example 3 - Visual separator between indented paragraphs
Do you remember that a blank line is actually part of base blocks? That means we can do like this:- paragraph 1
/
/ paragraph 2
This is valid single unordered list item with two paragraphs.
Another example would be quote with two paragraphs like:> paragraph 1
>
> paragraph 2
This is a single block quote with two paragraphs.
Example 4 - List right under the quote
>
-- unordered list
-- right under
-- the quote
This corresponds to the > -
syntax in Markdown. Way easier to parse while retaining the readability.
Carryover Tag
Let me explain how the carryover tag I mentioned earlier operates.
The next node carryover tag accept can be defined as: "a following node that exists under the same indent group to the carryover tag"
The reason why I'm explaining this now is because we can attach it to sublist.> #color red
-- this is a red level 2 list
-- under level 1 quote
-- (still red)
Here, #color red
consumes the level 2 unordered list which is under level 1 quote item. Because unordered list itself is a sibling of the carryover tag, the definition of next node applies.