Enhancements Beyond Spec
This document tracks djot-php enhancements that go beyond the current djot specification but align with its direction.
These are fixes or improvements for edge cases not explicitly covered by the spec. They are either on the way to get incorporated upstream - or may be incorporated into future spec versions.
Table of Contents
- Tab Indentation Support
- Multiple Footnote References
- Section ID Excludes Footnote Markers
- CSS-Safe Heading IDs
- Symbol Parsing in Time Formats
- Em/En Dash with Unmatched Braces
- Optional Modes
- Language Features Beyond Spec
- Task List Underscore Notation
- List Item Attributes
- Table Row and Cell Attributes
- Boolean Attribute Shorthand
- Fenced Comment Blocks
- Multiple Definition Terms
- Multiple Definition Definitions
- Definition List Element Attributes
- Table Multi-line Cells, Rowspan, and Colspan
- Captions for Images, Tables, and Block Quotes
- Abbreviations (PHP Markdown Extra Style)
- Testing
- Upstream Tracking
- Reporting Issues
Tab Indentation Support
Related: jgm/djot#255
Status: Implemented in djot-php
The djot spec doesn't explicitly define tab handling. We implemented consistent tab support:
Indentation (Leading Whitespace)
Tabs at the start of lines count as 2 spaces (one indentation level):
- Level 1
- Level 2 (tab-indented)
- Level 3 (two tabs)This applies to:
- Nested lists
- List item continuation
- Footnote continuation
- Definition list content
Syntax Delimiters (Space After Markers)
The space after block markers (#, -, >, :, etc.) must be a space, not a tab:
# Heading ✓ (space after #)
# Heading ✗ (tab after # - not a heading)
- List item ✓ (space after -)
- Item ✗ (tab after - - not a list)
> Quote ✓ (space after >)
> Quote ✗ (tab after > - not a blockquote)Rationale: The space after markers is a syntax delimiter (alignment), not indentation. Tabs are only meaningful for nesting depth at line start.
Multiple Footnote References
Related: jgm/djot#348
Status: Implemented in djot-php
When the same footnote is referenced multiple times, each reference gets a unique ID with multiple backlinks:
First reference[^note] and second reference[^note] and third[^note].
[^note]: This footnote is referenced three times.Output:
<p>First reference<a id="fnref1" href="#fn1" role="doc-noteref"><sup>1</sup></a>
and second reference<a id="fnref1-2" href="#fn1" role="doc-noteref"><sup>1</sup></a>
and third<a id="fnref1-3" href="#fn1" role="doc-noteref"><sup>1</sup></a>.</p>
<section role="doc-endnotes">
<ol>
<li id="fn1">
<p>This footnote is referenced three times.
<a href="#fnref1" role="doc-backlink">↩︎</a>
<a href="#fnref1-2" role="doc-backlink">↩︎</a>
<a href="#fnref1-3" role="doc-backlink">↩︎</a></p>
</li>
</ol>
</section>Features:
- Unique IDs:
fnref1,fnref1-2,fnref1-3 - Multiple backlinks in footnote content
- Proper ARIA roles for accessibility
Section ID Excludes Footnote Markers
Related: jgm/djot#349
Status: Implemented in djot-php
Auto-generated section IDs correctly exclude footnote reference markers:
# Introduction[^1]
[^1]: A footnote in the heading.Output:
<section id="Introduction">
<h1>Introduction<a href="#fn1"><sup>1</sup></a></h1>
</section>The ID is Introduction, not Introduction1 or Introduction[^1].
CSS-Safe Heading IDs
Related: php-collective/djot-php#92
Status: Implemented in djot-php
Auto-generated heading IDs are normalized to be valid CSS selectors, ensuring compatibility with querySelector(), HTMX scroll restoration, and CSS attribute selectors.
Normalization Rules
- Strip
#characters — Prevents invalid selectors - Trim whitespace — Clean leading/trailing spaces
- Whitespace to dashes — Spaces become single
- - Invalid characters to dashes — Only Unicode letters (
\p{L}), numbers (\p{N}), hyphens, and underscores are preserved - Collapse consecutive dashes —
foo--barbecomesfoo-bar - Trim leading/trailing dashes —
-foo-becomesfoo - Prefix digits — IDs starting with a number get
h-prefix (CSS requirement) - Fallback — Empty results become
heading
Examples
| Heading | Generated ID |
|---|---|
# Hello World | Hello-World |
# Hello World! | Hello-World |
# 日本語の見出し | 日本語の見出し |
# Привет мир | Привет-мир |
# E=mc^2 | E-mc-2 |
# 123 Numbers First | h-123-Numbers-First |
# $this->method() | this-method |
# ### | heading |
Unicode Preservation
International characters are preserved while special characters are normalized:
# 日本語の見出し
# Cześć świecieOutput:
<h1 id="日本語の見出し">日本語の見出し</h1>
<h1 id="Cześć-świecie">Cześć świecie</h1>Why This Matters
Without CSS-safe normalization, headings with special characters would break:
// This would throw SyntaxError with unsafe IDs
document.querySelector('#E=mc^2'); // Invalid selector
htmx.scrollToElement('#$this->foo'); // Invalid selectorWith normalization, these work correctly:
document.querySelector('#E-mc-2'); // Works
htmx.scrollToElement('#this-foo'); // WorksExplicit IDs
You can always override with an explicit ID attribute:
# My Heading {#custom-id}Explicit IDs are used as-is without normalization.
Symbol Parsing in Time Formats
Related: jgm/djot#350
Status: Implemented in djot-php
Colons in time formats are not parsed as symbol delimiters:
The meeting is at 10:30:00.Output:
<p>The meeting is at 10:30:00.</p>Not incorrectly parsed as symbols like :30:.
Em/En Dash with Unmatched Braces
Related: jgm/djot#125
Status: Implemented in djot-php
Unmatched {- does not prevent em/en-dash conversion:
{--- produces em-dash
{-- produces en-dashOutput:
<p>{— produces em-dash
{– produces en-dash</p>Optional Modes
These are optional parser modes that deviate from spec behavior for specific use cases.
Significant Newlines Mode
Related: jgm/djot#161
Status: Implemented in djot-php (opt-in)
An optional mode for chat messages, comments, and quick notes where markdown-like behavior is more intuitive.
Enable via:
// Factory method
$converter = DjotConverter::withSignificantNewlines();
// Constructor parameter
$converter = new DjotConverter(significantNewlines: true);
// Parser directly
$parser = new BlockParser(significantNewlines: true);Changes from spec:
| Behavior | Standard Mode | Significant Newlines Mode |
|---|---|---|
| Block elements interrupt paragraphs | No (blank line required) | Yes |
| Nested lists need blank lines | Yes | No |
| Soft breaks render as | \n or space | <br> |
Example:
Here is a list:
- item one
- item twoStandard mode output:
<p>Here is a list:
- item one
- item two</p>Significant newlines mode output:
<p>Here is a list:</p>
<ul>
<li>item one</li>
<li>item two</li>
</ul>Escaping: In this mode, escape block markers to keep them literal:
They said:
\> This is not a blockquoteLanguage Features Beyond Spec
These are djot syntax features we've implemented that aren't yet in the upstream spec.
Task List Underscore Notation
Related: jgm/djot#305
Status: Implemented in djot-php
The underscore [_] can be used as an alternative to space [ ] for unchecked task list items:
- [_] unchecked with underscore
- [ ] unchecked with space
- [x] checked itemOutput:
<ul class="task-list">
<li><input type="checkbox" disabled> unchecked with underscore</li>
<li><input type="checkbox" disabled> unchecked with space</li>
<li><input type="checkbox" disabled checked> checked item</li>
</ul>Rationale: The underscore notation is useful when:
- Typing on mobile devices where spaces inside brackets can be difficult
- Using editors without monospaced fonts where
[ ]may look ambiguous - The underscore visually resembles an empty checkbox in source
Both notations are fully equivalent and can be mixed within the same list.
List Item Attributes
Related: jgm/djot#262
Status: Implemented in djot-php (PR #5)
Attributes can be added to list items on the following indented line:
- item 1
{.highlight #id1}
- item 2
{data-value="test"}
- item 3Output:
<ul>
<li class="highlight" id="id1">item 1</li>
<li data-value="test">item 2</li>
<li>item 3</li>
</ul>Works with all list types:
1. First item
{.important}
2. Second item
- [ ] Unchecked task
{.pending}
- [x] Completed task
{.done}Rules:
- Attributes on next line at content indentation level
- Uses standard
{.class #id key=value}syntax - Works with unordered, ordered, and task lists
Table Row and Cell Attributes
Related: jgm/djot#250
Status: Implemented in djot-php (issue #18)
Attributes can be added to table rows and cells:
Row attributes (after final pipe):
| Name | Age |{.header-row}
|------|-----|
| John | 30 |{.highlight}Cell attributes (after opening pipe):
|{.name} Name |{.age} Age |
|-------------|-----------|
|{.emphasis} John | 30 |Output:
<table>
<tr class="header-row">
<th class="name">Name</th>
<th class="age">Age</th>
</tr>
<tr class="highlight">
<td class="emphasis">John</td>
<td>30</td>
</tr>
</table>Rules:
- Row attributes:
| cell | cell |{.class}(after final pipe) - Cell attributes:
|{.class} content |(after opening pipe) - Separator row attributes are ignored:
|---|---|{.ignored} - Attributes preserved when rows are converted to headers
- Works with alignment specifiers
Boolean Attribute Shorthand
Related: jgm/djot#257
Status: Implemented in djot-php
Boolean/flag attributes can be specified without a value for cleaner syntax:
{reversed}
1. Third
2. Second
3. First
::: details
{open}
This is expanded by default.
:::
[Download](file.zip){download .btn}Output:
<ol reversed="">
<li>Third</li>
<li>Second</li>
<li>First</li>
</ol>
<details open="">
<p>This is expanded by default.</p>
</details>
<p><a href="file.zip" class="btn" download="">Download</a></p>Supported syntax:
{reversed}- bare attribute name (no=required){hidden .class}- combinable with classes{#id open disabled}- multiple boolean attributes with ID{.alert hidden data-value="x"}- mixed with key=value attributes[text](url){download}- works on inline links too
Common use cases:
{reversed}- reversed ordered lists{open}- expanded<details>elements{hidden}- hidden elements{download}- downloadable links
Fenced Comment Blocks
Related: jgm/djot#67
Status: Implemented in djot-php
Standard {% %} comments cannot contain blank lines (they act as paragraph separators). Fenced comment blocks using %%% solve this:
%%%
This comment can contain
blank lines
and multiple paragraphs.
%%%Output:
<!-- nothing rendered -->Features:
- Uses
%%%(3+ percent signs) as delimiters - Closing fence must have at least as many
%as opening - Blank lines inside are preserved in the Comment node
- Like code fences, use more
%to include%%%inside
%%%%
%%% this is not the end
still inside
%%%%Rationale: The % character is already associated with comments in Djot ({% %}). This fenced syntax is consistent with code fences (```) and div fences (:::).
Multiple Definition Terms
Status: Implemented in djot-php
Multiple terms can share definitions in definition lists:
: CLI
: Command Line Interface
A text-based interface for interacting with computers.
: color
: colour
The visual property of objects.Output:
<dl>
<dt>CLI</dt>
<dt>Command Line Interface</dt>
<dd>
<p>A text-based interface for interacting with computers.</p>
</dd>
<dt>color</dt>
<dt>colour</dt>
<dd>
<p>The visual property of objects.</p>
</dd>
</dl>Multiple definitions: When multiple terms share definitions, each indented paragraph block (separated by blank lines) becomes a separate <dd>:
: color
: colour
The visual property of objects.
Used in art and design.Output:
<dl>
<dt>color</dt>
<dt>colour</dt>
<dd>
<p>The visual property of objects.</p>
</dd>
<dd>
<p>Used in art and design.</p>
</dd>
</dl>Rules:
- Consecutive
: termlines are grouped as multiple terms - Blank lines between terms are allowed
- Definition follows after blank line with indentation
- Each paragraph block becomes a separate
<dd>element - Common in dictionaries for synonyms, abbreviations, and alternate spellings
Multiple Definition Definitions (: + Continuation)
Related: php-collective/djot-php#49
Status: Implemented in djot-php
HTML definition lists support multiple <dd> elements per term. While blank lines within definition content create paragraphs in the same <dd>, the : + continuation marker explicitly creates additional <dd> elements:
: term
First definition.
: +
Second definition (separate dd element).
: +
Third definition.Output:
<dl>
<dt>term</dt>
<dd>
<p>First definition.</p>
</dd>
<dd>
<p>Second definition (separate dd element).</p>
</dd>
<dd>
<p>Third definition.</p>
</dd>
</dl>Comparison with blank lines:
: term
First paragraph.
Second paragraph (same dd).Produces a single <dd> with two paragraphs, while : + creates distinct <dd> elements.
Features:
- Uses
: +marker to start a new definition for the same term - Full roundtrip support in HtmlToDjot converter
- Works with definition list attributes
- Maintains compatibility with existing blank-line paragraph behavior
Definition List Element Attributes
Related: jgm/djot#323
Status: Implemented in djot-php
Attributes can be attached to individual <dl>, <dt>, and <dd> elements:
{.vocabulary}
: color
{.american}
: colour
{.british}
The visual property of objects.
{.primary}
Used in art and design.
{.secondary}Output:
<dl class="vocabulary">
<dt class="american">color</dt>
<dt class="british">colour</dt>
<dd class="primary">
<p>The visual property of objects.</p>
</dd>
<dd class="secondary">
<p>Used in art and design.</p>
</dd>
</dl>Syntax:
{...}before first term → applies to<dl>{...}on line after term → applies to that<dt>{...}as last line in definition block → applies to that<dd>(consistent with list items)
Table Multi-line Cells, Rowspan, and Colspan
Related: jgm/djot#368
Status: Implemented in djot-php (PR #67)
Enhanced table features for complex data presentation:
1. Multi-line Cell Content (continuation rows)
Uses + prefix instead of | to signal content continuation:
| Name | Description |
|------|------------------|
| Item | Long description |
+ | continued here |Output:
<table>
<tr><th>Name</th><th>Description</th></tr>
<tr><td>Item</td><td>Long description continued here</td></tr>
</table>Content from continuation rows is merged with space (like soft breaks).
2. Rowspan Support
The ^ marker indicates a cell is spanned from above (marker points UP):
| Category | Item |
|----------|--------|
| Fruits | Apple |
| ^ | Banana |
| ^ | Orange |Output:
<table>
<tr><th>Category</th><th>Item</th></tr>
<tr><td rowspan="3">Fruits</td><td>Apple</td></tr>
<tr><td>Banana</td></tr>
<tr><td>Orange</td></tr>
</table>Use \^ for literal ^ content.
3. Colspan Support
The < marker indicates a cell is spanned from left (marker points LEFT):
| Name | Contact Info | < |
|-------|--------------|-------|
| Alice | alice@ex.com | x5234 |Output:
<table>
<tr><th>Name</th><th colspan="2">Contact Info</th></tr>
<tr><td>Alice</td><td>alice@ex.com</td><td>x5234</td></tr>
</table>Use \< for literal < content. Content like a < b is NOT treated as a colspan marker.
4. Combined Rowspan + Colspan (2x2 blocks)
When a cell has both rowspan and colspan, it creates a rectangular block:
| | H1 | H2 |
|-----|-----|-----|
| L1 | A | < |
| L2 | ^ | ^ |This creates a 2x2 block where cell A has colspan="2" rowspan="2".
5. Code Spans Across Continuation Lines
Code spans can span across continuation rows:
| aaa | `this is a really long |
+ | code span` |Renders the second cell as: <code>this is a really long code span</code>
Edge Cases:
- Span markers in continuation rows are merged as content (not treated as spans)
- Multiple
^under a colspan only extend rowspan once per row - If intersection cells contain content instead of markers, that content is dropped
Captions for Images, Tables, and Block Quotes
Related: php-collective/djot-php#37
Status: Implemented in djot-php
The ^ caption text syntax adds captions to images, tables, and block quotes:
Image captions (wrapped in <figure> with <figcaption>):

^ A beautiful sunset captured at the beachOutput:
<figure>
<img alt="Sunset over the ocean" src="sunset.jpg"><figcaption>A beautiful sunset captured at the beach</figcaption>
</figure>Table captions (adds <caption> element):
| Product | Price |
|---------|-------|
| Widget | $10 |
^ Product pricing as of 2024Output:
<table>
<caption>Product pricing as of 2024</caption>
<tr><th>Product</th><th>Price</th></tr>
<tr><td>Widget</td><td>$10</td></tr>
</table>Block quote captions (wrapped in <figure> with <figcaption>, useful for attributions):
> To be or not to be, that is the question.
^ William Shakespeare, HamletOutput:
<figure>
<blockquote>
<p>To be or not to be, that is the question.</p>
</blockquote>
<figcaption>William Shakespeare, Hamlet</figcaption>
</figure>Features:
^marker at start of line triggers caption parsing- Can interrupt paragraphs (no blank line required before caption)
- Blank line between element and caption is allowed for readability
- Multi-line captions supported (continues until blank line or new block)
- Full roundtrip support in HtmlToDjot converter
Multi-line caption example:

^ This photograph was taken in 1969
during the Apollo 11 mission.
Credit: NASATesting
All enhancements have dedicated test coverage:
# Tab indentation tests
vendor/bin/phpunit tests/TestCase/TabIndentationTest.php
# Run full test suite (800+ tests)
vendor/bin/phpunitUpstream Tracking
Edge Case Fixes
| Enhancement | Upstream Issue | Status |
|---|---|---|
| Tab indentation | #255 | Open discussion |
| Multiple footnote refs | #348 | Open |
| Section ID footnotes | #349 | Open |
| Symbol time formats | #350 | Open |
| Em-dash with braces | #125 | Open |
Language Features
| Feature | Upstream PR/Issue | Status |
|---|---|---|
| Task list underscore notation | djot:305 | Open |
| List item attributes | djot:262 | Open PR |
| Table row/cell attributes | djot:250 | Open |
| Boolean attribute shorthand | djot:257 | Open |
| Multiple definition terms | djot:128 | djot-php |
| Multiple definition definitions | #49 | djot-php |
| Definition list attributes | djot:323 | Open |
| Fenced comment blocks | djot:67 | Open |
| Captions (image/table/blockquote) | #37 | djot-php |
| Table multi-line/rowspan/colspan | djot:368 | Open |
| Abbreviations (block, not inline) | djot:51 | djot-php |
Optional Modes
| Mode | Upstream Issue | Status |
|---|---|---|
| Significant newlines | #161 | djot-php (opt-in) |
These enhancements may be adopted into the official spec. We track upstream discussions and adjust our implementation accordingly.
Abbreviations (PHP Markdown Extra Style)
Status: djot-php extension
Abbreviation definitions using PHP Markdown Extra syntax for automatic <abbr> tag wrapping:
The HTML specification is maintained by the W3C.
*[HTML]: Hyper Text Markup Language
*[W3C]: World Wide Web ConsortiumOutput:
<p>The <abbr title="Hyper Text Markup Language">HTML</abbr> specification
is maintained by the <abbr title="World Wide Web Consortium">W3C</abbr>.</p>Features:
- Definitions can appear anywhere in the document
- Case-sensitive matching (HTML ≠ html)
- Word-boundary aware (HTML won't match HTMLElement or XHTML)
- Multi-line definitions supported with indentation
- Works alongside the inline span approach (
[HTML]{abbr="..."}) from the cookbook
Multi-line definition example:
*[HTML]: Hyper Text Markup Language,
the standard markup language for documents
designed to be displayed in a web browserThis is an extension feature not part of the djot spec yet.
Reporting Issues
If you find edge cases or inconsistencies:
- Check if it's covered by the djot spec
- Check upstream issues for existing discussions
- Report to djot-php issues