https://github.com/jgm/djot Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code + Explore + All features + Documentation + GitHub Skills + Blog * Solutions + For + Enterprise + Teams + Startups + Education + By Solution + CI/CD & Automation + DevOps + DevSecOps + Case Studies + Customer Stories + Resources * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles + Repositories + Topics + Trending + Collections * Pricing [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this user All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} jgm / djot Public * * Notifications * Fork 23 * Star 526 A light markup language djot.net License MIT license 526 stars 23 forks Star Notifications * Code * Issues 56 * Pull requests 1 * Discussions * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Discussions * Actions * Projects * Security * Insights jgm/djot This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 2 branches 2 tags Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/j] Use Git or checkout with SVN using the web URL. [gh repo clone jgm/dj] Work fast with our official CLI. Learn more. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @jgm jgm Fix djot-reader.lua for new AST. ... 374381b Dec 6, 2022 Fix djot-reader.lua for new AST. 374381b Git stats * 417 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github Revert "Fix CI so that it fails when 'make testall' fails." Nov 29, 2022 bin Lazily load submodules, e.g. djot.filter. Dec 2, 2022 clib clib comment change. Dec 5, 2022 djot Fix bug in tight/loose determination. Dec 4, 2022 doc Heading syntax changes. Dec 4, 2022 editors/vim vim/syntax: Highlight blockquote leader > Nov 29, 2022 test Heading syntax changes. Dec 4, 2022 web playground: use === to avoid spurious error message. Dec 5, 2022 .gitignore ignore generated rockspec (#25) Jul 31, 2022 LICENSE Initial commit Jul 11, 2022 Makefile API overhaul. Dec 2, 2022 README.md More README fixes. Dec 5, 2022 config.ld API overhaul. Dec 2, 2022 djot-reader.lua Fix djot-reader.lua for new AST. Dec 5, 2022 djot-writer.lua djot-writer.lua: handle section divs better. Nov 21, 2022 djot.lua Remove unneeded space in events JSON output. Dec 2, 2022 full-coverage.lua Fix full-coverage.lua. Dec 2, 2022 fuzz.lua API overhaul. Dec 2, 2022 lua51.nix Remove luafilesystem dependency from tests. Nov 19, 2022 luajit.nix Remove luafilesystem dependency from tests. Nov 19, 2022 pathological_tests.lua Revert "Try requiring compat53 on pathological_tests.lua." Nov 14, 2022 rockspec.in Remove djot.match module. Nov 29, 2022 run.sh run.sh - backup with LUA_PATH for inspect on error Nov 19, 2022 test.lua Lazily load submodules, e.g. djot.filter. Dec 2, 2022 View code [ ] Djot Rationale Syntax Installing Using the Lua library Quick start The code License README.md Djot GitHub CI Djot is a light markup syntax. It derives most of its features from commonmark, but it fixes a few things that make commonmark's syntax complex and difficult to parse efficiently. It is also much fuller-featured than commonmark, with support for definition lists, footnotes, tables, several new kinds of inline formatting (insert, delete, highlight, superscript, subscript), math, smart punctuation, attributes that can be applied to any element, and generic containers for block-level, inline-level, and raw content. The project began as an attempt to implement some of the ideas I suggested in my essay Beyond Markdown. (See Rationale, below.) This repository contains a reference implementation, written in Lua, and a Syntax Description. There is also a Cheatsheet and a Quick Start for Markdown Users that outlines the main differences between djot and Markdown, as well as a Playground, originally designed by @dtinth, that allows experimenting with the current implementation. Despite being written in an interpreted language, the reference implementation is very fast (converting a 260K test document in 141 ms on an M1 mac using the standard lua interpreter). It can produce an AST, rendered HTML, or a stream of match tokens that identify elements by source position, which could be used for syntax highlighting or a linting tool. We also provide a custom pandoc writer for djot (djot-writer.lua), so that documents in other formats can be converted to djot format, and a custom pandoc reader (djot-reader.lua), so that djot documents can be converted to any format pandoc supports. To use these, just put them in your working directory and use pandoc -f djot-reader.lua to convert from djot, and pandoc -t djot-writer.lua to convert to djot. (You'll need pandoc version 2.18 or higher.) Rationale Here are some design goals: 1. It should be possible to parse djot markup in linear time, with no backtracking. 2. Parsing of inline elements should be "local" and not depend on what references are defined later. This is not the case in commonmark: [foo][bar] might be "[foo]" followed by a link with text "bar", or "[foo][bar]", or a link with text "foo", or a link with text "foo" followed by "[bar]", depending on whether the references [foo] and [bar] are defined elsewhere (perhaps later) in the document. This non-locality makes accurate syntax highlighting nearly impossible. 3. Rules for emphasis should be simpler. The fact that doubled characters are used for strong emphasis in commonmark leads to many potential ambiguities, which are resolved by a daunting list of 17 rules. It is hard to form a good mental model of these rules. Most of the time they interpret things the way a human would most naturally interpret them---but not always. 4. Expressive blind spots should be avoided. In commonmark, you're out of luck if you want to produce the HTML a?b, because the flanking rules classify the first asterisk in a*?*b as right-flanking. There is a way around this, but it's ugly (using a numerical entity instead of a). In djot there should not be expressive blind spots of this kind. 5. Rules for what content belongs to a list item should be simple. In commonmark, content under a list item must be indented as far as the first non-space content after the list marker (or five spaces after the marker, in case the list item begins with indented code). Many people get confused when their indented content is not indented far enough and does not get included in the list item. 6. Parsers should not be forced to recognize unicode character classes, HTML tags, or entities, or perform unicode case folding. That adds a lot of complexity. 7. The syntax should be friendly to hard-wrapping: hard-wrapping a paragraph should not lead to different interpretations, e.g. when a number followed by a period ends up at the beginning of a line. (I anticipate that many will ask, why hard-wrap at all? Answer: so that your document is readable just as it is, without conversion to HTML and without special editor modes that soft-wrap long lines. Remember that source readability was one of the prime goals of Markdown and Commonmark.) 8. The syntax should compose uniformly, in the following sense: if a sequence of lines has a certain meaning outside a list item or block quote, it should have the same meaning inside it. This principle is articulated in the commonmark spec, but the spec doesn't completely abide by it (see commonmark/commonmark-spec# 634). 9. It should be possible to attach arbitrary attributes to any element. 10. There should be generic containers for text, inline content, and block-level content, to which arbitrary attributes can be applied. This allows for extensibility using AST transformations. 11. The syntax should be kept as simple as possible, consistent with these goals. Thus, for example, we don't need two different styles of headings or code blocks. These goals motivated the following decisions: * Block-level elements can't interrupt paragraphs (or headings), because of goal 7. So in djot the following is a single paragraph, not (as commonmark sees it) a paragraph followed by an ordered list followed by a block quote followed by a section heading: My favorite number is probably the number 1. It's the smallest natural number that is > 0. With pencils, though, I prefer a # 2. Commonmark does make some concessions to goal 7, by forbidding lists beginning with markers other than 1. to interrupt paragraphs. But this is a compromise and a sacrifice of regularity and predictability in the syntax. Better just to have a general rule. * An implication of the last decision is that, although "tight" lists are still possible (without blank lines between items), a sublist must always be preceded by a blank line. Thus, instead of - Fruits - apple - orange you must write - Fruits - apple - orange (This blank line doesn't count against "tightness.") reStructuredText makes the same design decision. * Also to promote goal 7, we allow headings to "lazily" span multiple lines: ## My excessively long section heading is too long to fit on one line. While we're at it, we'll simplify by removing setext-style (underlined) headings. We don't really need two heading syntaxes (goal 11). * To meet goal 5, we have a very simple rule: anything that is indented beyond the start of the list marker belongs in the list item. 1. list item > block quote inside item 1 2. second item In commonmark, this would be parsed as two separate lists with a block quote between them, because the block quote is not indented far enough. What kept us from using this simple rule in commonmark was indented code blocks. If list items are going to contain an indented code block, we need to know at what column to start counting the indentation, so we fixed on the column that makes the list look best (the first column of non-space content after the marker): 1. A commonmark list item with an indented code block in it. code! In djot, we just get rid of indented code blocks. Most people prefer fenced code blocks anyway, and we don't need two different ways of writing code blocks (goal 11). * To meet goal 6 and to avoid the complex rules commonmark adopted for handling raw HTML, we simply do not allow raw HTML, except in explicitly marked contexts, e.g. ``{=html} or ``` =html
| foo |