https://nestedtext.org/en/latest/alternatives.html#yaml
NestedText
latest
[ ]
Language
* Philosophy
* Alternatives
+ JSON
+ YAML
+ TOML or INI
+ CSV or TSV
* Language introduction
* Language reference
* Language changes
Python Implementation
* Basic use
* Schemas
* Examples
* Common mistakes
* Python API
* Releases
NestedText
* Docs >>
* Alternatives
* Edit on GitHub
---------------------------------------------------------------------
AlternativesP
There are no shortage of well established alternatives to NestedText
for storing data in a human-readable text file. The features and
shortcomings of some of these alternatives are discussed next.
NestedText is intended to be used in situations where people either
create, modify, or consume the data directly. It is this perspective
that informs these comparisons.
JSONP
JSON is a subset of JavaScript suitable for holding data. Like
NestedText, it consists of a hierarchical collection of objects
(dictionaries), lists, and strings, but also allows reals, integers,
Booleans and nulls. In practice, JSON is largely generated and
consumed by machines. The data is stored as text, and so can be read,
modified, and consumed directly by the end user, but the format is
not optimized for this use case and so is often cumbersome or
inefficient when used in this manner.
JSON supports all the native data types common to most languages.
Syntax is added to values to unambiguously indicate their type. For
example, 2, 2.0, and "2" are three different values with three
different types (integer, real, string). This adds two types of
complexity. First, the rules for distinguishing various types must be
learned and used. Second, all strings must be quoted, and with
quoting comes escaping, which is needed to allow quote characters to
be included in strings.
JSON was derived as a subset of JavaScript, and so inherits a fair
amount of syntactic clutter that can be annoying for users to enter
and maintain. In addition, features that would improve clarity are
lacking. Comments are not allowed, multiline strings are not
supported, and whitespace is insignificant (leading to the
possibility that the appearance of the data may not match its true
structure).
NestedText only supports three data types (strings, lists and
dictionaries) and does not have the baggage of being the subset of a
general purpose programming language. The result is a simpler
language that has the following clear advantages over JSON as a human
readable and writable data file format:
* strings do not require quotes
* comments
* multiline strings
* no need to escape special characters
* commas are not used to separate dictionary and list items
The following examples illustrate the difference between JSON and
NestedText:
JSON:
{
"treasurer": {
"name": "Fumiko Purvis",
"address": "3636 Buffalo Ave\nTopeka, Kansas 20692",
"phone": "1-268-555-0280",
"email": "fumiko.purvis@hotmail.com",
"additional roles": [
"accounting task force"
]
}
}
NestedText:
treasurer:
name: Fumiko Purvis
# Fumiko's term is ending at the end of the year.
# She will be replaced by Merrill Eldridge.
address:
> 3636 Buffalo Ave
> Topeka, Kansas 20692
phone: 1-268-555-0280
email: fumiko.purvis@hotmail.com
additional roles:
- accounting task force
YAMLP
YAML is considered by many to be a human friendly alternative to JSON
. There is less syntactic clutter and the quoting of strings is
optional. However, it also supports a wide variety of data types and
formats. The optional quoting can result in the type of values being
ambiguous. To distinguish between the various types, a complicated
and non-intuitive set of rules developed. YAML at first appears very
appealing when used with simple examples, but things can quickly
become complicated or provide unexpected results. A reaction to this
is the use of YAML subsets, such as StrictYAML. However, the subsets
still try to maintain compatibility with YAML and so inherit much of
its complexity. For example, both YAML and StrictYAML support nine
different ways of writing multiline strings.
YAML avoids excessive quoting and supports comments and multiline
strings, but the multitude of formats and disambiguation rules make
YAML a difficult language to learn, and the ambiguities creates traps
for the user. To illustrate these points, the following is a
condensation of a YAML document taken from the GitHub documentation
that describes how to configure continuous integration using Python:
YAML:
name: Python package
on: [push]
build:
python-version: [3.6, 3.7, 3.8, 3.9, 3.10]
steps:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest
if [ -f 'requirements.txt' ]; then pip install -r requirements.txt; fi
- name: Test with pytest
run: |
pytest
And here is the result of running that document through the YAML
reader and writer. One might expect that the format might change a
bit but that the information conveyed remains unchanged.
YAML (round-trip):
name: Python package
true:
- push
build:
python-version:
- 3.6
- 3.7
- 3.8
- 3.9
- 3.1
steps:
- name: Install dependencies
run: 'python -m pip install --upgrade pip
pip install pytest
if [ -f ''requirements.txt'' ]; then pip install -r requirements.txt; fi
'
- name: Test with pytest
run: 'pytest
'
There are a few things to notice about this second version.
1. on key was inappropriately converted to true, introducing an
error.
2. Python version 3.10 was inappropriately converted to 3.1,
introducing an error.
3. Blank lines were added to the multiline strings and were
converted to another of the 9 possible formats.
4. Escaping was required for the quotes on 'requirements.txt'.
5. Indentation is not an accurate reflection of nesting (notice that
python-version and - 3.6 have the same indentation, but - 3.6 is
contained inside python-version).
Now consider the NestedText version; it is simpler and not subject to
misinterpretation.
NestedText:
name: Python package
on:
- push
build:
python-version:
- 3.6
- 3.7
- 3.8
- 3.9
- 3.10
steps:
-
name: Install dependencies
run:
> python -m pip install --upgrade pip
> pip install pytest
> if [ -f 'requirements.txt' ]; then pip install -r requirements.txt; fi
-
name: Test with pytest
run: pytest
NestedText was inspired by YAML, but eschews its complexity. It has
the following clear advantages over YAML as a human readable and
writable data file format:
* simple
* unambiguous (no implicit typing)
* no unexpected conversions of the data
* syntax is insensitive to special characters within text
* safe, no risk of malicious code execution
TOML or INIP
TOML is a configuration file format inspired by the well-known INI
syntax. It supports a number of basic data types (notably including
dates and times) using syntax that is more similar to JSON (explicit
but verbose) than to YAML (succinct but confusing). As discussed
previously, though, this makes it the responsibility of the user to
specify the correct type for each field.
Another flaw in TOML is that it is difficult to specify deeply nested
structures. The only way to specify a nested dictionary is to give
the full key to that dictionary, relative to the root of the entire
hierarchy. This is not much a problem if the hierarchy only has 1-2
levels, but any more than that and you find yourself typing the same
long keys over and over. A corollary to this is that TOML-based
configurations do not scale well: increases in complexity are often
accompanied by disproportionate decreases in readability and
writability.
Here is an example of a configuration file in TOML and NestedText:
TOML:
[plugins]
auth = ['avendesora']
archive = ['ssh', 'gpg', 'avendesora', 'emborg', 'file']
publish = ['scp', 'mount']
[auth.avendesora]
account = 'login'
field = 'passcode'
[archive.file]
src = ['~/src/nfo/contacts']
[archive.avendesora]
[archive.emborg]
config = 'rsync'
[publish.scp]
host = ['backups']
remote_dir = 'archives/{date:YYMMDD}'
[publish.mount]
drive = '/mnt/secrets'
remote_dir = 'sparekeys/{date:YYMMDD}'
NestedText:
plugins:
auth:
- avendesora
archive:
- ssh
- gpg
- avendesora
- emborg
- file
publish:
- scp
- mount
auth:
avendesora:
account: login
field: passcode
archive:
file:
src:
- ~/src/nfo/contacts
avendesora:
{}
emborg:
config: rsync
publish:
scp:
host:
- backups
remote_dir: archives/{date:YYMMDD}
mount:
drive: /mnt/secrets
remote_dir: sparekeys/{date:YYMMDD}
NestedText has the following clear advantages over TOML and INI as a
human readable and writable data file format:
* text does not require quoting or escaping
* data is left in its original form
* indentation used to succinctly represent nested data
* the structure of the file matches the structure of the data
* heavily nested data is represented efficiently
CSV or TSVP
CSV (comma-separated values) and the closely related TSV
(tab-separated values) are exchange formats for tabular data. Tabular
data consists of multiple records where each record is made up of a
consistent set of fields. The format separates the records using line
breaks and separates the fields using commas or tabs. Quoting and
escaping is required when the fields contain line breaks or commas/
tabs.
Here is an example data file in CSV and NestedText.
CSV:
Year,Agriculture,Architecture,Art and Performance,Biology,Business,Communications and Journalism,Computer Science,Education,Engineering,English,Foreign Languages,Health Professions,Math and Statistics,Physical Sciences,Psychology,Public Administration,Social Sciences and History
1970,4.22979798,11.92100539,59.7,29.08836297,9.064438975,35.3,13.6,74.53532758,0.8,65.57092343,73.8,77.1,38,13.8,44.4,68.4,36.8
1980,30.75938956,28.08038075,63.4,43.99925716,36.76572529,54.7,32.5,74.98103152,10.3,65.28413007,74.1,83.5,42.8,24.6,65.1,74.6,44.2
1990,32.70344407,40.82404662,62.6,50.81809432,47.20085084,60.8,29.4,78.86685859,14.1,66.92190193,71.2,83.9,47.3,31.6,72.6,77.6,45.1
2000,45.05776637,40.02358491,59.2,59.38985737,49.80361649,61.9,27.7,76.69214284,18.4,68.36599498,70.9,83.5,48.2,41,77.5,81.1,51.8
2010,48.73004227,42.06672091,61.3,59.01025521,48.75798769,62.5,17.6,79.61862451,17.2,67.92810557,69,85,43.1,40.2,77,81.7,49.3
NestedText:
-
Year: 1970
Agriculture: 4.22979798
Architecture: 11.92100539
Art and Performance: 59.7
Biology: 29.08836297
Business: 9.064438975
Communications and Journalism: 35.3
Computer Science: 13.6
Education: 74.53532758
Engineering: 0.8
English: 65.57092343
Foreign Languages: 73.8
Health Professions: 77.1
Math and Statistics: 38
Physical Sciences: 13.8
Psychology: 44.4
Public Administration: 68.4
Social Sciences and History: 36.8
-
Year: 1980
Agriculture: 30.75938956
Architecture: 28.08038075
Art and Performance: 63.4
Biology: 43.99925716
Business: 36.76572529
Communications and Journalism: 54.7
Computer Science: 32.5
Education: 74.98103152
Engineering: 10.3
English: 65.28413007
Foreign Languages: 74.1
Health Professions: 83.5
Math and Statistics: 42.8
Physical Sciences: 24.6
Psychology: 65.1
Public Administration: 74.6
Social Sciences and History: 44.2
-
Year: 1990
Agriculture: 32.70344407
Architecture: 40.82404662
Art and Performance: 62.6
Biology: 50.81809432
Business: 47.20085084
Communications and Journalism: 60.8
Computer Science: 29.4
Education: 78.86685859
Engineering: 14.1
English: 66.92190193
Foreign Languages: 71.2
Health Professions: 83.9
Math and Statistics: 47.3
Physical Sciences: 31.6
Psychology: 72.6
Public Administration: 77.6
Social Sciences and History: 45.1
-
Year: 2000
Agriculture: 45.05776637
Architecture: 40.02358491
Art and Performance: 59.2
Biology: 59.38985737
Business: 49.80361649
Communications and Journalism: 61.9
Computer Science: 27.7
Education: 76.69214284
Engineering: 18.4
English: 68.36599498
Foreign Languages: 70.9
Health Professions: 83.5
Math and Statistics: 48.2
Physical Sciences: 41
Psychology: 77.5
Public Administration: 81.1
Social Sciences and History: 51.8
-
Year: 2010
Agriculture: 48.73004227
Architecture: 42.06672091
Art and Performance: 61.3
Biology: 59.01025521
Business: 48.75798769
Communications and Journalism: 62.5
Computer Science: 17.6
Education: 79.61862451
Engineering: 17.2
English: 67.92810557
Foreign Languages: 69
Health Professions: 85
Math and Statistics: 43.1
Physical Sciences: 40.2
Psychology: 77
Public Administration: 81.7
Social Sciences and History: 49.3
NestedText has the following clear advantages over CSV and TSV as a
human readable and writable data file format:
* text does not require quoting or escaping
* arbitrary data hierarchies are supported
* file representation tends to be tall and skinny rather than short
and fat
* easier to read
Next Previous
---------------------------------------------------------------------
(c) Copyright 2020-21, Ken and Kale Kundert Revision 15287459.
Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: latest
Versions
latest
stable
v3.1
v3.0
v2.0
v1.3
v1.2
v1.1
v1.0
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds
---------------------------------------------------------------------
Free document hosting provided by Read the Docs.