This project is part of the @thi.ng/umbrella monorepo.


Extensible Graph Format.

Striving to be both easily readable & writable for humans and machines, this line based, plain text data format and package supports:

  • Definition of different types of graph based data (e.g. RDF-style or Labeled Property Graph topologies)
  • Full support for cyclic references, arbitrary order (automatic forward declarations)
  • Choice of inlining referenced nodes for direct access or via special node ref values
  • Arbitrary property values (extensible via tagged literals and custom tag parsers a la EDN)
  • Optionally prefixed node and property IDs with (also optional) auto-expansion via declared prefixes (for Linked Data use cases)
  • Inclusion of sub-graphs from external files
  • Loading of individual property values from referenced file paths
  • Optionally GPG encrypted property values (where needed)
  • Multi-line values
  • Line comments
  • Configurable parser behavior & syntax feature flags
  • Hand-optimized parser, largely regexp free
  • Configurable GraphViz DOT export

example graph

(Source for this example graph is further below)

Built-in tag parsers

The following parsers for tagged property values are available by default. Custom parsers can be provided via config options.

#base64Base64 encoded binary dataUint8Array
#dateDate.parse() compatible string (e.g. ISO8601)Date
#fileFile path to read value fromstring
#gpgCalls gpg to decrypt given armored stringstring
#hexhex 32bit int (no prefix)number
#jsonArbitrary JSON valueany
#listWhitespace separated liststring[]
#numFloating point value (IEEE754)number

Note: In this reference implementation, the #file and #gpg tag parsers are only available in NodeJS.


Feature ideas

(Non-exhaustive list)

  • VSCode syntax highlighting
  • JSON -> EGF conversion
  • Async tag parsing
  • URL support for #file tag
  • Tag declarations & tag parser import from URL (needs trust config opts)
  • #md tag parser for markdown content
  • #gpg fallback behavior options


Generated API docs

TODO - Full docs forthcoming...

Basic example

; file: readme.egf

; prefix declaration (optional feature)
@prefix thi: thi.ng/

; a single node/subject definition
; properties are indented
; `thi:` prefix will be expanded
    type project
    ; tagged value property (here: node ref)
    part-of -> thi:umbrella
    status alpha
    description Extensible Graph Format
    url https://thi.ng/egf
    creator -> toxi
    ; multi-line value
    ; read as whitespace separated list/array (via #list)
    tag #list >>>

    type project
    url https://thi.ng/umbrella
    creator -> toxi

    type person
    name Karsten Schmidt
    location London
    account -> toxi@twitter
    account -> postspectacular@gh

    type account
    name @toxi
    url http://twitter.com/toxi

    type account
    name @postspectacular
    url http://github.com/postspectacular
import { parseFile } from "@thi.ng/egf";

// enable prefix expansion in parser
const graph = parseFile("readme.egf", { opts: { prefixes: true } }).nodes;

// [
//  'thi.ng/egf',
//  'thi.ng/umbrella',
//  'toxi',
//  'toxi@twitter',
//  'postspectacular@gh'
// ]

// {
//   '$id': 'toxi',
//   type: 'person',
//   name: 'Karsten Schmidt',
//   location: 'London',
//   account: [
//     {
//       '$ref': 'toxi@twitter',
//       deref: [Function: deref],
//       equiv: [Function: equiv]
//     },
//     {
//       '$ref': 'postspectacular@gh',
//       deref: [Function: deref],
//       equiv: [Function: equiv]
//     }
//   ]
// }

// in this example inlining of referenced nodes is disabled (default)
// therefore refs are encoded as objects implementing the `IDeref` interface
// to obtain the referenced node
// {
//   '$id': 'toxi@twitter',
//   type: 'account',
//   name: '@toxi',
//   url: 'http://twitter.com/toxi'
// }


EGF is a UTF-8 plain text format and largely line based, though supports multi-line values. An EGF file consists of node definitions, each with zero or more properties and their (optionally tagged) values. EGF does not prescribe any other schema or structure and it's entirely up to the user to e.g. allow properties themselves to be defined as nodes with their own properties, thus allowing the definition of LPG (Labeled Property Graph) topologies as well.

; Comment line

; First node definition
    ; property with string value
    prop1 value
    ; property with reference to another node
    prop2 -> node2
    ; property with tagged value
    prop3 #tag value
    prop4 <<< long, potentially
value >>>
    prop5 #tag <<< tagged multi-line value >>>

    ; property comment
    prop1 value


A full grammar definition is forthcoming. In the meantime, please see a somewhat outdated older version and related comments in #234 for more details.

Node references

Properties with reference values to another node constitute edges in the graph. References are encoded via property -> nodeid.

The following graph defines two nodes with circular references between them. Each node has a literal (string, by default) property name and a reference property knows to another node (via its ID). The order of references is arbitrary and the parser will automatically produce forward declarations for nodes not yet known.

    name Alice
    knows -> bob

    name Robert
    knows -> alice

Using default parser options, this produces an object as follows. Note, the references are encoded as objects with a $ref property and implement the IDeref and IEquiv interfaces defined in the @thi.ng/api package.

  alice: {
    '$id': 'alice',
    name: 'Alice',
    knows: {
      '$ref': 'bob',
      deref: [Function: deref],
      equiv: [Function: equiv]
  bob: {
    '$id': 'bob',
    name: 'Robert',
    knows: {
      '$ref': 'alice',
      deref: [Function: deref],
      equiv: [Function: equiv]
// access bob's name via alice
// "Robert"

If node resolution is enabled (via the resolve option) in the parser, the referenced nodes will be inlined directly and produce circular references in the JS result object. In many cases this more desirable and fine, however will stop the graph from being serializable to JSON (for example).

  alice: <ref *1> {
    '$id': 'alice',
    name: 'Alice',
    knows: { '$id': 'bob', name: 'Robert', knows: [Circular *1] }
  bob: <ref *2> {
    '$id': 'bob',
    name: 'Robert',
    knows: <ref *1> {
      '$id': 'alice',
      name: 'Alice',
      knows: [Circular *2]

Prefixed IDs

To enable namespacing and simplify re-use of existing data vocabularies, we're borrowing from existing Linked Data formats & tooling to allow node and property IDs to be defined in a prefix:name format alongside @prefix declarations. Such prefix IDs will be expanded during parsing and usually form complete URIs, but could expand to any string. The various (50+) commonly used Linked Data vocabulary prefixes bundled in @thi.ng/prefixes are available by default, though can be overridden, of course...

; prefix declaration
@prefix thi: http://thi.ng/

    rdf:type -> foaf:person


  'thi.ng/toxi': {
    '$id': 'thi.ng/toxi',
    'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': {
      '$id': 'http://xmlns.com/foaf/0.1/person'
  'http://xmlns.com/foaf/0.1/person': {
    '$id': 'http://xmlns.com/foaf/0.1/person'


Currently in NodeJS only, external graph definitions can be included in the main graph via the @include directive. Any @prefix declarations in the included file will only be available in that file, however will inherit any pre-existing prefixes declared in the main file.

Relative file paths will be relative to the path of the currently processed file:

 |- include
 |  |- sub1.egf
 |  |- sub2.egf
 |- main.egf

(These examples make use of the schema.org ontology)

; main.egf
; declare an empty prefix
@prefix : http://thi.ng/

@include include/sub1.egf

; use empty prefix for this node
    rdf:type -> schema:Person
; sub1.egf
@include sub2.egf

    rdf:type -> schema:Dataset
    schema:dateCreated #date 2020-07-19
; sub2.egf

    rdf:type -> schema:Dataset
    schema:creator -> :toxi

Parsing the main.egf file (with node resolution/inlining and pruning) produces:

  'http://thi.ng/sub2.egf': {
    '$id': 'http://thi.ng/sub2.egf',
    'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Dataset' },
    'http://schema.org/creator': {
      '$id': 'http://thi.ng/toxi',
      'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Person' }
  'http://thi.ng/toxi': {
    '$id': 'http://thi.ng/toxi',
    'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Person' }
  'http://thi.ng/sub1.egf': {
    '$id': 'http://thi.ng/sub1.egf',
    'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Dataset' },
    'http://schema.org/dateCreated': 2020-07-19T00:00:00.000Z

EGF generation / serialization

Complying JS objects can be converted to EGF using the toEGF() function. This function takes an iterable of Node objects, optional prefix mappings and an optional property serialization function to deal with custom tagged values. The default property formatter (toEGFProp()) handles various values for built-in tags and can be used in combination with any additional user provided logic.

import { rdf, schema } from "@thi.ng/prefixes";

const res = toEGF([
      $id: "thi:egf",
      "rdf:type": { $ref: "schema:SoftwareSourceCode" },
      "schema:isPartOf": { $id: "http://thi.ng/umbrella" },
      "schema:dateCreated": new Date("2020-02-16")
      $id: "thi:umbrella",
      "rdf:type": { $ref: "schema:SoftwareSourceCode" },
      "schema:programmingLanguage": "TypeScript"
  // prefix mappings (optional)
    thi: "http://thi.ng/",
  // property serializer (optional)
@prefix thi: http://thi.ng/
@prefix schema: http://schema.org/
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#

    rdf:type -> schema:SoftwareSourceCode
    schema:isPartOf -> thi:umbrella
    schema:dateCreated #date 2020-02-16T00:00:00.000Z

    rdf:type -> schema:SoftwareSourceCode
    schema:programmingLanguage TypeScript


Karsten Schmidt


© 2020 Karsten Schmidt // Apache Software License 2.0

