Fast parser engine with helper functions for tokenizer and parser machines.

ILazyParserMachine

Available since version 1.0 (view source)

not referred automatically

PROTOCOL with method -parse-seq

extensions of this protocol on existing types are forbidden (performance and/or backwards compatibility reasons).

A protocol for lazy parser machine. Is used in the lazy parser engine to process the sequence of tokens into the sequence of parsed values.

-parse-seq

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-parse-seq this seq parents)

Type signature:
  • (Any ⨯ ISeq ⨯ IPersistentList) → [Any ISeq]

Returns the [result rest-seq] pair, a (mostly lazy) result of parsing input seq of values within parents context (which includes this), and a lazy seq of unparsed values. First item in the returned pair may be this, which signals that no value was produced.

See also: ILazyParserMachine

ILazyParserMachineFactory

Available since version 1.0 (view source)

not referred automatically

PROTOCOL with method -dispatch-lazy-parser

A factory protocol for lazy parser machines.

-dispatch-lazy-parser

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-dispatch-lazy-parser this config state token parents)

Type signature:
  • (Any ⨯ {} ⨯ IReference ⨯ Any ⨯ IPersistentList) → Any

Returns a lazy machine based on given token, in the context of parents list of parent machines and a shared state map in state reference, or returns a value if token processing didn’t require a dedicated machine. If the dispatcher wants to ignore a token, it should be done through a special parser machine ignore-token.

IParserMachine

Available since version 1.0 (view source)

not referred automatically

PROTOCOL with methods -parse-value!, -parse-eof!

extensions of this protocol on existing types are forbidden (performance and/or backwards compatibility reasons).

A protocol for parser machine. Is used in the parser engine to process tokens into parsed values.

-parse-value!

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-parse-value! this value parents)

Type signature:
  • (Any ⨯ Any ⨯ IPersistentList) → Any

Returns the result of parsing value in the context of parents list of parent parser machines, including this. value may have been produced by a parser dispatcher or by a child parser machine. Result is a parser machine (if the parser machine has not yet finished) or a value if parsing has finished.

See also: IParserMachine

-parse-eof!

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-parse-eof! this parents)

Type signature:
  • (Any ⨯ IPersistentList) → Any

Returns the result of parser after the end of file has been reached, in the context of parents list of parent parser machines, including this. Result is either a this if no value is produced or a value, if parsing is finished. Throws if input is incomplete and parser is configured to throw on incomplete input.

See also: IParserMachine

IParserMachineFactory

Available since version 1.0 (view source)

not referred automatically

A factory protocol for tokenizer and parser machines.

-parser-config

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-parser-config this)

Type signature:
  • (Any) → {}

Returns a configuration map that is passed into machines created directly or indirectly with this parser machine factory. Passing of configuration map is done by convention as a first argument of machine constructor, with folowing keys recognized by the parser engine and its helper fns:

  • :incomplete-mode - Controls how machine should behave if end of file is reached before machine is done. Valid values are nil (throws), :keep (returns result so far or most probable result) or :ignore (behave as if there was nothing to parse).

  • :token-item-limit - specifies maximum number of input items a token can have before triggering limit exception.

  • :container-level-limit - specifies maximum number of nesting levels a container can have before triggering limit exception.

  • :container-item-limit - specifies maximum number of items a container can have before triggering limit exception.

  • :initial-state - specifies an initial state of a parser machine. Can be nil or a map.

-parser-from-type

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-parser-from-type this)

Type signature:
  • (Any) → (U nil Class+ Type)

Returns type of items which are to be parsed.

-parser-to-type

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-parser-to-type this)

Type signature:
  • (Any) → (U nil Class+ Type)

Returns type of parsed values.

-dispatch-tokenizer

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-dispatch-tokenizer this config state item)

Type signature:
  • (Any ⨯ {} ⨯ IReference ⨯ Any) → Any

Returns a token or a tokenizer-machine as a result of dispatching on a given input item according to the config configuration and shared state map in state reference. Returns this if item should be ignored (e.g. whitespace).

-dispatch-parser

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-dispatch-parser this config state token parents)

Type signature:
  • (Any ⨯ {} ⨯ IReference ⨯ Any ⨯ IPersistentList) → Any

Returns a machine based on given token according to the config configuration and a shared state map in state reference, in the context of parents list of parent machines, or returns a value if token processing didn’t require a dedicated machine. If the dispatcher wants to ignore a token, it should be done through a special parser machine called ignore-token.

ITokenizerMachine

Available since version 1.0 (view source)

not referred automatically

PROTOCOL with methods -analyze-batch!, -analyze-eof!

extensions of this protocol on existing types are forbidden (performance and/or backwards compatibility reasons).

A protocol for tokenizer machine. Is used in a parser engine to perform a conversion of input collection (of e.g. chars, bytes), into the collection of tokens. Parsers should implement tokenizer machines for tokens consisting of multiple input values.

See also: tokenizer-machine?

-analyze-batch!

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-analyze-batch! this bm batch)

Type signature:
  • (Any ⨯ BatchManager ⨯ AnyBatch) → Any

Returns the result of analyzing input batch managed by bm batch manager, setting the position of batch to the next not yet analyzed item, or to the batch’s current limit. Result is either a tokenizer machine (analysis not yet finished) or a token, if the analysis is finished.

Use leftover function to return leftover batch alongside token, if items outside current batch were left not analyzed (backtracking across batches).

See also: ITokenizerMachine

-analyze-eof!

Available since version 1.0 (view source)

not referred automatically

Usage:
  • (-analyze-eof! this)

Type signature:
  • (Any) → Any

Returns the result of tokenizer after the end of file has been reached. Result is either a this if no token is produced or a token, if the token could be produced.

Use leftover function to return leftover batch alongside token, if items outside current batch were left not analyzed (backtracking across batches). Throws if input is incomplete and tokenizer is configured to throw on incomplete input.

See also: ITokenizerMachine