API for protoflex.parse
-
by Panduranga Adusumilli
Full namespace name:
protoflex.parse
Overview
Clojure Parser Library.
Public Variables and Functions
ang-brackets
function
Usage: (ang-brackets parse-fn)
Returns the result of applying specifed parse function to text that is
in between the opening and closing angular brackets '<' and '>'
any
function
Usage: (any & parse-fns)
Returns the result of the first successfully matching parse-function.
If none of the parse functions match, an exception is thrown.
any-string
function
Usage: (any-string sep)
Reads a single-quoted or double-quoted or a plain-string that is followed
by the specified separator sep; the separator is not part of the returned
string.
at-end?
function
Usage: (at-end?)
Returns true if no more input is left to be read; false otherwise.
attempt
function
Usage: (attempt parse-fn)
Tries to match the input at the current position with the provided
parse function. If the parse function matches successfully, the matched
text is returned and the input cursor advances by the length of the
matched text. Otherwise a nil is returned and the current position
in the input remains unchanged.
auto-trim-if
function
Usage: (auto-trim-if)
Automatically trim the leading input text if :auto-trim option is set to true.
auto-trim-off
function
Usage: (auto-trim-off)
Turns off the auto-trim option.
auto-trim-on
function
Usage: (auto-trim-on)
Turns on auto-trim feature that cleans trailing white-space, comments
or whatever the custom ws-reader if any is spe
back-to-mark
function
Usage: (back-to-mark mark)
Resets the positional parameters to a previously set mark.
between
function
Usage: (between start-fn parse-fn end-fn)
Applies the supplied start-fn, parse-fn and end-fn functions and returns
the result of parse-fn. This is typically used to parse content enclosed by
some delimiters on either side.
blk-cmt
function
Usage: (blk-cmt beg end)
Reads and returns a block comment as specified by the begin and end
markers. Throws an exception if the specified block-comment doesn't
occur at the current position.
blk-cmt?
function
Usage: (blk-cmt? beg end)
Similar to blk-cmt but returns a nil instead of throwing an exception
in case of a match failure.
braces
function
Usage: (braces parse-fn)
Returns the result of applying specifed parse function to text that is
in between the opening and closing braces '{' and '}'
chr
function
Usage: (chr ch)
If the next character in the input matches the specified character ch,
returns it; otherwise throws an exception.
chr-
function
Usage: (chr- ch)
Same as chr but with auto-trimming turned off for the following input
chr-in
function
Usage: (chr-in chars)
If the next character in the input matches any character in the specified
string or character collection, the matching character is returned.
Otherwise throws an exception.
chr-in-
function
Usage: (chr-in- chars)
Same as chr-in but with auto-trimming turned off for the following input
decimal
function
Usage: (decimal)
Parses a decimal value and returns a Double.
dq-str
function
Usage: (dq-str)
Parses a double-quoted string and returns the matched string (minus the quotes)
eval-expr
function
Usage: (eval-expr & args)
Parses and evaluates an expression in infix notation.
Args: expression-string followed by parser options. See parse function for details.
eval-expr-tree
function
Usage: (eval-expr-tree ptree)
Evaluates the parse tree returned by expr parse function.
expect
function
Usage: (expect expected parse-fn)
Customize error message; if the specified parse function doesn't match
the current input text, the error message of the parse exception will include
the specified custom expected-message.
expr
function
Usage: (expr)
(expr nodes)
Parses expressions and returns the parse tree as nested vectors.
get-opt
function
Usage: (get-opt k)
(get-opt k d)
Returns the value for parser option k; if the optional default value
parameter d is specified, its value is returned if the option k is not set
in parser options.
integer
function
Usage: (integer)
Parses a long integer value and returns a Long.
lexeme
function
Usage: (lexeme parse-fn)
Applies the specified parse function for current input text, consumes any
following whitespace, comments and returns the result of the parse function
application.
line-cmt
function
Usage: (line-cmt beg)
Reads and returns a line comment as specified by the begin marker.
Throws an exception if the specified block-comment doesn't occur at the
current position.
line-cmt?
function
Usage: (line-cmt? beg)
Similar to line-cmt but returns a nil instead of throwing an exception
in case of a match failure.
line-pos
function
Usage: (line-pos)
Returns [line column] vector representing the current cursor position
of the parser
line-pos-str
function
Usage: (line-pos-str)
Returns line position in a descriptive string.
look-ahead
function
Usage: (look-ahead [la pf & rest])
Takes a collection of look-ahead-string and parse-function pairs and applies
the first parse function that follows the matching look-ahead-string and
returns the result, or throws a parse exception if the parse function fails.
If none of the look-ahead strings match the current text, an exception is thrown.
To specify a default parse function, provide an empty string as look-ahead and
the default parse function at the end of the argument list.
Args: [la-str-1 parse-fn-1 la-str-2 parse-fn-2 ...]
look-ahead*
function
Usage: (look-ahead* [la pf & rest])
Same as look-ahead, but consumes the matching look-ahead string before
applying the corresponding parse function.
mark-pos
function
Usage: (mark-pos)
Returns the current positional parameters of the parser.
multi*
function
Usage: (multi* parse-fn)
Matches zero or more occurrences of the provided parse function and returns
the results in a vector.
multi+
function
Usage: (multi+ parse-fn)
Matches one or more occurrences of the provided parse function and returns
the results in a vector. If the parse function doesn't match even once, an
exception is thrown.
no-trim
function
Usage: (no-trim fn)
Similar to with-trim-off, but takes a function as a parameter instead of
the body
no-trim-nl
function
Usage: (no-trim-nl fn)
Turns off automatic trimming of newline characters (as part of white-space)
and executes the specified function. The earlier auto-trim options are restored
at the end of execution of the specified function.
number
function
Usage: (number)
Matches an integral or non-integral numeric value. While the function decimal
also matches both integer and non-integer values, it always
returns a Double; where as number returns Long for integers and Double
for non-integers.
opt
function
Usage: (opt parse-fn)
(opt parse-fn default-val)
Same as attempt, but accepts a default value argument to return in case the
specified parse function fails. Useful for matching optional text.
parens
function
Usage: (parens parse-fn)
Returns the result of applying specifed parse function to text that is
in between the opening and closing parentheses '(' and ')'
parse
function
Usage: (parse parse-fn input-str & opts)
This function triggers off the parsing of the provided input string using
the specified parse function. The following parser options may be provided to alter
the comments and white space matching behavior:
:blk-cmt-delim - vector specifying start and end of block-comment markers
:line-cmt-start - string specifying the begin marker of a line comment
:ws-regex - regular expression for matching (non-comment) white space
:auto-trim - whether to automatically remove the leading whitespace/comments
at the current position in the input text or immediately after a parse action.
:word-regex - regular expression for matching words
:operators - a vector of vector of operators in the decreasing order of
precedence; see get-default-ops function for an example.
:op-fn-map - a map of operator and the function to call for that operator when
evaluating expressions
:eof - if true, the parse function must consume the entire input text
Args:
parse-fn - parse function to apply
input-str - input text to be parsed
opts - key value options (listed above)
parser-init
function
Usage: (parser-init input-str)
(parser-init input-str opts)
Initializes the parser state with the specified input string and options.
read-ch
function
Usage: (read-ch)
(read-ch is-no-auto-trim)
Reads and return the next input character. Throws an exception if the
current position is at the end of the input.
read-ch-in-set
function
Usage: (read-ch-in-set char-set)
(read-ch-in-set char-set is-no-auto-trim)
Reads and returns the next character if it matches any of the characters
specified in the provided set. An exception is thrown otherwise. The
optional is-no-auto-trim argument may be used to specify whether or not
to apply auto-trim after reading the next character.
read-n
function
Usage: (read-n n)
Reads and returns an n-character string at the current position.
read-re
function
Usage: (read-re re)
(read-re re grp)
Reads the string matching the specified regular expression. If a match-group
is specified, the corresponding text is returned; otherwise the entire matched
text is returned.
read-to
function
Usage: (read-to s)
The parser skips to the position where the text contains the string
specified by s. The string itself is not consumed, that is the cursor is
positioned at the beginning of the match. If the specified string is not
found, cursor position does not change and a parse exception is thrown.
read-to-re
function
Usage: (read-to-re re)
Reads and returns text upto but not including the text matched by the
specified regular expression. If the specified regular expression doesn't
occur in the remaining input text, an exception is thrown.
read-ws
function
Usage: (read-ws)
Reads whitespace (including comments) using a whitespace reader based
on parser options. If the :ws-reader option is not set, a default whitespace
reader based on other parser options such as :ws-regex, :blk-cmt-delim and
:line-cmt-start will be used. Returns the whitespace read.
regex
function
Usage: (regex re)
(regex re grp)
Returns the text matched by the specified regex; If a group is specified,
the returned text if for that group only. In either case, the cursor is
advanced by the length of the entire matched text (group 0)
sep-by
function
Usage: (sep-by fld-fn fld-sep-fn)
(sep-by fld-fn fld-sep-fn rec-sep-fn)
Reads a record using the specified field, field-separator and record-separator
parse functions. If no record-separator is specified, a newline character
is used as record separator. Returns the fields of the record in a vector.
series
function
Usage: (series & parse-fns)
Matches a sequence of parse functions and returns their results in
a vector. Each successfull match by the parse function advances the cursor.
If any of the parse functions fails, an exception is thrown.
set-blk-cmt-opts
function
Usage: (set-blk-cmt-opts beg end)
Sets block comment begin and end markers.
set-line-cmt-opts
function
Usage: (set-line-cmt-opts beg)
Sets line comment begin marker.
set-opt
function
Usage: (set-opt k v)
Sets parser option k to value v
set-ws-reader
function
Usage: (set-ws-reader ws-reader)
This sets the white-space parser to be used when auto-trim is set.
If this is specified, it overrides the options set by set-blk-cmt-opts,
set-line-cmt-opts and set-ws-regex options.
set-ws-regex
function
Usage: (set-ws-regex ws-re)
Sets the regular expression to be used for matching non-comment white-space.
skip-over
function
Usage: (skip-over s)
Finds the specified string s in the input and skips over it. If the string
is not found, a parse exception is thrown.
skip-over-re
function
Usage: (skip-over-re re)
Reads and returns text upto and including the text matched by the
specified regular expression. If the specified regular expression doesn't
occur in the remaining input text, an exception is thrown.
sq-brackets
function
Usage: (sq-brackets parse-fn)
Returns the result of applying specifed parse function to text that is
in between the opening and closing square brackets '[' and ']'
sq-str
function
Usage: (sq-str)
Parses a single-quoted string and returns the matched string (minus the quotes)
starts-with-re?
function
Usage: (starts-with-re? re)
Returns a boolean value indicating whether the specified regular expression
matches the input at the current position.
starts-with?
function
Usage: (starts-with? s)
Returns a boolean value indicating whether the current input text matches
the specified string.
string
function
Usage: (string s)
If the input matches the specified string, the string is
returned. Otherwise, a parse exception is thrown.
string-in
function
Usage: (string-in strings)
Returns the longest string from the provided strings that matches text
at the current position. Throws an exception if none of the strings match.
string-in-ord
function
Usage: (string-in-ord strings)
Returns the first string from the provided strings that matches text
at the current position. Throws an exception if none of the strings match.
throw-ex
function
Usage: (throw-ex)
(throw-ex msg)
Throws an exception; this is usually called to indicate a match
failure in a parse function.
unexpected
function
Usage: (unexpected actual expected)
Creates a message string for unexpected input expection.
with-trim-off
macro
Usage: (with-trim-off & body)
Executes the provided body with auto-trim option set to false. The earlier
value of the auto-trim option is restored after executing the body.
with-trim-on
macro
Usage: (with-trim-on & body)
Executes the provided body with auto-trim option set to true. The earlier
value of the auto-trim option is restored after executing the body.
word
function
Usage: (word w)
Returns the specified word if the word occurs at the current position in
the input text; an exception is thrown otherwise.
word-in
function
Usage: (word-in str-coll)
(word-in str-coll word-reader)
Returns the first word from the provided words that matches text
at the current position. Throws an exception if none of the words match.
An optional word-reader parse-function may be provided to read words.
ws
function
Usage: (ws)
(ws bcb bce lcb wsre)
Matches white space (including comments) at the current position. The
optional parameters bcb, bce, lcb and wsre specify block-comment-begin,
block-comment-end, line-comment-begin and white-space-regex respectively.
If they are not specified here, the options set for the parser are used.
Throws an exception if white space doesn't occur at the current position.
ws?
function
Usage: (ws? & args)
Similar to ws except that a nil value is returned instead of throwing
an exception in case of a match failure.