Parser.UTF8

is a submodule of codepoint-level parsers layered on the byte-level core. Each codepoint advances the cursor by one column, regardless of byte width.

any-char

defn

(Fn [] (Parser Char))

                        (any-char)
                    

consumes one UTF-8 codepoint, returning it as a Char. Fails empty at end of input or on malformed UTF-8.

char

defn

(Fn [Char] (Parser Char))

                        (char c)
                    

consumes a specific Unicode codepoint. Fails empty if the input doesn't match.

codepoint-satisfy

defn

(Fn [(Fn [Char] Bool a), String] (Parser Char))

                        (codepoint-satisfy pred lbl)
                    

consumes one UTF-8 codepoint if pred holds. Fails empty otherwise (or at EOF / on malformed input).