Haskell Parsers Part 2.

This post continues from Haskell Parsers Part 1 and here we look at elevating the parser first to a Functor then an Applicative and finally to a Monad. Creating Functor, Applicative and Monad instances for the parser will provide useful operations and allow parsers to be combined in interesting and useful ways.
For a Functor we need to implement the fmap function and in the context of a Parser the signature of fmap is

in other words “A function a -> b applied to the result of a parser of ‘a’ gives a parser of ‘b'” and the implementation is P with a lambda expression (\s -> …

Next is Applicative. Whereas Functor applies a function to a value in a context, Applicative takes a function in a context and applies it to a value in a context. Applicative has two key functions to implement.

Pure is “the minimum needed to put a value in a context” and <*> known as ‘apply’ or ‘tie-fighter’ and it “applies a parser that gives a function and applies that function to the result of applying another parser and the result of the function application is in the parser that’s returned” (Sounds convoluted but later a few examples will clarify things.) Anyway I think the implementation is much clearer than the ‘hand wavy’ description!

and notice that fmap fab pa is using fmap (from Functor) to apply the function (fab) ‘into’ the pa (the Parser a) which ends up being Parser b.
Finally we look at a Monad for parsers. Having a Monad will allow us to sequence parsers using ‘>>’, ‘>>=’ or with the ‘do’ notation.

The return function is the same as the pure function in Applicative. The bind function (>>=) takes a Parser of ‘a’ , a function that takes an ‘a’ returning a Parser of ‘b’ and returns a Parser of ‘b’.

With the scene set lets create some simple parsers and ‘mix n match’ them into more sophisticated parsers. This first one is ‘satisfy’, a parser that succeeds if a condition is met and fails otherwise.

remembering that ‘ch’, on line 3, is a parser that parses the first char from the input string, then the above applies ‘ch’ and sets the result to an empty parser if the predicate, p, fails and a parser of Char if the predicate succeeds.
This can also be written in a slightly more succinct way as

It is often a matter of personal preference as to which style is preferred.
So with the ‘satisfy’ parser in place we can make a parser for a specific character.

and trying these in ghci:

Here are a couple of predicates and a number of parsers that use a predicate as input to the ‘satisfy’ parser.

So we have ‘small’ parsers that will parse a single character with or without some restriction on the character to be parsed. Next we will look at how to parse for a string. The idea is to keep calling the ‘char’ parser with the head of the input whilst removing the head each time. Here’s an implementation in the ‘Monadic style’.

The same string parsing function can be written in the applicative style like this:

Here are some examples in ghci:

We have parsers that can parse a particular string or, using a predicate, parse for certain characters. However we don’t yet have a way of taking the input string and, for example, trying a number of parsers until one works or there are no parsers left. For example imagine we’re parsing some log file that has HTTP verbs at the start of a line and the verb could be one of “GET”, “POST”, “DELETE” followed by a ‘,’ and then the rest of the line.

Now we can parse any one of these by using a suitable string parser – e.g. string "GET" or string "POST" or string "DELETE"
So what we need to do is write code to give a choice between the three parsers. If the first for “GET” works then done otherwise try the next one and so on. Haskell has a typeclass to provide such behaviour!

And its Parser based incantation is

So using this we can define a parser for some of the HTTP verbs.

And some ghci examples:

Noticing that the hypothetical line of log output has a ‘,’ after the http verb we can parse the ‘,’ away in either monadic style

or in an applicative fashion

In both styles the ‘,’ is silently dropped.
Having the ‘or’ or ‘choice’ operator, ‘<|>’, will now allow us to expand the sophistication of our parsers. To add more capability to our parsers consider the motivating problem of how to parse a number given that our ‘digit’ parser can only take one char from the input string?. i.e. in ghci

The result of parsing is ‘1’ and the remainder is “234”. What we’d like to do is apply the digit parser repeatedly until it fails. To achieve this we define two mutually recursive parsers, ‘many’ i.e. zero or more and ‘many1’ – i.e. at least 1.

These two are a ‘push me pull you’ sort of thing. Start with ‘many‘ it calls ‘many1‘ which puts ‘:’ (list concatenation) into a parser context and applies ‘p’. Control then goes back to ‘many‘ – it then calls ‘many1‘ this continues and I imagine it as building up a bunch of ‘:’ operations until the parser ‘p’ fails and at that point the whole lot gets resolved.

Now we can create a parser for (unsigned) integers. The idea is to keep applying the ‘digit’ parser and then use the ‘read’ function to make an integer from a string. In Monadic and also Applicative form.

and some examples in ghci.

All the parsers we’ve looked at are typically Parser String or Parser Char
In the next (third) post of this series we’ll look at creating more parsers and parser combinations with a view, in the fourth part, to parsing into data structures.

Leave a Reply

Your email address will not be published. Required fields are marked *

ˆ Back To Top