Haskell Parsers Part 3.

This is the third part of parsing in Haskell. The previous posts are available at Haskell Parsers Part 1 and Haskell Parsers Part 2. In part one we defined a parser as a type and derived a very simple parser, a parser of char. Part two extended the ideas and created monadic, functorial and applicative instances of the parser type. This allowed us to combine parsers monadically – using ‘do’ or in an applicative style – e.g. using ‘<*>’ etc. Which to use is often a stylistic choice.

In this post we’ll look at creating a few extra functions to allow more varied combinations of parsers and we can then use them in a more practical setting which I’ll describe in the final post. Here is the ‘finished’ module of parser functions.

Many of the details in the above Haskell file have been discussed but quite a few haven’t!
The ‘many‘ and ‘many1‘ parsers are particularly interesting. They’re like the ‘push-me-pull-you’ of parsers and are mutually recursive. The ‘many‘ tries to run a parser zero or more times and ‘many1‘ tries to run the parser at least once.

If we first try ‘many‘ it first tries ‘many1‘ and if that fails it puts [] in context and that is the result. When ‘many1‘ is called it ‘loads up’ list concatenation operator and applies the parser so we the applied parser with a partially applied ‘:’ function. Then ‘many‘ is called and that way they go back and forth until ‘many‘ fails, pure [] is run and the list gets resolved. i.e. Parser [a]

Our first use of ‘many‘/’many1‘ is parsing for an identifier where an identifier is, arbitrarily, defined as starting with a lowercase letter then followed by any number of letters/digits.

and then there is the corresponding applicative style:

Next we define a space parser that just consumes leading spaces from the input string. Here it is in monadic and applicative style.

and in action

Notice that the leading spaces in ” a1BC ” are consumed and the trailing spaces are not as the parser stops, as it should, when getting to a non-space character.
To handle leading and trailing spaces for any parser we define ‘dropSpaces

as can be seen ‘dropSpaces‘ parses spaces then runs the supplied parser and then runs the space parser again. And in ghci…

Finally the ‘manyUntil‘ is a slightly involved parser…

It runs a parser ‘p‘ until a ‘terminating’ parser ‘endp‘ succeeds. i.e ‘keep parsing for something until you’re able to parse something else’. Internally it uses the recursive ‘go‘ function to alternate between trying the ‘termination’ parser ‘endp‘ and building up the result by running the ‘p‘ parser and calling ‘go‘ again.
Having ‘manyUntil‘ will now allow us to create parsers for block comments.

and the comment is delimited by the start and end strings supplied to the parser.

Well, thanks for getting this far! In the next post we’ll look at using some of what’s been written and parsing strings into data and actually doing something with that data.

Leave a Reply

Your email address will not be published. Required fields are marked *

ˆ Back To Top