Parsing


Last time, I showed that we can build a small parsec clone with packrat support.

This time I intend to implement packrat directly on top of Parsec 3.

One of the main topics of discussion when it comes to packrat parsing since Bryan Ford's initial release of Pappy has been the fact that in general you shouldn't use packrat to memoize every rule, and that instead you should apply Amdahl's law to look for the cases where the lookup time is paid back in terms of repetitive evaluation, computation time and the hit rate. This is great news for us, since, we only want to memoize a handful of expensive combinators.

(more...)

You never heard of the Millenium Falcon? It's the ship that made the Kessel Run in 12 parsecs.

I've been working on a parser combinator library called trifecta, and so I decided I'd share some thoughts on parsing.

Packrat parsing (as provided by frisby, pappy, rats! and the Scala parsing combinators) and more traditional recursive descent parsers (like Parsec) are often held up as somehow different.

Today I'll show that you can add monadic parsing to a packrat parser, sacrificing asymptotic guarantees in exchange for the convenient context sensitivity, and conversely how you can easily add packrat parsing to a traditional monadic parser combinator library.

(more...)

I'll be giving a talk tomorrow, Wednesday, September 16th, 2009 at the Boston Haskell User Group in the MIT CSAIL Reading Room (on the 8th floor of the William H. Gates tower of the Stata center) about mixing Oleg's iteratees with parsec and monoids to build practical parallel parsers and to cheaply reparse after local modifications are made to source code.

Ravi is trying to organize some time before hand during which people can get together and work on Haskell projects, or spend some time learning Haskell, so its not all scary academic stuff.

The meeting is scheduled from 7-9pm, and an ever growing number of us have been wandering down to the Cambridge Brewing Company afterwards to hang out and talk.

If you are curious about Haskell, or even an expert, or just happen to be interested in parallel programming and find yourself in the area, come on by.

I was asked to give two talks at the Boston Area Haskell User Group for this past Tuesday. The first was pitched at a more introductory level and the second was to go deeper into what I have been using monoids for lately.

The first talk covers an introduction to the mathematical notion of a monoid, introduces some of the features of my Haskell monoids library on hackage, and starts to motivate the use of monoidal parallel/incremental parsing, and the modification use of compression algorithms to recycle monoidal results.

The second talk covers a way to generate a locally-context sensitive parallel/incremental parser by modifying Iteratees to enable them to drive a Parsec 3 lexer, and then wrapping that in a monoid based on error productions in the grammar before recycling these techniques at a higher level to deal with parsing seemingly stateful structures, such as Haskell layout.

  1. Introduction To Monoids (PDF)
  2. Iteratees, Parsec and Monoids: A Parsing Trifecta (PDF)

Due to a late start, I was unable to give the second talk. However, I did give a quick run through to a few die-hards who stayed late and came to the Cambridge Brewing Company afterwards. As I promised some people that I would post the slides after the talk, here they are.

The current plan is to possibly give the second talk in full at either the September or October Boston Haskell User Group sessions, depending on scheduling and availability.

[ Iteratee.hs ]

Some people have requested my slides from the short talk I gave about monoids and monoidal parsing at Hac Phi. So, here they are.

There will be more to come at the next Boston Haskell User Group in August, where it looks like I'll be giving two short talks covering monoids. I may use the monoidal parsing engine from Kata as an example for the advanced talk if I have time and will start to cover parsing larger classes of grammars in general (regular languages, CFGs/TIGs, TAGs, PEGs, LALR, attribute-grammars, etc.)