Runtime, annotations based, lexer generator
Just a quick post to publish this class. It allows to easily create lexers (or lexical scanners or tokenizers) by either extending the class or pointing it to a class or object.
Takes advantage of PHP’s Reflection API to explore the document blocks of your class looking for special annotations describing patterns (regular expressions) and rules.
Some features:
- “First to match” and “longest match” lexing modes.
- States based lexer with support for a states stack
- Rules can consume, ignore, repeat in a different state or repeat skipping the rule just matched
- Implements the iterator interface so it can be used in foreach() loops for example
- Reports lexing failures indicating the position in the text plus the line and column
- Caches the internal lexer information so the annotations are only parsed once
- Allows to save and restore the internal lexer information so it can persisted and cached between requests