Compilerware
Open Source Fast LALR(k) Parser Generator
 
AMDG
INTRODUCTION Downloads Feedback From Users Release Information PRODUCTS LRSTAR DFASTAR DFASTAR/c THEORY Definition Of Terms Theoretical Information COMPANY About Us Contact and Support  
Donate SourceForge Page

LRSTAR Parser Generator 3.1
  • Creates LR parsers of type
LALR(k)
  • Creates LR parser tables of type
Matrix
  • Creates parser source code in
C++
  • Compiler front-end speed (lines/second)
1,190,000
yes
yes
  • Shows traces for conflicts in your grammar
yes
  • Parsers have automatic error recovery
no
  • Parsers have a symbol-table builder
yes
  • Parsers do AST construction and traversal
yes
yes
yes
   
DFASTAR Lexer Generator 3.1
  • Creates DFA lexers of type
Matrix, Direct Code
  • Creates lexer source code in
C++
  • Maximum lexer speed (tokens/second)
34,620,000
  • Number of keywords lexer can handle
250,000
  • Build time for DB2 (550 keywords)
5 seconds
yes
  • Lexers do keyword recognition
yes
  • Unicode support
no
yes
yes

DOWNLOADS
LRSTAR & DFASTAR 3.1.006 All Documentation
LRSTAR & DFASTAR 3.1.006 Windows Binaries, Projects, Grammars
LRSTAR & DFASTAR 3.1.006 Windows Source Code

DOCUMENTATION VERSION UPDATED
LRSTAR User's Manual 3.1 Mar 7 2012
Skeleton Notation 3.1 Mar 7 2012
Definition Of Terms 3.1 Mar 7 2011
TBNF Paper 3.1 Mar 7 2011

GRAMMARS COMMENTS AUTHOR DATE
? ? 1988
Early MS Basic Paul Mann 1987
Incomplete Paul Mann ?
? ? ?
Just for fun Paul Mann ?
Complete? Ed Dobies 1988
Complete & tested Paul Mann 1995
Complete Paul Mann 1988
Complete? Ed Dobies 1988
Small subset Paul Mann 2005
Complete? Taylor Hutt 1990
Complete & tested Paul Mann 1998
Complete? Eric Beser 1989
Complete? Ed Dobies 1988
Complete? Bill Spees 1988
Complete? Paul Mann 1988
Complete? Paul Mann 1997
Complete? Paul Mann 1988
Complete? Paul Mann 1985

FOOTNOTES: 
 

LALR(k) Parsing 

An LALR(k) parser can handle complex computer languages, such as COBOL and other languages that require more than one symbol of lookahead to resolve ambiguity in the language definition.  The parser goes beyond LALR(1) parsing, if necessary, by using a nondeterministic parsing algorithm for states that require it.  It will look ahead 'k' symbols to resolve ambiguities.  'k' can be as large as necessary, however grammars should be defined to make 'k' as small as possible for better performance and error recovery.

LBNF Regular Expressions 

LBNF is a Lexical BNF notation for defining the symbols (or tokens) of a computer language.  LBNF allows regular expressions, but offers better readability.  When defining a token, you are not restricted to one line of notation.  You may use several lines.  In fact, you may use BNF grammar notation to describe each token of your language.  It also gives you the freedom to define your own escape symbols (\z) to match Perl regular expressions or some other notation with which you are familiar.  Here is an LBNF specification for the C language: C lexical grammar.

Matrix Tables 

Matrix tables are used to provide very fast lexers and parsers.  LRSTAR and DFASTAR use compressed-matrix tables for their finite-state automata, which offer small tables without losing much speed.  Matrix tables compile faster than direct-code recognizers (recursive descent parsers, direct-code lexers).  So, with matrix tables, you get the best solution.

Direct-Code Lexers 

A direct-code lexer generator creates a switch statement for each state in the finite-state machine.  The speed can be 20% faster than table-driven lexers, when line counting is necessary.  However, in a compiler front end, that is only 4% faster at best.  The disadvantage is that direct-code lexers have many more lines of source code and compile much slower, as the number of keywords increases.  The limit on the number of keywords is about 3,000 (a compiler limitation).

Skeleton File 

A skeleton file is source code with special code added which tells the parser or lexer generator how to format the parser or lexer tables.  This skeleton-file concept makes it possible to have the parser or lexer generator output code in many different languages.  This way you are not locked into one language or one style of coding.  You can customize the code to fit your preferred language and style of coding.

TBNF Grammar Notation 

TBNF is a grammar notation that can define the complete translation process from input source code to intermediate code.  It goes beyond BNF and EBNF.  It makes the task of building compilers and translators easier, quicker and more reliable.  For more information, see the TBNF Paper.

© Copyright Compilerware 2012.  All rights reserved.