LR Parsing Definition/Meaning:
A bottom-up parsing technique, LR standing for Left- to-right
Rightmost derivation sequence. Originally developed by D. E. Knuth, it is the
most powerful left-to-right, no backtracking parsing method for context-free
grammars.
An LR parser consists of a pushdown stack, a parsing table, and a driving
routine. The driving routine is the same for all grammars. The stack is
manipulated by the driving routine using the information contained in the top
stack element and the next k symbols in the input stream (called the k lookahead);
k is an integer ≥0, but for most practical purposes k = 1. The stack consists
of a string
s0X0s1X1...SnXnSn+1
where each Xi, is a symbol of the input grammar and each si is called a state.
The parsing table is indexed by pairs (s,a) where s is a state and a is the
lookahead. Each entry in the table has two parts: (a) an action, which may be
shift, reduce p (for some production p), accept, or error, and (b) a state,
called the goto state. When the action is shift, the next input symbol and goto
state are pushed onto the stack (in that order). When the action is reduce p the
top 2l elements of the stack will spell the right-hand side of p but with goto
states interspersed, where l is the length of this right-hand side. These 2l
elements are popped from the stack and replaced by the left-hand side of p and
the new goto state. This operation corresponds to adding a new node to the
parse tree for the input string. The accept action is only encountered when the
start symbol S is the only symbol on the stack (i.e. the stack contains s0Ss1
for some states s0 and s1) and the lookahead is the end-of-input
symbol. It signifies that parsing has been successfully completed. On the other
hand an error entry in the parse table indicates an error in the input string.
A
grammar that can be parsed by an LR parser using k-symbol lookahead is called
an LR(k) grammar. The power of the LR parsing method conies from the fact that
the LR(1) grammars properly include other grammar types like precedence grammars
and LL(1) grammars. This and its efficiency make it a popular
choice of parsing method in compiler-compilers. If a grammar is not LR(1) this
will be evidenced as multiply defined entries in the parsing tables called
shift-reduce conflicts or reduce-reduce conflicts.
Many different parsing tables may be constructed for one grammar, each differing
in the number of states it defines. The so-called canonical LR table tends to be
too long for practical purposes and it is commonly replaced by an SLR (simple LR)
or LALR (lookahead LR) table. A grammar that is LR(1) may not however be SLR(1)
or LALR(l).
|