[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[no subject]
- <!--x-content-type: text/plain -->
- <!--x-date: Sun Mar 21 17:02:42 2004 -->
- <!--x-from-r13: qnafpbk ng zvaqfcevat.pbz (Rnaal Qbk) -->
- <!--x-message-id: 1079906378.23020.11.camel@pip -->
- <!--x-reference: [email protected] --> "http://www.w3.org/TR/html4/loose.dtd">
- <!--x-subject: [ale] [OT] Writing a parser -->
- <li><em>date</em>: Sun Mar 21 17:02:42 2004</li>
- <li><em>from</em>: danscox at mindspring.com (Danny Cox)</li>
- <li><em>in-reply-to</em>: <<a href="msg00657.html">[email protected]</a>></li>
- <li><em>references</em>: <<a href="msg00657.html">[email protected]</a>></li>
- <li><em>subject</em>: [ale] [OT] Writing a parser</li>
On Thu, 2004-03-18 at 04:18, Kevin Krumwiede wrote:
> The gist of what I do understand is this (and somebody please tell me if
> I'm wrong): parsers generally take a string of text and return a numeric
> code signifying what pattern (if any) the text matches. A program can
> then use that code to decide how to process the text. I assume that
> sophisticated parsers take into account the preceding context of the
> text when evaluating its pattern.
>
> I am completely lost when it comes to the input languages of parser
> generators. Anyone know of a good tutorial?
At the risk of sounding like a broken record ("record"? what's that?),
I'll once again recommend Kerninghan & Pike's book, "The UNIX
Programming Environment". Their big project, a "high order calculator"
takes you from a rudimentary calcaulator up through one which "compiles"
the code into a "machine" and then executes it. Along the way, you
learn about yacc (the prececessor of bison), and a brief foray into lex,
but at the time, you could still write faster lexical analyzers
yourself. I wouldn't try that now against flex, though. Still, the
book covers many (most) aspects of working with grammers, parsers, and
lexers.
I'll also say this: when speaking of this class of parsers, one often
reads of a "stack". When running and given a new token, the engine will
either "shift" (push) onto the stack, or "reduce" by some rule (pop N
matching tokens). There are really two stacks: one for the symbols
(tokens), and one for the value of those tokens. This was a major break
through in my head, and I've never actually seen it stated explicitly
anywhere else. As an example, a token might be "INTEGER", but the value
is "0".
Good luck!
--
kernel, n.: A part of an operating system that preserves the
medieval traditions of sorcery and black art.
Danny
</pre>
<!--X-Body-of-Message-End-->
<!--X-MsgBody-End-->
<!--X-Follow-Ups-->
<hr>
<!--X-Follow-Ups-End-->
<!--X-References-->
<ul><li><strong>References</strong>:
<ul>
<li><strong><a name="00657" href="msg00657.html">[ale] [OT] Writing a parser</a></strong>
<ul><li><em>From:</em> kjkrum at comcast.net (Kevin Krumwiede)</li></ul></li>
</ul></li></ul>
<!--X-References-End-->
<!--X-BotPNI-->
<ul>
<li>Prev by Date:
<strong><a href="msg00786.html">[ale] OT: tech support hell at Mindspring--followup</a></strong>
</li>
<li>Next by Date:
<strong><a href="msg00788.html">[ale] OT: Opinion on Toshiba Laptop hardware?</a></strong>
</li>
<li>Previous by thread:
<strong><a href="msg00677.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Next by thread:
<strong><a href="msg00669.html">[ale] Selecting stuff in remote X sessions</a></strong>
</li>
<li>Index(es):
<ul>
<li><a href="maillist.html#00787"><strong>Date</strong></a></li>
<li><a href="threads.html#00787"><strong>Thread</strong></a></li>
</ul>
</li>
</ul>
<!--X-BotPNI-End-->
<!--X-User-Footer-->
<!--X-User-Footer-End-->
</body>
</html>