[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]



So if your language has "(" and ")", but "{" and "}" must be preceded by
a "\", the lexer would throw an error on a naked "}" but would pass "\}"
back to the parser. But the parser would have to check that each "(" has
a corresponding ")". Assuming, of course, that your language required
that the "()"s balance.

I better stop now.

--  CHS

On Thu, 2004-03-18 at 09:17, Charles Shapiro wrote:
> Oooh, me, me!!
> 
> Parsers generally take collections of elements of a language ("lexical
> tokens") and then translate them in some way.  I think what you might be
> trying to build is a lexer, which takes raw text (or other raw data) and
> cuts it into tokens which you can feed to a parser. Output from a lexer
> is supposed to be already correct, so you can concentrate on parsing it
> later without worrying about lexical errors such as unbalanced
> parentheses (if your language uses parentheses) or invalid tokens.
> 
> If the source you're looking at is generated, it's not the source.
> Parsers and lexers generally involve a lot of coding drudgery, so the
> smart way to build them is with code generators working from a language
> specification. That specification is the real source. The canonical
> lexer//parser generator tool set is lex(1) and yacc(1), called flex(1)
> and bison(1) in the linux development tool set.  I learned to use them
> first by taking the standard demo program for these tools, a calculator
> program, and enhancing it. You can even find a couple of calculator 
> examples in the gnu bison documentation
&gt; (<a  rel="nofollow" href="http://www.delorie.com/gnu/docs/bison/bison.html#SEC_Top";>http://www.delorie.com/gnu/docs/bison/bison.html#SEC_Top</a>).
&gt; After I got that to work and understood  the concepts behind it, I was
&gt; able to use flex/bison to write  my own program, available
&gt; at <a  rel="nofollow" href="http://tomshiro.org/coldread";>http://tomshiro.org/coldread</a>.  
&gt; 
&gt; One crucial piece of advice which I haven't seen covered elsewhere:
&gt; 
&gt; First, build a really stupid parser which simply recognizes the lexical
&gt; tokens as they go by and prints them out (this is easy in yacc). Then,
&gt; build your lexer and make sure it works the way you want. Write some
&gt; unit tests here so you can change your lexer and be sure it still
&gt; works.Finally, go back and replace the stubbed back-end with one which
&gt; does what you want it to.
&gt; 
&gt; This stuff is really cool and what seduced me into the sad pathetic life
&gt; I now live. Aho, Sethi and Ullman's _Compilers: Principles, Techniques,
&gt; and Tools (Addison-Wesley, 1986), aka the Dragon Book, is worth the
&gt; fabulous price you'll pay for it if you think this is interesting
&gt; 
&gt; -- CHS
&gt; 
&gt; On Thu, 2004-03-18 at 04:18, Kevin Krumwiede wrote:
&gt; &gt; I am working on a program to capture data from a MUD (actually, TW2002).
&gt; &gt;  I've looked at the source for a couple parsers, including one made
&gt; &gt; specifically for that game, but because they are generated code I'm
&gt; &gt; having a lot of difficulty understanding how they work.  
&gt; &gt; 
&gt; &gt; The gist of what I do understand is this (and somebody please tell me if
&gt; &gt; I'm wrong): parsers generally take a string of text and return a numeric
&gt; &gt; code signifying what pattern (if any) the text matches.  A program can
&gt; &gt; then use that code to decide how to process the text.  I assume that
&gt; &gt; sophisticated parsers take into account the preceding context of the
&gt; &gt; text when evaluating its pattern.
&gt; &gt; 
&gt; &gt; I am completely lost when it comes to the input languages of parser
&gt; &gt; generators.  Anyone know of a good tutorial?
&gt; &gt; 
&gt; &gt; Thanks,
&gt; &gt; Krum
&gt; &gt; _______________________________________________
&gt; &gt; Ale mailing list
&gt; &gt; Ale at ale.org
&gt; &gt; <a  rel="nofollow" href="http://www.ale.org/mailman/listinfo/ale";>http://www.ale.org/mailman/listinfo/ale</a>
&gt; _______________________________________________
&gt; Ale mailing list
&gt; Ale at ale.org
&gt; <a  rel="nofollow" href="http://www.ale.org/mailman/listinfo/ale";>http://www.ale.org/mailman/listinfo/ale</a>


</pre>
<!--X-Body-of-Message-End-->
<!--X-MsgBody-End-->
<!--X-Follow-Ups-->
<hr>
<!--X-Follow-Ups-End-->
<!--X-References-->
<ul><li><strong>References</strong>:
<ul>
<li><strong><a name="00657" href="msg00657.html">[ale] [OT] Writing a parser</a></strong>
<ul><li><em>From:</em> kjkrum at comcast.net (Kevin Krumwiede)</li></ul></li>
<li><strong><a name="00662" href="msg00662.html">[ale] [OT] Writing a parser</a></strong>
<ul><li><em>From:</em> cshapiro at nubridges.com (Charles Shapiro)</li></ul></li>
</ul></li></ul>
<!--X-References-End-->
<!--X-BotPNI-->
<ul>
<li>Prev by Date:
<strong><a href="msg00662.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Next by Date:
<strong><a href="msg00664.html">[ale] OT: tech support hell at Mindspring</a></strong>
</li>
<li>Previous by thread:
<strong><a href="msg00662.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Next by thread:
<strong><a href="msg00667.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Index(es):
<ul>
<li><a href="maillist.html#00663"><strong>Date</strong></a></li>
<li><a href="threads.html#00663"><strong>Thread</strong></a></li>
</ul>
</li>
</ul>

<!--X-BotPNI-End-->
<!--X-User-Footer-->
<!--X-User-Footer-End-->
</body>
</html>