[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]





 "http://www.w3.org/TR/html4/loose.dtd">

<li>date: Thu Mar 18 12:43:47 2004</li>
<li>from: jknapka at kneuro.net (Joe Knapka)</li>
<li>in-reply-to: <<a href="msg00657.html">[email protected]</a>></li>
<li>references: <<a href="msg00657.html">[email protected]</a>></li>
<li>subject: [ale] [OT] Writing a parser</li>

  The dog chased the cat.

A lexer produces a stream consisting of the syntactically-significant
tokens in the string:

&quot;The&quot;, &quot;dog&quot;, &quot;chased&quot;, &quot;the&quot;, &quot;cat&quot;

A parser takes the token stream and discovers its syntactic
structure:

               S
               |
      NP-------+------VP
      |               |
DET---+---N           |
 |        |      V----+--------NP
The      dog     |             |
               chased     DET--+---N
                           |       |
                          the     cat

Often something as simple as Python's string.split() method will work
fine as a lexer, and if the data you're looking at is formatted in a
tabular way, a lexer is all you need.  You only need a parser if the
data has nontrivial structural relationships between tokens.  In
general, parsing is NP-complete, but it's extremely unusual (when
dealing with machine-generated data) to encounter situations that
involve the icky parts of that complexity domain.

Cheers,

-- Joe

Kevin Krumwiede &lt;kjkrum at comcast.net&gt; writes:

&gt; I am working on a program to capture data from a MUD (actually, TW2002).
&gt;  I've looked at the source for a couple parsers, including one made
&gt; specifically for that game, but because they are generated code I'm
&gt; having a lot of difficulty understanding how they work.  
&gt; 
&gt; The gist of what I do understand is this (and somebody please tell me if
&gt; I'm wrong): parsers generally take a string of text and return a numeric
&gt; code signifying what pattern (if any) the text matches.  A program can
&gt; then use that code to decide how to process the text.  I assume that
&gt; sophisticated parsers take into account the preceding context of the
&gt; text when evaluating its pattern.
&gt; 
&gt; I am completely lost when it comes to the input languages of parser
&gt; generators.  Anyone know of a good tutorial?
&gt; 
&gt; Thanks,
&gt; Krum
&gt; _______________________________________________
&gt; Ale mailing list
&gt; Ale at ale.org
&gt; <a  rel="nofollow" href="http://www.ale.org/mailman/listinfo/ale";>http://www.ale.org/mailman/listinfo/ale</a>
&gt; 
&gt; 

-- 
Resist the feed.
--
If you really want to get my attention, send mail to
jknapka .at. kneuro .dot. net.


</pre>
<!--X-Body-of-Message-End-->
<!--X-MsgBody-End-->
<!--X-Follow-Ups-->
<hr>
<!--X-Follow-Ups-End-->
<!--X-References-->
<ul><li><strong>References</strong>:
<ul>
<li><strong><a name="00657" href="msg00657.html">[ale] [OT] Writing a parser</a></strong>
<ul><li><em>From:</em> kjkrum at comcast.net (Kevin Krumwiede)</li></ul></li>
</ul></li></ul>
<!--X-References-End-->
<!--X-BotPNI-->
<ul>
<li>Prev by Date:
<strong><a href="msg00676.html">[ale] Selecting stuff in remote X sessions</a></strong>
</li>
<li>Next by Date:
<strong><a href="msg00678.html">[ale] slow memory test on boot</a></strong>
</li>
<li>Previous by thread:
<strong><a href="msg00667.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Next by thread:
<strong><a href="msg00787.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Index(es):
<ul>
<li><a href="maillist.html#00677"><strong>Date</strong></a></li>
<li><a href="threads.html#00677"><strong>Thread</strong></a></li>
</ul>
</li>
</ul>

<!--X-BotPNI-End-->
<!--X-User-Footer-->
<!--X-User-Footer-End-->
</body>
</html>

Prev by Date: [no subject]
Next by Date: [no subject]
Previous by thread: [no subject]
Next by thread: [no subject]
Index(es):
- Date
- Thread