[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[no subject]
- <!--x-content-type: text/plain -->
- <!--x-date: Thu Mar 18 12:43:47 2004 -->
- <!--x-from-r13: wxancxn ng xarheb.arg (Xbr Yancxn) -->
- <!--x-message-id: [email protected] -->
- <!--x-reference: [email protected] --> "http://www.w3.org/TR/html4/loose.dtd">
- <!--x-subject: [ale] [OT] Writing a parser -->
- <li><em>date</em>: Thu Mar 18 12:43:47 2004</li>
- <li><em>from</em>: jknapka at kneuro.net (Joe Knapka)</li>
- <li><em>in-reply-to</em>: <<a href="msg00657.html">[email protected]</a>></li>
- <li><em>references</em>: <<a href="msg00657.html">[email protected]</a>></li>
- <li><em>subject</em>: [ale] [OT] Writing a parser</li>
The dog chased the cat.
A lexer produces a stream consisting of the syntactically-significant
tokens in the string:
"The", "dog", "chased", "the", "cat"
A parser takes the token stream and discovers its syntactic
structure:
S
|
NP-------+------VP
| |
DET---+---N |
| | V----+--------NP
The dog | |
chased DET--+---N
| |
the cat
Often something as simple as Python's string.split() method will work
fine as a lexer, and if the data you're looking at is formatted in a
tabular way, a lexer is all you need. You only need a parser if the
data has nontrivial structural relationships between tokens. In
general, parsing is NP-complete, but it's extremely unusual (when
dealing with machine-generated data) to encounter situations that
involve the icky parts of that complexity domain.
Cheers,
-- Joe
Kevin Krumwiede <kjkrum at comcast.net> writes:
> I am working on a program to capture data from a MUD (actually, TW2002).
> I've looked at the source for a couple parsers, including one made
> specifically for that game, but because they are generated code I'm
> having a lot of difficulty understanding how they work.
>
> The gist of what I do understand is this (and somebody please tell me if
> I'm wrong): parsers generally take a string of text and return a numeric
> code signifying what pattern (if any) the text matches. A program can
> then use that code to decide how to process the text. I assume that
> sophisticated parsers take into account the preceding context of the
> text when evaluating its pattern.
>
> I am completely lost when it comes to the input languages of parser
> generators. Anyone know of a good tutorial?
>
> Thanks,
> Krum
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> <a rel="nofollow" href="http://www.ale.org/mailman/listinfo/ale">http://www.ale.org/mailman/listinfo/ale</a>
>
>
--
Resist the feed.
--
If you really want to get my attention, send mail to
jknapka .at. kneuro .dot. net.
</pre>
<!--X-Body-of-Message-End-->
<!--X-MsgBody-End-->
<!--X-Follow-Ups-->
<hr>
<!--X-Follow-Ups-End-->
<!--X-References-->
<ul><li><strong>References</strong>:
<ul>
<li><strong><a name="00657" href="msg00657.html">[ale] [OT] Writing a parser</a></strong>
<ul><li><em>From:</em> kjkrum at comcast.net (Kevin Krumwiede)</li></ul></li>
</ul></li></ul>
<!--X-References-End-->
<!--X-BotPNI-->
<ul>
<li>Prev by Date:
<strong><a href="msg00676.html">[ale] Selecting stuff in remote X sessions</a></strong>
</li>
<li>Next by Date:
<strong><a href="msg00678.html">[ale] slow memory test on boot</a></strong>
</li>
<li>Previous by thread:
<strong><a href="msg00667.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Next by thread:
<strong><a href="msg00787.html">[ale] [OT] Writing a parser</a></strong>
</li>
<li>Index(es):
<ul>
<li><a href="maillist.html#00677"><strong>Date</strong></a></li>
<li><a href="threads.html#00677"><strong>Thread</strong></a></li>
</ul>
</li>
</ul>
<!--X-BotPNI-End-->
<!--X-User-Footer-->
<!--X-User-Footer-End-->
</body>
</html>