Return-Path: megacz@cs.berkeley.edu Received: from 216.237.119.186 (GODEL.MEGACZ.COM) by null (org.ibex.mail.protocol.SMTP) with ESMTP for ; Wed, 19 Jul 2006 00:18:58 -0700 Received: from 127.0.0.1 (GODEL.MEGACZ.COM) by null (org.ibex.mail.protocol.SMTP) with SMTP for ; Wed, 19 Jul 2006 00:17:34 -0700 Received: by godel.megacz.com (sSMTP sendmail emulation); Wed, 19 Jul 2006 00:17:33 -0700 To: Simon Hay Cc: sbp-interest@research.cs.berkeley.edu Subject: [sbp-interest] Re: SBP References: <4DCD5332-BBCF-43F7-AA83-7D0F9FF325D7@lincoln.ox.ac.uk> <8B3BA505-38A3-49A1-A281-02B8C850F374@lincoln.ox.ac.uk> From: Adam Megacz Organization: UC Berkeley X-Home-Page: http://www.megacz.com/ Date: Wed, 19 Jul 2006 00:17:33 -0700 In-Reply-To: (Simon Hay's message of "Mon, 17 Jul 2006 10:52:46 +0100") Message-ID: User-Agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Envelope-To: sbp-interest@research.cs.berkeley.edu List-Id: The Scannerless Boolean Parser Hi, Simon! BTW, I've set up a mailing list for SBP, and I'm cc'ing this message to it: http://research.cs.berkeley.edu/project/sbp/list/ This reply might be a bit brief; I'll write more in the morning. > Tree res = new CharParser(MetaGrammar.make()).parse(new > FileInputStream(s[0])).expand1(); // meta.g > Union meta = MetaGrammar.make(res, "s"); > SequenceInputStream sis = new SequenceInputStream(new FileInputStream > (s[0]), new FileInputStream(s[1])); // grammar > res = new CharParser(meta).parse(sis).expand1(); > Union mygrammar = MetaGrammar.make(res, "ts"); //, new TestCaseMaker()); > CharParser parser = new CharParser(mygrammar); > Forest r2 = parser.parse(new FileInputStream(s[2])); // input > Tree t = r2.expand1(); > System.out.println(t); > > Firstly, out of curiosity, what's the > advantage of parsing meta.g and using that instead of just using the > built-in meta grammar used to parse meta.g in the first place? No advantage whatsoever. RegressionTest only does that as an extra "sanity check" to make sure I haven't broken anything. > Also, previously I could call MetaGrammar.make(res, "ts") or similar > without any mention of GrammarBindingResolvers; now calling that > assumes I want an AnnotationGrammarBindingResolvers, which promptly > gets unhappy when you try to use it (e.g. with the grammar > > ts = Expr > Expr = [0-9]++ > | Plus:: (left::Expra) "+" (right::Expr) > Expra = Foo:: ("a" | "b") > > > copied-and-pasted from regression.tc and input a+2 you get > > Exception in thread "main" java.lang.RuntimeException: could not find > a Java method/class/ctor matching tag "Foo", nonterminal "Expra" with > 1 arguments > at > edu.berkeley.sbp.meta.AnnotationGrammarBindingResolver.resolveTag > (AnnotationGrammarBindingResolver.java:60) The short version: try removing "left::", "right::", and "Foo::" from your grammar; they're not really needed. Ah yes. The idea with AnnotationGrammarBindingResolver (yes, I know, that name is way too long) is this: Any sequence of grammar elements in which *two or more elements* are *not* dropped must "resolve". That is, you have to explain to SBP how to make a tree out of the two nodes. For example, MultiplyExpr = Expr "+" Expr SBP needs to know how to turn those two "sub-exprs" into a tree. So it "resolves" the sequence by looking at two things: the name of the nonterminal ("MultiplyExpr") and the tag (in this case there is no tag). You should avoid using tags unless you're forced to; for example: SomeExpr = OtherExpr (Foo:: SecondExpr ThirdExpr) The "Foo" is a tag (designated by the double-colon after it); it tells SBP how to resolve the subsequence with two elements (SecondExpr and ThirdExpr). The "outer" sequence (OtherExpr (...)) is resolved using the name of the nonterminal (SomeExpr). Okay, now back to the annotations. When you call MetaGrammar.make(), you should always pass in a GrammarBindingResolver as the third argument unless what you're parsing a grammar file. (I just pushed a new set of changes with some much more sensible names for some of this stuff). There are basically two choices if you're trying to keep it simple: 1. Plain old "new GrammarBindingResolver()" -- this just creates a tree using the "resolver" (tag or nonterminal name) as the head of the tree and the results of the subexpressions as the children of that tree. 2. AnnotationGrammarBindingResolver() -- this takes a class as an argument and inspects that class's *static* inner classes and *static* methods to try to match things its resolving with members that have the @bind or @bind.as attributes. Eventually if you're trying to build grammars at runtime you'll almost certainly want to use the first one, since you can't change annotations at runtime. This approach is more tedious but more flexible. Please keep pestering me to write that tutorial. I really need to do it right away. I've finally got TibDoc working, so I don't have any excuses left... - a