Saturday, March 13, 2004
XAST
So the task at hand is to use XSL operations on the AST of a java program to expand macros in that program. In order to do this we need an XML representation of an AST. Last time I mentioned using JSE's parse tree. Here's what I came up with last night.
Consider the following program, with the macro 'unless',
A reasonable macro expansion would be,
To get things started, I first created an XMLReader
which
serializes JSE's AST. Its output for the above program looks like so,
As you can imagine, the above XML leaves out much of what the AST contained. That is, I didn't bother outputting source coordinates, and some semantic information (for instance, nodes on the AST knew whether they represented the end of a clause or not). For now, I decided to just spit out enough to allow me to reconstituted the original java program.
Also notice that the tree is pretty well unstructured. Only the most fundamental bits of the syntax rules (braces, parenthesis and brackets must match) are reflected in the structure. Maybe this is the most we can hope for, since we don't know anything else about the semantics of program until we start matching macros?
Transforming
I decided to use XSL to convert the above tree into a similarly unstructured representation of the program, but with the macros expanded. That is, I decided that the XSL will output an XMLized AST that is identical to the AST of the expanded program. In other words, nothing fancy is going on..
So, by default I structured the XSL so it visits every node in order, from top
to bottom, depth first, and outputs the nodes as it finds them. The template
for the 'unless' macro matches at the point of reaching the unless
identifier, by using the following XPath,
This says, "match the identifier 'unless' which is followed immediately by a pair of parenthesis". The output of the template is like this,
If I was to write this template in JSE notation it would look like this,
Actually, my template is a little more general, since it doesn't care whether the 'unless' expression is actually an expression, but only that it is followed by parentheses. This points to something larger. Using the current representation of the source code, I can't easily do much better. That is, I can't easily express, in XPath, the fact that inside the parentheses I should find an expression. Why? because expressions aren't marked as expressions in the AST, instead they're represented as unstructured goo. This came up again when I tried to write a template for a more complicated macro...
Traversing properly
The last thing I want to point out about this approach is an implementation detail. In order to get the proper traversal behaviour I created a general template like so:
This captures the "top down, depth-first, and output each node" behaviour.
But notice that it does it in a peculiar way. Each time we match an element
we explicitly call apply-templates
to match its next sibling. This is
in opposition to calling apply-templates
on all of its children.
We do this is because there is an interplay between a macro transformation and the default traversal of the XSL. We don't know ahead of time whether a child of an element needs to be expanded until we've inspected the child's previous siblings. That is, when doing a macro transformation we need to add bits into the result tree and we need to control how and where different bits of the source tree are further expanded.
For instance, without the 'unless' macro, the default behaviour is to visit the 'unless' identifier, then visit the following 'Paren', then the 'Braces', and so on. But this isn't what we want to have happen when we transform the 'unless' macro. What we want is to output an 'if' and 'not' operator, then visit the inside of the 'Paren', and then visit the following 'Braces'.
So this is what the 'unless' template turned out looking like,
Blah.
Questions and thoughts
So, I have to say my head is slightly spinning with all of this. I could go on and on, but instead I'll just jot down a few questions and thoughts I have.Q: Matching on type information.. Hard to do. It's hard, because we don't have any type information in our tree, as I mentioned above. How would we write a template for the JSE macro template,
The best we could with an XPath seems to be something like,
Yuck. This doesn't work entirely because *[4] matches any single element, so it could be a set of Parens, Braces, Braces, Comma, Semicolon, Identifier, ... Yuck, again. And how do we match the following statement?
Q: Macro uses expanding to macro uses... Imagine the following macro (expressed as a JSE template),
Well, whatever, the basic idea is that the 'unless' macro transformer produces the use of a 'using' macro. How do we deal with that? In XSL-speak, the bit of the tree holding the 'unless' expansion is the result tree, and a template is not allowed to operate on the result tree. To operate on the result tree (conceptually) we could run the stylesheet again over its own output. But what if a macro use expanded to a macro use which expanded to a macro use...?