Wednesday, January 28, 2009

Xtext Corner #3 – M5: What's in the pipeline?

Some cool things will come up with the next milestone of TMF Xtext on February, 6th 2009.
  • The outline view has been redesigned. From now on it is possible to create a representation of your source file's structure, that does not necessarily map the semantic model one to one. You are free to include virtual nodes to emphasize aspects of special interest or to group the objects in another way as it is done in the textual source file.
  • Xtext resources observe their referenced models and can reload them to reflect recent changes transparently. This improves the overall user experience and provides faster feedback.
  • We will come up with a first draft of an Xtend API for the new Xtext. It is not as powerful as the corresponding Java API, but still very useful especially for early prototyping. Most notably Xtend is very convenient when you have to work with dynamic EMF models. Furthermore it comes with a nice collection API.
There is one more thing ...

Due to an IP issue with Antlr we are not allowed to use this mature parser generator if we want to be part of the next Eclipse release with TMF Xtext. Unfortunatly we are not satisfied with the eclipse compatible alternatives. That's why we decided to build our own parser generator based on the packrat algorithm. Besides the effort with implementing yet another parser generator, there are some positive side effects:
  • We learned a lot about minimizing dependencies and therefore made the parser more pluggable. If you don't like our home grown packrat parser and even don't want to use Antlr, you can theoretically take any generator of your choice and use it.
    Attention: This feature comes without warranty.
  • We found a nice way to define terminal symbols. With Xtext M4 you could already write your own lexical rules in a somewhat awkward syntax. The whole body of the rule was a pure string without any syntactical check at design time. Xtext M5 comes with terminal rules. At a first glance, they seem to be like any other parser rule. The clue is, that they allow to define kind of a lexer body with a rich syntax and known semantics. But instead of plain parser rules, they will produce exacly one (leaf-)node in the parsed tree and may not be interrupted by any whitespace or comment.

    Terminal rules will supersede the old school lexer rules.

  • Hidden tokens per rule were introduced: You can define terminal tokens as hidden on a per rule basis. If you do not want to have whitespaces between your fully qualified names, you can disallow them easily.

Monday, January 19, 2009

Xtext Corner #2 - Linking and scoping

I've been very busy these days with all this christmas stuff, the M4 milestone of TMF Xtext and obviously with the next milestone, which will be released mid Feb. That's why I had the rather long delay since my first post about linking in Xtext. In december I discussed some difficulties with multi value references in combination with the default behavior of the Linker. As I promised in my previous post the workaround is not necessary in Xtext M4: You don't have to avoid multi value (non containment) references in metamodels anymore for the ease of validation.

Sven wrote already a very nice blog post about the concept of scopes that we introduced in the last milestone. The crucial point is to draw a distinction between visible objects and name matching when linking the model. Therefore we split the default linking service implementation into two main stages.

The first parts is about identifying all visible objects and creating the so called scope. A scope is an abstraction from linkable EObjects and the algorithm to calculate their valid string representation. Any scope provides all reachable objects paired with information about the valid textual representation – the name – that will match this object starting from a given point in your model, the so called context. Important are two facts:
  1. Scopes can be nested into each other and thus can hide their inherited elements.
  2. A scope does not export two objects with the same name in the same scope-level.




If you think about java, a local variable foo might be declared in a method body, which will make the same named field unreachable by its simple name but require the „this“ qualifier to gain access to it. A well known tool that has to deal with scopes is the eclipse compiler. It will not compile, read „link“, any two fields with the same name in a given class. None of these two fields would be on the scope.



Because we can have fields and methods with the same name, that would both be visible in contrast to equally named variables, the scope provider has to take the type of the objects into account, that should be retrieved. Sven described the semantics of scopes more detailled and provided a neat example.

The second part of the linking-service' job is to compare the input string with the object names of the scoped elements that were retrieved from the scope provider. That's very easy in most cases because a simple string match is sufficient.

But it is not necessarily an object, that was retrieved from the scope service, that should be linked. In the Xtext grammar itself, we link imported metamodels to EPackages by a given URI. It is rather impossible to come up with a scope implementation, that iterates each and every metamodel, that can be reached via any given URI. But the linking-service is still able to establish this cross reference. This indicates, that we found some nice abstractions for the common cases, but did not introduce too tight restrictions for the seldom ones.

Let's come back to the example from my previous post. It is now possible to write the grammar for the given example language in the most intuitive way using the M4 milestone of TMF Xtext. By default, the linker will only cross-reference objects, that have a unique name. Even for multi value references, we will get very good error indicators for free. The last thing to do is to provide a check, that every object is linked only once for a reference, that means the list contains distinct values. Should really be a no brainer with Xtend ...