Microsoft Expands .Net With Xen
Robyn PetersonIn the professional world, most programming can be summed up in two words: data manipulation. And these days, much of the data being manipulated comes prepackaged in XML documents or SQL tables.
So why do languages like C# force programmers to use obtuse APIs to access those data structures? It would be much more convenient if C# had a notion of XML and SQL built directly into the language--and if Microsoft has its way, it soon will.
Xen, a new programming language coming out of Microsoft Research and developed in conjunction with the University of Cambridge, promises to bring together three disparate but integral components of programming, wrapping them together in .Net. Xen's creators use a geometric metaphor to illustrate this conjoining, calling the language a means to program with "circles, triangles, and rectangles." The circle represents object-oriented programming. In .Net, that's C# and the Common Language Runtime (CLR). CLR manages the execution of code, whether its C#, VB, or F#. It's at the base of .Net. This piece is not changing.
The triangle represents data in a hierarchical structure, namely, XML. Programming with XML in C# can be tiresome today. The APIs needed to access XML data structures tend to obfuscate the code and lead to security holes, poor type-safety, and logic problems. With Xen, Microsoft proposes to encompass XML in the C# language, giving it first-class, native support.
The rectangle represents relational data, or data stored in tables in a database. Today's code tends to be riddled with verbose strings containing SQL and redundant ADO.Net API calls. According to Microsoft, Xen will incorporate relational data manipulation constructs directly into the language, solving that problem as well.
For those of you already familiar with C#, it's clear that Xen is simply C# with additional features and capabilities. In fact, it's just C# with two of its most used APIs -- XML and database manipulation -- now built directly into the language.
The architects behind Xen believe that if an API is used frequently, it should be considered for incorporation into the primary language. The popularity of XML and relational data structures make them the most likely candidates for inclusion.
Along with native support for XML and SQL, Xen will include the entirety of the C# language. According to the creators, that will lead to simplified programming and increased productivity. Are they correct, or do these new features simply add to the bloat of the C# language?
Click on the next button below for our short preview of Microsoft's new Xen language, and what it means for you.
Note: Since Microsoft has not yet publicly released a Xen compiler and information appears to be restricted, the code examples in this article are primarily from the only published paper on Xen, Programming with Circles, Triangles and Rectangles.
Adopting a simple view of XML, Xen allows programmers to define classes using syntax similar to that of XML 1.0, while not straying far from its roots in C#. The keywords for XML schemas can be used when defining classes in Xen. In this article, we'll use the following keywords--note that these keywords are not new to Xen, but rather have been used in XML schemas for some time: Sequence: A list of data members that must appear in the same order. If you define a person's name as a three-part sequence containing: first name, middle name, and last name; then the data would have to appear in that order, such as: Fred, William, Lasternamerson. (Please excuse the example, I've had a lot of coffee.)
Choice: This element allows only one of its children to appear in an instance. If you have a choice for dessert between ice cream and frozen yogurt, that means you can only choose one.
Attribute: A characteristic of the described object. For instance, hair color is an attribute of a person.
For a refresher on XML schemas, the XML Schema Part 0: Primer from the W3C is quite extensive.
Now, back to Xen. You can make use of XML schema keywords when defining a class. For instance, the following class in Xen (given by the authors) will define a class book, which is just a representation of a library book that has at least one editor or at least one author, denoted by the 'choice' keyword. 'Book' also has a publisher, a price, and a year attribute. public class book { sequence{ string title; choice{ sequence{ editor editor; }+; sequence{ author author; }+; } string publisher; int price; } attribute int year; }
For any object-oriented programmer, this example code is fairly easy to read. The class 'book' is defined as a sequence of variables (similar to columns in a table, or data members in a class). The sequence contains, for each object, a title, at least one editor or at least one author (discrete choice thereof), a publisher, a price, and the year the book was published.
Note that Xen can use the stream types ?, +, and *, which can be defined as follows: ? Optional + One or more * Zero or more
So when we say editor+ it means, one or more editors.
Writing the book data structure using just C#, or Java or C++ for that matter, would require much more coding. You'd have to include data integrity checks as methods, or somehow enforce your data rules such that each book contains at least one editor or at least one author.
An XML developer could argue that this class can be defined with fewer characters in a DTD. However, the code in a DTD is far less readable. The following DTD, in XML, defines an equivalent 'book': <!ELEMENT bib (book* )> <!ELEMENT book (title,(author+ | editor+), publisher, price)> <!ATTLIST book year CDATA #REQUIRED >
Keep in mind, the purpose of adding this XML functionality to C# is to improve the language, not to replace DTDs or XML. Since most C# coders are writing Web services and applications, they frequently need to work with XML. Including XML constructs in the language will make programming Web services easier and should result in greater type-safety as well as increased code optimization--since the compiler will have intimate knowledge of the XML data structures and the functionality used to manipulate them.
For instance, when using .Net's System.Xml library, something as complex and interesting as a regular expression is often passed as a simple string. When an expression is passed as a string, the compiler cannot, or will not, parse it or check for syntactic accuracy (in regards to the grammar of a regular expression). Also, as a string, the application will have a much more difficult time detecting oddities, such as code-injection attacks.
With Xen, the simplest strategy for iteratively processing a list is with a foreach loop, which performs an action on each item in a series. For example, using the book class we developed on the previous page, we can declare a book and then write each of the author names to a console with the following code: //Instantiating the book object, b book b; [… here we can ask the user, or some other source, to fill in data for the book, like author or editor, price, year of publication, etc. …] //Accessing the data held within the book object author* authors = b.author; //Iterating over the list of authors, and writing names to the console foreach(author a in authors) Console.WriteLine(a);
So, when we say author* authors = b.author we are saying book b may contain zero or more authors--remember that book b can have an editor instead of an author according to our definition--and assign their names to the authors variable.
Then, the foreach loop simply iterates over the elements in authors, printing each to the console.
Convenient Coding Theoretical computer science is fascinating, but for a new language to take hold in the modern workplace, it has to provide a higher level of convenience than what currently exists. In these early stages, Xen appears to do just that. Here's another example of code, given by the authors, that's easier in Xen than it is in C#. In this example, we're creating a bibliography for our books. In essence, it's a simple class which contains an array-like structure of books, or each slot in the array is a book object.
Given the following class: public class bibliography { book* books; }
Remember that in, Xen, you can use the stream types ?, +, and *, which are defined as follows: ? Optional + One or more * Zero or more
We can then create a bibliography called bib: Bibliography bib; [… assuming we already have a list of books, "listOfBooks" (not including this step here in order to keep the code easier to understand …] bib.books = listOfBooks;
Then we can acquire all of the titles of the books in our bibliography with this code in Xen: string* titles = bib.books.title;
In this line, we're creating something like a string array (with zero or more strings) that will contain the title for each book that is in books. That's much more elegant than the code that is currently necessary in C#: string* getTitles(books bs){ foreach(book it in bs) yield it.title; } string* titles = getTitles(bib.books);
It's important to note that this code could be written many different ways, in numerous different languages. The code was written in this manner for ease of illustration. If you have a better way to write it, let us know in the forum.
Table-like structures can be built in Xen, similar to how classes are created. For example, the following SQL code will construct a table called Customer with each record containing a custid, an integer, and an optional name, a string. CREATE TABLE Customer ( name string NULL, custid int);
In Xen, this table could be created with the following line of code:
sequence{ string? name; int custid; }* Customer;
Note: For a definition of 'sequence' or other keywords, skip back two pages to Incorporating XML into Xen.
The collection is made up of a sequence which contains an optional name (string followed by the symbol denoting optional, '?') and an integer custid. The name of the table, Customers, is declared at the end of the line. Notice that the sequence declaration is followed by a '*', that simply means that zero or more sequences can be stored in Customer (in other words, the table can be empty, or may contain any number of records).
Built-In Select Statements In Xen, you can also write code that performs filtering tasks just like select statements in SQL. Using our just created Customer table, we can retrieve all of the customers who have the name "Fred" and a custid greater than 100. //Instantiating a Customer table called CustomerTable Customer CustomerTable; [Let's assume CustomerTable has some data that was filled in already by a user.] //Selecting all customers named Fred with a custid greater than 100 Customer* Freds = CustTable[it.name=="Fred" && it.custid>100];
For a more complex example, let's use the bibliography we created on the previous page. If we instantiate this object with the name 'bib', we can retrieve all of the books that are published by Addison-Wesley and published after 1991 in the following manner. book* OldBooks = bib.book[it.publisher == "Addison-Wesley" && it.year > 1991];
Note that the implicit variable 'it' is not defined outright, but is understood to be the iterator over all of the books in bib.
At this point, if you're a C# programmer, you're either worried or drooling. If you're in the former state, don't worry--C# isn't going away anytime soon. According to the authors of the only published work on Xen, "[Xen] does support the entirety of the C# language."
The goal of Xen is to provide native support for APIs that are used most frequently in the business world. The creation of the language is an attempt to simplify the life of the average coder. Whether or not you accept Xen as a quality initiative, you have to agree with the inherent logic of that goal.
It's necessary to realize that no language is better than another in every regard. Some are easier to use to accomplish certain tasks than others. To us, Xen appears to be a very natural extension to C#, and one that we envision becoming very useful for Web and application programming. Let us know what you think of Xen in our forum.
Where is Xen Now? Xen is currently still in the research and development phase at Microsoft Research and the University of Cambridge. With the defining research paper, The Meaning of Xen, still under development, Xen looks to be firmly grounded in the laboratory for now. Although, a functioning compiler was shown at XML 2003, according to reports from Microsoft Weblogger Dare Obasanjo, it hasn't been distributed publicly, nor has an availability date been announced.
According to Erik Meijer, Technical Lead of Microsoft Research's Webdata division, Xen replaced X#, the XML-related programming language project that has been frequently speculated about over the last year. He further explained that what really matters is that Microsoft is doing research in data integration and programming.
This article is simply a preview of Xen based on available information in the market. For more, we recommend that you visit the following sites. In particular, the first link leads to a very well written technical paper on Xen. However, since Mary Jo Foley first posted a story on Xen on Microsoft Watch, links on the language have been disappearing around the Web. Coincidence? Possibly. We'll let you decide. Programming with Circles, Triangles and Rectangles - Erik Meijer and Wolfram Schulte (Microsoft), and Gavin Bierman (University of Cambridge)
Unifying Tables, Objects, and Documents- Erik Meijer and Wolfram Schulte (Microsoft)
Dare Obasanjo's Blog
Devhawk Blog
Copyright © 2004 Ziff Davis Media Inc. All Rights Reserved. Originally appearing in Dev Source.