Binary XML vs. Excel

02 Aug, 2006

In this post I’d like to point out some ideas regarding Binary XML and Excel.
The W3C‘s XML Binary Characterization Working Group started in the summer of 2003 and its first activity sparked protest from XML experts around the world. The Binary XML concept has been discussed before, exchanging ideas why a less verbose XML is necessary.
One year ago the XBC Working Group published their documents and continued their efforts at the Efficient XML Interchange Working Group, sadly without gaining too much interest. The problem is, Binary XML would be a whole new set of specifications, creating new problems: it would be humanly unreadable, new and updated tools, parsers and editors would be necessary, and a new set of agreements would have to be made for e.g. well-formedness validations.
Why do we invent a Binary XML when we already have a perfectly useable alternative: Microsoft Excel?
Excel is used for all kinds of purposes, even many it was not designed for: requirements, structured specifications, messages, issue tracking, timesheets, project management, timeseries data and — of course, calculations. That makes Excel as much ‘general purpose’ as XML is.
What’s more, it is already used by countless numbers of users worldwide. Nowadays non-Microsoft tools are able to take care of the Excel file format, so the usefulness of Excel spreadsheets is extended beyond the Microsoft Windows platform.
An Excel sheet is a binary format which simple human beings know how to deal with: we can open those spreadsheet files in our favourite office suite and look, analyse and modify what’s in there. It’s even more user-friendly than XML itself! No more tag balancing, no more character escaping! Spreadsheets lend themselves perfectly for mixed human-computer message exchange. For instance, a questionnaire or a timesheet could be provided as an Excel template, filled in by a user and processed by an administrative system, using the user’s favourite Spreadsheet application and software like Jakarta POI.
It’s a shame that most spreadsheets are only used by users, and almost never processed by software, which is a very real and practical option…
When will businesses go all the way and use spreadsheets to implement c2b or b2b messaging?

Newest Most Voted
Inline Feedbacks
View all comments
Silvester Van der Bijl
15 years ago

Unfortunately, hardened Excel users have the tendency to move cells around making it impossible to extract information through code. Other issues are: differences between versions, user locales, etc.
Microsoft itself apparently experienced the same problems, so newer Excel versions have the capability to export the data to XML (optionally with an XSD) 😉
The POI libraries have some annoying issues which usually don’t show up until it’s too late to swith to a structured markup (maximum number of rows for starters), and it doesn’t seem to be in active development anymore.
In short I don’t think it’s as practical as you suggest it is to use Excel as an interchangable format.

Silvester Van der Bijl
15 years ago

Yes you can protect Excel cells, but then we would also have to know in advance how many rows of data to expect from e.g. a table in a sheet. The user cannot add rows, since the cells are protected.
Have you also looked at using a combination of the two? You can add an XSD to an Excel sheet, allowing the user to enter data, move cells, whatever. Excel remembers the mapping to the XSD, and allows for export to XML.

Wilfred Springer
15 years ago

….I just hope you’re kidding. FastInfoset allows you to continue to use the existing API’s, like STaX, SAX and DOM, so it’s not that intrusive at all.

14 years ago

i just want to know that when formats such as ‘Excel’ and ‘Text File (with comma delimited)’ are already available for data exchange and representation, then why use XML over these standards?

John Taylor
13 years ago

I found your blog on Google. I’ve bookmarked it and will watch out for your next blog post.

Explore related posts