…or software patent troubles again

The weirdest news i read yesterday: I4i is not out to destroy Microsoft? In the article is described that some small company, i4i, has apparently won a case against Microsoft, claiming the Microsoft Word violates a patent they hold.
More specifically: patent 5787449.

The Texas judge ruled that Microsoft has to pay a fee/fine of 200 million dollar, AND that Word versions 2003, 2007 and future versions can no longer be sold. And the problem only arises when clients or customers use a feature called “custom XML”. What?

Let’s rewind here for a bit. What does that patent actually say? Let me summarize it for you. It describes two things: 1) it describes a system where the content is seperated from the actual representation, and 2) using a dialect of SGML. Now note: this patent was filed in 1994, before XML even existed (only since 1996), and the patent was finally issued in 1998. I tried to read the patent-document as some kind of specification, but i am not even sure what it is actually protecting. An algorithm? The idea? Because the idea of seperating content and representation is a common idiom. We all do it. Websites do it. We use css all the time. We use templates in Word … The idea of using metacodes (like Title, Chapter, …) and then looking up their definition the replace in the actual document (this is the example from the patent).

So it must have something to do with the specific implementation: they use an XML file. Now comes the really scary part. XML is really powerful, and Word is using a set of XML files, packaged in one zipped archive as their file format since Word 2003. This is really awesome. This is more open than any binary format. Now anyone who can read can actually create or process Word-files.

Microsoft calls it Open Office XML or OOXML, but that doesn’t really mean the standard is really open. Only Microsoft can amend the standard, and the other big standard, OpenDocument (ODF), is the fileformat originally used in OpenOffice.org (you see the deliberate confusion in the naming?), is totally different, not supported. A missed opportunity. But that is another discussion all together.

Now, i4i has no problem that Microsoft uses an XML file, just as long as the format is fixed. Meaning: you specify which tags you can read, are defined (for instance using DTD), and those tags are fixed. But Word allows you to add Custom XML, you can add your own fields into any Office document. And i4i claims that is the problem.

So now i am lost. This has nothing to do with seperating content and presentation (that was what the patent was about, right?). It could have something to do with how the extra tags are added, the custom XML. If they would use a lookup-table, to identify the custom tags, then they use a technique that is described in that patent.

But hey, wait, does that mean you can not use custom xml tags anymore? I have used tons and tons of xml, with no predefined DTD, so that the file format is extensible, and users can add their own extensions, and keeping the definition of the tags seperate.

In my personal opinion, and i think i am not alone, Microsoft is falsely convicted. There is no technological ground this patent can hold. This all comes back to the old software patent discussion. What is patentable and what is not. It seems so unnatural to patent software, algorithms, they are the building blocks upon which we learn and build the actual stuff. And the actual stuff (software/websites) is most of the times so general on the one hand, but very specific to the customer, i don’t see how these could be patented. I read that the way a website operates can be patented in the USA.

Back to the patent ’449. In the press release found on i4i’s website, they claim that Microsoft infringes claims 14,18 and 20 from the patent. Those claims refer to the ability of using a lookup-table, being able to generate it, being able to read and replace the values, and being able to use more than one. That is so general! They could sue anyone who is using xml and some kind of look-up mechanism. I think the patent should just be declared invalid. But i guess that is easier said than done.

Speaking about the seperation between content and representation, speaking of metacodes, reminded me of Donald Knuth, inventor of TeX, and author of the greatest series of books of all time: The Art of Computer Programming. So let me point you to a letter by that same man, claiming mathematical ideas or algorithms should not be patented.

Final thoughts: i4i LP is a company that guards the patents from i4i. So this could be some kind of patent troll? Secondly, they choose Texas, where a large number of these kind of cases have been won, because of their technological inaptitude.

I am looking forward to Microsoft’s next step.

What are your thoughts?