Google Wave is not about text, it’s about data infrastructure

Google unveiled a new service called Google Wave, a new live communication and collaboration platform. This new wave got some attention and lot’s of rumors started with speculations and predictions about how it would impact e-mail, instant messaging, and even Twitter.

Those articles were somehow superficial and focused too much on people “immediate” (as in chat) communication and collaboration. Like instant messaging wasn’t real time enough so Google came up with an upgrade with it’s new web application.

IMHO that’s only the tip of the iceberg. It’s not simply an improved way of communication. Putting directly: Google Wave is not about text it’s about data!

In order to understand Google Wave, we need to take a look at Google’s mission. If you visit the company’s website you can read it: “Google’s mission is to organize the world’s information and make it universally accessible and useful.”

Today they already know the current state of most somewhat-static data. Google’s Search Engine robots crawl’s the entire web gathering data from web pages processing html and the related separating presentation data and scripts from semantic data. They actually do a nice work and as the time passes they can identify trends and people’s behavior.

To fulfill it’s mission, Google needs more. It must know all information, at least all publicly available information by the time it is created, so Google can reduce the lead time for organizing and make it useful and available.

So here enters the new Google Wave, that will act as a distributed data processing pipeline.

Google’s announced that it was a new platform for communication and collaboration. But, this wave can go deep into the “communication” definition, transcending the simple layer that we all know about exchanging text messages to describe ideas, and build it’s representation: a document.

That’s not the first time Google tried something like that. Google Base was an attempt to group data and enable people to publish it. But Google Base wasn’t “fluid” and collaborative enough. There are lot’s of data that should be kept private and those data are stored in specific in house ways. This works as an entrance barrier, turning Google Base integration, a task with little appeal.

Google Wave solve this problem by being an open source platform divided in Client, Server and Protocol based on a real-time publishing strategy using the operational transformation technology.

This enables companies to have their on Wave Servers where some waves are kept private and other can be shared with another Wave Servers through the Google Wave Federation Protocol.

So basically any person’s action in the real world could generate a Wave transformation. Imagine that you buy something with your credit card. Your bank’s robot could publish this transaction, then your company’s ERP system could have a robot that monitors that wave and updates the company’s private cash flow wave, hosted on your own wave server.

Another internal robot, then updates the company’s public balance that is available in some shared wave that is visible to the internet through the Wave Federation Protocol providing transparent real-time information to investors that could have robots capable of using Wave’s Playback features to analyze the balance document history and evolution through time.

This new way to transit and persist data make lot’s of sense if you analyze Google’s effort’s on developing it’s distributed storage named Big Table and making it available to users through Google App Engine. Then you can see a strong movement on the web development community to study and improve persistence strategies with initiatives like CouchDB.

It’s a new way to think the dynamics of data life cicle. It’s a new change from a imperative way of computing and storing data, defining how it is persisted at the minimal structured level, to a more declarative way of publishing data changes and letting the structure evolve as needed.