Saturday, May 31, 2008

GORDA: the day after!

May 30th officially marks the end of the GORDA project. I had the privilege to be involved in it from the very beginning. I was there in the kickoff meeting, back in October 2004, and I was there in the last review meeting, last Friday. Both of these two meetings took place in Braga at University of Minho.

In this last review, everything went smooth. Nevertheless, I have to stress one of the project deliverables, the prototype demonstration. It was a live prototype demonstration of a replicated database using all GORDA software packages. The demo happened without any glitch whatsoever and I actually felt very proud as I watched all concepts and ideas, that we have had for the past three and a half years, implemented, deployed and executing nicely for the reviewers.

This demonstration presented two different replication scenarios: i) Sequoia+MySQL based master-slave replication; ii) PostgreSQL based, multi-master update everywhere replication using certification and additional autonomic cluster management tools. Pretty much all the software developed is hosted or referenced in GORDA website, so If you just feel curious, feel free to sneak a peak. We have GORDA implementations for PostgreSQL, MySQL (roughly), Sequoia and Apache Derby. Not all of them implement the fully GAPI (GORDA API) set as defined in the API reference, but still they show that the concept/model is feasible.

Now that the project is over, I am wondering what will happen with GORDA legacy. I believe that at least one of the project partners, will merge GORDA contributions into some of their products. As for the rest of the open source database communities, I am still not sure what is the impact of GORDA on their concerns about replication. Well, at least in the long run it is not clear. Currently, every time I engage in some database replication discussion (outside academia circles) the speech almost instantaneously includes "master-slave" expression. It is kind of like a tunnel vision around primary-backup replication. People are in this mindset for a long time, and it is hard to make them understand that there are other ways of doing things (eventual with a different kind of trade-offs). Regarding GORDA, I sometimes am afraid that after preaching to people about GAPI they would just get back to me with something similar: " - So... Can we do master-slave on top of it?". Probably, the industry is not ready for anything different yet... I mean, GORDA has prototypes on multi-master update everywhere replication using certification, although sub-optimal they are proof-of-concept implementations. They prove the very feasibility of these "other" approaches. So I guess my question is: "If you are a database replication solution provider, would it be interesting for you to have other solutions than master-slave replication (for instance: row based, no data partitioning, master-master replication)? Apart from very specific situations, Is there any user demand for anything other than primary-backup?"

Personally, I believe that some of GORDA ideas will make it into the market, but to what extent and within which time-frame is not that clear for me. If at least GAPI model gets embraced by open source databases (PostgreSQL, MySQL and Apache Derby) it will be a major achievement and a major break-through. Honestly, I like to think that last demonstration we did on Firday, was actually the first of many others. Additionally, I will continue to maintain and support parts of the GORDA software, either because I need them (in my PhD thesis for instance) or because I have sensed some interest from the community (which has already resulted in a trip to California for me and Alfrânio to present some of this at MySQL Conference).

By now, if you are still reading this post, you should check GORDA website for details and software. The public deliverables with all the documentation eventually will have their final versions uploaded and published, but the software is already available. Feel free to provide some feedback, and if you have anything to add in the part of the user demands with respect to other solutions than master-slave replication I would be delighted to know about them.

Final remark: inevitably, GORDA's end felt like we had "finished writing a book", but also that we "had began writing a new one."

No comments: