If we look back at what has happened in the data integration market over the last 10 years we can see a lot of change. In the first half of that decade there was an explosion in the number of data integration tools and in the second half there was a big wave of consolidations. This consolidation wave put an ever growing amount of data integration power in the hands of only a few large billion dollar companies. For any person, company or project in need of data integration, this meant either paying large amounts of money or doing hand-coding of their solution.
During that exact same period, we saw web servers, programming languages, operating systems, and even relational databases turn into a commodity in the ICT market place. This was driven among other things by the availability of open source software such as Apache, GNU, Linux, MySQL, and many others. For the ICT market, this meant that more services could be deployed at a lower cost. If you look closely at what has been going on in those last 10 years, you will notice that most companies increasingly deployed more ICT services to end-users. These services get more and more connected over an ever growing network. Pretty much anything ranging from tiny mobile devices to huge cloud-based infrastructure is being deployed and all those can contain data that is valuable to an organization.
The job of any person that needs to integrate all this data is not easy. Complexity of information services technology usually increases exponentially with the number of systems involved. Because of this, integrating all these systems can be a daunting and scary task that is never complete. Any piece of code lives in what can be described as a software ecosystem that is always in a state of flux. Like in nature, certain ecosystems evolve extremely fast where others change very slowly over time. However, like in nature all ICT systems change. What is needed is another wave of commodification in the area of data integration and business intelligence in general. This is where Pentaho comes in.
Pentaho tries to provide answers to these problems by making the integration software available as open source, accessible, easy to use, and easy to maintain for users and developers alike. Every release of our software we try to make things easier, better, and faster. However, even if things can be done with nice user interfaces, there are still a huge amount of possibilities and options to choose from.
As the founder of the project I've always liked the fact that Kettle users had a lot of choice. Choice translates into creativity, and creativity often delivers good solutions that are comfortable to the person implementing them. However, this choice can be daunting to any beginning Kettle developer. With thousands of options to choose from, it can be very hard to get started.
This is above all others the reason why I'm very happy to see this book come to life. It will be a great and indispensable help for everyone that is taking steps into the wonderful world of data integration with Kettle. As such, I hope you see this book as an open invitation to get started with Kettle in the wonderful world of data integration.
Matt Casters
Chief Data Integration at Pentaho
Kettle founder