Tuesday, 2 March 2010

Piwik – an open source web analytics tool

Piwik is a project that has been gaining a lot of momentum during the last year. The objective of this project is to offer an open source valuable alternative to Google Analytics. It is supported by a very active community of developers that are issuing releases at a frantic pace. At the time of writing we are at release 0.5.4 but release 1.0 is already forecasted sometime in the second half of 2010.

What is Piwik?

Piwik is a click stream web analytics tool. It is relying on PHP and MySql, exactly like many other open source project like Joomla and Wordpress. Exactly like Wordpress you will install the application on your web server in a matter of minutes. Like Google Analytics, Piwik track pages with JavaScrip tags that you place in your web pages.

The Javascript tags can be installed manually on your pages, or alternatively if you are using a CMS (content management system, Joomla and Wordpress are good examples) you can install the tags in your site via a module or a plug-in, making the whole process extremely quick and user friendly.


The application architecture is quite interesting. I see Piwik like a data-gathering platform on which you can add features and functionalities by adding plug-ins (a bit like you add functionalities to Firefox by installing plug-ins). If you need a feature that is not offered by the standard plug-ins available and you are able to write PHP applications you might want to develop your own plug-ins.

What are the advantages of Piwik over GA?

1 With Piwik you are in total control of your data

The data is stored in a MySql database residing on your server and will never fall into the hands of a third party. If you read closely the Google analytics terms of service you will find things that many companies might find disturbing like:

6. INFORMATION RIGHTS AND PUBLICITY . Google and its wholly owned subsidiaries may retain and use, subject to the terms of its Privacy Policy (located at http://www.google.com/privacy.html , or such other URL as Google may provide from time to time), information collected in Your use of the Service. Google will not share information associated with You or your Site with any third parties unless Google (i) has Your consent; (ii) concludes that it is required by law or has a good faith belief that access, preservation or disclosure of such information is reasonably necessary to protect the rights, property or safety of Google, its users or the public; or (iii) provides such information in certain limited circumstances to third parties to carry out tasks on Google's behalf (e.g., billing or data storage) with strict restrictions that prevent the data from being used or shared except as directed by Google . When this is done, it is subject to agreements that oblige those parties to process such information only on Google's instructions and in compliance with this Agreement and appropriate confidentiality and security measures.

In practice this clause state that Google has the right to use your analytics data. This is fine for many small companies or individuals, but for large companies or companies where analytics data is highly confidential this clause will be a show stopper for Google Analytics.

2 Piwik is real time

With Piwik, analytics data is available on the spot. Once a visitor finishes loading a page (i.e. the JavaScript tag is executed) the data is already available for reports. With Google Analytics you have to wait for several hours at best (sometime for more than 24 hours) before you can see changes in your reports.

3 You can develop new features

Like mentioned before, if you are a developer or if your organization has PHP developers, you can extend the tool functionalities yourself. The framework is done for easy development and you will be surprised how easy is to develop plug-ins.

4 Exporting data features

You can export all your statistics with a set of open APIs that enables the user to extract end export data in many different and advanced ways.

Disadvantages of Piwik

1 Lacks of ecommerce and advanced segmentation features

Piwik is still a tool in its infancy and lacks several advanced features, like advanced segmentations and funnel analysis. This doesn’t mean that these features won’t be present in the future but if you need ecommerce features NOW, this won’t be the tool for you.

2 Size of databases and load on servers

I couldn’t really make reliable measurement so far, but I have the feeling that if you receive a medium to high number of daily visits (more than 5000 visits per day for example), this might have an impact on your web server performances. The database could potentially grow very large, even thought you could place the database on another server (different than your web server).

Conclusions

Comparing the two products is quite difficult especially because Piwik and Google Analytics approach the web site analysis in a very different way. At this moment Piwik is not a replacement for Google Analytics, especially if you are relying a lot on AdWords campaigns (due to the tight AdWords/Google Analytics integration) and if you are dealing with ecommerce. While it is a promising product (and I am sure that in time it will become a fierce competitor for GA) it lacks at present sophisticated visitor segmentation features (or plug-ins). Even though it is possible to develop your own plug-ins many GA like features are not available off the shelf.

I noticed a systematic difference in the measured number of visits, Piwik figures being between 5% and 7% higher than GA. This is normal for different tools, therefore if you are planning to run Google Analytics and Piwik side by side you will have these kind of discrepancies.

On the whole Piwik needs to mature a little bit more before becoming appealing to most Web Analytics practitioners. I am really eager to see the evolution of this tool and I am sure that we will have pleasant surprises in the near future.