Affordable data mining now!
One thing is for sure: data mining is expensive. For example, this study shows that the average initial annual cost of an SPSS deployment is $342,061 (including $153,500 in licensing fees). Also, this study shows that SAS licensing fees alone and for a year are $123,669 (this for a quadcore machine, excluding training or consulting fees). So while the cost to deploy data mining solutions remains extremely high, we think that large companies tolerate it because of the resulting ROI. Using predictive analytics to achieve a 0.5% decrease in churn rates (or a 0.5% increase in campaign response) may result in savings high enough to quickly offset the initial investment. This said, does it make sense to pay this much for data mining? For example, licensing models which charge more depending on the number of processors seem a bit ridiculous at a time when even laptops come with dual / quad cores.
1. Same algorithms, different prices:
The reality is that cheaper alternatives are available. For example, MineQuest offers both SAS and WPS consulting services. WPS is essentially a SAS clone which replicates SAS functions and the SAS language. This means that most scripts written for SAS can be executed by WPS at roughly one tenth of the cost. Another example: our data mining platform implement the same scalable algorithms as that of other products, often with some improvements. For example, SPSS’s two-step clustering algorithm is nothing more than a variant of the BIRCH algorithm, published in 1996. While we agree that properly implementing this algorithm is not a piece of cake, we offer a robust improved implementation at a fraction of the cost (same thing for decision trees, outlier detection, association rule mining, etc.). So why are the big players still standing?
2. Entrenched at the top:
There are several reasons why SAS / SPSS shops aren’t moving to cheaper alternatives. As I learned from a brilliant Data Mining architect from Microsoft, one key factor is budgets. Managers typically exert an influence proportional to their budget, and unspent budgets usually result in budget reductions the following year. So most managers’ instinct is to spend their budget in maintenance fees, even in the presence of cheaper alternatives. Also, given the price difference, shifting to a cheaper alternative would be like falling off a cliff: someone ought to challenge the manager’s initial decision to acquire such expensive software. Finally, I suspect most SAS consultants are happy to perpetuate the status-quo because it is favorable to their skills and expertise.
3. A shifting landscape:
In our opinion, the only way to “put a ding in the universe” is to offer ubiquitous mining at ridiculously low costs. This will require a combination of things: a/ high-quality implementations of data mining algorithms, b/ support for low cost storage solutions (ex: MySQL, SQL Server), c/ on-demand cloud-based OR on-premise data mining, and d/ a rich user interface able to bring data mining techniques within reach of savvy business analysts. We’re not quite there yet, but we’re getting closer every day. What do you think?