Our technology might be on the edge, but we envision a world where the world’s data is routinely analyzed using more powerful methods than simple pie chart reporting and executive dashboards.
What kind of data? Survey data, marketing data, sales data, inventory data, employee data, engineering data, social data, salesforce data, ebay data – any type of data!
At Data Applied, we use Google analytics to monitor incoming traffic to our website. We then crunch this data using our own product to extract more meaningful information than we would using Google’s UI alone. For example, we use clustering to automatically categorize visits into different groups, based on characteristics such as visit duration, page views, or location. We use association rule mining to identify hidden associations between visit time, keywords, and network names. We perform outlier detection to get a list of visits which may be out of the ordinary. Finally, we use our own super pivots to better visualize this information. In fact, if you have already created a free account, you should notice a new (anonymized) clickstream dataset we uploaded into the demo workspace.
Regarding network names, there is some type of asymmetric information warfare at play here. Because we are a small startup, we do not have the luxury of maintaining a private network. This means that, when we connect to any website, we appear under a generic (ex: “Comcast customer”) network name. The same however does not apply to other large Business Intelligence companies when they pay us a visit. Here is some summary clickstream data regarding recent visits to our web site. We’re publishing this information because we can, or more precisely because we find it interesting that we can but our visitors cannot. But of course, we’d still like to say thank you for paying us a visit!
|sas institute inc.||42||6.666666667||294.7142857|
PS: we reviewed Google Analytics terms of service to make sure it is ok to publish this type of information.
Update: for some unknown reason, we received a lot more visits (over 700 from Microsoft).
Perhaps you haven’t already heard about Dr Sandro Saitta. He not only works on grid computing and analytics for FinScore, but also runs a successful data mining blog (http://www.dataminingblog.com) in his spare time. Here is his profile if you want to know more. We’re fan of his blog because it contains a lot of practical advice, including book recommendations.
Sandro recently invited us to write a guest post so we obliged. You’ll have to follow this link for more details, but in this post we discuss the broken promises of data mining, and how the community should respond. Let us know what you think!
We just launched and received some great press! Two disappointments however:
- We wrote to the local press (TechFlash, XConomy) but they chose to ignore us
- We’re #1 for “data applied” on Bing.com and Yahoo.com, but not yet on Google.com
Here are some of the highlights…
ReadWriteWeb + New York Times:
It’s pretty hard to beat this combination in terms of audience! As ReadWriteWeb explains on their web site, they are among the top 10 blogs, while the New York Times is the largest metropolitan newspaper in the US. It’s awesome that the journalist (Marshall Kirkpatrick) was in a good mood, but also that he immediately understood what we were trying to do. These guys are also super efficient in terms of turnaround.
Here is an article we published on the Seattle 2.0 website:
So You Want to Process Credit Cards?
The entire industry is a mess. We hope this information will save a fellow entrepreneur some time (and money). Check out the excellent links from the Seattle startup community in comments as well.