Data Mining in the Real World

Yesterday evening I went to a talk on `Data Mining in the Real World' given by Nicholas Radcliffe of Quadstone Ltd. It was organised by the Edinburgh Group of the Royal Statistical Society, meeting in the International Centre for Mathematical Sciences (a rather attractive Georgian property in the New Town and the birthplace of James Clerk Maxwell, but I digress). The flyer for the talk read:

Business is spending billions of pounds on acquiring the physical capability to offer different experiences to different customers - the key goal of Customer Relationship Management (CRM). It has now started to realise that deciding which experiences to offer which customers is not simple, and that statistical approaches can improve the effectiveness of CRM initiatives. This talk will discuss what happens when business meets analysis.

Firms implement CRM by performing `data mining' on the large databases of customer information that they have amassed. Most of the talk was a series of anecdotes about how data mining exercises had enabled firms to significantly improve either their business processes or the services that they offered to customers. It was not obvious that the approaches adopted had much in common with the sort of problems encountered in astronomy. The main interest seemed to be in finding correlations and clusters rather than outliers. For the sort problems encountered in business the very simplest techniques for finding patterns, almost always some variation of linear regression, are usually adequate; using a more sophisticated technique rarely gives a significant improvement. However, what are useful (but are currently under-used in business) are scientific-type visualisation techniques, which used sensibly can yield enormous insight.

Towards the end of the talk the speaker noted that `data mining is just another name for statistics' and that many problems could be successfully tackled using basic statistics applied properly coupled with a deep understanding of the problem domain, both of which observations I have some sympathy with. All told, an interesting and often amusing talk, but it was not obvious that the data mining techniques used in CRM had much immediate relevance to astronomy.

Clive Davenhall,
13/3/02.