Wednesday, 8 February 2012

Using Geo-Special Awareness to Get That Extra Edge Out of Predictive Analytics

The way to get that extra edge out of the analysis is to get your hands on the key drivers, transform them wisely and exploit the correlations. The data mining tools are very good at the first steps for most types of data. However, two main gaps are still awaiting a proper answer: temporal correlation and spatial correlation. An experienced statistician can handle this gap by clever data manipulation and returning to the good old sas-stat & sas-ets to use the advanced modelling approaches such as nlmixed and arima.

However, it is important to be able to clean and transform spatial information such as the location of practices a sales rep has visited, the geo-demographic profile of the practice catchment, the regulatory environment for this practice, or the influence of the nearest hospitals and the specialists working in them. Sas has very elementary tool to handle mappable information  such as kriging, point-in-polygon and map rendering procedures. However, it feels like sas did not push developing this aspect of analytics very hard. Especially after the agreement with ESRI [http://www.esri.com/] (the sas-bridge to ESRI - http://www.sas.com/products/bridgeforesri/). I found an announcement from 2002 - http://www.esri.com/news/arcnews/winter0203articles/sas-and-esri.html. I got to try out the bridge around 2004 and was bitterly disappointed as it was very clunky and did not really allow for proper seamless feel. At the time I also experimented with sending queries to MS-sql-server (that was augmented with the spatial analysis pack) and with writing MapBasic code on the fly within a sas-session, compiling it and calling MapInfo to execute it using data exported from sas to csv (Ha!). The latter is my current preferred mode of work but it has obvious short comings. The one that is unexpected is that one cannot automate drive time calculations in Mapinfo and boy do I need to do this now.

Blair Freebairn of GeoLytix (http://geolytix.co.uk/) stopped over a few days ago and we had an interesting discussion exploring the need for dynamic interaction between an analytical package such as sas and a GIS software such as Arc-View (ESRI). Many of the application we thought up really need a once in a while processing such as identifying drive time catchments, joining in Mosaic (geo-demographic - http://www.experian.co.uk/business-strategies/mosaic-uk-2009.html) and aggregating up using appropriate weights. That could be done once a quarter and presented to sas as a csv to augment any analysis mart. Or fitting a predictive model once a week and implementing the real-time scoring in the GIS software. However, I can envision a situation were data should go back and forth seemlessly to effectively use the strengths of sas and a GIS platform – not just for reporting purposes. Please feel free to share your experience and thoughts in the comments stream.

I hear there is a new version of the Bridge to ESRI – anybody out there experienced it?

No comments:

Post a Comment