Posted by Shane Gibson on September 21, 2008
Interesting post over at blogs.sas.com pointing to an article by Bob Garfield that dispels as urban legend the Data Mining anecdote that putting beer and nappies together increases beer sales due to the fact that Dad’s pop in to buy nappies on the way home from work and pick up some beer as well.
Must admit I have used that line a few times in the past, but it does highlight the idea that Data Mining can identify differences or similarities that would not normally be pciked up via simple data exploraton techniques
A while ago I was luck enough to be involved in a Data Mining exercise for a large supermarket chain, and a few things came out of the analysis that we probably wouldn’t have picked up manually:
- One of the stores had a high proportion of shopping carts with meat in them, the other was primarily vegetarian
- Banana’s were the major linkage product for all carts in all stores
- There was a certain customer segment that regularly brought cat food, chocolate and red wine
(I still think naming market segments is a fun task!)
- House brands vs non house brands were dependent on a certain percentage price difference
- People would regularly buy house brands and non house brands in the same cart
So would changing the layout of the store based on some of these preferences increase sales?
Good question…
All I know is that this insight took a few days of effort, it probably would have taken months to do it manually (and that is assuming we would find these results).
Thats the power of Data Mining!
Posted by Shane Gibson on September 5, 2008
Working with a customer where we have decided to utilise the current Oracle Data Warehouse as much as possible as the source of data for reporting, given the significant investment in the data warehouse over the last few years.
Rather than extract the Oracle data into SAS we were testing using SAS Information Maps and Web Report Studio over the top of the Oracle Star Schema’s. I have to say I was impressed with the performance of the Web Report Studio reports given the number of layerswe were traversing, i.e
Web Report Studio > Information Map > SAS Workspace Server > Proc Sql >SAS Access to Oracle > Oracle (and back again).
Anyway one issue we found was that we couldn’t apply SAS Date Formats against the Oracle date fields, if we did it would just show ******* instead of a date.
We could use a datepart function to substring the date field and that worked ok, but that could only be applied in the Information Map, not against the Metadata definition which means other tools such as SAS OLAP Cubes and Enterprise Guide etc wouldn’t pick it up.
We tried changing the Metadata definition for datefields from DATETIME20. to say DATE9. , but then our friends ****** would appear. Even worse if we reapplied the DATETIME20. format to the metadata definition it set all the dates to 1960. We had to delete the metadata definition and reimport it to get the correct dates again.
The solution was in the “Usage Note 6413: SASDATEFMT option can be defined globally in the config file or as an environment variable with SAS/ACCESS Interface to Oracle” here:
http://support.sas.com/kb/6/413.html
Although it talks about SAS 8 we applied the following statement to the sasv9.cfg file:
-set SASDATEFMT date
Rebooted the workspace server, deleted and re-imported the Oracle Table Metadata and wallah the Oracle date fields were know registered as Date. and aplpying date formats in the Metadata definition, Information Maps or Enterprise Guide worked the way they should!
As an aside in working through the options we worked out that thwre is a magic number that if you take it away from the numeric Oracle data value you get the numeric SAS date value, whihc makes sense I suppose given they both juts start counting days to manage dates, they just start counting from different points.