<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blogging about all things SAS &#187; Dataflux</title>
	<atom:link href="http://blog.saasinct.com/tag/dataflux/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.saasinct.com</link>
	<description>::       Sharing with the world everything we discover about SAS.</description>
	<lastBuildDate>Wed, 01 Sep 2010 06:56:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Jigging with Dataflux</title>
		<link>http://blog.saasinct.com/2009/12/17/jigging-with-dataflux/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=jigging-with-dataflux</link>
		<comments>http://blog.saasinct.com/2009/12/17/jigging-with-dataflux/#comments</comments>
		<pubDate>Thu, 17 Dec 2009 09:03:30 +0000</pubDate>
		<dc:creator>Shane Gibson</dc:creator>
				<category><![CDATA[Dataflux]]></category>
		<category><![CDATA[SAS Data Quality]]></category>
		<category><![CDATA[SAS Dataflux]]></category>

		<guid isPermaLink="false">http://blog.sasinct.com/?p=222</guid>
		<description><![CDATA[I have read a few &#8220;how to&#8221; and &#8220;case Study&#8221; books on Data Warehousing over the last few years, and they all pretty much state if the quality of your data  is rubbish, then the success of your Data Warehouse will be limited. However it is often difficult to get an organisation to rectify all [...]]]></description>
			<content:encoded><![CDATA[<p>I have read a few &#8220;how to&#8221; and &#8220;case Study&#8221; books on Data Warehousing over the last few years, and they all pretty much state if the quality of your data  is rubbish, then the success of your Data Warehouse will be limited.</p>
<p>However it is often difficult to get an organisation to rectify all the Data Quality issues, before they embark on delivering reports and information to the business users who need it.</p>
<p>One of the interesting sessions at next years SAS Users Conference in New Zealand is by Zeeman van der Merwe who is talking about the work he is doing at ACC.  I had the pleasure of meeting Zeeman a while ago and to talk to him about his project and he is definitely taking the recommended approach of sponsorship from the top and covering areas such as Data Governance, Data Stewardship and Data Quality reporting.</p>
<p>One of the Data Warehouse projects we are working on has a sister project dealing with Data Quality.  It is fair to say we that we have yet to get the organisation to fully understand the impact data quality has on the business and the necessity to rectify the issues.  Everybody does of course agree there are a lot of issues with the quality of the data which is a good start.</p>
<p>I always remember in my presales days at SAS the words customers always uttered &#8220;yes we have major data quality issues&#8221; shortly followed by &#8220;but we don&#8217;t have any money to pay to fix them&#8221;.</p>
<p>Anyway on this project we are lucky enough to have SAS Enterprise Data Integration Server at our beck and call and so have the ability to use Dataflux on the Warehouse data.  So we have done a number of tactical Data Discovery and Data Validation pieces of works.</p>
<p>So far we have completed:</p>
<ul>
<li>Validation of Phone Numbers</li>
<li>Validation of Addresses</li>
<li>Customer/Person matching</li>
</ul>
<p>The Phone Number validation was the first one we did and we picked it as it was a discrete piece of work we could time bound, while we worked through the process to use Dataflux.  We are now looking to close the loop by updating the augmented phone number data Dataflux produced inot the source system, and changing the business rules in the source system to rectify some data entry issues we identified.</p>
<p>I really recommend the idea of picking something small to start out with.</p>
<p>We are now looking at how we productionise the Data Quality routines into out standard Data Warehouse load and reporting processes.  So far the options (in 9.1.3) look like:</p>
<ul>
<li>Purchase the full use Dataflux Integration Server</li>
<li>Schedule Dataflux routines to run on a PC</li>
<li>Manually run the jobs</li>
<li>Rewrite the Dataflux jobs in SAS DI Studio</li>
</ul>
<p>Interesting thing to note is that in SAS 9.2 the Dataflux Integration Server component is bundled in eDI so you can just deploy the Dataflux Architect jobs and run them in your Warehouses standard process flows.</p>
<p>We still havent decided whihc option will work the best, but are thinking it is going to be the DI Studio option in the interim as consistency and stability of loads is one of our major focuses.</p>
<p>I have to say I love Dataflux and all that it does (I even believe the Dataflux team now have a stringer presence in the development of SAS Data Integration Server under the &#8220;Project Unity&#8221; banner).</p>
<p>I note that Dataflux jumped to the top of the Gartner Magic Quadrant in 2009.  I always struggle to find this when I need them, so here are the Data Quality ones for 2008 and 2009.</p>
<p><a href="http://www.broadstreetdata.com/images/pdf/DI-DQ/Magic-Quadrant-for-Data-Quality-Tools.pdf" target="_blank">Gartner 2008 Data Quality Magic Quadrant</a></p>
<p><a href="http://www.gartner.com/technology/media-products/reprints/dataflux/167657.html" target="_blank">Gartner 2009 Data Quality Magic Quadrant</a></p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.saasinct.com/wp-content/plugins/add-to-any/share_save_256_24.png" width="256" height="24" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.saasinct.com/2009/12/17/jigging-with-dataflux/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Definition of Metadata</title>
		<link>http://blog.saasinct.com/2008/02/15/definition-of-metadata/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=definition-of-metadata</link>
		<comments>http://blog.saasinct.com/2008/02/15/definition-of-metadata/#comments</comments>
		<pubDate>Fri, 15 Feb 2008 07:40:16 +0000</pubDate>
		<dc:creator>Shane Gibson</dc:creator>
				<category><![CDATA[All Things Metadata]]></category>
		<category><![CDATA[Dataflux]]></category>
		<category><![CDATA[Metadata]]></category>

		<guid isPermaLink="false">http://blog.sasinct.com/2008/02/15/definition-of-metadata/</guid>
		<description><![CDATA[Great post by Mike Ferguson called Data Management Needs A Solid Foundation &#8211; The World of Metadata over on the Dataflux Blog I agree with him when he says that he struggled with the definition of Metadata as &#8220;data about data&#8221;, so I really like his version of: &#8220;When it comes to data, does the [...]]]></description>
			<content:encoded><![CDATA[<p>Great post by <a href="http://www.dataflux.com/blog/archives/author/mikeferguson/" target="_blank" title="Posts by Mike Ferguson">Mike Ferguson</a> called <a href="http://www.dataflux.com/blog/archives/2008/01/29/data-management-needs-a-solid-foundation-the-world-of-metadata/" target="_blank" title="Permanent Link to Data Management Needs A Solid Foundation - The World of Metadata" rel="bookmark">Data Management Needs A  Solid Foundation &#8211; The World of Metadata</a> over on the <a href="http://www.dataflux.com/blog/" target="_blank" title="Dataflux Blog">Dataflux Blog</a></p>
<p>I agree with him  when he says that he struggled with the definition of Metadata as &#8220;data about  data&#8221;, so I really like his version of:</p>
<p><a href="http://www.dataflux.com/blog/" target="_blank" title="Dataflux Blog">&#8220;When it comes to data, does the user know what it  means?”</a></p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.saasinct.com/wp-content/plugins/add-to-any/share_save_256_24.png" width="256" height="24" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.saasinct.com/2008/02/15/definition-of-metadata/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is Dataflux?</title>
		<link>http://blog.saasinct.com/2008/02/14/14/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=14</link>
		<comments>http://blog.saasinct.com/2008/02/14/14/#comments</comments>
		<pubDate>Thu, 14 Feb 2008 06:45:53 +0000</pubDate>
		<dc:creator>Shane Gibson</dc:creator>
				<category><![CDATA[SAS Products]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Dataflux]]></category>

		<guid isPermaLink="false">http://blog.sasinct.com/2008/02/14/14/</guid>
		<description><![CDATA[With a hat tip to Joyce Norris-Montanari there is a great blog post over at Victor Fehlberg’s Tech Postings called What is DataFlux? Victor takes us through a walk through of cleaning up address data using Dataflux, with screenshots and all! I am always amazed at how many companies recognise they have a data quality [...]]]></description>
			<content:encoded><![CDATA[<p><font face="Arial" size="2"><span class="428215123-13022008">With a hat tip to <a href="http://www.dataflux.com/blog/archives/author/joyce-norris-montanari/" title="Posts by Joyce Norris-Montanari"><font face="Times New Roman" size="3">Joyce Norris-Montanari</font></a> there is a great blog post over at <a href="http://fehlberg.wordpress.com/"><font color="#80904f">Victor Fehlberg’s Tech Postings</font></a> called <a href="http://fehlberg.wordpress.com/2008/01/12/what-is-dataflux/" title="What is Dataflux?">What is DataFlux?</a> </span></font></p>
<p><font face="Arial" size="2"><span class="428215123-13022008"></span></font></p>
<p><font face="Arial" size="2"><span class="428215123-13022008">Victor takes us through a walk through of cleaning up address data using Dataflux, with screenshots and all!</span></font></p>
<p><font face="Arial" size="2"><span class="428215123-13022008"></span></font></p>
<p><font face="Arial" size="2"><span class="428215123-13022008">I am always amazed at how many companies recognise they have a data quality issue, and then relegate it to a &#8216;spare time&#8217; project, rather than a priority.</span></font></p>
<p><font face="Arial" size="2"><span class="428215123-13022008"></span></font></p>
<p><font face="Arial" size="2"><span class="428215123-13022008">As they say &#8220;you can&#8217;t manage what you don&#8217;t measure&#8221; just as important is my own &#8220;any decision based on crap data is likely to be a crap decision&#8221;</span></font></p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.saasinct.com/wp-content/plugins/add-to-any/share_save_256_24.png" width="256" height="24" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.saasinct.com/2008/02/14/14/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
