• Data Profiling & Scorecarding with Informatica Data Quality

    by  • April 30, 2011 • 1 Comment

    In my opinion, profiling and scoring data is a fundamental part of a sound data quality assessment.  I routinely use these processes to build my “current state” report for clients.  I recently used Informatica’s Data Quality developer and analyst tools to put together such a package.  I am of the opinion that these tools represent [...]

    Read more →

    Soundex for String Matching

    by  • April 30, 2011 • 5 Comments

    Soundex is a useful function for performing data matching While you can use a Soundex function in the process of identifying potential duplicate strings, I don’t recommend it.  Here’s why … The algorithm encodes consonants Vowels will not be encoded unless it is the first letter Consonants to the right of a vowel are not coded [...]

    Read more →

    Data Quality: where does it belong?

    by  • April 27, 2011 • 3 Comments

      Data Quality is not a technology issue, it’s a business issue Here is my opinion on why people think it is about technology. Business initiatives like MDM/BI/DQ and the like are being presented, sold on, and driven by technology experts. Information technology has carried business forward to the point where we are the chauffeurs [...]

    Read more →

    Data Quality: to whom does it belong?

    by  • April 27, 2011 • 1 Comment

     How should data ownership be addressed? In my opinion a governance committee is the best option.  There should be at least one, probably two representatives from the business, from technology and from budgeting.  I’d suggested budgeting be the head of the committee so that solid cost-based decisions can be made.  Business and technology can present their [...]

    Read more →

    Data Cleansing every quarter?

    by  • April 24, 2011 • 1 Comment

    @jschwa1 Data cleansing every 3 months? http://ow.ly/1i0vd - Someones not addressing the right problem! This is a clip from a recent tweet from Julian Schwarzenbach of Data and Process Advantage Limited (DPA).  My response to his tweet was “ I can see validity [of quarterly cleansing] esp. if the data is from external sources like customers”.  I can see where [...]

    Read more →

    Data Quality & Cloud-based services

    by  • April 23, 2011 • 1 Comment

    Software as a Service (SaaS) will help proliferate data quality solutions I agree with this assertion for a few reasons, not the least of which is the ease at which “front-end” data quality solutions will be included in the suite of services in a Service Oriented Architecture (SOA). In my opinion, data qualities true promise [...]

    Read more →

    Data Quality ROI = Address Validation and Duplication Consolidation

    by  • April 22, 2011 • 1 Comment

      I have had conversations recently with fellow data quality gurus which centered around DQ ROI.  We all know how important it is to tie a DQ initiative to a return on the investment.  This is even more true of an initiative with long-term implementation objectives.  During the course of the conversation I pointed out that [...]

    Read more →

    Data Discovery. The first step toward data management.

    by  • April 20, 2011 • 2 Comments

    Introduction Recently on a data discovery project I observed something that I wanted to share.  Data discovery efforts, and the tools that support them, are well suited for those organizations who’ve had data explosive growth.  With this kind of growth the data landscape expands to the point where in-depth knowledge of data, and more importantly metadata, details becomes unobtainable.  [...]

    Read more →

    Data Quality Resource

    by  • April 20, 2011 • 0 Comments

      Recently a reader, Richard Ordowich, posted this resource in a comment so I thought I’d pass it along. The most comprehensive list I have seen is in the book; Managing Information Quality by Martin Eppler in which he lists 70 typical information quality criteria which was compiled from various other sources (and referenced). Thanks for taking [...]

    Read more →

    It’s a date!

    by  • April 18, 2011 • 0 Comments

    I’ve started using the date related functions in the data quality developer tool. I’ve found some fun ways to implement them and wanted to share. Is_Date Before you use any date function you need to be sure you’re dealing with a date string. The Is_Date function, available in the Expression transform, is how you test [...]

    Read more →