Further to my previous entry about migrating my sites to a new hosting provider I have now resurrected the “Project Dedupe” wiki: http://dedupe.tickett.net

I started the project in 2006 while I was working on a data migration project and trying to dedupe a large dataset. From sourceforge:

“The project will be looking at data (the intention is to begin looking at customer name/address data but this may widen over time) and ways to intelligently detect duplicates using fuzzy matching methods and algorithms.”


I have no immediate plans to resume work on the project (other commitments)- but it’s out there for anyone to get involved.