by Harold Burt-Gerrans
Welcome to Part 5. As promised in Part 4, I’ll start by discussing recursive de-duplication.
I can’t count the number of times that clients have complained about x.400/x.500 addresses in emails. Unfortunately, if the collected data comes with those address structures and not [email protected], we’re stuck with using them. Relativity and Ringtail have both introduced “People” or “Entity” type data structures (I’m sure other review tools have as well), but I think these structures are not used to their full potential yet.
Part of the de-duplication process should be to recursively substitute alias values in place of address strings so that multiple copies of the same email can be matched even when one copy has addresses formatted differently from the other copies.