Quantcast
Channel: Data Preparation & Blending discussions
Viewing all articles
Browse latest Browse all 4999

Cleansing / Mapping using Soundex ?

$
0
0

JHi,

 

this is is my first look at Alteryx ! I have a large csv file of over a million lines. One of the fields is an item ID which has very poor quality with differing issues :

 

  • leading zeros
  • special characters within the item ID
  • item description added to the item ID
  • unit of measure added to the item ID

 

Can anyone help on creating a final clean ID field ?  

 

The first issue is I don’t know which is the correct ID as I’m getting data from multiple sources and on a weekly basis so a master lookup will need to be constantly updated. Is it possible to group them using Soundex to start with to see the most likely matches? 

 

any help advice is really appreciated!

 

many thanks


Viewing all articles
Browse latest Browse all 4999

Trending Articles