Quantcast
Channel: Data Preparation & Blending discussions
Viewing all articles
Browse latest Browse all 4999

Classify each customer based on purchasing history

$
0
0

Hello all,

 

First off I’m sorry about posting my second question on such a short time but I can’t seem to find a solution to my problem and if anyone can give me some hints I would be truly thankful.

 

I have a very large dataset with Client ID’s, date and Items they purchased. It looks like this:

 

Client CodeDateItem Purchased
7788ii201207A
7788ii201208A
7788ii201209A
7788ii201210A
7788ii201211 
7788ii201212 
7788ii201301 
7788ii201302 
7788ii201303 
21easad201207 
21easad201208 
21easad201209B
21easad201210B
21easad201211B
21easad201212B
21easad201301A
21easad201302A
21easad201303A
upoo9201207C
upoo9201208C
upoo9201209C
upoo9201210 
upoo9201211 
upoo9201212 
upoo9201301 
upoo9201302C
upoo9201303C
4r32rrer201207 
4r32rrer201208 
4r32rrer201209B
4r32rrer201210B
4r32rrer201211B
4r32rrer201212C
4r32rrer201301 
4r32rrer201302 
4r32rrer201303 

 

 

Now I would like to create a new column with a classification for each customer per date. There are five possible classifications:

 

  • New Customer: When a customer purchases for the first time. In order to be classified as such, the "Item Purchased" column of this specific customer must be empty before the date he made the first purchase.
  • Re-Buying: This is basically a recurring customer. This classification should appear when a customer purchases the same item as the previous month.
  • Lost customer: When a customer stops buying. This classification should appear when the customer stops buying (when the following cells are empty).
  • Switch: When a customer stops buying one product but immediately starts buying a different product.
  • Regained Customer: When a customer that had stopped purchasing before (and had had the “Lost customer” classification) starts purchasing again.

 

An example of the output I’m trying to achieve:

 

Client CodeDateItem PurchasedClassification
7788ii201207ARe-Buying
7788ii201208ARe-Buying
7788ii201209ARe-Buying
7788ii201210ARe-Buying
7788ii201211 Lost customer
7788ii201212  
7788ii201301  
7788ii201302  
7788ii201303  
21easad201207  
21easad201208  
21easad201209BNew Customer
21easad201210BRe-Buying
21easad201211BRe-Buying
21easad201212BRe-Buying
21easad201301ASwitch
21easad201302ARe-Buying
21easad201303ARe-Buying
upoo9201207  
upoo9201208CNew Customer
upoo9201209CRe-Buying
upoo9201210 Lost Customer
upoo9201211  
upoo9201212  
upoo9201301  
upoo9201302CRegained Customer
upoo9201303CRe-Buying
4r32rrer201207  
4r32rrer201208  
4r32rrer201209BNew Customer
4r32rrer201210BRe-Buying
4r32rrer201211BRe-Buying
4r32rrer201212CSwitch
4r32rrer201301 Lost Customer
4r32rrer201302  
4r32rrer201303  

 

Is it possible to create a function or use a tool to classify automatically in a large dataset? I’ve been trying to use both Multi Row and Multi Field formulas to write a nested IF function that can fit all of those conditions, but I can’t seem to specify the correct cells or even write the function correctly. If anyone knows how to do this, I would really appreciate some help!

 

Thank you.


Viewing all articles
Browse latest Browse all 4999

Trending Articles