Hello all,
First off I’m sorry about posting my second question on such a short time but I can’t seem to find a solution to my problem and if anyone can give me some hints I would be truly thankful.
I have a very large dataset with Client ID’s, date and Items they purchased. It looks like this:
Client Code | Date | Item Purchased |
7788ii | 201207 | A |
7788ii | 201208 | A |
7788ii | 201209 | A |
7788ii | 201210 | A |
7788ii | 201211 | |
7788ii | 201212 | |
7788ii | 201301 | |
7788ii | 201302 | |
7788ii | 201303 | |
21easad | 201207 | |
21easad | 201208 | |
21easad | 201209 | B |
21easad | 201210 | B |
21easad | 201211 | B |
21easad | 201212 | B |
21easad | 201301 | A |
21easad | 201302 | A |
21easad | 201303 | A |
upoo9 | 201207 | C |
upoo9 | 201208 | C |
upoo9 | 201209 | C |
upoo9 | 201210 | |
upoo9 | 201211 | |
upoo9 | 201212 | |
upoo9 | 201301 | |
upoo9 | 201302 | C |
upoo9 | 201303 | C |
4r32rrer | 201207 | |
4r32rrer | 201208 | |
4r32rrer | 201209 | B |
4r32rrer | 201210 | B |
4r32rrer | 201211 | B |
4r32rrer | 201212 | C |
4r32rrer | 201301 | |
4r32rrer | 201302 | |
4r32rrer | 201303 |
Now I would like to create a new column with a classification for each customer per date. There are five possible classifications:
- New Customer: When a customer purchases for the first time. In order to be classified as such, the "Item Purchased" column of this specific customer must be empty before the date he made the first purchase.
- Re-Buying: This is basically a recurring customer. This classification should appear when a customer purchases the same item as the previous month.
- Lost customer: When a customer stops buying. This classification should appear when the customer stops buying (when the following cells are empty).
- Switch: When a customer stops buying one product but immediately starts buying a different product.
- Regained Customer: When a customer that had stopped purchasing before (and had had the “Lost customer” classification) starts purchasing again.
An example of the output I’m trying to achieve:
Client Code | Date | Item Purchased | Classification |
7788ii | 201207 | A | Re-Buying |
7788ii | 201208 | A | Re-Buying |
7788ii | 201209 | A | Re-Buying |
7788ii | 201210 | A | Re-Buying |
7788ii | 201211 | Lost customer | |
7788ii | 201212 | ||
7788ii | 201301 | ||
7788ii | 201302 | ||
7788ii | 201303 | ||
21easad | 201207 | ||
21easad | 201208 | ||
21easad | 201209 | B | New Customer |
21easad | 201210 | B | Re-Buying |
21easad | 201211 | B | Re-Buying |
21easad | 201212 | B | Re-Buying |
21easad | 201301 | A | Switch |
21easad | 201302 | A | Re-Buying |
21easad | 201303 | A | Re-Buying |
upoo9 | 201207 | ||
upoo9 | 201208 | C | New Customer |
upoo9 | 201209 | C | Re-Buying |
upoo9 | 201210 | Lost Customer | |
upoo9 | 201211 | ||
upoo9 | 201212 | ||
upoo9 | 201301 | ||
upoo9 | 201302 | C | Regained Customer |
upoo9 | 201303 | C | Re-Buying |
4r32rrer | 201207 | ||
4r32rrer | 201208 | ||
4r32rrer | 201209 | B | New Customer |
4r32rrer | 201210 | B | Re-Buying |
4r32rrer | 201211 | B | Re-Buying |
4r32rrer | 201212 | C | Switch |
4r32rrer | 201301 | Lost Customer | |
4r32rrer | 201302 | ||
4r32rrer | 201303 |
Is it possible to create a function or use a tool to classify automatically in a large dataset? I’ve been trying to use both Multi Row and Multi Field formulas to write a nested IF function that can fit all of those conditions, but I can’t seem to specify the correct cells or even write the function correctly. If anyone knows how to do this, I would really appreciate some help!
Thank you.