Hi, Below is my data
Clr doc_# | Clearing_Date | Invoice_Amount |
85212889 | 20-Oct-14 | -205.34 |
85212889 | 20-Oct-14 | -205.34 |
482032929 | 5-Nov-14 | 319.81 |
482032929 | 5-Nov-14 | -319.81 |
835181114 | 3-Nov-14 | 331.65 |
257495 | 18-Dec-14 | -10475.70 |
257495 | 18-Dec-14 | 10475.70 |
257495 | 18-Dec-14 | 10475.70 |
343898098 | 19-Dec-14 | 2000.00 |
343898098 | 19-Dec-14 | -1000.00 |
343898098 | 19-Dec-14 | -1000.00 |
343898098 | 19-Dec-14 | 500.00 |
2370 | 9-Dec-14 | 11132.00 |
2370 | 9-Dec-14 | 20.00 |
3308 | 8-Oct-14 | 2000.00 |
3308 | 8-Oct-14 | -1000.00 |
3308 | 8-Oct-14 | -1000.00 |
1234 | 8-Oct-14 | -2000.00 |
1234 | 8-Oct-14 | 1000.00 |
1234 | 8-Oct-14 | 1000.00 |
Expected Output:
Clr doc_# | Clearing_Date | Invoice_Amount | Manual flag |
85212889 | 20-Oct-14 | -205.34 | I |
85212889 | 20-Oct-14 | -205.34 | I |
835181114 | 3-Nov-14 | 331.65 | I |
257495 | 18-Dec-14 | 10475.70 | I |
343898098 | 19-Dec-14 | 500.00 | I |
2370 | 9-Dec-14 | 11132.00 | I |
2370 | 9-Dec-14 | 20.00 | I |
Logic is groupby doc_num and date: eliminate those records where sum of amount of n rows for each doc# is zero
for Example: for doc_num 257495: we have 3 records but if we sum up invoice amount (-10475.70)+(10475.70)+(10475.70) = 10475.70, so instead of 3 records, I should get only one record that has data like below and create a column as Manual flag that has I=Include, E= Exclude
257495 18-Dec-14 10475.70 I