Quantcast
Channel: Data Preparation & Blending discussions
Viewing all 4999 articles
Browse latest View live

Writing an expression to choose one data source over another when both are populated

$
0
0

I am currently building a workspace that is taking in data from multiple Access Databases and blending it into one fluid file. I am doing this becuase there is data missing from the old files and I need to merge and update the information. My current dillema is that I have duplicate columns that have differing information (Blanks, Unknowns, Innaccuracies, etc.) and I need to merge the two. For example, I have a column tiitled "Customer ID" and "Revised Customer ID"- I need to write an expression that will look at both columns and if both of them are populated with data, I would like it to return the "Revised Customer ID" data. I need to do this because in some cases one is populated but not the other, vice versa, but regardless I need the Revised data if both fields are filled. Any help would be greatly appreciated.


Output to Excel: Simultaneous sheets renaming and time stamp to file name

$
0
0

Dear friends, could you please advise me about the way how to do the following in the Excel output. I want to:

- Have multiple tables being outputed to different sheets named in a way I choose

- Have filename of the output be stampted with date and time (e.g. "Output 20160629 1400.xlsx")

 

To do the first part I use "Take file/table name from field" option in the Output Data tool (with standard logic showed on a screenshot). As far as I understand, time stamping is done via the same option. But how to do both simultaneously?..

 

Thanks a lot!

 

SheetsNaming.jpg

Custom Join tool which allows for correct multiple field joins

$
0
0

Hi all,

 

I'm looking at creating a custom Join tool but was wondering if it was possible to allow the user to select multiple fields per dataset and ensure that they correctly join?

 

Example:

 

Dataset A Fields: Key, Subkey, Data

Dataset B Fields: Data, Subkey, Key

 

Using a list box and ticking Key and Subkey in both datasets does not work as the order's are different, so Key_A attempts to pair to Subkey_B and Subkey_A attempts with Key_B. Other than manually reordering the fields, is there a way to ensure that the correct pairings are attempted, whether dynamic or via user input?

 

Thanks in advance,

 

Timur_O

Load gdb or shp file in alteryx

$
0
0

If I include a shp file in the input data model of Alteryx I get the following message below. Could anyone tell me what I can do to solve this?

 

Error: Input Data (76): No available conversion between projections.
Source is:
PROJCS["WGS_1984_Web_Mercator_Auxiliary_Sphere",GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.017453292519943295]],PROJECTION["Mercator_Auxiliary_Sphere"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",0.0],PARAMETER["Standard_Parallel_1",0.0],PARAMETER["Auxiliary_Sphere_Type",0.0],UNIT["Meter",1.0]]
Target is:
GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]

CrossTab does not support renaming of duplicate fields.

$
0
0

Getting this error and I'm not exactly sure what it means.  I am trying to pull my data together so I am using  a Basic Data Profile tool and then going straight into a Cross Tab tool.  What I was expecting to happen is that each row would be one column worth of data.  It appears to be when I am trying to group my Cross Tab.  

 

 

Date conversion from excel

$
0
0

Hi Fiends,

 

I extract an excel sheet from an application on the below format. i need to convert the Date02 field to understandable format(ISO or USA Date format) from excel using alteryx. i request your help to help me solve this issue

 

S.NoNameDate01Date02
1Holly2009-09-3040757
2Peter2009-08-2540940
3Sanson2009-07-1741306

Unable to do a Date parse for a Null Column

$
0
0

Hi Friends,

 

I am having a Scenario having some null columns in the date field. When i try to parse it, i get error messages. Please help me, how to parse date columns which consists of null values as well.

 

S.NoNameDate
1Harper2002-04-09
2Stringer 
3Martin2008-08-08

FINDER - workflow to search all your workflows for keywords

$
0
0

Hi, I finally got around to making my 'search all your workflows' tool public facing.

 

The problem:

I create a lot of workflows and often need to find workflows that created a specific yxdb or csv file. Or I need to find all workflows that use a specific tool - like the Download Tool(eg download.download), so I can remember how I set it up for a API so I can use in another service.

 

The Need:

Crawl all my directories with yxmd/mc files and then search the filename and file xml for keywords.

 

The Solution:

This workflow takes the paths you specify and creates a list(via .bat file - eg windows only) of all your yxmd/mc Files/Paths and then runs through a batch macro to look for the keyword you are searching and returns the file name and path so you can copy and open and get on with your business.

 

There are 2 input:

  1. The Paths - the path to save data is where your list of files will be stored. This was added for another feature that failed. This must be a real directory.
  2. The Search Term - NOTE: this uses the contain function so it is broad match.

 

Capture.PNG

Capture.PNG

 

Once enter your information: RUN to see your results by

  1. Keyword in File Name (excluding path)
  2. Keyword in Files broken out by yxmd or yxmc with counts of occurrences for keyword in the file. (side note: I discover my text files stored in workflows which makes the files large. Just search for <r> and you will see some very large numbers if you manage to store data in your workflows.)

Capture.PNG

 

YXMD example - eg keyword shopify was also in some of the files not listed above.

Capture.PNG

 

Anyway, hope this helps someone as this was more brain damage than I had intended(my local hacked version worked great for me Smiley Happy). I had hoped to build in a cache file for quicker turn around on second search term but that is when I discoved some of the massive lines, 8MM total, in the xml for things like <r> - so I just left the process to run through batch macro everytime vs trying to use a cache.

 

Anthony


Alteryx and Excel style countifs/sumifs with dates criteria

$
0
0

Hey, I have a simple problem as I want to use countifs and sumifs functions in Alteryx. I have two datasets and I need to find out how many times a certain corporation ID appears in other dataset in a given time period.

Dataset1 is like this:

Column1=CorporationID, Column2= date1, Column3=date2.

 

Dataset2 is like this:

Column1=CorporationID,Column2=date3,Column3=price. 

 

Now I first want to know how many times corporationID in dataset1 appears in dataset2 based on a condition that date3 falls between date1 and date2.

Second, I want to know what is the sum of "price" based on corporationID and certain daterange. I attached an excel and I would like to reproduce those results in Alteryx. 

Excel has both example datasets.

 

 

"Last" function for Summarize In-DB

$
0
0

Hi, and also thank you for your help Smiley Happy

 

I have been trying to find the "Last" function in the Summarize In-DB to no avail. The "Last" aggregation in the normal Summarize Tool is working fine, but in Summarize In-DB it disappears. All my data fields are in V_String type. Somehow the only available function are Group, Count(s), Min, Max and some disabled Numeric functions.

 

And when i try to change in XML instead, this error happens:  Summarize In-DB XML: Unknown action "Last"

 

Please help me if you have any solution. Again thank you very much.

Distinguish Qualitative / Quantitative variable

$
0
0

Hello!

 

I would like to know if there're exist a procedure which can seperate qualitative and quatitative variable ?

I have one hundred variable and i want to distinguish quantitatives variables to do à PCA.

 

 

Thanks for your help !

 

Best regards.

 

 

Need expression that will indicate whether two similar columns are both populated

$
0
0

I need to write an expression that will look at two columns of data and tell me whether or not both are populated or one is populated and the other is blank, vice versa. I am assuming I will use the Formula or Filter tool. The two fileds are titled, Customer Segment and Revised Customer Segmement. Thanks

Multi-field <if-then>

$
0
0

Greetings:

 

I'm coming from an R background where I do all my data cleaning preparation and trying to find my legs with Alteryx. I'm a single day into the experience, so my apologies for a very basic question.  I have used a dynamic selection to grab 9 unique fields of interest. These fields all contain a specific value that denotes a missing response.  The value for each of these fields is 9999.  I want to apply a formula to all these fields that replaces 9999 with a null value.  It seems that I need a "multi-field" formula using a condition if-then statement.  

 

The formulat has the following pattern:  

 

IF c THEN t ELSE f ENDIF

 

I'm not sure how to specify the condition (9999) for all the variables!  Seems straightforward if it is a single field, but I am not seeing the specification for the multi-field option.  Any guidance would be greatly appreciated!

 

Brian

 

 

Data generation : Generate set months

$
0
0

Hey guys,

 

Below is how my data looks like

id mo_id  c1 c2 c3

1 201401 3 4 5
1 201402 5 2 5
1 201403 1 5 1
1 201404 5 2 7
2 201401 4 2 5
2 201402 5 1 4
2 201403 4 1 4

 

What i am trying to do is generate null values for rest of the month for the year for ex : We have data till april for id 1 i want to generate equal grids for all the ids so in short i want 24 rows for all the ids 1 and 2.

 

Thanks in advance

Remove the last Row while processing

$
0
0

Hi Friends,

 

I am having the below scenario. The Last record needs to be removed since that is a Summarized of the Value field. I need to load the data into the database where all the fields are being declared as Key Fields and Nulls are not allowed. Please help me to solve this issue.

 

S.NoNameValueCountry
1Mike200USA
2Jason300USA
  500 

Using Regex to parse before/after varying sets of digits

$
0
0

Have a set of addresses I am trying to parse out and am having some difficulties because there are varying number of words and digits I am looking for. Some addresses even have other names before them so I was wondering if anyone who knows Regex better than I could tell me if there was a way to elimate all words that come before a set of digits from {1,4}. 

 

For example some addresses say: "ABC Company 1234 MAIN ST" and I would like eliminate that in a way that would work with differnt company names and address #'s

 

Appreciate the help!

 

Matching transactions that occur within a 30 day window

$
0
0

Hello everyone,

 

I am currently trying to match transactions for a same product (the dataset has thousands of different products) that occurred within a 30 day window. Picture the following example:

 

Transaction 1

Product A

30 April 2015

 

Transaction 2

Product A

1 May 2015

 

Transaction 3

Product A

1 August 2015

 

Here, only Transactions 1 and 2 would be relevant. In other words, I assume I need to filter transactions that had no other for the same product in the 30 days prior/subsequent to it. Is this the best way to approach the problem?

 

Thank you for your help!

 

Best

Output association analysis matrix to a file

$
0
0

Greetings:

 

I successfully put together a correlation matrix using the Association Analysis tool.  My interest is to export the matrix into a standalone file so I can create my own heatmap in Tableau.  However, neither of the output options from the Association Analysis allow me to create this matrix -- most likely a user error (I'm a noob).  Any suggestions would be greatly appreciated!

 

Brian

Turn fields into different columns

$
0
0

Hello,

 

I have a dataset with the following characteristics:

 

Row #1: Transaction A - Location New York

Row #2: Transaction B - Location Seattle

Row #3: Transaction A - Location Washington

Row #4: Transaction B - Location Detroit

...

 

I would like my dataset to look as follows:

 

Row #1 - Transaction A - Location 1 New York - Location 2 Washington

Row #2 - Transaction B - Location 1 Seattle - Location 2 Detroit

 

Basically, I would like to move from two columns (Transaction and Location), to three columns (Transaction, Location 1 and Location 2). What is the easiest way to do this?

 

Thank you in advance for your help!

 

Best

Assigning a value based on column pattern

$
0
0

Hi,

 

I have an employee biometric dataset which has a record for every In and Out of the employee.

There are times when the system does not capture one of the entries so i might end up with a In-In or Out-Out consecutively or miss out one record for the employee.

Something like this:

 

Employee IDIn/OutTime
123In06-05-2016 00:00
123Out06-05-2016 16:00
123In07-05-2016 00:05
124Out06-05-2016 13:50
124In06-05-2016 14:00
124Out06-05-2016 20:00
124Out06-05-2016 22:00

 

Now I want to have the corresponding in and out together in one row before which I plan to number the pairs with the same number so that I can join them based on Employee ID and the pair number.

 

Employee IDIn/OutTimePattern Number
123In06-05-2016 00:001
123Out06-05-2016 16:001
123In07-05-2016 00:052
124Out06-05-2016 13:503
124In06-05-2016 14:004
124Out06-05-2016 20:004
124Out06-05-2016 22:005

 

so that when I join it, I would get this 

 

Employee IDIn FlagTime InPattern NumberTime OutOut Flag
123In06-05-2016 00:00106-05-2016 16:00Out
123In07-05-2016 00:052  
124  306-05-2016 113:50Out
124In06-05-2016 14:00406-05-2016 20:00Out
124  506-05-2016 22:00Out

 

The logic of this I think is to increment a global variable by 1 for every different employee id and for a particular employeed id check the next row flag if its different from the current row assign the pattern number the same as current row and increment it for the row+2.

 

I have been trying out quite a few ways to achieve this:

1)Batch Macro

2)Multi-row formula

Also, its sorted by Employee Id and the Time.

 

I don't know if i'm missing something really silly but I can't get the Pattern Number to increment properly.

 

Anyone has a better way to solve this in Alteryx? Please let me know.

 

Viewing all 4999 articles
Browse latest View live