Expand my Community achievements bar.

Announcement: Calling all learners and mentors! Applications are now open for the Adobe Analytics 2024 Mentorship Program! Come learn from the best to prepare for an official certification in Adobe Analytics.
SOLVED

Are there limitations or things to avoid when using delimiters?

Avatar

Level 1

We are receiving values and want to separate them using a delimiter.  We've found that colon isn't suitable, nor is a single pipe.  Other single characters like semi colon, comma and exclamation mark isn't suitable because these values already exist in some of the values.

 

Could double pipe be ok?  i've not seen these values present in any existing data but wanted to check in case there was a technical reason to avoid.

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Sometimes finding the right delimiter can be challenging, since it relies so much much on what data is being collected.

 

One trick that I have used in the past is to sanitize the data being collected... like for instance, if the data we're collecting doesn't have to be exactly the same format, I will strip out the character that I need to use as a delimiter from the value...

 

For instance, let's say I want to use colon as my delimiter, but my values are:

  • "value"
  • "value: something"
  • "another example with my offending character of :"

I might convert these to:

  • "value"
  • "value something"
  • "another example with my offending character of "

This just removes the value totally....

"value:value something:another example with my offending character of " (now I know the only : are my delimiters, but technically I've lost a little data)

 

or I might get fancy and do something like:

  • "value"
  • "value[colon] something"
  • "another example with my offending character of [colon]"

Then use classifications to replace "[colon]" back into the text with the original ":"

 

"value:value[colon] something:another example with my offending character of [colon]"

(again, ensuring that the only ":" are my delimiters, and then after the data is split out, I put the original values back in to the individual values)

View solution in original post

3 Replies

Avatar

Correct answer by
Community Advisor

Sometimes finding the right delimiter can be challenging, since it relies so much much on what data is being collected.

 

One trick that I have used in the past is to sanitize the data being collected... like for instance, if the data we're collecting doesn't have to be exactly the same format, I will strip out the character that I need to use as a delimiter from the value...

 

For instance, let's say I want to use colon as my delimiter, but my values are:

  • "value"
  • "value: something"
  • "another example with my offending character of :"

I might convert these to:

  • "value"
  • "value something"
  • "another example with my offending character of "

This just removes the value totally....

"value:value something:another example with my offending character of " (now I know the only : are my delimiters, but technically I've lost a little data)

 

or I might get fancy and do something like:

  • "value"
  • "value[colon] something"
  • "another example with my offending character of [colon]"

Then use classifications to replace "[colon]" back into the text with the original ":"

 

"value:value[colon] something:another example with my offending character of [colon]"

(again, ensuring that the only ":" are my delimiters, and then after the data is split out, I put the original values back in to the individual values)

Avatar

Level 1

We are getting the data from an API from a Firmographic vendor.  They wanted us to have an eVar for each value but there is about 15 values and we didn't want to waste the eVars.

 

Double pipe it transpires doesn't work for us because we also use Alteryx and Alteryx can't handle splitting out double delimiters.

We push our data into a datamart daily via the FTP daily feed.  Then in the 3 eVars which hosts all these values, the values are separated by something. Currently a ":" but since that exists in the data in some names, it caused some issues.

Today i was advised perhaps one thing we could use is "^" because it doesn't exist at all in the data.

Avatar

Level 7

Fully agreeing with @Jennifer_Dungan , data sanitization is importantly. I sometimes use a combination of different characters like "+_+" to limit the chances of interference with actual values. 

cheers

Björn