Unicode apostrophe types and Citizen hub

Yesterday we were made aware of an issue that a Customer Services Agent had been contacted about by a resident.

When the resident (and CSA) tried to create a “New Service Request” from their person record, it wouldn’t load the requested form. Upon investigation we noted that the residents surname contained an apostrophe, we tried removing it, the form would now load. We are also aware that we have other residents whose surnames include an apostrophe and had not had this issue, so tried replacing the apostrophe, the form loaded.

Inspecting the originally submitted surname versus the new version, we found that the original apostrophe was a “straight” apostrophe and the newly entered one was a “curly” apostrophe.

Checking detective we found the following error:

[error] => 400
[message] => invalid json body: Control character error, possibly incorrectly encoded

Here is an explanation by one of my (incredible) colleagues:

Microsoft Word – type in a word with an apostrophe

  1. Select the apostrophe and Alt-X, and it shows you it is Unicode 2019

  2. Open Excel, and copy and paste your word into two cells

  3. In the second cell, delete and retype the apostrophe

  1. You can see it is a different character.

  2. Copy and paste this second cell back into Word and select this new apostrophe and Alt-X, and it will show you it is Unicode 0027.

My only suggestion is perhaps Create doesn’t handle specific types of Unicode apostrophes. It is possible to use either of these apostrophes in an auto-fill field, so maybe that is how it happened?

Is there a universal fix that can be applied for this? We can do it locally, but felt that it should be examined by Netcall.

We have found this issue with an API call that was raising cases on a citizen hub host based on emails sent into a citizen hub central server. I’ve actually collected about a dozen bad characters that break the api and have had to setup a system for replacing them. In the end we found some code that strips anything except a small subset of Unicode characters and replaces them with the /u2019 code and modified that into a tidy up script.

Some of the bad characters were zero width spaces, some were em or en dashes. Even had an issue with a £ at one point.

We’ve also just noticed the issue is happening with the portal user sync from chcentral to chhosts