How Can I Fix Character Set Errors When A Summernote Textarea Value is Saved To The Database?

My app uses MySQL and has a table column which is used to enter a description of an event.

My users tend to paste existing text into this field, and it sometimes generates a character set error when inserting their description the database when they have included less than common characters.

My database in general is set to be utf8, but I have set this specific field to be utf8mb4:

ALTER TABLE activities MODIFY description TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

The error that gets generated looks like this:

"SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\\xF0\\x9F\\x8C\\x9F\\xE2\\x9C...' for column 'description' at row 1",
 "trace": "#0 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\db\\Connection.php(144): PDOStatement->execute()\n#1 D:\\workshop-angel-v6b\\dmxConnectLib\\modules\\dbupdater.php(350): lib\\db\\Connection->execute('UPDATE `activit...', Array)\n#2 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(204): modules\\dbupdater->update(Object(stdClass), 'update_activity', Array)\n#3 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(176): lib\\App->execSteps(Object(stdClass))\n#4 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(144): lib\\App->execSteps(Array)\n#5 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(133): lib\\App->exec(Object(stdClass), false)\n#6 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(111): lib\\App->exec(Object(stdClass))\n#7 D:\\workshop-angel-v6b\\dmxConnect\\api\\activity\\update_activity_description.php(8): lib\\App->define(Object(stdClass))\n#8 {main}"
}

I am testing this with a piece of odd character filled text that good old chatGPT created for me, which I have included below.

What is the best way to fix this issue?

  1. Is there a filter I can apply to the text before it is saved to the database that will remove the offending characters?
  2. Is there a different character set I can assign to the column to accept these odd characters?

Best wishes,
Antony.


Test text:

Sure! Here's a paragraph of text that includes some unusual ASCII characters:

"🌟✨ Welcome to the mystical realm of 🌙☁️ ASCII art! Behold the enchanted creatures as they dance across the screen, ✨🦄 sprinkling magic wherever they go. 🎩✨ With a wave of my wand, I summon the elusive ╰(⸝⸝⸝´꒳`⸝⸝⸝)╯ ASCII fox, known for its cunning wit and mischievous charm. 🦊💫 As the moonlight cascades through the /人◕ ‿‿ ◕人\ mystical trees, the 🌳🌟 ASCII owl perches silently, guarding ancient wisdom. 🦉📚 Step into this ethereal realm and let your imagination ✨✍️ roam freely among the twinkling stars and enchanting creatures! 🌌🔮✨"

I hope you find this paragraph delightful and whimsical!

How it appears when output in the saving server action:

{
    "p_description": "<p>Sure! Here's a paragraph of text that includes some unusual ASCII characters:<\/p><p><br><\/p><p>\"\ud83c\udf1f\u2728 Welcome to the mystical realm of \ud83c\udf19\u2601\ufe0f ASCII art! Behold the enchanted creatures as they dance across the screen, \u2728\ud83e\udd84 sprinkling magic wherever they go. \ud83c\udfa9\u2728 With a wave of my wand, I summon the elusive \u2570(\u2e1d\u2e1d\u2e1d\u00b4\ua4b3`\u2e1d\u2e1d\u2e1d)\u256f ASCII fox, known for its cunning wit and mischievous charm. \ud83e\udd8a\ud83d\udcab As the moonlight cascades through the \uff0f\u4eba\u25d5 \u203f\u203f \u25d5\u4eba\uff3c mystical trees, the \ud83c\udf33\ud83c\udf1f ASCII owl perches silently, guarding ancient wisdom. \ud83e\udd89\ud83d\udcda Step into this ethereal realm and let your imagination \u2728\u270d\ufe0f roam freely among the twinkling stars and enchanting creatures! \ud83c\udf0c\ud83d\udd2e\u2728\"<\/p><p><br><\/p><p>I hope you find this paragraph delightful and whimsical!<\/p>",
}

I’ve drilled down into this a bit more…

It seems like I can save the :sunny: emoji which has a code of &#9728; but I can’t save the :star2:emoji which has a code of &#127775;

And also it doesn’t seem to matter if my database column is individually set to be the mb4 variety of utf8 or not. (my database as a whole is still utf8)

I’d love some guidance on this! :pray:

Thanks to the forum’s auto-bump feature, this article came up today which answers everything! :tada:

My issues were

  1. The column was set to utf8mb4 but the database wasn’t
  2. The global database connection wasn’t set to be utf8mb4.

With step 2, remember to set it for each of your targets!