My app uses MySQL and has a table column which is used to enter a description of an event.
My users tend to paste existing text into this field, and it sometimes generates a character set error when inserting their description the database when they have included less than common characters.
My database in general is set to be utf8
, but I have set this specific field to be utf8mb4
:
ALTER TABLE activities MODIFY description TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
The error that gets generated looks like this:
"SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\\xF0\\x9F\\x8C\\x9F\\xE2\\x9C...' for column 'description' at row 1",
"trace": "#0 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\db\\Connection.php(144): PDOStatement->execute()\n#1 D:\\workshop-angel-v6b\\dmxConnectLib\\modules\\dbupdater.php(350): lib\\db\\Connection->execute('UPDATE `activit...', Array)\n#2 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(204): modules\\dbupdater->update(Object(stdClass), 'update_activity', Array)\n#3 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(176): lib\\App->execSteps(Object(stdClass))\n#4 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(144): lib\\App->execSteps(Array)\n#5 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(133): lib\\App->exec(Object(stdClass), false)\n#6 D:\\workshop-angel-v6b\\dmxConnectLib\\lib\\App.php(111): lib\\App->exec(Object(stdClass))\n#7 D:\\workshop-angel-v6b\\dmxConnect\\api\\activity\\update_activity_description.php(8): lib\\App->define(Object(stdClass))\n#8 {main}"
}
I am testing this with a piece of odd character filled text that good old chatGPT created for me, which I have included below.
What is the best way to fix this issue?
- Is there a filter I can apply to the text before it is saved to the database that will remove the offending characters?
- Is there a different character set I can assign to the column to accept these odd characters?
Best wishes,
Antony.
Test text:
Sure! Here's a paragraph of text that includes some unusual ASCII characters:
"🌟✨ Welcome to the mystical realm of 🌙☁️ ASCII art! Behold the enchanted creatures as they dance across the screen, ✨🦄 sprinkling magic wherever they go. 🎩✨ With a wave of my wand, I summon the elusive ╰(⸝⸝⸝´꒳`⸝⸝⸝)╯ ASCII fox, known for its cunning wit and mischievous charm. 🦊💫 As the moonlight cascades through the /人◕ ‿‿ ◕人\ mystical trees, the 🌳🌟 ASCII owl perches silently, guarding ancient wisdom. 🦉📚 Step into this ethereal realm and let your imagination ✨✍️ roam freely among the twinkling stars and enchanting creatures! 🌌🔮✨"
I hope you find this paragraph delightful and whimsical!
How it appears when output in the saving server action:
{
"p_description": "<p>Sure! Here's a paragraph of text that includes some unusual ASCII characters:<\/p><p><br><\/p><p>\"\ud83c\udf1f\u2728 Welcome to the mystical realm of \ud83c\udf19\u2601\ufe0f ASCII art! Behold the enchanted creatures as they dance across the screen, \u2728\ud83e\udd84 sprinkling magic wherever they go. \ud83c\udfa9\u2728 With a wave of my wand, I summon the elusive \u2570(\u2e1d\u2e1d\u2e1d\u00b4\ua4b3`\u2e1d\u2e1d\u2e1d)\u256f ASCII fox, known for its cunning wit and mischievous charm. \ud83e\udd8a\ud83d\udcab As the moonlight cascades through the \uff0f\u4eba\u25d5 \u203f\u203f \u25d5\u4eba\uff3c mystical trees, the \ud83c\udf33\ud83c\udf1f ASCII owl perches silently, guarding ancient wisdom. \ud83e\udd89\ud83d\udcda Step into this ethereal realm and let your imagination \u2728\u270d\ufe0f roam freely among the twinkling stars and enchanting creatures! \ud83c\udf0c\ud83d\udd2e\u2728\"<\/p><p><br><\/p><p>I hope you find this paragraph delightful and whimsical!<\/p>",
}