I can add the funny é char into the cell inside phpMyAdmin, just errors when i try add it from a text field on the form.
Can I change that collation thing that I have never used before to something different to make it work?
Thanks, I have never touched those things, always just left them as whatever they were already, i had no idea what they were for, so figured rather not touch, hahaha, thanks guys, i will change it quickly. Luckily South Africa really does not use many if any of those strange é thingymabobs
Holy pooh, I changed to utf8 general ci and my field which was type text has now changed to blob and instead of data in the columns i have these [BLOB -1.4 KiB] links, im a little nervous, should have backed my database first, that will teach me.
If I turn on Show Blob Contents then some are correct with the expected text that was there and others are just strings and strings of code.
What I had to do to change the collation was, firstly fix what phpMyAdmin did, then once it was all back to latin1_swedish_ci which for some reason is how my default is of phpMyAdmin or my MySQL server or INNODB, then i did a full database export.
Opened the file in BBEdit, find and replace latin1 with utf8mb4, drop all tables from my database, import my BBEdit file, and wham all converted and perfect.
My surname ends in an 'é' (but I'm not French).
For a while I've been using utf8mb4. I can't quite remember why, but it seemed liked the best option at the time. I would need to use this character set if I wanted to store this message in a database - because it supports emojis (but that probably wasn't the reason I chose it).
General, unicode and language locales refer to the collation. So how the database sorts and compares the data.
I was referring to using charset utf8mb4 instead of plain utf8 which is a bit old and doesn't support fully unicode. Utf8 uses 3 bytes to store data while utf8mb4 uses 4.
Of course once you have decided to use utf8mb4 and right now it is the recommended charset. You should decide your collation type.
As for going for general or unicode collations. General is usually faster than unicode but “less correct” when comparing or sorting.
In the latest MySql versions(8+ I think) the recommended charset and collation is “ utf8mb4_0900_ai_ci”. Nonetheless general and unicode will still do the job.
I am going to bug for one more piece of info on this, sorry, I just can’t seem to find a direct answer for this.
if the latin collation i had was 1byte, and utf8 is 3bytes and utf8mb4 is 4bytes does that mean a tinytext of 256 char length in utf8mb4 is only (256/4)=64 char length now, or is tinytext always 256 chars regardless of it’s collation?
The amount of bytes of the charset(not the collation) defines how many characters it can represent. To represent the number of characters that unicode has now you need at least 4 bytes.
The size of the data types is separate thing and it is defined by the database engine.