Max. bytes in a UTF-8 char?

4.

There are a maximum of 4 bytes in a single UTF-8 encoded unicode character.

And this is how the encoding scheme works in a nutshell.

Bits of code point First code point Last code point Bytes in sequence Byte 1 Byte 2 Byte 3 Byte 4
7 U+0000 U+007F 1 0xxxxxxx      
11 U+0080 U+07FF 2 110xxxxx 10xxxxxx    
16 U+0800 U+FFFF 3 1110xxxx 10xxxxxx 10xxxxxx  
21 U+10000 U+1FFFFF 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

Source: Wikipedia (also confusingly showing 6 possible bytes when truly 4 is the maximum)

Wait, I heard there could be 6?

No.
You heard wrong.

Continue reading