Who cares? It’s going to be hashed anyway. If the same user can generate the same input, it will result in the same hash. If another user can’t generate the same input, well, that’s really rather the point. And I can’t think of a single backend, language, or framework that doesn’t treat a single Unicode character as one character. Byte length of the character is irrelevant as long as you’re not doing something ridiculous like intentionally parsing your input in binary and blithely assuming that every character must be 8 bits in length.
If the same user can generate the same input, it will result in the same hash.
Yes, if. I don’t know if you can guarantee that. It’s all fun and games as long as you’re doing English. In other languages, you get characters that can be encoded in more than 1 way. User at home has a localized keyboard with a dedicated key for such a character. User travels across the border and has a different language keyboard and uses a different way to create the character. Euro problems.
Byte length of the character is irrelevant as long as you’re not doing something ridiculous like intentionally parsing your input in binary and blithely assuming that every character must be 8 bits in length.
There is always some son-of-a-bitch who doesn’t get the word.
Hmm. I wonder about this one. Different ways to encode the same character. Different ways to calculate the length. No obvious max byte size.
Who cares? It’s going to be hashed anyway. If the same user can generate the same input, it will result in the same hash. If another user can’t generate the same input, well, that’s really rather the point. And I can’t think of a single backend, language, or framework that doesn’t treat a single Unicode character as one character. Byte length of the character is irrelevant as long as you’re not doing something ridiculous like intentionally parsing your input in binary and blithely assuming that every character must be 8 bits in length.
Yes, if. I don’t know if you can guarantee that. It’s all fun and games as long as you’re doing English. In other languages, you get characters that can be encoded in more than 1 way. User at home has a localized keyboard with a dedicated key for such a character. User travels across the border and has a different language keyboard and uses a different way to create the character. Euro problems.
https://en.wikipedia.org/wiki/Unicode_equivalence
There is always some son-of-a-bitch who doesn’t get the word.