The problem with that line of reasoning is that it ruins what’s arguably the most important feature of DNS: providing human-readable names.
Using lookalike characters to deceive people has been a problem since long before anyone first got the idea to register paypa1.com but no-one ever seriously suggested abandoning human-readable names in order to avoid that problem.
I’m unsure how that’d be useful to any normal user. Let’s say the UI shows something like this:
A.com
Α.com (xn--mxa.com)
А.com (xn--80a.com)
What’s the user supposed to do with that information, how would showing the Punycode here help any normal user determine which one of these domains is the right one that they want to visit?
Helping users identify the right domain name and avoid being deceived is surely a very important thing to do, I just find it hard to see how having users read Punycode would ever be a practically useful way to achieve that.
Let’s say that I go to google.com. The UI shows
https://google.com/ . No punycode because it is plain ascii. Everything is as expected.
Now let’s say I click on a link for googӏe.com. The ui shows https://xn--googe-hof.com/ (googӏe.com) I’d be like, holy shit that is a shady URL!
That’s how I imagine it helping, although I am not a UI expert. There could be a better way. But that googӏe.com scares me – I can’t visually tell that it is not a normal lowercase “l”.
P.S. for the URL in question, https://xn--gckvb8fzb.com/ (マリウス.com) I imagine that if I went to it frequently, I might begin to recognize the punycode, sorta like how people recognize rickroll URLs.
But that line of reasoning presupposes both that the right name is in ASCII and that the user knows this. As soon as either one of those isn’t true, showing the Punycode no longer is of any help in determining which one is the right one.
For most security - centric websites, the right name is ASCII only.
For any that aren’t, people would have the opportunity to become familiar with the correct fingerprint over time and have a chance to notice a difference.
I’m curious to hear if you think there is a better way. What I’m saying is unlikely to ever be implemented in a browser and I’m not trying to convince you or anything, just say why I personally would appreciate it.
For most security - centric websites, the right name is ASCII only.
Are you perhaps by any chance American?
I’m curious to hear if you think there is a better way.
I think a much better solution would be to shield end users from this problem entirely, by having all registries refuse to register such confusable names, as recommended by Unicode:
The problem with that line of reasoning is that it ruins what’s arguably the most important feature of DNS: providing human-readable names.
Using lookalike characters to deceive people has been a problem since long before anyone first got the idea to register paypa1.com but no-one ever seriously suggested abandoning human-readable names in order to avoid that problem.
The term “Human” does not include people who primarily read non latin-based languages silly
Note that everything outside of ASCII gets encoded in Punycode, so this also includes most languages written in the Latin script.
Shit, I forgot that Human now just means the native English-speaking world.
Ideally they should show both side by side.
I’m unsure how that’d be useful to any normal user. Let’s say the UI shows something like this:
What’s the user supposed to do with that information, how would showing the Punycode here help any normal user determine which one of these domains is the right one that they want to visit?
Helping users identify the right domain name and avoid being deceived is surely a very important thing to do, I just find it hard to see how having users read Punycode would ever be a practically useful way to achieve that.
Let’s say that I go to google.com. The UI shows
https://google.com/
. No punycode because it is plain ascii. Everything is as expected.Now let’s say I click on a link for googӏe.com. The ui shows
https://xn--googe-hof.com/ (googӏe.com)
I’d be like, holy shit that is a shady URL!That’s how I imagine it helping, although I am not a UI expert. There could be a better way. But that googӏe.com scares me – I can’t visually tell that it is not a normal lowercase “l”.
P.S. for the URL in question,
https://xn--gckvb8fzb.com/ (マリウス.com)
I imagine that if I went to it frequently, I might begin to recognize the punycode, sorta like how people recognize rickroll URLs.But how would an average user know that
xn--googe-hof.com
isn’t the right one?Because it does not match google.com
But that line of reasoning presupposes both that the right name is in ASCII and that the user knows this. As soon as either one of those isn’t true, showing the Punycode no longer is of any help in determining which one is the right one.
For most security - centric websites, the right name is ASCII only.
For any that aren’t, people would have the opportunity to become familiar with the correct fingerprint over time and have a chance to notice a difference.
I’m curious to hear if you think there is a better way. What I’m saying is unlikely to ever be implemented in a browser and I’m not trying to convince you or anything, just say why I personally would appreciate it.
Are you perhaps by any chance American?
I think a much better solution would be to shield end users from this problem entirely, by having all registries refuse to register such confusable names, as recommended by Unicode:
https://www.unicode.org/reports/tr46/tr46-34.html#Registries