Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The main reason why it wouldn't have happened is CSS selectors predate XPath by a few years. CSS was first proposed in 1994 and the CSS1 spec was released in 1996, I don't know when XPath was originally proposed but the first public draft was in late 1998 and the release was in 1999.

CSS 2 actually predates XPath 1.0.

XPath would also have needed more work to replace CSS selectors, aside from being a bigger performance concern (through being more capable and not working in a strictly top-down manner, meaning you can easily get very inefficient selectors) it lacks facilities which are quite critical to CSS selectors like the shortcut id and class selectors as well as priority.

In fact talking about class selectors, those are absolute hell to replicate in XPath 1.0 if you don't have extensions to lean on. To replicate the humble `.foo` you need something along the lines of

    //*[contains(concat(' ', normalize-space(@class), ' '), ' foo ')]
And don't miss the spaces around the name of the class you're looking for, they're quite critical. Good fucking luck if you need to combine multiple classes.

exslt/xpath 2.0 have `tokenize` which make it much more convenient although IIRC the way it's used is weird, I think it's

    //*[tokenize(@class) = 'foo']
because "=" on a nodeset is really a containment operation? Not sure. There's also `matches` but that's error-prone because classes tend to be caterpillar-separated, and your friendly neighborhood `\b` will match those so you need to mess around with `(^|\s+)` bullshit instead.

And finally I believe xpath 3.1 has a straightforward "contains-token" which does what the CSS "~=" operator does.

XPath 3.1 was released in 2017. "~=" was part of CSS2 (CSS1 didn't have "arbitrary" attribute selection, only classes and ids).



XPath works with XML, class is a mini language. We had same problem in JavaScript until classList

    $(element).attr("class").split(/\s+/)
It is side effect of one attribute name per element

    <a class=foo class=bar>
    //*[@class = 'foo']
    //*[@class = 'bar']
or having attributes at all

    <a>
      <class>foo</class>
      <class>bar</class>
    </a>
    //*[class = 'foo']
    //*[class = 'bar']
Of course `class` access would be optimized like it is today.


> XPath works with XML, class is mini language.

So? Also class is just a token list.

> We had same problem in JavaScript until classList

1. it's much less common to select using javascript than it would be using an xpath selector, hence the issue.

2. because JS is a full-blown programming language, splitting the class into a list is not difficult, and can furthermore be trivially factored into a helper function, or a set thereof (as class modification would really be what you'd want to do)

> It is side effect of one attribute name per element

It's a side-effect of token lists not being much of a use-case at the time for XPath, despite having been a CSS use-case for 3 years at that point.

As my comment notes, XPath 3.1 literally has a contains-token function, that function is compatible with XPath 1.0. If you have an XML processor which allows extension (which would be the case for more or less all of them outside browsers), you can trivially implement your own, or an even more specialised `has-class` function.

But contains-token was added in 2017, not in 1999, to say nothing of 1996.


It looks strange to me, language designed to work with trees so turn everything to tree. No need to change serialization, we know format, just parse it — class, URL, CSS. That's what we do in JavaScript [1], [2], [3].

    location.host === 'example.com'
    span.style.color === 'blue'
    datet.getFullYear() === 2020

    //a[href/host = 'example.com']
    //span[xstyle/color = 'blue']
    //date[datetime/year = '2020']
Just like your proposal but node knows how to parse itself

    //a[url-host(@href) = 'example.com']
    //span[css-color(@style) = 'blue']
    //time[datetime-year(@datetime) = '2020']
And it is actual syntax, I've checked on

    <a><href><host>example.com</host></href></a>
    <span><xstyle><color>blue</color></style></span>
    <date><datetime><year>2020</year></datetime></date>
    
* <style> is CDATA, emulated with <xstyle>

[1] https://developer.mozilla.org/en-US/docs/Web/API/DOMTokenLis...

[2] https://developer.mozilla.org/en-US/docs/Web/API/URL

[3] https://developer.mozilla.org/en-US/docs/Web/API/CSS_Object_...


Yeah maybe not the best idea, XPath has a pretty different use case than CSS. But it does seem a shame that more complex "nth-child" selectors do not just use something more flexible/programmable like xpath.

As it stands CSS keeps adding one-off selectors such as nth-child, only-child, only-of-type. Such is the feature creep of web standards as they are, a continual addition of one-off APIs for specific use cases rather than a robust orthogonal programming platform that people can build anything on top of, like WebCrypto instead of adding integers.


well, I know XPath 1.0 was released in 1999 because I was doing stuff with it.

CSS 2 as I remember was some years after that, and googling it seems to be August 2002.

At any rate in this example

//*[contains(concat(' ', normalize-space(@class), ' '), ' foo ')]

I don't think the interest in using XPath is at all related to being able to check what class an element has, and in fact the linked discussion does not have anything to do with that, so I mean yeah, that's pretty bad but not anything anyone is asking to do in this scenario.


> CSS 2 as I remember was some years after that, and googling it seems to be August 2002.

You're thinking about CSS 2 revision 1 aka CSS 2.1

> I don't think the interest in using XPath is at all related to being able to check what class an element has

The original comment in the thread was

> It would have made sense to use XPath for CSS selectors

Being able to check what class an element has would be absolutely critical to that use-case.


ah yeah, forgot we were in thread and was thinking about the post itself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: