If everything were arbitrary then there would be no agreed upon standards and semantics.
By having an agreed upon structure you allow the internet to get smarter. Data mining, search engines, usability, etc...
E.g <phone> tag that is always expected to have a phone number. That way when it's rendered on certain devices it will always have a phone number that can be interacted with.
Sure you can achieve that now with text parsing but there's better ways to evolve the web and that's what standards are for and the introduction to new elements that are semantically accurate.
A phone number is better represented by a URI (such as the tel: URI scheme). Though I agree with you, I wonder if there is no better way to represent time as well, rather than a new HTML element.
Pardon the devil's advocate question, but to what extent is semantic markup (other than with microformats) relevant to today's web? I mean, how is it used by anyone? I've tried to use semantic markup and it's not really a very effective abstraction for many kinds of data and UX.
Ask anyone who accesses the web through a screen reader and they’ll tell you it’s extremely useful. Not to be flippant, but well structured markup allows a screen reader to:
* Emphasise headings
* Allow users to skip past header sections automatically
* Ignore information unrelated to the content (aside sections)
* Allow users to navigate by heading structure
* Voice certain parts differently - links are an obvious candidate, but screen readers could put on different voices for quotes
Hang on a sec, that's what CSS does for the sighted, we don't expect the HTML to tell the browser how to layout the page, so why are you expecting the HTML to be able to tell a screen reader how to present it.
Semantic web is suppose to elicit meaning from content, not presentation.
The whole concept of semantic markup is extremely fishy to me. On the one hand you're supposed to remove presentation directions but on the other it's supposed to give presentation clues to screen readers?
When you build a site with CSS you’re over-riding the browser’s built in stylesheet. If you don’t provide your own CSS then the site is readable without, because the browser lends basic visual meaning to semantic markup.
Think of screen readers as having internal styling. Links are read in a different voice because the screen reader is applying a default audio style to the page.
Audio styling is part of CSS2.1 [1] but isn’t widely supported or widely written (classic chicken/egg problem). If you want to you can (at least theoretically) over-ride a screen reader’s default presentation of your markup.
I know, although I disagree that almost any site without its css stylesheet is really readable, but the original question was how is semantic HTML useful.
If usability is the only answer and it's doing precisely the opposite of what semantic HTML is supposed to achieve, namely separation of content and presentation, it appears to me that we've not really decided what semantic HTML is, it just sounds kinda cool and has a happy side effect of making it easier for screen readers to parse.
So many of these new tags seem too narrow, a victim of the old fashioned thinking of the w3c where everything, to them, is still a document.
Why use article instead of div? Will it really make mining easier or will you get loads of false positives? If my site creates hourly weather reports, are they articles? Why can't I define my own tag of weatherreport? Is there any point?
To quote ryan's original comment:
This isn't 'bringing the web forward' in any sense. I don't know what the answer is but it certainly isn't this.
HTML shouldn't need CSS + JavaScript to incorporate basic functions. The point of these new elements is to DEFINE semantics that we can take advantage of TOMORROW. This goes far beyond screen readers.
The HTML doesn't tell the screen reader how to present the content; it just tells what the content is, and the screen reader then decides the best way to present it (web devs just aren't qualified to decide that).
Microformats are the one case that does seem obvious/sensible to me.
It would seem to me that decentralized, ad-hoc microformat micro-standards would make more sense than a slow, bureaucratic process. The web will adopt good ideas, useful semantics, etc., so I don't really see the benefit of a 5+ year process of introducing a few new tags.
By having an agreed upon structure you allow the internet to get smarter. Data mining, search engines, usability, etc...
E.g <phone> tag that is always expected to have a phone number. That way when it's rendered on certain devices it will always have a phone number that can be interacted with.
Sure you can achieve that now with text parsing but there's better ways to evolve the web and that's what standards are for and the introduction to new elements that are semantically accurate.