Showing posts with label whatwg. Show all posts
Showing posts with label whatwg. Show all posts

Sunday, January 11, 2015

On Use of the Lang Attribute

HTML5 Logo with character for Chinese number 5.

Way back in October I noticed this WHATWG HTML bug (26942) where someone asked why do these examples of <html> lack the lang attribute? I thought the answer from Hixie was a bit dismissive and not based on any data or real-world benefits of use, particularly in the context of screen readers:

Why not? Realistically, few people include it. It just means the language is unknown.

At the time, I could not get the latest archive to download from WebDevData.org (though that has changed, see below), so I fell back to asking for help on why the lang attribute is valuable.

How the lang Attribute on <html> Is Used

I got lots of good bits of feedback, which I collected into a Storify. I've distilled all that great information to these key points:

  • VoiceOver on iOS uses the attribute to auto-switche voices.
  • VoiceOver can speak a particular language using a different accent when specified.
  • Leaving out the lang attribute may require the user to manually switch to the correct language for proper pronunciation.
  • JAWS uses it to load the correct phonetic engine / phonologic dictionary — Handy for sites with multiple languages.
  • NVDA (Windows) uses it in the same way as VoiceOver and JAWS.
  • When used in HTML that is used to form an ePub or Apple iBooks document, it affects how VoiceOver will read the book.
  • Firefox, IE10, and Safari (as of a year ago) only support CSS hyphens: auto when the lang attribute is set (not from Twitter; source).

In the absence of setting a lang attribute on the <html> element, screen readers will fall back to the user's default system setting (barring any custom overrides) when speaking content.

How Many Pages Use lang

On January 8, WebDevData.org (from a W3C Community Group) posted its latest archive (which did not error on download, woo!). It consists of the HTML from 87,000 web pages.

I pulled down the 780MB file and re-taught myself the skills necessary to parse the files. For those who are regular expression geniuses, you are welcome to suggest an alternate approach, but I used the following pattern to return all the <html> elements: <html([^>]+)>. It fails for any <html> with no attributes at all, but for what I am doing that's ok.

Of the 84,054 pages I parsed (I excluded XML, ISO files, and so on), I found that 39,433 use the lang attribute on the <html> element. That's just about 47% (46.914% if I understand significant digits correctly).

What that tells me is that instead of the case being that few people include it, nearly half the web includes it.

There are 12,672 instances of xml:lang, though at a quick scan they appear alongside lang. If anyone with better regex skills would like to help me further parse, please let me know.

Why You Should Use the lang Attribute on the <html> Element

Hyphens

By using lang, you get the benefits of hyphen support in your (modern) browser that you otherwise would not get (assuming you use hyphens: auto in your CSS).

Accessibility

At the very least, lang is a benefit for screen reader users, particularly when your users don't have the same primary language as your site. It allows proper pronunciation and inflection when the page is spoken.

WCAG Compliance

Including the lang is a Level A requirement of the Web Content Accessibility Guidelines 2.0 (specifically item 3.1.1 Language of Page). Technique H57 identifies the lang attribute specifically.

Internationalization

The W3C Internationalization (I18n) Activity has a great Q&A on why you should use lang, which was updated less than two months ago. I'll reprint the start of the answer, but there is far more detail and I strongly recommend you go read it.

Identifying the language of your content allows you to automatically do a number of things, from changing the look and behavior of a page, to extracting information, to changing the way that an application works. Some of language applications work at the level of the document as a whole, some work on appropriately labeled document fragments.

We list here a few of the ways that language information is useful at the moment, however, as specifications and browsers evolve in the future there could be numerous additional applications for language information.

Interesting Aside

If you go to the WHATWG HTML5 specification today and view the page source, you'll see the following language declaration in the code:

<html class=split data-revision="$Revision: 8877 $" lang=en-GB-x-hixie>

Not to be outdone, the W3C HTML5 spec has the same language declaration.

If anybody has the en-GB-x-hixie phonologic dictionary in his or her screen reader, I'd love to hear it.

While technically allowed (the -x puts it in the private use sub-tag category), it's bad form:

Private-use subtags do not appear in the subtag registry, and are chosen and maintained by private agreement amongst parties.

Because these subtags are only meaningful within private agreements and cannot be used interoperably across the Web, they should be used with great care, and avoided whenever possible.

Update: January 1, 2015

For what it's worth, I've filed bugs against the W3C HTML5 spec and the WHATWG HTML5 spec.

Update: February 25, 2015

Another case where a lang attribute is important, though in this case on a specific element, is outlined in the piece HTML5 number inputs – Comma and period as decimal marks:

<input type="number"> will open a numeric software keyboard on modern mobile operating systems. Not every user can input decimal numbers into this convenient field without proper localization.

[…]

Half the world uses a comma and the other half uses a period as their decimal mark. (In Latin scripts.) Does your web application take that into consideration? Do the browsers?

Tuesday, January 7, 2014

The HTML Star Is Ignored (and Shouldn't Be)

On Friday Jeff Croft posted a piece titled Web Standards Killed the HTML Star where he makes the argument that just knowing HTML and CSS is no longer enough to get a job. He states that the web standards movement has effectively rendered the need for specialized knowledge of browser quirks meaningless, something he feels an HTML/CSS author used to bring to the table. In short, he maintains that the HTML/CSS dev needs to develop new skills.

And So I Ramble Through a Response

It's a compelling argument. With the advent of frameworks and pre-built templates, most web developers don't necessarily feel the need to write HTML or CSS anymore. With WYSIWYG editors built into CMSes, word processors that output to HTML, and all manner of other output-for-web features in traditional software, non-web developers never even need to see HTML if they don't want to.

Jeff's post is addressing standards from the perspective of what skills get you hired. Sadly, it's hiring practices that continue to perpetuate the lack of standards and a need for truly talented HTML/CSS coders. There are plenty of reasons why we still need skilled HTML/CSS coders.

Hip Technologies

Too many developer job requirements do in fact treat HTML/CSS as a given. It is assumed you have those skills if you apply for a software developer or graphic designer job, so prospects list HTML and CSS alongside MS Word and Excel on resumes. Instead, job listings look for the hip language or tool du jour (ActionScript Node, Dreamweaver Twitter Bootstrap, etc.).

In the last decade, many cool technologies have come and gone. Few people are asking for Flash features on their sites, for example. HTML and CSS, however, are still there. They are the bedrock on which all these new tools are built. Staying abreast of everything going on in HTML is perhaps the best way to understand and evaluate the latest JavaScript library or CSS pre-processor. Except that skill-set has been commoditized.

Bad Advice

There are many tech outlets on the web, and they all want to get eyeballs to feed the ad banners paying their salaries. As such, some of the advice on how to properly code HTML and CSS is dubious at best. It doesn't help that some of the people reviewing these resources are also not HTML/CSS experts and so cannot identify when advice is outright wrong.

Let's not forget that the HTML specification is a fluid thing. HTML 5 is not final, HTML 5.1 is coming on its heels, and browser support is still a thing we have to consider. As such, you can't really casually know HTML or CSS

Bandwidth

Remember the adage that everyone is surfing without JavaScript until the JS file is downloaded and processed. This is particularly important over slow or bad connections. It's what drives the "offline first" trend.

Perhaps you are smarter than generating nothing but a body and letting the JS fill in the HTML. If so, then knowing proper HTML is even more important, as that is what the browser will be displaying until your script is parsed.

Starting Assumptions and Pre-Built Platforms

A few weeks back I attended a local WordPress meet-up targeted at developers. As I shared my own development practices someone asked what framework I use for responsive design. I explained I use none. The reaction was bafflement. The idea of starting a responsive site with nothing but a blank Notepad window was completely foreign. The same discussion happened regarding my preferred CSS reset (none) or CSS pre-processor (also none).

Now we live in a time when many developers don't know the fundamental HTML and CSS behind their own pages. They are aware there are resets and frameworks, that they need to muddle through some markup or styles to customize it, that a module or add-on will unlock more features. They all have their choice of shiny hammer, so every problem is just a nail. They are limited by their tools.

These tools are often wrong. Their application of HTML and CSS is against best practices, a barrier to accessibility, or generally inconsistent. These tools were developed by people who also take HTML and CSS for granted, and it shows. When web developers use these tools and themselves don't know HTML or CSS, they simply carry bad habits forward, encoding it across the web.

Accessibility

Let's not forget accessibility in general. The short-sighted may roll their eyes, but only because they forget that they, too, will continue to age and will benefit from accessibility features.

It's my experience that just trying to get developers who "know HTML" to create a simple heading structure on a page is a painful process, but fold ARIA into the mix and you've blown someone's buffer. Even if we keep it simple, I challenge you to find someone who can adequately explain the difference between article and section, let alone what the accessibility implications are.

There tutorials and frameworks dedicated to styling a button to look like a link, or a link to look like a button. These make it into frameworks, software products, and even best practice guides. The specialists without HTML or CSS knowledge are making the web harder to use for those with disabilities.

My Advice

Jeff's advice is to diversify. That is quite good advice. Ideally the more you know about assorted technologies the more quickly you can pivot when they are replaced.

My advice is a bit different. Learn the fundamentals. Learn HTML and CSS and how to best apply it. If it interests you enough to specialize, then be prepared to make your case when looking for a job.

Be the person who validates what everyone else on your team is building so they can stick with their preferred language and you can make sure it renders clean, valid HTML. Be the person who reviews frameworks and tools and can best guide decisions and what needs to be done to fix these third-party codebases.

Another Take

Rewind to Jeffrey Zeldmans' article, To Hell With Bad Browsers, from February 2001. That article pretty much kicked off the web standards push, the core of the standards movement. A few days after that post, I wrote a quasi-response piece, To Hell With Bad Editors, where I took web developers and WYSIWYG editors to task.

As you can see from my rant above and my position thirteen (13!) years ago, I still don't think the web standards movement has achieved its goals. In that regard, I think Jeff Croft's piece already starts from a false assumption.

Others' Thoughts

Others have stated their opinions in the comments of the original piece, and yet others in their own posts:

Related

Friday, November 8, 2013

Tables as Responsive Image Containers

If you've been following the latest chaos in the responsive image debate, you may know that there is a battle afoot between supporters of src-n, srcset and picture. If you don't believe me, I refer you to this WHATWG post, a polite round-up of today's bar fight. Key is that it links to multiple discussions in its round up:

[1] http://tabatkins.github.io/specs/respimg/Overview.html
[2] https://groups.google.com/a/chromium.org/d/msg/blink-dev/tV3T1wHuXqE/SvWKxIyG6IIJ
[3] https://lists.webkit.org/pipermail/webkit-dev/2013-October/025763.html
[4] https://lists.webkit.org/pipermail/webkit-dev/2013-November/025809.html
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=870021

Granted, picture has been mostly abandoned by the Responsive Images Community Group (RICG), but after today's fight it looks like it might have legs again.

All of these, however, neglect a responsive image solution that we've had since 2005: tables.

Consider the RICG logo:

RICG logo.

And consider this version encoded as a table, with none of those troublesome pixels:

......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
Look, I even added transparency!

If you look through the Use Cases and Requirements for Standardizing Responsive Images, you'll see this approach satisfies none of the requirements and is made of incredibly complex code. This, in effect, guarantees employment for responsive web developers for years to come. You can make your own tabled images, though you'll want to tweak the code a bit.

I tried to make this clear earlier today:

I can only hope this example sets us all on the right path.

Update: November 14, 2013

If you came here actually looking for something useful about the status of responsive images in general, check out Mat Marquis' responsive images standings cheat-sheet, which he plans to update regularly.

Update: January 2, 2013

Other responsive image options: