Monday, October 31, 2011

HTML5 kills [time], Resurrects [u]

HTML5 logo -- I am the 'alt,' not the 'title'The HTML5 specification as managed by both W3C and WHATWG is an unfinished, incomplete specification that can change at any time. That isn't a criticism, it's just a statement of fact. It's a fact often ignored by people and companies who choose to implement it and then cry foul when something changes.

<i> and <b>

So far HTML5 has done things like convert the deprecated (in HTML 4) <i> and <b> elements to impart stylistic meaning with specific results. While the stylistic effect was known, this is why I stopped using them years ago, instead imparting semantic meaning by using <strong> and <em> and style via CSS. HTML5 Doctor covers this in more detail in The i, b, em, & strong elements. This caused me much consternation.

<small> and <hr>

Then they restored the <small> and <hr> elements by giving them semantic and structural meaning, repurposing yet more mis-used elements for reasons different than how people were already mis-using them. Once again, HTML5 Doctor addressed this in the post The small & hr elements.

The <img> altAttribute

Then in what I consider to be a profoundly anti-accessible move that can only make it easier for CMS vendors to cut corners, the requirement for an alt attribute on every <img> element was dropped under specific conditions. I know I wasn't alone on feeling this was a terrible decision when my two posts on this topic were widely distributed and still see a good deal of traffic six months later: Image alt Attributes Not Always Required in HTML5 and More on Image alt Requirement in HTML5.

And Now <u> Returns

I suppose, then, that I should not have been surprised when word came down that the <u> element was back. This is an element we blocked in our CMS, telling our clients that an underline meant text was clickable and there was no good reason to use it in regular copy. Even so, I still had a use case that I thought made sense — to indicate what character for a form field is accessible with the keyboard via the accesskey attribute. So of course the real reason for its resurrection surprised me. Let me reprint the entire entry for <u> in the WHATWG HTML5 specification:

4.6.18 The u element

The u element represents a span of text with an unarticulated, though explicitly rendered, non-textual annotation, such as labeling the text as being a proper name in Chinese text (a Chinese proper name mark), or labeling the text as being misspelt.

In most cases, another element is likely to be more appropriate: for marking stress emphasis, the em element should be used; for marking key words or phrases either the b element or the mark element should be used, depending on the context; for marking book titles, the cite element should be used; for labeling text with explicit textual annotations, the ruby element should be used; for labeling ship names in Western texts, the i element should be used.

The default rendering of the u element in visual presentations clashes with the conventional rendering of hyperlinks (underlining). Authors are encouraged to avoid using the u element where it could be confused for a hyperlink.

Proper names in Chinese and misspelled words. The u has returned and we get little detail for how to use it but plenty of detail on how not to use it.

Once again HTML 5 Doctor came to the rescue and provided explanations and use cases for the zombie <u> element in the post The return of the u element. To distill it here, some examples:

  • Chinese proper name marks, such as the names of people, places, dynasties, or organizations. It's akin to capital letter in English. Eg: 屈原放逐,乃賦離騒左丘失明,厥有國語
  • For indicating family names in Asian languages. Given that the family name often precedes the given name, this can be confusing to Western readers. This is an alternative to the capitalization we often see when translated to English. Eg: Lynn Minmay versus LYNN Minmay.
  • To indicate potential spelling errors, like you might see in MS Word or word processors and browsers with spell checkers. Eg: This is not my bootiful house.

In each of these cases you could use CSS to modify the <u> display to put a wavy line under the text (for the first and third examples) or to shift the characters to uppercase (the second example). The argument here is that the <u> underline will still render if the CSS is lost.

The specification lists the browsers that support this element, showing Internet Explorer, Firefox, Opera and Webkit. These browsers support it from its prior incarnation, however, as they have been supporting the <u> element since its presence in prior versions of HTML. Whether they support this new re-imagining is a different story. Since this new explanation is still primarily about display, the browsers may support it's new meaning by accident. The wavy line CSS, however, is only in Firefox right now.

The End of <time>

Oh for f*cks sake. We may as well just 'consider replacing all HTML tags with <derp>'

Just found a bug in #html5! There is a <data> element, that basically does nothing. We got <div> and <span> for that.'

Over the weekend I saw a tweet-storm from people who are closely tied to the pulse of HTML5. The <time> element was slated for destruction and quite a lot of people who have come to rely on it immediately swung into action — partly to educate those of us who aren't so intimately familiar with the element. The #occupyHTML5 hashtag on Twitter is on fire with people deriding the decision, though I defer to two experts on the subject to give a little explanation.

Ian Devlin, author of the new book HTML5 Multimedia: Develop and Design, wrote up his reaction in "On the disappearance of HTML5 <time>." To quote:

Many documents and pages that get posted on the web have some sort of timestamp attached to them. Think of every news and blog article that’s written, this very post included, that indicate somewhere when it was posted. Having a machine readable element that encapsulates this special case piece of information is very useful for both machines and humans alike to read and understand.

He notes that the new <data> element, which is intended to replace <time> as a more generic element, isn't a bad idea in itself, but is also a little late to the game.

Bruce Lawson, one of the HTML5 Doctors and one author of Introducing HTML5 (along with his role as standards evangelist at Opera), has received quite a lot of support for his post Goodbye HTML5 <time>, hello <data>! Awful Cher video notwithstanding, he raises good points:

<time> (or its precursor, <date>) has an obvious semantic (easy to learn, easy to read). Because it's restricted to dates and times, the datetime attribute has a specific syntax that can be checked by a validator. Conversely, <data value=""> has no such built-in syntax, as it's for arbitrary lumps of data, so can't be machine validated. This will lead to more erroneous dates being published. Therefore, the reliability and thus the utility of the information being communicated in machine-readable format diminishes.

He goes on to point out that Reddit, The Boston Globe, the default Wordpress theme, and now parts of Drupal's core have built <time> into them.

A request to reverse the decision to drop <time> has already been filed. There is also a shared document for people to log their arguments for keeping <time>, ostensibly for taking back to Hixie to reconsider the decision.

The Takeaway

Given all the ongoing tweaks to the HTML5 specification by WHATWG, my opinion that dropping the version number and making it a living document with no set versions is flawed. As developers who want to be on the bleeding edge start to integrate elements and attributes from a specification that's not fully baked, this moving target just means more refactoring and ultimately cost or delays to the end users and/or clients.

Related

Updated November 3, 2011

The <time> element has been restored. Read more: Well, It's about <time>

In case you had not read my piece following up this one with a broader look at the whole process around removing and restoring an element, go read End of <time> Is Not Helping the Case for HTML5

Saturday, October 29, 2011

Twitter's t.co Continues UX Failure of Link Shorteners

Twitter stamp image created for Tutorial9 by Dawghouse Design StudioIt's been a few weeks since Twitter moved to its own link shortening service for tweets. Originally the shortener only kicked in for tweets over 18 characters, but Twitter recently moved to have it affect all URLs in tweets. Twitter's argument was that this allows Twitter to reduce the number of spam and phishing URLs embedded in tweets. In Twitter's own words (from the t.co site):

Twitter uses the t.co domain as part of a service to protect users from harmful activity, to provide value for the developer ecosystem, and as a quality signal for surfacing relevant, interesting Tweets.

Twitter's Reasons for the t.co Shortener

Twitter's explanation sounds reasonable but doesn't bear itself our now that I've had some time to try it out. Let me explore...

As Protection from Spam/Phishing

I still get the same number of spammers on Twitter, I did not expect a link shortener to change that. Those spammers also use link shorteners, so whether or not the t.co service came into play it wouldn't matter much — the link is still obfuscated. If the t.co service was doing its job, however, then those tweets would be caught or flagged. Even if it wasn't a pro-active service (because we know people can change the destination of a link shortened by many services, rendering it malicious from an initial innocuous configuration), I would expect it to do its job when I follow a link. It doesn't. Just this morning I received a spam tweet, and for the scope of this post opted to click the link. The t.co address showed up in browser, it spent time processing, and then sent me to the phishing site. Twitter's first claim is false.

A couple weeks ago I followed a tweeted link from a local business that fed through the t.co service. The t.co service told me that the link was to a spam or malware site. It was not. The link was to an online petition for a local issue. While I was motivated to just grab the original link from the tweet, there's no way to tell how may others may have had the same experience and were unable to weigh in on the petition. The risk here is that the t.co can also produce false positives, damaging anyone's reliance on Twitter as a link dissemination tool.

As Value for the Developer Ecosystem

Let's be clear here — as a user I don't care how much easier it is for developers. I don't let my web team just throw a bunch of fields on a web form without regard to the end user, no matter how much more quickly they can do it. But Twitter has a model that is less about the end user and more about driving organizations to rely on Twitter through its API and reporting features. The t.co shortener provides a boon to Twitter because it makes it easier for web masters to see how much traffic came to their site from a Twitter-shortened link.

Sometimes that t.co link isn't from Twitter (it's been shared elsewhere such as a blog, through originally from a tweet). Sometimes the same web page address has more than one t.co address. Sometimes that t.co link is to a bit.ly (or other shortener) link, which are often created for use on Twitter anyway.

As a Quality Signal for Relevant or Interesting Tweets

Given that I have no confidence in Twitter's ability to filter malware, spam and phishing sites, I certainly cannot believe that quality is an appropriate word. Just seeing the t.co address in a tweet doesn't tell me that it is relevant or interesting. Twitter may decide a link or tweet is relevant or interesting simply by measuring how many clicks it gets. In the absence of a clear explanation, those two metrics are also suspect.

Other Factors

Twitter Clients

I use TweetDeck on my computers and Seesmic on my phone. I have used Hootsuite and sometimes I use Twitterfall for a Twitter wall at events. They all display the t.co address instead of the full address underneath. These apps don't update at the same pace as the Twitter web site and not all users will allow frequent updates (whether by corporate IT policies or lack of interest) to their Twitter clients for when they do support expanding the t.co addresses.

This means I regularly see a t.co address. This wouldn't be an issue except I rely on the URL to know what will happen when I click a link: youtu.be means my Twitter client will play a YouTube video, twitpic.com means my client will show a picture, and so on. That link scent is now gone and I click fewer links as an end user because I don't know what I will get.

Twitter-Provided Tweet Streams

Screen shot of my Twitter stream with t.co shorteners both from Twitter RSS and Twitter JavaScript widget.

I use the Twitter-provided JavaScript code to embed a Twitter feed on my personal site. This does not expand the t.co URLs to show the full address. I push the Twitter-provided RSS feed of my tweets to my blog. This does not expand the t.co URLs to show the full address. Tweets pushed to Facebook or other services come with the t.co, again hiding the full address and link scent.

Extra Bandwidth Burden

At peak times I have found a link that goes through t.co takes longer to redirect me to my destination address. Often that destination address is being hit by only a few users, maybe a few thousand. The destination site can typically handle the traffic. The t.co service is taking the brunt of all the traffic from all the users on Twitter. This increases the bandwidth used across the web (and on my phone) and results in a longer click-to-destination time.

Copy/Paste Hassle

This may sound like a minor issue, but I regularly copy an entire tweet or just the URL from a tweet, often wanting to share via email, in my blog, or elsewhere. When I do that I typically get the t.co link, and I think it's obvious by now that I do not want that. If this affects me, then others who may do the same but not know how to tease the expanded URL out of a tweet could end up pushing traffic to my site from a blog that reports itself as a t.co referrer. In my reporting I will now be unable to distinguish traffic from a link and from a source that I may want to otherwise engage.

Blocking

I have read from a few sources that t.co is blocked in China. Given Twitter's prominence in recent events such as the Arab Spring, London riots, and now even Occupy Wall Street, creating a single point of failure with t.co means not only are the links blocked, but the expanded link may be blocked easily by any organization or government that wants to quell activity on Twitter.

My Former Reliance

I use a photo sharing service that pushes a link to the photo to Twitter. I used to take the RSS feed, along with the geolocation of the tweets, and pull the URL from the photo service to quickly hack up into a path to the thumbnail. I would then embed this modified RSS feed into a Google Map to show my activities and travels — most recently for a trip to Italy.

Because I was not using the Twitter API I could not be considered a developer, so I don't fall into Twitter's stated support for developers. As such, when the RSS feed from Twitter converted the URL from my photo sharing service my maps didn't display images and my followers stopped clicking links to my photos.

I also regularly craft URLs in tweets to remove the query string nonsense and unnecessary "www" prefixes (among other bits). I also regularly craft a tweet with the intent to make the URL visible to the end user because the address often feeds into my point or joke, Instead I find I exclude the "http://" from the address to get my point across, but it also means the link is not clickable for many users depending on their Twitter client.

A common annoyance is that Twitter now encodes URLs that are already far shorter than the Twitter-encoded t.co address. My own tweets have seen URLs double in character count after Twitter applies its link shortener.

Why Did Twitter Do This?

Three key reasons that I can contrive:

  1. Twitter owes much of its success to developers building apps and integrating it into other services. Shortening URLs reduces the efforts an end user has to make in a third-party tool to stay under the 140 character limit.
  2. Twitter's importance as a driver of web site traffic is reinforced when webmasters see the t.co links in their logs.
  3. All the t.co links track information, which puts Twitter in a position to monetize the data it captures from each shortening and each click.

For all the reasons I state above, Twitter isn't really helping the end user (either content consumer or non-developer). Twitter's goals here are more for its own gains. In the end, since it's a free service they have the right to do that. I would certainly appreciate a more direct and honest explanation and a consistent implementation across its API and own services (RSS, JavaScript widget).

Sadly Twitter has continued to set the mark for other developers and, like its infinite scroll and other user annoyances, it will continue to enable developers to make poor decisions that are counter to a good user experience.

Related

From Twitter

My Posts about Link Shorteners

Thursday, October 13, 2011

More Samples of Responsive Web Design ≠ Print

When the guy who coined the term "Responsive Web Design," has written a book about it, and is well regarded throughout the industry is asked to name his 20 favorite responsive sites, you should expect top-notch examples of sites that use CSS to respond to nearly any medium.

Except that isn't the case. Given that I wrote a piece for evolt.org about a week before his interview, I think I am within my rights to call attention to how these sites are not responsive insofar as they do not adapt to the printed page — a capability that has existed for years prior to the CSS media queries we are swooning over for responsive web design.

I am taking the first five sites listed in the article (which he says are not in any particular order, but I don't want to be accused of cherry picking and I don't have time to do all twenty) and presenting the printed versions of each as black and white PDF files (I figure not everyone has a color printer, unlike the assumption I clearly made in the evolt.org article). Some might suggest that if I want to make it in this industry* I should not attack someone so prominent, but I want to be clear this isn't an attack on Ethan Marcotte, this is a recurring oversight by nearly everyone purporting to make sites that are responsive.

Sample Sites

Elliot Jay Stocks

Screen shot of PDF file.

I wouldn't print the home page, but I might print the About page to include in my presentation to a boss or client about potential vendors. The navigation and content on the right could go away, and better margins can clean this up.

Ampersand Conference

Screen shot of PDF file.

Having attended my share of conferences, and knowing internet access is spotty and my battery can run dry quickly, I always print a schedule to carry with me. Boy would that be a mistake with this site. Of the six pages, only two have the schedule, and those could easily fit on one page. The white-on-white doesn't help much, either.

New Adventures in Web Design 2012

Screen shot of PDF file.

Another conference site, this time Mr. Marcotte sings the praise of the conference schedule. Owing to my need to print, I cannot agree with his praise. Three pages of sponsor logos, nearly a page of branding and navigation, and the remaining for the speakers? Couldn't the necessary bits all fit on one page?

SimpleBits

Screen shot of PDF file.

I've met Dan Cederholm and presented at a conference with him (not the same presentation) in Toronto a few years back. He's a good egg and so it hurts a little to list his site, but his About page, while full of valuable content, just doesn't seem to adapt to print. I know he uses bullet lists for a lot of content, we all do, but at least remove the bullets from the pictures.

Made By Hand

Screen shot of PDF file.

Printing the home page would be unfair — it's a site about a video, and so there is a video. Instead I visited the About page to learn about the project, which spends nearly as much space talking about the site's typefaces as it does the project, but makes no effort to adjust margins, hide navigation, remove the form, or otherwise adapt to the printed page.

How to Be Better Than This

Make print styles.

Now go read my original article at evolt.org: Print Styles Forgotten by Responsive Web Developers


* I should probably qualify that I have made it in this industry (despite my attempt at humor above). I am a co-founder of evolt.org, created the evolt.org browser archive, have co-written four web development books and edited another, am referenced in other books, am cited in best practices references and college courses, and have been running a business for 13+ years. I also make sure all the sites we build get print styles because it's just daft not to.

Tuesday, October 11, 2011

Detecting Mobile Devices — Don't Bother

Image of mobile phone showing this site.Since I started working on the web (and was slowly coaxed to the world of Netscape from Mosaic and HotJava), clients have asked me to find ways to adjust how a page behaves based on what browser the end user has. Before campaigns like the Web Standards Project (WaSP) took hold and slowly convinced web developers, and by extension clients, that the right approach is to build for standards first, web developers struggled with everything from clunky JavaScript user agent sniffers to server-side components like the browscap.ini file for IIS. These all took time to maintain and were never 100% effective.

I am thrilled we've gotten to the point in the web where progressive enhancement is in vogue, finally falling in line with our own practices of the last decade or so. With the advent of mobile devices and plunging screen resolutions, we have support in the form of CSS media queries to adapt a single page to multiple devices, now referred to as responsive web design. Yes, we are still struggling with the best practices and design differences (such as forgetting print styles), but the overall concept is solid. No longer must you code a text-only page, a mobile page, a printable page, and a regular page (or the templates for each if you are using a web content management system). You can build one page and let it handle all those scenarios.

Except sometimes you find yourself in a situation where you have been asked to develop a different experience for a mobile user that lies outside the ideal of responsive sites. That different experience can be as simple as sending a user to a different page on the site if he or she is surfing on a mobile device. All those years of progress are swept away in one moment and we are back to struggling with user agents. I'd like to provide a little context on why such a simple-sounding request can be such an effort to implement.

Techniques?

If we fall back to user agent sniffing (reading the browser's User Agent as it reports to the server), then we have an uphill battle. Just finding a comprehensive list is an effort. One site lists hundreds of user agent strings, and there is even a SourceForge project dedicated to staying on top of them all. When you consider how many different phones and browsers there are, and how often new ones come out (such as Amazon Silk), your clients need to understand that this approach is doomed to failure without ongoing updates (and fees).

If all you do is follow Google's advice on its Webmaster Central Blog to simply look for the word "mobile" in the string, you'll fail immediately — user agents on Android devices do not need to conform (and often don't) to what Google says you will find. Opera doesn't include "mobile" in its user agent (Opera/9.80 (Android 2.3.3; Linux; Opera Mobi/ADR-1109081720; U; en) Presto/2.8.149 Version/11.10), and the browser Dolphin doesn't even include its name in the user agent string (Mozilla/5.0 (Linux; U; Android 2.3.3; en-us; PC36100 Build/GRI40) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 ).

You can take the inverse approach and instead detect for desktop browsers. It's smart and simple as far as user agent sniffing goes, but still falls prey to the same problem of the constantly changing landscape of browsers. Given that the next version of Windows is intended to quickly switch its interface back and forth between desktop and mobile (keyboard and touch), unless the user agent for all the browsers installed on that device change as the user changes the device orientation, that technique is also doomed.

Serving different content based on screen resolution gets you around the user agent sniffing, but isn't any more effective. With tablets approaching desktop screen resolution, and smartphone resolution approaching tablet resolution, there is no clear method for determining what kind of device a user has. An iPhone 4S held horizontally has 960 pixels of resolution and the Dell Streak tablet has 800 pixels (to clarify, the smaller device has more pixels, which is contrary to what most might expect). If you want a tablet to have a different experience than a phone, then serving it based on screen resolution won't do it. As it is, the resolution of many tablets matches that of my netbook (1,024 x 600), which is definitely not the same type of device (it has a keyboard, for example).

What To Do?

Try to solve the objective earlier in the overall process — generate a different URL for mobile, embed it in different QR codes, look into feature detection, look at using CSS media queries to display or hide alternate content, and so on. Every case may require a different solution, but falling back to methods that were never reliable certainly isn't the right default approach.

Update: November 13, 2013

I'm just going to leave this link here for you to read at your leisure: Internet Explorer 11’s Many User-Agent Strings

Tuesday, October 4, 2011

Print Styles Forgotten by Responsive Web Developers (at evolt.org)

Image of a printed web page with a QR code.

As web browsing technology continues to change at a rapid pace, budgets to update web sites for these changes often don't match that same pace. Responsive web design has become the de facto answer to preemptively adapt sites to this constant shift, typically relying on CSS3 media queries to do the bulk of the work. CSS3 media queries allow web browsers to choose stylesheets, and as a result, layouts, that fit the display resolution of the current device.

Web developers and designers brag about their ability to craft designs that work across platforms, some even integrating transitions that only they can see when they change their window sizes. Sites have sprung up to catalog these achievements and article after article expounds the benefits and wrong-mindedness of any other approach. I tend to agree.

I am surprised, however, at the utter disregard for a media format that has existed before the web — the printed page.

Read the full article at evolt.org

The rest of this article was just published at evolt.org. Go read Print Styles Forgotten by Responsive Web Developers and post your comments or thoughts, which you can also do below.