Adrian Roselli: February 2011

Saturday, February 26, 2011

Don't Choose Between Mobile Web and Mobile Apps

Image of mobile phone showing this site. When Adobe released InDesign it included a novel feature that was otherwise unheard of to the average agency — the ability to import content into its page templates from XML data. Having developed a web content management system (QuantumCMS for those of you interested in hiring us), we had selected XML as an output option for content, allowing us to deliver that content into any medium that could support XML (such as web pages via XSLT). The idea we proposed to our agency clients was simple: author your content in one place, one data store, and push it out to the web, print, and other media.

None of them got it. Fear of the technology and comfort with an existing platform made it impossible for them to step back and evaluate whether there was a good business case in play. This approach (or lack thereof) is not limited to agencies. Unskilled web developers spent years building text-only versions of sites, feeling terribly proud of themselves when they realized they could wrap different templates around the same content instead of maintaining duplicate sites. Even now in 2011 I still see web developers take advantage of clients by selling the text-only site as an add-on service.

We're in the same place in the world of mobile. Organizations and developers (clients and vendors) are struggling with the right way to deploy their products to the masses (customers, end users). For reasons perhaps grounded in technology assumptions (preferences, fears, lack of understanding) they tend to look at two options for mobile — apps and web. Assuming that there are only two options is the first mistake they make.

With the recent marketing push over HTML5, CSS3, new APIs (such as geolocation) coupled with the most compliant browsers we've seen in years appearing on mobile in far greater percentages than desktop, we have a viable platform on which to develop web-based experiences that are functional and effective. The idea that a developer can develop one web-based application and deploy to the web, to iPhone, Android, Palm, Blackberry and Windows Mobile devices should be a compelling reason to consider that a starting point.

There are many things, however, that we cannot yet do via the web, such as access your mobile phone's camera, or take advantage of more robust touch-based interfaces. These are features best utilized directly through the mobile device, ideally via its API.

It is possible to develop an app for a mobile phone that is nothing more than an embedded web browser (by instantiating the default browser) along with elements that access the phone's hardware features. You risk the loss of a more robust user interface for your web-based content, but you gain the benefit of developing simpler apps for each platform while retaining control over your core service and content in your own hosted environment.

With the new payment models that are shaking out in the mobile apps market, the web-first approach might be a more appropriate way to build your business model. Given the uproar over Apple's recent requirement that subscription-based app developers offer subscription options through the Apple app store (and at no cheaper on their own sites), in exchange for a hefty 30% cut, relying on the mobile device itself to deliver your content becomes a suddenly more costly solution. Instead, just having a mobile-friendly site can be sufficient to handle sign-ins and subscriptions, which also carries over to other devices (iPad, Android) and platforms (desktop, tablets).

Certainly not everything can be deployed via the web. A music service, for example, is tough to do even with excellent support for the unfinished HTML5 audio element. But looking at a web-first or hybrid approach allows you to reduce your development and deployment costs as you support fewer platforms, and share your content with nearly any web-enabled device.

Posts on this blog that help lay some of the groundwork of these thoughts.

There is so much content out there about the Apple app store rules that, while I had planned to write post about it on my own, I felt that adding to the noise wouldn't help. Instead here are some of the links I cultivated for the post I never wrote.

Update: March 11, 2011

If you read An Open Letter to Apple from Readability, then you know that it submitted an app to the Apple App Store that was in limbo. The issue came down to the Readability business model (70% of subscriptions to writers) bumping up with Apple's new store rules (30% to Apple). Even though Readability already had a mobile version of the site, on Wednesday Readability released updated apps for multiple devices that are really just wrappers for web content. In the end, this can bypass the Apple App Store if Readability can deploy it all via the web, removing the need to use an in-app purchase that pushes 30% directly to Apple. Yesterday ZDNet covered this in the article Readability goes HTML 5 on iOS, expect others to follow, where I think you can see others are starting to see the viability of the business and technology model, partly spurred by Apple's new pricing model.

Thursday, February 24, 2011

W3C Starts Mobile Web App Standards Roadmap

Illustration showing the Web as an application development platform.

The W3C has taken a step toward gathering all the information relevant to the mobile web, as authored by its disparate working groups, under one umbrella as a singular (recurring) reference source: Standards for Web Applications on Mobile: February 2011 current state and roadmap (read the accompanying W3C blog post).

You may recall when the W3C released its Mobile Web Application Best Practices "cards" with tips and techniques for web developers diving into mobile. You may also recall that it was very focused on mobile and didn't address how those best practices are handled via the standard desktop browser (such as the tel: URI scheme).

This, however, promises to be a bit different. Instead of being pushed out by the W3C monolithic entity, one person has made the effort to take on this responsibility and clearly state that he can't possibly get it all right all the time:

...[T]he data in this report have not received wide-review and should be used with caution. Feedback on every aspect of this document should be sent to the author (dom@w3.org) and will serve as input for the next iteration of the document.

The document outlines a series of categories of technologies that apply to the web, including the relevant technology for each. These categories are:

Graphics, which includes SVG, CSS, WOFF and other acronymns.
Multimedia, such as audio and video.
Forms, including the new input types and attributes like pattern and placeholder.
User interactions such as touch and speech events and even a vibration API (no specs, just working groups).
Data storage such as the different file APIs and local vs. web storage.
Sensors and hardware integration which will lean on the geolocation API and eventually APIs for using your phone's camera and microphone.
Network, which covers XMLHTTPRequest (level 1 and 2) and the WebSocket API.
Communication using the messaging API to allow SMS and emails from web apps.
Packaging, consisting of HTML5's ApplicationCache and W3C Widgets.
Performance & Optimization such as the Mobile Web Application Best Practices along with support timing and threading.

Each of these items includes a table that outlines the appropriate specification, working group, maturity, stability, draft status, current implementations and test suites — if any — for each feature.

When it's all gathered in one place for us to review, it's a pretty compelling list of features in the pipe, even if we all know it will be quite some time before most of them shake out. I look at this roadmap as more than just a reference source, but as a path to shedding the costs and limitations imposed by building custom apps for each device (iOS, Android, desktop, television, etc).

Apps Are Not Killing the Web

Monday, February 21, 2011

WebM, H.264 Debate Still Going

Terrible illustration of Chrome dropping H.264. On February 2, Microsoft released a plug-in for Chrome on Windows 7 to allow users to play H.264 video directly in Chrome. In addition, Microsoft has said that it will support WebM (VP8) when a user has the codec installed. And so began the fragmentation of the HTML video model, back to relying on plug-ins to support what was otherwise intended to be supported natively within browsers.

Microsoft went on to ask three broad questions in a separate post (HTML5 and Web Video: Questions for the Industry from the Community), taking Google to task for what Microsoft considers inconsistent application of its own patent concerns and openness (emphasis Microsoft's):

Who bears the liability and risk for consumers, businesses, and developers until the legal system resolves the intellectual property issues;

When and how does Google make room for the Open Web Standards community to engage genuinely;

What is the plan for restoring consistency across devices, Web services, and the PC.

The same day Microsoft was announcing its plug-in approach to addressing the WebM and H.264 battle, the post On WebM again: freedom, quality, patents came out addressing what it felt were the five most common issues raised with WebM (which I have paraphrased):

Quality: the argument here is that it's a function of the encoder and the WebM can match H.264;
Patent Risk: comparing the 164 unexpired U.S. patents used in a single encoder, he finds that 126 of them are there for the encoder's H.264 support, the remaining (used by WebM) are in a library released by Google.
Not open enough: there is little argument here, given that it's currently in Google's hands to manage and develop.
H.264 is not so encumbered: but only for non-commercial use for freely-distributed web video.
Google provides no protection from infringing patents: nor does MPEG-LA.

Changing the Nature of the Battle

On February 11, the post MPEG LA puts Google's WebM video format VP8 under patent scrutiny outlines how MPEG-LA, the licensing entity for multimedia codecs such as H.264, has put out a call for patents related to VP8, the underlying technology in WebM. That deadline for submissions is March 18, or less than a month away as of this writing. From there, MPEG-LA will create a patent pool of contributing patent holders for any items that are deemed essential to the codec. This patent pool can then be used to negotiate licensing. In short, VP8/WebM could soon be more patent encumbered than it has been. This puts Google on the defensive as it will have to show that none of the patents in use are valid and/or infringed.

The same author of that last post posted on the 14th at The Guardian, Royalty-free MPEG video codec ups the ante for Google's WebM/VP8. In case that title isn't clear enough, a new royalty-free video standard may be in the pipes. MPEG, a standards body separate from MPEG-LA (the licensing body) has called for proposals toward a royalty-free MPEG video coding standard. One of the goals is to make this new standard comparable to the baseline used in H.264.

If this pans out, it puts another barrier in front of the WebM offering from Google, namely that for WebM to be adopted it will have to best any new royalty-free MPEG codec. The three items that can bring WebM (VP8) down:

If the MPEG-LA call for for patents and resulting patent pool for VP8 nets some patents, MPEG-LA will now form a patent pool to push for licensing agreements, which Google will have to fight at each step.
If MPEG can genuinely develop a royalty-free video coding standard, it can beat WebM either from the patent perspective or by forcing WebM to be technically superior.
Assuming WebM can get past the first two items, it's still back where it started — in a battle for adoption and endorsement against the already entrenched H.264 standard.

Real-World Needs

Considering Google is the company that delivers so much of the video viewed on the web via YouTube, it makes sense that Google presumes it can take the lead on video codec standards. Netflix, however, has its entire business model built around video, which is now moving inexorably to the web. Netflix commented back in December (HTML5 and Video Streaming) that it needs some things sorted out before it can even rely on HTML5 and video to deliver its content (the first 6 of which it has resolved through its own proprietary technology):

The acceptable A/V container formats (e.g. Fragmented MP4, WebM, etc.);

The acceptable audio and video codecs (e.g. H.264, VP8, AAC, etc.);

The streaming protocol (e.g. HTTP, RTP, etc.);

A way for the streaming protocol to adapt to available bandwidth;

A way of conveying information about available streams and other parameters to the streaming player module;

A way of supporting protected content (e.g. with DRM systems);

A way of exposing all this functionality into HTML5.

It's clear that the web video debate extends far beyond the academics of HTML5. As long as issues related to patents and licensing are unresolved, or perceived as unresolved, we will see more proprietary solutions gain more ground, further balkanizing the future of video on the web.

If you think this doesn't affect the end user, all that time and effort put into creating proprietary solutions ultimately costs you as the consumer in the form of increased fees for your content. Granted, it will be some time for browsers to catch up with the selected codec, and yet more time for users to catch up with the supporting browsers, but the more this debate continues, the longer before we can even start that long road of user adoption.

Possibly the best outcome we can hope for is that this battle results in a royalty-free MPEG standard, nullifying many of the arguments against both H.264 and WebM.

H.264 Getting Dropped from Chrome, January 11, 2011.
Simple HTML5 video player with Flash fallback and custom controls, February 17, 2011.

Wednesday, February 9, 2011

Beyond Hash-Bangs: Reliance on JavaScript Is a Bad Idea

Graph of percent of users with JavaScript disabled.

In November I wrote up a post (How Many Users Support JavaScript?) outlining the process and results from Yahoo's study about how many users have JavaScript disabled (How many users have JavaScript disabled? and Followup: How many users have JavaScript disabled?).

The Numbers

In those articles, Yahoo stated that even using the meager percentage it found, that number corresponds to 20-40 million users across the internet. At 2009 census numbers, that's the entire population of New York State on the low end or more than the entire population of California on the high end. That tiny percentage is no small number of users.

Before you think that all those users are sitting in parts of the world that don't care about your product or service, Yahoo itself fields 6 million visits per month from users without JavaScript (whether it's disabled or the browser doesn't support it). That easily justifies Yahoo's work to make the home page accessible to everyone.

Gawker and Twitter

In a very recent example of what an over-reliance on JavaScript can cause, on Monday the sites under the Gawker umbrella, Gizmodo, Lifehacker, iO9, and Gawker itself, all failed (Gawker Outage Causing Twitter Stir). And not in the typical way we're used to seeing such as server 500 errors, timeouts, and other artifacts of generating too much traffic or pushing the wrong button. In this case the error was a function of a complete reliance on JavaScript to render the page. The sites are back up and functional, unless of course you have JavaScript disabled (this screen shot is from today):

The Gawker sites aren't the only culprit. Twitter is another example. When I attempt to link to a tweet from last night about these issues, I can't access the page at all with JavaScript disabled (turn off JavaScript and try it yourself). I just end up at the Twitter home page. Some other sites rely on so much JavaScript on top of the HTML that the page is constantly reflowing. One culprit is Mashable.com, requiring me to wait patiently while the page finishes drawing before I risk any interaction with the page for fear of an errant click.

How JavaScript Is Breaking the Web

Last night I came across a post, Breaking the Web with hash-bangs, which specifically references the Gawker downtime along with Twitter's confounding page addresses. You might recognize the site (or author) because it's the very same site that took issue with Yahoo's methodology for reporting on JavaScript disabled users (Disabling JavaScript: Asking the wrong question). In that post he outlines other ways that browsers might report a lack of JavaScript support:

The site serving the JavaScript file might be unreachable, a risk when relying on JS libraries hosted on third-party sites.
The URL for the JavaScript file needs to be correct.
The connectivity to the JavaScript file must exist, which means network lag, DNS issues, bad jumps, etc. can all break that path.
An intermediary server passing that JavaScript file along could decide it is spam or malware, based on its own rules, and just purge the contents of the file.
Other policies at the destination may strip the file or block it in some way.

In last night's post he draws attention to the new reliance on page addresses that really only exist as fragment identifiers that are then parsed by JavaScript to return a particular piece of content. What this means is that in order to see the page, you have to not be in the 20-40 million number without JavaScript support and you have to successfully make it past the five items I outline above. Oh — and the JavaScript cannot have any errors. Some examples of JavaScript-related errors (from the article):

JavaScript fails to load led to a 5 hour outage on all Gawker media properties on Monday. (Yes, Sproutcore and Cappucino fans, empty divs are not an appropriate fallback.)

A trailing comma in an array or object literal will cause a JavaScript error in Internet Explorer - for Gawker, this will translate into a complete site-outage for IE users

A debugging console.log line accidentally left in the source will cause Gawker’s site to fail when the visitor’s browser doesn’t have the developer tools installed and enabled (Firefox, Safari, Internet Explorer)

Adverts regularly trip up with errors. So Gawker’s site availability is completely within the hands of advert JavaScript. Experienced web-developers know that Javascript from advertisers are the worst lumps of code out there on the web.

I feel so strongly about how poorly Gawker implemented its new sites, about how Twitter relies on the same approach, how it all relies on a hack in Google's spidering technology in order to even make it into the search engines, that you really do need to go read this post, to which I am linking again: Breaking the Web with hash-bangs

For years I have tried to talk people out of relying on JavaScript for everything. I usually end up trying to explain the concept to someone who thinks it takes more time to build a site to not rely on JavaScript. Which is totally wrong. If all your forms validation is done via JavaScript, for example, then you are really putting yourself at risk (do you sniff for SQL injection attacks solely via JavaScript?).

JavaScript is a method to enhance your site (progressive enhancement), not replace features you should be building on the server and in your HTML. Relying on the end-user's browser, and there are so many variations, to execute your JavaScript without generating an error (which all too often brings everything to a halt) just isn't a very resilient approach to development. On top of that, how many developers really know how to build support for WCAG 2.0 into their JavaScript?

Without JavaScript there are a few sites on the web I cannot use at all, owing to their shoddy implementation practices and poor business decisions. On the bright side, surfing sites like Mashable without JavaScript means I don't see any ads and the page render time increases dramatically. However, even with JavaScript enabled, poor scripting techniques still spare me the sight of those ads:

I would be willing to bet that this simple JavaScript error would have brought any of the Gawker family of sites down. And if the JavaScript is passed in from the ad service, the only recourse is to cut the ad altogether and lose the revenue.

Even at that most basic level you can see how poor of a decision JavaScript reliance is.

Updates: February 11, 2011

In the post Hash, Bang, Wallop the author provides some alternate views and arguments on how the hash-bang approach isn't necessarily the wrong way to do things. His argument is compelling, until the end when he wraps up discussing users without JavaScript and users without quality internet access. His statement that JavaScript errors breaking execution can be almost entirely avoided with good development practices presumes that end-users have any control and that the amount of errors we see now is an aberration — something my 15+ years of experience tells me isn't true. He also talks about poor connections by saying that there are people daring to live somewhere in the world where packet loss is a problem. I somehow suspect that Twitter users in Iran or Egypt might disagree that they dared to live there, or had a choice.

Two more articles railing against the reliance on hash-bang URLs, AJAX and JavaScript:

Going Postel by Jeremy Keith.
Broken Links by Tim Bray.

And for a little self-congratulation, this post was referenced in this week's Web Design Update, something I have read for nearly its entire existence and hold with rather high regard. If you consider yourself a web developer and aren't on this mailing list, you're not really a web developer.

Updates: February 12, 2011

I found this script this morning which is designed to end the reliance on hash-bangs and the resultant URLs they force into page addresses: History.js.

History.js gracefully supports the HTML5 History/State APIs (pushState, replaceState, onPopState) in all browsers. [...] For HTML5 browsers this means that you can modify the URL directly, without needing to use hashes anymore. [...]

While this looks like a nice alternative to the hash-bang problem (which it is, not a solution), it relies on two things: the user having an HTML5-capable browser, and the JavaScript still executing despite everything I've outlined above.

In short, the wrong problem is being addressed here. The reliance on JavaScript is the issue, not just the methodology of that reliance.

Updates: June 1, 2011

The post It's About The Hashbangs points out that hash-bangs' use as a stop-gap until pushState is supported in browsers isn't a valid reason to use them. He also addresses how hash-bangs break standard URL rules, confusing and confounding tried-and-true server-side processing methods. It's worth a read.

Update: May 7, 2015

It's 2015 and yes, this is still a thing. So I'll just leave this here for your review: Everyone has JavaScript, right?

Saturday, February 5, 2011

Apps Are Not Killing the Web

iPad in use with a meatstick.

Forrester Research is an oft-cited source by businesses when making decisions or declarations about trends and technologies. In many circles Forrester is something of a de facto standard for analysis. As such I fully expect to start dealing with a recent statement from its CEO claiming that the web is dead when I sit down to talk with clients.

On Thursday morning at the DeSilva + Phillips Media Dealmakers Summit, George F. Colony, CEO of Forrester Research was on a panel discussing the future of media in light of tablets and e-readers. Expanding on an answer to a question he fielded, Colony said, We think the Web is dead.

When he says we, he means the folks at Forrester. Back in October another Forrester staffer was quoted as writing that the golden age of the Web is coming to an end. This was in an article by The New York Times covering the hub-bub about Wired Magazine's over-hyped death of the web article. Wired's article relied heavily, but not totally, on a graph showing the decline of the web as we know it, but Boing Boing quickly refactored that graph with a more accurate visualization using the same numbers. You might recall that I wrote up my own response to Wired's argument in my post Enough about the Death of the Web

In May of 2009 Forrester also claimed that the smartphone was dead. To be fair, the full title of the study was The Smartphone Is Dead: Long Live Smart Phones and Smart Gadgets, showing that Forrester didn't really think the concept of the smart phone was dead, but that it was no longer worth breaking into its own category given the ubiquity of capable phones and devices. That may very well be the logic the Forrester CEO was using in his comments, although I don't think so.

The web is not dying (again). If anything, the advent of tablet and tablet-like devices coupled with support for HTML5 and CSS3 (really just CSS3) in the browsers that are coming on those devices (Webkit-powered Safari and Chrome on iPads, iPhones, and Android devices) is going to ensure that the web will be around for quite a while longer. The Webkit engine (along with the mobile version of Opera that many are grabbing) does a good job of supporting the newest still-in-development standards, creating opportunities for far more interactivity and style than could be achieved in browsers targeting just CSS2.

Angry Birds is a great example of an app that you cannot replicate as easily in HTML/CSS, if at all. But so many other apps are geared toward media, such as eReaders and photo sharing utilities, that delivering much of that content through a browser is a more cost-effective approach. For example, an app like Picplz has to be built separately for both iPhone and Android devices in order to use the cameras built into each. The method you use to browse your photo, profile, and the photos of others, however, is delivered through an embedded browser. This allows the app developers to focus on the features unique to each device while the universal elements are maintained back on their web server, removing the need to push an updated app to users for each minor tweak.

Soon we can expect that an app delivering content of any sort will really be a wrapper for a web browser, handling just the interaction with the hardware and operating system that is necessary for things like user validation, preferences, and so on. This will reduce the cost of app development as the heavy lifting is done via the web server and CSS3 (HTML5 if you've bought into the hype). You can expect to see RIM and Microsoft move to catch up by deploying more capable browsers to lower the bar for app developers to deploy to Blackberry and Windows Mobile devices. This closes the gap in available apps for each device, making them more appealing to more users.

This approach also supports users who don't have these new devices, allowing your typical desktop user with a capable browser to access the same content, even if he/she uses a mouse instead of a finger. A good example of this is the Marvel Comics comic book reader for Chrome. Originally built for the iPad, and supporting a swipe interface, viewing it in Chrome gives you a similar experience but with clicks instead of a more tactile experience.

The new model is build it once, deploy it across the web and your apps. Except it's not a new model. It just makes sense.

As I was wrapping up this post I stumbled across this article from The Nieman Journalism Lab, The Newsonomics of apps and HTML5. I think that, even though it takes a different path, it comes to similar conclusions as I do.

Update: February 10, 2011

Go read Robert Scoble's take on the new HP TouchPad. If he's right and it takes off, building web-based apps seems that much more like a good idea now, doesn't it? Supporting compiled apps for each of iOS, Android, WinMo, RIM and WebOS seems far less compelling to me.

Update: February 11, 2011

John Dowdell explains the concepts a bit more succinctly over at his Adobe blog: Blends of native and global. It's pretty much the same argument I make — some bits will have to be developed to run on the device, some bits can be best delivered via the web. Automatically excluding either of those options isn't a sound business or technical decision.

Friday, February 4, 2011

URL Shortener Spam Overrunning Blogger Stats

URL shorteners are in your web logs, stealing your clicks. Ok, maybe not in all your logs, but certainly they can show up in reports, tricking you to click on them, potentially exposing you to spam or viruses.

If you aren't familiar with URL shorteners, they evolved as a method to take those terribly unwieldy web page addresses (you know, the ones with all the random-seeming numbers and letters that go on forever) and replace them with short, simple addresses. Just being able to paste these shortened addresses into emails without them wrapping was already a good enough reason to use them. Twitter has made them a necessity thanks to its character limit coupled with peoples' desire to include some commentary on links they tweet. The process by which they work is simple — you provide one web address, and the shortener service provides another simple address that redirects all traffic to the address you provided.

But there is a risk. This obfuscation of the destination address makes it hard to quickly evaluate whether or not this is a link you want to follow. You can use these to RickRoll your friends, for example, hiding the true destination of the link. You can also choose links that are far less innocuous. There are many more issues with link shorteners, ranging from link rot to reporting, topics which I have discussed here before, but this time I want to focus on the destination address obfuscation, or what I am now calling Link Lying. And now I'm done calling it that.

There is a new trend that's been annoying me for some time now, but has picked up dramatically in the last few days. When I go to my Blogger Stats tab to see what kind of traffic this site is getting (which may be anemic, but is still valuable to me) I no longer see pages of value in the Traffic Sources section ("referrers" to us in the know, or "referers" if you misspell it like the HTTP spec does). Instead I see a stack of shortened links, nearly all of which point to the practically-prostitution site Adult Friend Finder (if I thought it was run by a bunch of whores before, this game of lying through linking just solidifies that opinion).

I am prepared to take the risk with shortened URLs from my Twitter stream, but I follow a list of people that I trust not to spam me. If they did, I simply wouldn't follow them. I know not to click on links from DM spam (such as when a friend's account is hacked). I know not to click shortened links in unsolicited emails. I had not considered that I'd have to apply the same wariness to my Blogger reports. This image demonstrates how these shorteners have taken over my stats, successfully tricking me twice now to click (this image shows just traffic from today).

It's not like the spamming in my stats is a new trend. I see links to sites show up regularly that clearly do not link to me in any way, but manage to get themselves in there regardless. But I don't click on those. I can tell by looking that they are spam. These next images show a week of stats, a month of stats and a year of stats. You can see how the value of this report has dropped off dramatically now that spammers have figured out how to overtake it.

Of the 7 (out of 10) links the other day that were from shorteners, only one presented me with a warning message from Bit.ly that the link itself might be a bad idea to follow. As you can see, this one points to the same site of liars that I reference above. Apparently this one has been reported by another user who was spammed and Bit.ly has flagged it.

But what's telling about this message is insight into how this bait and switch is possible (Bit.ly's language, not mine):

Some URL-shorteners re-use their links, so bit.ly can't guarantee the validity of this link.

Some URL-shorteners allow their links to be edited, so bit.ly can't tell where this link will lead you.

Spam and malware is very often propagated by exploiting these loopholes, neither of which bit.ly allows for.

For those of you young studs looking to break into the cheaters and liars world of spamming via link obfuscation, that's all you are going to get from me out of this post. I think, however, it's pretty clear how this happens.

How You Can Avoid This

In this example, until Blogger fixes how these are reported by pre-filtering the links, you really can't avoid them. I recommend installing a link previewer in your browser. For example, in Google Chrome I have installed ChromeMUSE, which allows me to see the destination of the link before clicking it. Now I can see where the link goes without the risk of infecting my computer or otherwise visiting the site of a pack of liars.

Rely on a more robust service such as Google Analytics, or even something that reads your web server logs like WebTrends (which captures data on all browsers, not just the ones that can run the JavaScript that Google Analytics uses). Leaning on your Blogger dashboard is nice for a quick review, but when the referrers are spam links, you have to wonder how much of that represents real traffic and not just an attempt to show up high enough in your logs for you to click the link.

Net Effect

I already have a problem with link shorteners. I've said as much in previous posts:

I don't trust that a shortened URL will bring me to a safe page. Certainly not when the link comes from a third party, an untrusted source. So if I get an email or a forwarded tweet, for example, and I see a shortened address you can be confident I won't click it. I am not the only one who feels this way. More people are coming over to this camp all the time. Eventually people will not trust shortened links on the whole.

In time we'll see more organizations who have rolled their own or branded a service like Bit.ly with their own address, letting Bit.ly perform all the technical work (redirections, reporting). Web content management systems are finally catching up to the trend, offering aliasing features to allow organizations to create shorter addresses for pages, sometimes bypassing the need for a shortener altogether. In time you may come to recognize a branded URL shortener, like nyti.ms or 4sq.com, and if you trust that organization then you may be comfortable clicking the link.

Update: Feb. 26, 2011

While I didn't think this was unique to Blogger (which has improved dramatically as of late, thankfully), Vox (Scott) posts his own Wordpress stat frustrations and provides a nice link back to me.

Update: June 26, 2012

Found a post from May 2012 dealing with a similar issue, Link Shorteners and Referral Spam Suck. It was over a year since I posted this article and the problem still isn't going away. We may all just be getting used to it.

Adrian Roselli

Pages

Saturday, February 26, 2011

Don't Choose Between Mobile Web and Mobile Apps

Related

Update: March 11, 2011

Thursday, February 24, 2011

W3C Starts Mobile Web App Standards Roadmap

Related

Monday, February 21, 2011

WebM, H.264 Debate Still Going

Changing the Nature of the Battle

Real-World Needs

Related

Wednesday, February 9, 2011

Beyond Hash-Bangs: Reliance on JavaScript Is a Bad Idea

The Numbers

Gawker and Twitter

How JavaScript Is Breaking the Web

Updates: February 11, 2011

Updates: February 12, 2011

Updates: June 1, 2011

Update: May 7, 2015

Saturday, February 5, 2011

Apps Are Not Killing the Web

Update: February 10, 2011

Update: February 11, 2011

Friday, February 4, 2011

URL Shortener Spam Overrunning Blogger Stats

How You Can Avoid This

Net Effect

Related

Update: Feb. 26, 2011

Update: June 26, 2012