Showing posts with label YouTube. Show all posts
Showing posts with label YouTube. Show all posts

Tuesday, July 8, 2014

Changing YouTube Playback Speed

This post originally appeared on the Algonquin Studios blog.

YouTube gives users the option to modify the playback speed of some videos. This is particularly useful in the case of videos that you are obligated to watch (training videos, terrible fan videos, the occasional conference talk, etc.) and want to get through quickly. You have the option to speed a video to one-and-a-half times normal speed and double normal speed. You can also slow a video to half speed or quarter speed, which can be handy when trying to draw out a training-over-lunch session.

In order to make a go of this, you’ll need to use the YouTube HTML5 player, which you can activate at http://www.youtube.com/html5 while logged into your Google account. If you worry about browser support (for both the HTML5 video element or the various codecs), the YouTube page will show you what your browser supports. In general, if you are using a current version of your favorite browser then you should be fine.

The opening image shows where the option lives. Sadly, that awesome video of Morrissey and George Michael doing film reviews has been pulled, so instead you can try it out on this video of Hitchcock’s The Lady Vanishes (I reference it in slide 58 of my Selfish Accessibility talk). The video also has closed captions and an audio description so it’s a great example of the accessibility features available for YouTube.

When at a video, click the gear icon at the bottom right and look for the Speed menu. If the video allows you to change its playback speed, it will be there with available options. This will only apply to the selected video. If you know of a setting to have it apply to all videos, please let me know.

If you still aren’t sure where this can handy, just try listening to Thundercats dialogue (particularly Panthro) at normal speed and then again at 1.5× normal speed. To me the difference is dramatic.

Wednesday, November 13, 2013

Captions in Everyday Use

Yesterday Henny Swan asked a simple question on the Twitters:

Adam Banks put together a Storify of the responses that show there are plenty of use cases for those not hard of hearing to get value from closed captioning.

In general, any context where either the audio track is loud enough that the viewer doesn't want to disrupt those nearby, or the background noise is too much to hear the audio track clearly, is a case where captions have value for all users. Other cases that popped up include multi-tasking or working with a new language or just tough accents.

In short, closed captions have value for all users.

There is also no reason to panic about providing them, particularly if you use a video service that can do them for you. For example, back in 2010 YouTube committed to enabling auto-captioning for everyone, and Google has documents to help plus tutorials from others, such as this step-by-step or or this video.

Image of the captions in use on President Obama's speech about the Chile earthquake.

Of course, as I was writing this post, Henny posted her own reference to the Twitter conversation: The weird and wonderful reasons why people use subtitles / captions

The Storify of responses I mentioned above is embedded here to spare you all the hassle of clicking the link and to bloat my page with unnecessary script blocks:

Update: November 14, 2013

While I was writing this, Dave Rupert was putting together a very neat experiment, Caption Everything: Using HTML5 to create a real-time closed captioning system.

It's a neat proof-of-concept to show how real-time closed captioning is a possibility with current technology, albeit imprecise and cumbersome. If nothing else, hopefully it can bring more attention to a technique that, as demonstrated above, can benefit all users in everyday situations.

It's such a nifty experiment, I am embedding it here (remember, this isn't mine, this is Dave Rupert's code):

See the Pen Closed Captioning with HTML5 by Dave Rupert (@davatron5000) on CodePen

Monday, February 21, 2011

WebM, H.264 Debate Still Going

Terrible illustration of Chrome dropping H.264.On February 2, Microsoft released a plug-in for Chrome on Windows 7 to allow users to play H.264 video directly in Chrome. In addition, Microsoft has said that it will support WebM (VP8) when a user has the codec installed. And so began the fragmentation of the HTML video model, back to relying on plug-ins to support what was otherwise intended to be supported natively within browsers.

Microsoft went on to ask three broad questions in a separate post (HTML5 and Web Video: Questions for the Industry from the Community), taking Google to task for what Microsoft considers inconsistent application of its own patent concerns and openness (emphasis Microsoft's):

  1. Who bears the liability and risk for consumers, businesses, and developers until the legal system resolves the intellectual property issues;
  2. When and how does Google make room for the Open Web Standards community to engage genuinely;
  3. What is the plan for restoring consistency across devices, Web services, and the PC.

The same day Microsoft was announcing its plug-in approach to addressing the WebM and H.264 battle, the post On WebM again: freedom, quality, patents came out addressing what it felt were the five most common issues raised with WebM (which I have paraphrased):

  1. Quality: the argument here is that it's a function of the encoder and the WebM can match H.264;
  2. Patent Risk: comparing the 164 unexpired U.S. patents used in a single encoder, he finds that 126 of them are there for the encoder's H.264 support, the remaining (used by WebM) are in a library released by Google.
  3. Not open enough: there is little argument here, given that it's currently in Google's hands to manage and develop.
  4. H.264 is not so encumbered: but only for non-commercial use for freely-distributed web video.
  5. Google provides no protection from infringing patents: nor does MPEG-LA.

Changing the Nature of the Battle

On February 11, the post MPEG LA puts Google's WebM video format VP8 under patent scrutiny outlines how MPEG-LA, the licensing entity for multimedia codecs such as H.264, has put out a call for patents related to VP8, the underlying technology in WebM. That deadline for submissions is March 18, or less than a month away as of this writing. From there, MPEG-LA will create a patent pool of contributing patent holders for any items that are deemed essential to the codec. This patent pool can then be used to negotiate licensing. In short, VP8/WebM could soon be more patent encumbered than it has been. This puts Google on the defensive as it will have to show that none of the patents in use are valid and/or infringed.

The same author of that last post posted on the 14th at The Guardian, Royalty-free MPEG video codec ups the ante for Google's WebM/VP8. In case that title isn't clear enough, a new royalty-free video standard may be in the pipes. MPEG, a standards body separate from MPEG-LA (the licensing body) has called for proposals toward a royalty-free MPEG video coding standard. One of the goals is to make this new standard comparable to the baseline used in H.264.

If this pans out, it puts another barrier in front of the WebM offering from Google, namely that for WebM to be adopted it will have to best any new royalty-free MPEG codec. The three items that can bring WebM (VP8) down:

  1. If the MPEG-LA call for for patents and resulting patent pool for VP8 nets some patents, MPEG-LA will now form a patent pool to push for licensing agreements, which Google will have to fight at each step.
  2. If MPEG can genuinely develop a royalty-free video coding standard, it can beat WebM either from the patent perspective or by forcing WebM to be technically superior.
  3. Assuming WebM can get past the first two items, it's still back where it started — in a battle for adoption and endorsement against the already entrenched H.264 standard.

Real-World Needs

Considering Google is the company that delivers so much of the video viewed on the web via YouTube, it makes sense that Google presumes it can take the lead on video codec standards. Netflix, however, has its entire business model built around video, which is now moving inexorably to the web. Netflix commented back in December (HTML5 and Video Streaming) that it needs some things sorted out before it can even rely on HTML5 and video to deliver its content (the first 6 of which it has resolved through its own proprietary technology):

  1. The acceptable A/V container formats (e.g. Fragmented MP4, WebM, etc.);
  2. The acceptable audio and video codecs (e.g. H.264, VP8, AAC, etc.);
  3. The streaming protocol (e.g. HTTP, RTP, etc.);
  4. A way for the streaming protocol to adapt to available bandwidth;
  5. A way of conveying information about available streams and other parameters to the streaming player module;
  6. A way of supporting protected content (e.g. with DRM systems);
  7. A way of exposing all this functionality into HTML5.

It's clear that the web video debate extends far beyond the academics of HTML5. As long as issues related to patents and licensing are unresolved, or perceived as unresolved, we will see more proprietary solutions gain more ground, further balkanizing the future of video on the web.

If you think this doesn't affect the end user, all that time and effort put into creating proprietary solutions ultimately costs you as the consumer in the form of increased fees for your content. Granted, it will be some time for browsers to catch up with the selected codec, and yet more time for users to catch up with the supporting browsers, but the more this debate continues, the longer before we can even start that long road of user adoption.

Possibly the best outcome we can hope for is that this battle results in a royalty-free MPEG standard, nullifying many of the arguments against both H.264 and WebM.

Related

Thursday, January 13, 2011

H.264 Getting Dropped from Chrome

Terrible illustration of Chrome dropping H.264.If you pay any attention to the plodding chaos that is the development of HTML5, then you've probably seen the discussions around the video element and how best to encode videos. Over a year and half ago Ian Hickson gutted the video and audio portions of the HTML5 specification to remove all references to codecs, disconnecting the two competing standards, H.264 and Ogg Theora, from the next HTML standard. He did this in an email announcement to the WHATWG list, explaining the issues with licensing and browser support for both options.

At the time Safari refused to implement support for Ogg Theora, Opera and Mozilla refused to support H.264, Internet Explorer was silent, and only Google Chrome implemented both (though Google said it could not provide the H.264 codec license to Chromium third-party distributors).

A year and half later and Google has dropped support for H.264 from Chrome as of two days ago. While Google has hung its argument on the hook of license restrictions, it's probable that Google is really just pushing its own WebM format.

The licensing argument is simple — the compression parts of H.264 are patented by MPEG-LA. While MPEG-LA has opened up the H.264 license for the web (after originally saying it wouldn't collect royalties until 2016), it's conceivable that move was intended to get the web hooked on it. And then it's fair to assume the patent trolling might begin (history indicates the odds are good).

This announcement and the logic behind it has started off a mini-firestorm among developers on the leading edge of HTML5.

Ars Technica wrote up a pretty scathing review of Google's move in the article Google's dropping H.264 from Chrome a step backward for openness, suggesting Google's real issue is about control over its own WebM format. The article goes into more detail about Google acquisition of the company responsible for developing what is now WebM and compares and contrasts the licenses of these and other standards.

Haavard Moen, who works for Opera, takes some time to disassemble the argument made by Ars Technica in his post Is the removal of H.264 from Chrome a step backward for openness? He breaks it up into 11 points, corrects or contextualizes them, and then suggests that the bulk of the points aren't even relevant to the discussion at hand.

The chart below (and its comments) were unabashedly stolen and marked up from a graphic by Bruce Lawson (of Opera Software fame). He uses it to outline which browsers support which codec. I have added a column to list Ogg Theora.

Browser Ogg Theora Native webM Support H.264 Support Member of MPEG-LA H.264 Licensing Pool
Opera Yes Yes No No
Firefox Yes Yes No No
Chrome Yes Yes No No
Internet Explorer 9 No No Yes Yes
Safari No No Yes Yes
The two browsers that only support h264 video are Internet Explorer 9 and Apple Safari, the vendors of which have a financial stake in the codec:
www.mpegla.com/main/programs/AVC/Pages/Licensors.aspx

The column in the chart asking about the MPEG-LA licensing pool is intended to show that the only browsers still supporting H.264 are those with a financial stake. Bruce updated his post with a link from a site visitor that claims that Microsoft gets far less from its financial commitment to H.264 than it pays in.

An argument that keeps popping up is that Google should drop support for Flash, given that it is not an open standard. I have dismissed this immediately on the argument that Flash has been here for a very long time, and it's not practical to drop support for a technology that is already driving millions of sites, at least not without the myopic world view that Apple lugs around.

That argument, however, made it into the post from John Gruber's blog, Daring Fireball, titled Simple Questions for Google Regarding Chrome’s Dropping of H.264. Remy Sharp was quick to respond on his own blog with My take on Google dropping H.264.

While Flash is only a part of that debate, it's further insight into the arguments we're hear from both sides. Some of the arguments will lean on a perceived double standard, as we see with the Flash example; some will lean on license debate, as we see with the constant references to MPEG-LA versus WebM and its background; some will lean on the quality of the video, which I intentionally left out of this post; and some will lean on who wants to be the next big monopoly for the burgeoning growth of video on the web.

For the average developer, it might be best to wait until the dust settles. You'll still be making decisions based on browser support in the end anyway.

Related

UPDATE: More Links from around the Innertubes

These links came rolling out this past weekend (January 14-16) and offer more arguments for and against H.264 and Google's decision.

Update (January 21, 2011)

Friday, December 17, 2010

You Get What You Pay For

We're just shutting down delicious, not selling your children to gypsies. Get the f-ck over it.

First off, let me apologize for ending the title of this post with a preposition. I am playing off an idiom, so I think I have some leeway. Besides, "You get that for which you pay" just doesn't roll off the tongue.

In the last week I have watched two free web services I use announce (in some fashion) that they are going away. This has caused a good deal of frustration and anger on behalf of users. And it's all just a repeat of things I have seen on the web for 15 years now.

I have watched the Brightkite blog, Facebook page and Brightkite/Twitter accounts get hammered with angry and abusive comments from users (Brightkite Yields to Foursquare, Gowalla, Etc.).

I have watched on Twitter as people have derided Yahoo's decision to shut down del.icio.us, the place where they have shared and stored bookmarks for years (Leaked Slide Shows Yahoo Is Killing Delicious & Other Web Apps at Mashable).

I felt vindicated when Google decided to pull the plug on Google Wave, partly owing to the fact that nobody could quite figure out how to wield something that was a floor wax and a dessert topping all in one (Google Wave is Dead at ReadWriteWeb).

I have watched as some of the URL shorteners on which we have come to rely for services like Twitter have announced that they are going away, or have just disappeared (List of URL Shorteners Grows Shortener).

I, and perhaps the entire web, breathed a sigh of relief when Geocities announced it was going to take a dirt nap — and finally did (Wait - GeoCities Still Exists?).

I remember when both Hotmail and Yahoo decided it was time to start charging for access to some of the more enhanced features of the free email they offered users (Say Goodbye to Free Email).

I saw people panic when they might lose access to all sorts of free video, photos, and even text content from CNN, Salon, and others (End of the Free Content Ride?).

We Get It; You've Been There, What's Your Point?

These services all have a couple key things in common:

  1. Users have put a lot of time, energy, and apparently emotion into these services.
  2. They are free.

The second point, in my opinion, should mitigate the first point. If you as a user are not paying to use a service, then is it a wise decision to build your social life or your business around it? Do you as a user not realize that these organizations owe you nothing?

As Brightkite announced the shuttering of its core service with only a week heads-up, they were kind enough to allow users to grab their data via RSS feeds. Yahoo hasn't even formalized the future of del.icio.us, but already fans have found a way to grab the data. But in both of these cases, if you as a user aren't backing up your data, keeping an archive, or storing it elsewhere, whose fault is it really that you might lose it all?

Is it wise to build a social media marketing campaign on Facebook, a platform notorious for changing the rules (features, privacy controls, layout, etc.) on a whim? Is relying on a free URL shortener service a good idea as the only method to present links to your highly developed web marketing campaigns? Should you really run your entire business on the features offered by Gmail, Google Calendar, Google Docs, etc? If you have to alert staff/friends/partners to something important in a timely fashion, can you really trust Twitter to do what you need?

The culture of the web (nee Internet) has always been one of an open and sharing environment, where people and organizations post information that they understand will be shared/borrowed/stolen/derided. Somehow users of the web have come to expect that everything is, or should be, free. Look at the proliferation of sites to steal movies and music as an example on one end of the spectrum. On the other end is the reliance on Wikipedia by every school kid across the country instead of a purchased encyclopedia.

Let's all take some time to evaluate our plans and what we are doing. When that vendor who builds Facebook campaigns comes back to tell you that what he/she built last year won't work this year due to a Facebook change, there is your cost. When you have to take time from your real work to download all your bookmarks just so you can try to find a way to share them again or even get them into your browser, there is your cost. When you build a business on the back of a Twitter API and have to retool your entire platform due to an arbitrary change in how you call the service, there is your cost. When your Google Doc is sitting in "the cloud" and you're sitting in a meeting without wifi just before you have to present it, there is your cost.

This cost, however, ignores something that can't be measured on your end with dollars. The cost of sharing your personal information, your activities, your habits, are all your daily cost for using many of these services.

You may be under the impression that I have something against these free services. The use of this very blog should tell you otherwise. Instead I have something against users who have an expectation of free, top-notch service from organizations who are really only around as far as their cash flow can sustain them.

I keep my bookmarks on my local machine and just share the files between computers. I have been archiving my Brightkite photos since I started using the service, and archiving the posts to Twitter and Facebook, all the while backing up my Twitter stream. I use locally-installed software (MS Word, OpenOffice) writing to generic formats (RTF, etc.) and keep the files I need where I can access them (file vault on my site). I pay for a personal email service in addition to maintaining a free one. Other than Twitter, with its character limits, I avoid URL shorteners (and have no interest in rolling my own). I signed up for Diaspora in the hopes that I can funnel all my social media chaos to the one place I can take it with me. I keep a landline in my house so when the power goes out I can still make a phone call to 911.

I don't tweet my disgust when Facebook changes its layout. I don't post angry comments on Brightkite's wall when they kill a service. I don't try to organize people to take their time to rebuild Google Wave when I cannot. I don't punch my co-worker when he buys me a sandwich and the deli failed to exclude the mayo.

Let's all take some personal responsibility and stop relying solely on something simply because it's free. Your favorite free thing is different or gone (or will be). Suck it up and move on.

Update: January 10, 2011

Alex Williams at ReadWriteWeb echoes the general theme of expecting free stuff in the post "Dimdim: The Risk of Using A Free Service."

Update: January 12, 2011

Free sometimes means "full of malware and viruses," even when you are just installing free themes for your blog: Why You Should Never Search For Free WordPress Themes in Google or Anywhere Else

Update: January 2, 2014

Jeffrey Zeldman explains the process in a narrative: The Black Hole of The Valley

Wednesday, November 24, 2010

Current Internet Use, from Assorted Sources

Image of this blog on a BlackBerry, showing a post with an image of this blog on an HTC phone.

Today Opera Software released data about how users of its Opera Mini mobile web browser use the web. Opera does this periodically to give some insight into how its users may be surfing, but what we don't know is how much Opera Mini users correspond to the web in general. Opera is certainly motivated to capture as much of the mobile market as it can given its low appearance numbers on desktops. Regardless, the title of the report really distills down Opera's findings: Generation Y chooses the mobile Web. You can get all the details from this and prior surveys at Opera's State of the Mobile Web site. Some of the highlights:

  • Almost 90% of respondents in the United States aged 18-27 have used their phones to share pictures. Of the profiled countries, Vietnam &8212; at 67% &7212; had the lowest use of mobile phones to share pictures.
  • Respondents in the United States are least likely to have asked someone out on a date via SMS (44%). Respondents in China (84%), Germany (84%) and Vietnam (83%) are most likely to have used SMS texts to ask someone out on a date.
  • Generation Y in both China and the United States share a disdain for printed newspapers. 53% of respondents in the United States and 57% of respondents in China rarely or never read physical newspapers.
  • Watch your privacy policies. Respondents in South Africa (49%) and the United States (44%) were somewhat to very uncomfortable sharing their personal information online.

Last week ReadWriteWeb reported that YouTube use on mobile devices has been on the rise — 75% of surveyed mobile YouTube users saying that their mobile device is the primary way of accessing YouTube (YouTube Mobile Use Exploding: 75% Report Mobile is Primary Way of Watching YouTube ). This number, however, should be considered in context. Only users of the mobile version of YouTube (typically the YouTube app installed on a phone) were surveyed, so you can expect a far larger percentage of respondents relying on the mobile version as opposed to the general public. This doesn't, for example, track the users who might come across a page on your site with your corporate YouTube video. Since YouTube is often used for casual surfing, not so much business use, it makes sense that a meme discussed over beers with friends might result in a smartphone popping out to track down the video everyone is referencing.

Brian Solis was kind enough to take the data from the Ad-ology report, Twitter Users in the United States, and distill it down to some manageable chunks of data in his post Who are All of These Tweeple? In short, Twitter users tend to range between 18 and 34 (which is a big range) are white and have at least some college education. Again, cross-referencing with the data we've gathered from other surveys, we see a continuation of some trends toward younger more savvy users. There aren't lots of surprises in the report, but there are some numbers that can at least provide a little more detail to what we already expect. For example:

57.7% of Twitter users use the Internet more than three hours per day for personal use (outside of school or work) and are considered "heavy Internet users."

Back in June Nielsen released a report with a telling title: Social Networks/Blogs Now Account for One in Every Four and a Half Minutes Online. Four of the most popular destinations on the web are Google, Facebook, Youtube and Wikipedia. All of these enjoy a lot of use from users on mobile devices (well, perhaps not so much Wikipedia, but people are still looking things up in bars after tracking down the YouTube video). While the article is silent on mobile use, a skilled reader can apply mobile trends to the overall traffic and begin to see part of the reason mobile has been climbing.

If you believe this article from June, Social Media is the 3rd Era of the Web, then you can expect to see the numbers of social media sites to continue to climb and ages of users continue to stay young, even as older users get on board. As part of that, mobile use will continue to climb as people want to stay socially connected wherever they are.

The trick among reports and studies is to figure out how the data was gathered, who performed the gathering, why they did it and who participated. If you can validate that a study has any merit, then you can start to cross-reference it with other reports and piles of data to tease out some meaning.

Related Links

UPDATE

It seems the day after Thanksgiving is a good day for people to post more details about Internet use. I won't distill them here (I haven't had a chance to read them in detail), but here are a couple more chunks of stats and data to review while you digest.

Monday, March 8, 2010

YouTube Opens Auto-Captioning to All

Image of the captions in use on President Obama's speech about the Chile earthquake.

If you've been reading my blog for a while now then you may have noticed my post back in November titled YouTube Will Automatically Caption Your Video. In that post I talked about YouTube leveraging Google Voice and its speech recognition features to automatically generate video captions. That feature was only available to a subset of all YouTube users, a "small, select group of partners."

On Thursday YouTube announced it will open that feature to all YouTube users (The Future Will Be Captioned: Improving Accessibility on YouTube). In reading the post on the YouTube blog, it looks like they will be working through all the videos on YouTube. Video owners who want to speed up the availability of auto-captions on their videos can click a "request processing" button, hopefully dropping it into a queue. However, since it's a free service, I wouldn't expect them to set a deadline with you.

The blog post lists a few things to keep in mind:

  • While we plan to broaden the feature to include more languages in the months to come, currently, auto-captioning is only for videos where English is spoken.
  • Just like any speech recognition application, auto-captions require a clearly spoken audio track. Videos with background noise or a muffled voice can't be auto-captioned. President Obama's speech on the recent Chilean Earthquake is a good example of the kind of audio that works for auto-captions.
  • Auto-captions aren't perfect and just like any other transcription, the owner of the video needs to check to make sure they're accurate. In other cases, the audio file may not be good enough to generate auto-captions. But please be patient — our speech recognition technology gets better every day.
  • Auto-captions should be available to everyone who's interested in using them. We're also working to provide auto-captions for all past user uploads that fit the above mentioned requirements. If you're having trouble enabling them for your video, please visit our Help Center: this article is for uploaders and this article is for viewers.

The obvious benefit here is the ability to satisfy accessibility requirements and open your content to a broader audience. This also can help in your ability to search for videos with appropriate content since the captions live in a separate text file that is pulled into the video, all powered by Google. In addition, with Google's free translation services, it will be far easier to translate videos into multiple languages, reaching an even larger audience.

Wednesday, January 20, 2010

Accessible Video and Transcripts

With HTML5 on the horizon, it is becoming far easier to embed video on a web page than it has been. Sure, you can drop some code copied from YouTube, but you have little control over the HTML or the video output. Once you do have your video, you also need to bear in mind that not only are video (and audio) transcripts good practice, they are required by law for many organizations.

HTML5 Video Captions

Bruce Lawson has been kind enough to put together an example and code (Accessible HTML5 Video with JavaScripted captions) for his method to embed synchronized captions, using JavaScript for the synchronization, with a video embedded using the HTML5 video element. The caveat here is that you need to have a browser that supports the open Ogg Theora codec. You can check out the sample video if you have such a capable and sweet browser.

YouTube Automatic Captions

Some of you may remember my post about how YouTube can now automatically caption your videos using speech recognition software. While the results can be interesting, you can at least edit the captions coming out of YouTube. Terrill Thompson describes how he grabbed the .sbv caption file from his video and updated it, then uploaded it back to YouTube for a much more useful result.

Why Transcripts?

Shawn Henry leads the education and outreach effort promoting web accessibility at the W3C Web Accessibility Initiative (WAI). She recently wrote an article titled Transcripts on the Web: Getting people to your podcasts and videos that explains why you would want to do this, some best practices, and even some resources. For example, in addition to supporting users who are deaf or hard of hearing, she points out that search engines can index a transcript where they cannot index a video or audio file. This adds to the overall SEO strategy for any site. Her best practices go beyond what the captions mentioned above provide: she points out that sometimes we need to state how many people in a video raised their hands in response to a question. She is good to not browbeat the reader with a reminder that WCAG 1.0, WCAG 2.0 and Section 508 all require transcripts — at least not until the end of the article.

If you have a few minutes after reading all these other links, consider reading the interview with Shawn Henry, W3C's Shawn Henry on Website Accessibility, from last week over at Freelance Review.

Friday, November 20, 2009

YouTube Will Automatically Caption Your Video

Three years ago YouTube/Google added the ability for video authors to add captions to videos. Over time support for multiple caption tracks was included, the expansion of search to consider text in captions, and even machine translation support for the captions (see my other post about machine translation risks).

Even with hundreds of thousands of captioned videos on YouTube, new videos are posted at the rate of 20 hours of video per minute. For many companies (and not-for-profits and government agencies), YouTube provides the most cost-effective and ubiquitous method to distribute video content to users. So many of these organizations (particularly not-for-profits and government agencies) are required by law (US and elsewhere) to provide captions for video, but don't have the experience or tools to do so. Users who are deaf are excluded from fully understanding this content as a result.

This is where the speech recognition features (ASR) of Google Voice come into play. This technology can parse the audio track of your videos and create captions automatically. Much like machine translation, the quality of these captions may not be the best, but it can at least provide enough information for a user who could not otherwise understand the video at all to glean some meaning and value.

In addition, Google is launching "automatic caption timing," essentially allowing authors to easily make captions using a text file. As the video creator, an author will be able to create a text file with all the words in the video and Google's speech recognition software will figure out where those words are spoken and take care of the timing. This technique can greatly increase the quality of captions on videos with very little effort (or cash outlay for tools) on the part of the video creator.

You can read more at the the YouTube Help Center article. You can also read the blog post announcing this feature at the Google Blog. The video below shows a short demo about the auto-captioning and auto-timing features.

Update (August 25, 2010): Paul Bukhovko of FatCow was kind enough to translate this entry into Belorussian: YouTube аўтаматычна захоплівае сваё відэа