VTT Format Guide - Complete Documentation

Comprehensive guide to the VTT (WebVTT) format - the web standard for HTML5 video subtitles, captions, and time-synchronized text tracks.

What is VTT Format?

VTT (Web Video Text Tracks) is a web standard format designed specifically for HTML5 video elements. Developed by the W3C, WebVTT provides a simple, text-based way to add subtitles, captions, descriptions, and other time-synchronized text to web videos.

Unlike more complex subtitle formats, VTT focuses on web compatibility and ease of use while still supporting essential features like positioning, styling, and accessibility requirements for modern web applications.

VTT Format Specification

File Structure

A VTT file must start with the "WEBVTT" signature and contains cues with timing and text:

# Basic VTT file structure
WEBVTT
00:00:12.340 --> 00:00:15.670
Hello World!
00:00:16.000 --> 00:00:19.500
This is a VTT subtitle cue.

Main Components

WEBVTT Header

Required signature that must appear at the start of every VTT file, optionally followed by metadata.

Cue Timing

Start and end timestamps in HH:MM:SS.mmm format connected with arrow notation.

Cue Text

The actual subtitle content that supports basic HTML tags and WebVTT-specific markup.

Cue Format

Each cue in VTT follows this structure:

[cue identifier]
start --> end [cue settings]
cue text
  • Cue identifier: Optional unique ID for the cue
  • Start/End: Timing in HH:MM:SS.mmm or MM:SS.mmm format
  • Cue settings: Optional positioning and alignment
  • Cue text: Subtitle content with optional HTML/VTT markup

Timing Syntax

Valid time formats:

  • MM:SS.mmm - Minutes:Seconds.Milliseconds (e.g., 12:34.567)
  • HH:MM:SS.mmm - Hours:Minutes:Seconds.Milliseconds (e.g., 1:23:45.678)
The arrow --> separates start and end times, and timing must be sequential within the file.

Advanced Features and Styling

Cue Settings

VTT supports various cue settings for positioning and alignment:

Positioning

  • vertical:rl - Right-to-left vertical text
  • vertical:lr - Left-to-right vertical text
  • line:N - Line position (percentage or line number)
  • position:N% - Horizontal position
  • size:N% - Cue box width

Alignment

  • align:start - Left alignment
  • align:middle - Center alignment
  • align:end - Right alignment
  • align:left - Left (for vertical text)
  • align:right - Right (for vertical text)

HTML Tags and Formatting

VTT supports a subset of HTML tags for text formatting:

Text Styling

  • <b> - Bold text
  • <i> - Italic text
  • <u> - Underlined text
  • <c> - CSS class styling

Semantic Tags

  • <v> - Voice/speaker label
  • <ruby> - Ruby annotations
  • <rt> - Ruby text
  • <lang> - Language spans

CSS Styling

VTT files can include CSS styles using the STYLE block:

CSS styling example:

WEBVTT STYLE ::cue { background-color: black; color: white; font-family: Arial; } ::cue(.highlight) { color: yellow; font-weight: bold; } 00:00:12.340 --> 00:00:15.670 <c.highlight>This text is highlighted</c>

Complete VTT File Example

Here's a comprehensive example showing various VTT features:

# Complete WebVTT file with styling and positioning
WEBVTT - Example subtitle file
# CSS Styling
STYLE
::cue {
background-color: rgba(0,0,0,0.8);
color: white;
font-family: Arial, sans-serif;
font-size: 18px;
}
::cue(.speaker) {
color: #ffff00;
font-weight: bold;
}
# Subtitle cues
1
00:00:01.200 --> 00:00:04.500
Welcome to WebVTT subtitles!
2
00:00:05.000 --> 00:00:08.300 align:middle
This subtitle is <b>centered</b> and bold.
speaker-intro
00:00:09.100 --> 00:00:12.800 line:85%
<v.speaker John>Hello, I'm John!
4
00:00:13.500 --> 00:00:16.200 position:25% size:50%
<c.speaker>This text has custom positioning</c>
5
00:00:17.000 --> 00:00:20.400 vertical:rl
Vertical text example

Use Cases and Applications

Web Video Subtitles

Primary format for HTML5 video subtitles on websites, web apps, and streaming platforms.

Accessibility Compliance

WCAG-compliant captions for hearing-impaired users with proper semantic markup and styling.

E-Learning Platforms

Educational video content with synchronized transcripts and multi-language support.

Streaming Services

Web-based streaming platforms using HTML5 video with professional subtitle presentation.

Corporate Training

Business training videos with professional captions and multi-language localization.

Interactive Media

Interactive video experiences with styled captions and synchronized text overlays.

Software Compatibility

VTT enjoys excellent support across modern web browsers and many media players. Here's a comprehensive compatibility overview:

Web Browsers

Chrome (Full support)
Firefox (Full support)
Safari (Full support)
Edge (Full support)
Internet Explorer (IE10+ partial)

Video Players

VLC Media Player
MPV
PotPlayer
MPC-HC
QuickTime (Basic support)

Mobile Browsers

Chrome Mobile (Android/iOS)
Safari Mobile (iOS)
Firefox Mobile
Samsung Internet

Streaming & Platforms

YouTube (Full support)
Vimeo (Full support)
HTML5 Players (Native support)
Netflix (Internal format)

Web Standards Compliance

VTT is an official W3C standard, ensuring consistent behavior across compliant browsers and platforms. It's the recommended format for web accessibility and HTML5 video applications.

W3C Standard
WCAG Compliant
HTML5 Native

VTT Tools and Utilities

Explore our comprehensive collection of VTT tools to create, convert, validate, and work with WebVTT files.