The experience of porting my spec tests to unit testst for the parser has been interesting. Google Test , unlike ctypes in Python or Node-FFI in Node, actually works and has allowed us to be quite productive over the past few weeks.

The first thing I noticed while porting my tests over is that our parser currently doesn’t recognize a cue with no payload as a cue! According to the spec (as of this post date: December 11, 2012):

A WebVTT cue consists of the following components, in the given order:

  1. Optionally, a WebVTT cue identifier followed by a WebVTT line terminator.
  2. WebVTT cue timings.
  3. Optionally, one or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters followed by WebVTT cue settings.
  4. A WebVTT line terminator.
  5. The cue payload: either WebVTT cue text, WebVTT chapter title text, or WebVTT metadata text, but it must not contain the substring “–>” (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN).

Since the components that make up a cue payload (cue text, chapter title text or metadata) are optional, a WebVTT file can be as simple as the following:

WEBVTT

00:01.000 –> 00:02.000

When we wrote these tests against the validator in September, this was considered valid. The WebVTT spec considers it valid. However, our parser is currently not seeing it as a cue. This is an issue that we’ll have to fix next semester. I find it fascinating that converting such a simple test, a task I thought would be walk-in-the-park, ended shedding some light on where our parser is currently falling short. Testing is fun!

Another test which seemed to break the parser was this one:

WEBVTT

00/00/000 –> 00:00.001

This causes a segmentation fault. I guess the parser isn’t expecting strange characters. In fact, anything the parser doesn’t expect seems to break here. If my timestamp looks like 00:ee:32n, boom! Segfault. This has been fixed recently from what I’ve heard so I’ll have to post a follow up soon.