The experience of porting my spec tests to unit testst for the parser has been interesting. Google Test , unlike ctypes in Python or Node-FFI in Node, actually works and has allowed us to be quite productive over the past few weeks.
The first thing I noticed while porting my tests over is that our parser currently doesn’t recognize a cue with no payload as a cue! According to the spec (as of this post date: December 11, 2012):
A WebVTT cue consists of the following components, in the given order:
- Optionally, a WebVTT cue identifier followed by a WebVTT line terminator.
- WebVTT cue timings.
- Optionally, one or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters followed by WebVTT cue settings.
- A WebVTT line terminator.
- The cue payload: either WebVTT cue text, WebVTT chapter title text, or WebVTT metadata text, but it must not contain the substring “–>” (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN).
Since the components that make up a cue payload (cue text, chapter title text or metadata) are optional, a WebVTT file can be as simple as the following:
WEBVTT
00:01.000 –> 00:02.000
When we wrote these tests against the validator in September, this was considered valid. The WebVTT spec considers it valid. However, our parser is currently not seeing it as a cue. This is an issue that we’ll have to fix next semester. I find it fascinating that converting such a simple test, a task I thought would be walk-in-the-park, ended shedding some light on where our parser is currently falling short. Testing is fun!
Another test which seemed to break the parser was this one:
WEBVTT
00/00/000 –> 00:00.001
This causes a segmentation fault. I guess the parser isn’t expecting strange characters.
In fact, anything the parser doesn’t expect seems to break here. If my timestamp looks like
00:ee:32n
, boom! Segfault. This has been fixed recently
from what I’ve heard so I’ll have to post a follow up soon.