How did I miss that? (Probably because the synopsis only used add_instance(), and I skimmed the rest too fast.) SVMLight format is pretty simple, so it's not too hard to dump your data in that format and then call read_instances(). So one minor suggestion -- adding instances in bulk, particularly for training, is far more common than adding them individually, so it should be in the synopsis.
FWIW, when I wrote an Octave binding to SVM-Light some years back, I used direct calls to the SVM-Light C interface (init_doc(), custom_kernel, etc.) to add a whole batch of instances. It was more work, but way more efficient (and flexible!) than serializing and going through the file system.
Read More