| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
| |
Add a `matchStrings` which does basically the same thing as
`matchStringList` except it doesn't split the input at space boundaries.
I think this should work better for Japanese and Chinese, since they
don't use spaces.
Doesn't seem to cause any accuracy regressions for English.
Also update the README.
|
| |
|
|
|
| |
Stitching new uses 6 word sliding window instead of 4 word. Seems to
dramatically improve transcription quality.
|
|
|
Transcription stitching now occurs in word space, rather than in text
space. This avoids problems where we accidentally duplicate or delete
letters in the middle of words.
Factor out stitching into its own module and add a small handful of
test cases. Hopefully if we hit problems in production, we can just
grow this list and avoid regressions if we reimplement.
|