diff options
| author | yum <yum.food.vr@gmail.com> | 2022-11-14 21:30:50 -0800 |
|---|---|---|
| committer | yum <yum.food.vr@gmail.com> | 2022-11-14 21:36:13 -0800 |
| commit | 2505a5cc486cd913db50a475e45c3701b9710282 (patch) | |
| tree | 86855b5772cc6400205926ed8d935227a574a7e6 /README.md | |
| parent | 9921697816c9f9473bac54444793f702e54d24a6 (diff) | |
Another transcription rework
After re-reading the paper, I noticed that they apply a couple
optimizations I wasn't using. Use the top-level `whisper.transcribe`
method, which is a little slower, but more accurate than the one I was
using.
Although this method is slower, it has better temporal stability due to
the increased quality, which I think should make for an overall more
responsive UX. Lower transcription quality means the paging layer has to
waste time updating earlier cells.
Also, drop the auto-commit stuff and go back to string stitching. I
think it's better to let the user manually commit. A rework of the hand
controls is probably coming soon.
Finally, update README.
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 3 |
1 files changed, 3 insertions, 0 deletions
@@ -157,6 +157,9 @@ To use the STT: 3. ~~Speech-to-text interface. Speak out loud, show in game.~~ DONE 4. Translation into non-English. Whisper natively supports translating N languages into English, but not the other way around. + 5. Display text in overlay. Enables (1) lower latency view of TaSTT's + transcription state; (2) checking transcriptions ahead of time; (3) + checking transcriptions without having to see the board in game. 4. Optimization 1. ~~Utilize the avatar 3.0 SDK's ability to drive parameters to reduce the total # of parameters (and therefore OSC messages & sync events). Note |
