diff options
| author | yum <yum.food.vr@gmail.com> | 2022-11-06 21:22:54 -0800 |
|---|---|---|
| committer | yum <yum.food.vr@gmail.com> | 2022-11-06 21:22:54 -0800 |
| commit | 629c0f611de1622131bb0fa364c170219f6252ed (patch) | |
| tree | 605ef13186df9a21d9e6fc48ef02a2a391317c0e | |
| parent | fe7e51db4c341f9510351e9b3942430f6d44edf2 (diff) | |
Update README
| -rw-r--r-- | README.md | 13 | ||||
| -rw-r--r-- | osc_ctrl.py | 3 |
2 files changed, 11 insertions, 5 deletions
@@ -1,16 +1,23 @@ ## TaSTT: A deliciously free STT TaSTT (pronounced "tasty") is a free speech-to-text tool for VRChat. It uses -local machine translation to turn your voice into text, then sends it into +local machine transcription to turn your voice into text, then sends it into VRChat via OSC. A few parameters, a machine-generated FX layer, and a custom shader display the text in game.  Features: + * 4x44 grid, 256 or 65536 characters per slot. * Text-to-text interface. * Speech-to-text interface. +* Multiple language support. + * Transcription within the same language works for many languages. + * Translation from N languages to English is supported. + * Translation from English into other languages is added case by case. This + is a limitation of the state of the art in machine translation: fine-tuned + English->other language models far outperform English->many language models. * Free as in beer. * Free as in freedom. * Privacy-respecting: transcription is done on your GPU, not in the cloud. @@ -35,7 +42,7 @@ reliable as possible. There are existing tools which help here, but they are all imperfect for one reason or another: -1. RabidCrab's STT costs money and relies on cloud-based translation. I have +1. RabidCrab's STT costs money and relies on cloud-based transcription. I have struggled with latency, quality, and reliability issues. It's also closed-source. 2. The in-game text box is only visible to your friends list, making it @@ -148,6 +155,8 @@ To use the STT: 1. Error detection & correction. 2. ~~Text-to-text interface. Type in terminal, show in game.~~ DONE 3. ~~Speech-to-text interface. Speak out loud, show in game.~~ DONE + 4. Translation into non-English. Whisper natively supports translating N + languages into English, but not the other way around. 4. Optimization 1. ~~Utilize the avatar 3.0 SDK's ability to drive parameters to reduce the total # of parameters (and therefore OSC messages & sync events). Note diff --git a/osc_ctrl.py b/osc_ctrl.py index bb6dd87..5ab65de 100644 --- a/osc_ctrl.py +++ b/osc_ctrl.py @@ -77,9 +77,6 @@ def disable(client): # `which_cell` is an integer in the range [0,2**INDEX_BITS). def sendMessageCellDiscrete(client, msg_cell, which_cell): empty_cell = [state.encoding[' ']] * NUM_LAYERS - if msg_cell != state.encoding[' '] * BOARD_COLS: - addr="/avatar/parameters/" + generate_utils.getSpeechNoiseToggleParam() - client.send_message(addr, False) if msg_cell != empty_cell: addr="/avatar/parameters/" + generate_utils.getSpeechNoiseToggleParam() |
