TaSTT.git/Scripts, branch v0.11.0

TaSTT.git/Scripts, branch v0.11.0 Free self-hosted STT for VRChat. https://git.yummers.dev/TaSTT.git/atom?h=v0.11.0 2023-04-24T03:52:36+00:00 Begin integrating faster-whisper 2023-04-24T03:52:36+00:00 yum yum.food.vr@gmail.com 2023-04-24T03:52:36+00:00 urn:sha1:b4bb6524652e0f76834ca26a4afa232855ca1348 This is a much faster, lower-VRAM reimplementation of Whisper in Python. Early testing is extremely promising: fast transcription speed, extremely low resource usage (CPU/RAM/VRAM), high accuracy. Reduce texture memory usage for English speakers 2023-03-22T00:13:32+00:00 yum yum.food.vr@gmail.com 2023-03-22T00:13:32+00:00 urn:sha1:1f15133dd985442af20d42a96fbcd0007f03bd2b We used to populate 7 4k textures + 1 2k texture for all users. Now if the user has configured `bytes_per_char=1` in the Unity panel, we just populate a single 512x512 texture containing the first 128 ASCII characters. This reduces texture memory usage by 99.74%, from 134.67 MB to 340 KB. Fix _socket module not found issue 2023-03-21T22:02:29+00:00 yum yum.food.vr@gmail.com 2023-03-21T21:28:46+00:00 urn:sha1:656d7c2092545b18d981acfac000c73fb2128e4a Need python310._pth, specifically 'import site' line, for embedded python + pip to get along. Set PYTHONPATH in synchronous multiprocessing layer 2023-03-08T23:36:46+00:00 yum yum.food.vr@gmail.com 2023-03-08T23:36:46+00:00 urn:sha1:12b6447a87da8077c7dd12b92eefc27dcf7f0818 A user saw an error like `ModuleNotFoundError: No module named _socket`. StackOverflow blames this on PYTHONPATH, so let's try setting it. * Fix latent bug in Scripts/transcribe.py. PyAudio.open() positional parameters must be specified in correct order, even when telling it which parameter is which. *shrug* Bugfix: C++ transcription engine should not launch OSC layer 2023-02-27T00:24:10+00:00 yum yum.food.vr@gmail.com 2023-02-27T00:24:10+00:00 urn:sha1:d0d5eedb4e6c56d81ae2135a50212f2091ee65d7 Not ready yet. Begin work on C++ custom chatbox 2023-02-26T22:27:22+00:00 yum yum.food.vr@gmail.com 2023-02-26T22:21:18+00:00 urn:sha1:f7d7858a9ff270380f5407e48d6afaf6a3a97de3 Sort of a misnomer. The idea is to use C++ for transcription and Python for steamvr and OSC. Having issues getting output from multithreaded Python code. Not in the mood to figure this out today. * Hide unimplemented parts of C++ panel. Revert "Apply previous window conditioning to decoding layer" 2023-02-23T05:54:39+00:00 yum yum.food.vr@gmail.com 2023-02-23T05:49:58+00:00 urn:sha1:718319ee7b79d7cdbead5d765769b50c25e968f4 This reverts commit cece1ee8f1b985c2a89adb661dd02c6d44787f67. This does *not* in fact result in improved temporal stability. It makes makes things so unstable that even single-sentence messages fail to ever stabilize. Begin work on C++ implementation 2023-02-23T05:49:29+00:00 yum yum.food.vr@gmail.com 2023-02-21T21:19:43+00:00 urn:sha1:9a97fbc3c583ccd518d838faaaa36ed9aa5558e1 Use Const-me/Whisper to perform transcription. This implementation is vastly more efficient: CPU usage, memory usage, and VRAM usage are all dramatically reduced. It's slightly less accurate when comparing the same model (due to the lack of beam search decoding), but since you can use larger models, the impact is largely a wash. Apply previous window conditioning to decoding layer 2023-02-23T05:49:22+00:00 yum yum.food.vr@gmail.com 2023-02-19T22:15:30+00:00 urn:sha1:cece1ee8f1b985c2a89adb661dd02c6d44787f67 Per the Whisper source code, this should result in better temporal stability. Remove exponential backoff cap 2023-02-19T20:10:13+00:00 yum yum.food.vr@gmail.com 2023-02-19T19:46:43+00:00 urn:sha1:52f743e43a9ef582e04d7a363fbda19824db6cc7 Allows sustained exponential backoff when not transcribing. Used to cap out at 1s. * Add more items to README TODO list * Adjust emote metadata * Emotes bugfix: Non-existent emote map doesn't cause transcription engine to bail out.