TaSTT.git/Scripts/requirements.txt, branch v0.18.1

TaSTT.git/Scripts/requirements.txt, branch v0.18.1 Free self-hosted STT for VRChat. https://git.yummers.dev/TaSTT.git/atom?h=v0.18.1 2024-01-09T02:59:27+00:00 Revert "Begin experimenting with flash-attention" 2024-01-09T02:59:27+00:00 yum yum.food.vr@gmail.com 2024-01-09T02:59:27+00:00 urn:sha1:33db3dcc23a45cae611bcf839c33d6615ccbf59e This reverts commit 921b92a69f36502dc5eefd14ba3487c1bb49bb9d. Begin experimenting with flash-attention 2023-12-13T21:54:57+00:00 yum yum.food.vr@gmail.com 2023-12-13T21:54:55+00:00 urn:sha1:921b92a69f36502dc5eefd14ba3487c1bb49bb9d Seems much faster than faster-whisper. There are two issues: * Requires NVIDIA 3000 series or higher. * Incompatible with faster-whisper dependencies. So it seems like we'll either need to toggle between two sets of dependencies at runtime or have two environments. Pin huggingface_hub to 0.16.4 2023-09-11T20:53:14+00:00 yum yum.food.vr@gmail.com 2023-09-11T20:46:57+00:00 urn:sha1:0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec 0.17.x are breaking faster_whisper's ability to download models. Also: * Start using frozen requirements.txt. * Conditionally install torch & legacy whisper only when doing mechanical optimization. Wire transcribe_v2.py into GUI 2023-09-04T02:29:44+00:00 yum yum.food.vr@gmail.com 2023-09-04T02:29:44+00:00 urn:sha1:6020bc056d8992523ae62feb4edfbae10b169880 Also: * Enable SO_REUSEADDR on browser src socket * Temporarily add evaluation dependencies to requirements.txt * Fix browser src. It's now looking for a prefix that the python app actually uses. Begin rewriting transcribe.py 2023-09-03T03:43:18+00:00 yum yum.food.vr@gmail.com 2023-09-03T03:43:18+00:00 urn:sha1:e9b5b4f1da2a8ff07b2d13e5e63dae491325251d A set of proper interfaces is called for. See #dev-update-spam in discord for drawing of design. Also add code to mechanically optimize committer parameters using an audio file. Not perfectly repeatable since it depends on the performance characteristics of the machine, but prob better than what we had before (nothing). Switch back to openvr 2023-08-29T03:09:35+00:00 yum yum.food.vr@gmail.com 2023-08-29T03:09:35+00:00 urn:sha1:2daa2c8057cf036357a64e09925487e6f5c0025e openxr doesn't have any notion of background process, making it unusable trash :) Put audio feedback into its own thread 2023-08-25T19:50:59+00:00 yum yum.food.vr@gmail.com 2023-08-25T19:50:59+00:00 urn:sha1:302f7ba09f2ee115d0ee4b8f0841f6ffcd50ec57 I this improves the code structure of the controller input thread and leads to some deduplication, so I'm going to keep it. However, the intended purpose was to decrease lag when pressing buttons, and in that regard it failed. The lag goes all the way down to the input layer, implying that the input thread is not able to consistently run at its intended 100 Hz sample rate. I suspect that the Python global interpreter lock (GIL) is at fault. Since we can't realistically move all our functionality into one thread in a non-blocking model, I think multiprocessing is the logical choice going forward. Each thread in transcribe.py would become its own process, and pub/sub through some intermediary process sitting in the middle. Finish pyopenvr -> pyopenxr migration 2023-08-25T19:08:07+00:00 yum yum.food.vr@gmail.com 2023-08-25T15:21:56+00:00 urn:sha1:9e43487c1bf62402e96cb6139b24cd8446515673 pyopenvr is both deprecated and buggy, so switch to pyopenxr. Enforce a stricter avg_logbprob than default 2023-07-07T09:35:51+00:00 yum yum.food.vr@gmail.com 2023-07-07T09:30:18+00:00 urn:sha1:7a576bcac1c37c3c5a59fadf172aa70b15ff83c8 Common hallucinations sneak in around -0.9 avg_logprob. Also: * Limit temperatures to just 0.0. Multiple values cause latency to occasionally spike. Finish translation for Western European language speakers 2023-05-31T02:13:25+00:00 yum yum.food.vr@gmail.com 2023-05-31T02:01:56+00:00 urn:sha1:0bda49279ec80187d49a922ff2a47141ffb2fd8f NLLB needs its input to be split up into sentences. I use the sentence_splitter Python package to do this. It supports ~20 Western European languages, but notably, no Asian languages. * Sort spoken language list. English is still at the top. * Remove 'Translation source' dropdown. Infer this from the spoken language. * Add lang_compat.py to map language codes between the various libraries (whisper, nllb, sentence_splitter). * Fix bug where old text would appear in textbox when you first bring it up.