<feed xmlns='http://www.w3.org/2005/Atom'>
<title>TaSTT.git/Scripts/requirements.txt, branch v0.17.0</title>
<subtitle>Free self-hosted STT for VRChat.</subtitle>
<id>https://git.yummers.dev/TaSTT.git/atom?h=v0.17.0</id>
<link rel='self' href='https://git.yummers.dev/TaSTT.git/atom?h=v0.17.0'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/'/>
<updated>2023-09-11T20:53:14+00:00</updated>
<entry>
<title>Pin huggingface_hub to 0.16.4</title>
<updated>2023-09-11T20:53:14+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-11T20:46:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec'/>
<id>urn:sha1:0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec</id>
<content type='text'>
0.17.x are breaking faster_whisper's ability to download models.

Also:
* Start using frozen requirements.txt.
* Conditionally install torch &amp; legacy whisper only when doing
  mechanical optimization.
</content>
</entry>
<entry>
<title>Wire transcribe_v2.py into GUI</title>
<updated>2023-09-04T02:29:44+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-04T02:29:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=6020bc056d8992523ae62feb4edfbae10b169880'/>
<id>urn:sha1:6020bc056d8992523ae62feb4edfbae10b169880</id>
<content type='text'>
Also:
* Enable SO_REUSEADDR on browser src socket
* Temporarily add evaluation dependencies to requirements.txt
* Fix browser src. It's now looking for a prefix that the python app
  actually uses.
</content>
</entry>
<entry>
<title>Begin rewriting transcribe.py</title>
<updated>2023-09-03T03:43:18+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-03T03:43:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=e9b5b4f1da2a8ff07b2d13e5e63dae491325251d'/>
<id>urn:sha1:e9b5b4f1da2a8ff07b2d13e5e63dae491325251d</id>
<content type='text'>
A set of proper interfaces is called for. See #dev-update-spam in
discord for drawing of design.

Also add code to mechanically optimize committer parameters using an
audio file. Not perfectly repeatable since it depends on the performance
characteristics of the machine, but prob better than what we had before
(nothing).
</content>
</entry>
<entry>
<title>Switch back to openvr</title>
<updated>2023-08-29T03:09:35+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-08-29T03:09:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=2daa2c8057cf036357a64e09925487e6f5c0025e'/>
<id>urn:sha1:2daa2c8057cf036357a64e09925487e6f5c0025e</id>
<content type='text'>
openxr doesn't have any notion of background process, making it unusable
trash :)
</content>
</entry>
<entry>
<title>Put audio feedback into its own thread</title>
<updated>2023-08-25T19:50:59+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-08-25T19:50:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=302f7ba09f2ee115d0ee4b8f0841f6ffcd50ec57'/>
<id>urn:sha1:302f7ba09f2ee115d0ee4b8f0841f6ffcd50ec57</id>
<content type='text'>
I this improves the code structure of the controller input thread and
leads to some deduplication, so I'm going to keep it. However, the
intended purpose was to decrease lag when pressing buttons, and in that
regard it failed.

The lag goes all the way down to the input layer, implying that the
input thread is not able to consistently run at its intended 100 Hz
sample rate. I suspect that the Python global interpreter lock (GIL) is
at fault.

Since we can't realistically move all our functionality into one thread
in a non-blocking model, I think multiprocessing is the logical choice
going forward. Each thread in transcribe.py would become its own
process, and pub/sub through some intermediary process sitting in the
middle.
</content>
</entry>
<entry>
<title>Finish pyopenvr -&gt; pyopenxr migration</title>
<updated>2023-08-25T19:08:07+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-08-25T15:21:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=9e43487c1bf62402e96cb6139b24cd8446515673'/>
<id>urn:sha1:9e43487c1bf62402e96cb6139b24cd8446515673</id>
<content type='text'>
pyopenvr is both deprecated and buggy, so switch to pyopenxr.
</content>
</entry>
<entry>
<title>Enforce a stricter avg_logbprob than default</title>
<updated>2023-07-07T09:35:51+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-07-07T09:30:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=7a576bcac1c37c3c5a59fadf172aa70b15ff83c8'/>
<id>urn:sha1:7a576bcac1c37c3c5a59fadf172aa70b15ff83c8</id>
<content type='text'>
Common hallucinations sneak in around -0.9 avg_logprob.

Also:
* Limit temperatures to just 0.0. Multiple values cause latency to
  occasionally spike.
</content>
</entry>
<entry>
<title>Finish translation for Western European language speakers</title>
<updated>2023-05-31T02:13:25+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-05-31T02:01:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=0bda49279ec80187d49a922ff2a47141ffb2fd8f'/>
<id>urn:sha1:0bda49279ec80187d49a922ff2a47141ffb2fd8f</id>
<content type='text'>
NLLB needs its input to be split up into sentences. I use the
sentence_splitter Python package to do this. It supports ~20 Western
European languages, but notably, no Asian languages.

* Sort spoken language list. English is still at the top.
* Remove 'Translation source' dropdown. Infer this from the spoken
  language.
* Add lang_compat.py to map language codes between the various libraries
  (whisper, nllb, sentence_splitter).
* Fix bug where old text would appear in textbox when you first bring it
  up.
</content>
</entry>
<entry>
<title>Add ability to translate into 200 languages</title>
<updated>2023-05-26T05:00:56+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-05-26T04:45:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=84f09e1fdf15644d1ea5f955889581932e4f6a8e'/>
<id>urn:sha1:84f09e1fdf15644d1ea5f955889581932e4f6a8e</id>
<content type='text'>
Use Meta's No Language Left Behind (NLLB) algorithm to provide
translation capabilities into 200 languages. Obviously most are very
untested.

This requires either 4.1 or 7.1 GB of RAM and significiantly increases
transcription latency.
</content>
</entry>
<entry>
<title>Add keyboard toggle</title>
<updated>2023-05-22T11:04:09+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-05-22T10:59:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=8fafea9d026b2b65599456e70d3f5aa61ef073d1'/>
<id>urn:sha1:8fafea9d026b2b65599456e70d3f5aa61ef073d1</id>
<content type='text'>
Users can now configure a keybind to start/stop/dismiss the STT when in
desktop mode. The default keybind is ctrl+x, since by default VRC
doesn't use 'x' for anything.
</content>
</entry>
</feed>
