<feed xmlns='http://www.w3.org/2005/Atom'>
<title>TaSTT.git/Scripts/requirements.txt, branch v0.18.1</title>
<subtitle>Free self-hosted STT for VRChat.</subtitle>
<id>https://git.yummers.dev/TaSTT.git/atom?h=v0.18.1</id>
<link rel='self' href='https://git.yummers.dev/TaSTT.git/atom?h=v0.18.1'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/'/>
<updated>2024-01-09T02:59:27+00:00</updated>
<entry>
<title>Revert "Begin experimenting with flash-attention"</title>
<updated>2024-01-09T02:59:27+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-01-09T02:59:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=33db3dcc23a45cae611bcf839c33d6615ccbf59e'/>
<id>urn:sha1:33db3dcc23a45cae611bcf839c33d6615ccbf59e</id>
<content type='text'>
This reverts commit 921b92a69f36502dc5eefd14ba3487c1bb49bb9d.
</content>
</entry>
<entry>
<title>Begin experimenting with flash-attention</title>
<updated>2023-12-13T21:54:57+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-12-13T21:54:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=921b92a69f36502dc5eefd14ba3487c1bb49bb9d'/>
<id>urn:sha1:921b92a69f36502dc5eefd14ba3487c1bb49bb9d</id>
<content type='text'>
Seems much faster than faster-whisper.

There are two issues:
* Requires NVIDIA 3000 series or higher.
* Incompatible with faster-whisper dependencies.

So it seems like we'll either need to toggle between two sets of
dependencies at runtime or have two environments.
</content>
</entry>
<entry>
<title>Pin huggingface_hub to 0.16.4</title>
<updated>2023-09-11T20:53:14+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-11T20:46:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec'/>
<id>urn:sha1:0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec</id>
<content type='text'>
0.17.x are breaking faster_whisper's ability to download models.

Also:
* Start using frozen requirements.txt.
* Conditionally install torch &amp; legacy whisper only when doing
  mechanical optimization.
</content>
</entry>
<entry>
<title>Wire transcribe_v2.py into GUI</title>
<updated>2023-09-04T02:29:44+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-04T02:29:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=6020bc056d8992523ae62feb4edfbae10b169880'/>
<id>urn:sha1:6020bc056d8992523ae62feb4edfbae10b169880</id>
<content type='text'>
Also:
* Enable SO_REUSEADDR on browser src socket
* Temporarily add evaluation dependencies to requirements.txt
* Fix browser src. It's now looking for a prefix that the python app
  actually uses.
</content>
</entry>
<entry>
<title>Begin rewriting transcribe.py</title>
<updated>2023-09-03T03:43:18+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-03T03:43:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=e9b5b4f1da2a8ff07b2d13e5e63dae491325251d'/>
<id>urn:sha1:e9b5b4f1da2a8ff07b2d13e5e63dae491325251d</id>
<content type='text'>
A set of proper interfaces is called for. See #dev-update-spam in
discord for drawing of design.

Also add code to mechanically optimize committer parameters using an
audio file. Not perfectly repeatable since it depends on the performance
characteristics of the machine, but prob better than what we had before
(nothing).
</content>
</entry>
<entry>
<title>Switch back to openvr</title>
<updated>2023-08-29T03:09:35+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-08-29T03:09:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=2daa2c8057cf036357a64e09925487e6f5c0025e'/>
<id>urn:sha1:2daa2c8057cf036357a64e09925487e6f5c0025e</id>
<content type='text'>
openxr doesn't have any notion of background process, making it unusable
trash :)
</content>
</entry>
<entry>
<title>Put audio feedback into its own thread</title>
<updated>2023-08-25T19:50:59+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-08-25T19:50:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=302f7ba09f2ee115d0ee4b8f0841f6ffcd50ec57'/>
<id>urn:sha1:302f7ba09f2ee115d0ee4b8f0841f6ffcd50ec57</id>
<content type='text'>
I this improves the code structure of the controller input thread and
leads to some deduplication, so I'm going to keep it. However, the
intended purpose was to decrease lag when pressing buttons, and in that
regard it failed.

The lag goes all the way down to the input layer, implying that the
input thread is not able to consistently run at its intended 100 Hz
sample rate. I suspect that the Python global interpreter lock (GIL) is
at fault.

Since we can't realistically move all our functionality into one thread
in a non-blocking model, I think multiprocessing is the logical choice
going forward. Each thread in transcribe.py would become its own
process, and pub/sub through some intermediary process sitting in the
middle.
</content>
</entry>
<entry>
<title>Finish pyopenvr -&gt; pyopenxr migration</title>
<updated>2023-08-25T19:08:07+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-08-25T15:21:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=9e43487c1bf62402e96cb6139b24cd8446515673'/>
<id>urn:sha1:9e43487c1bf62402e96cb6139b24cd8446515673</id>
<content type='text'>
pyopenvr is both deprecated and buggy, so switch to pyopenxr.
</content>
</entry>
<entry>
<title>Enforce a stricter avg_logbprob than default</title>
<updated>2023-07-07T09:35:51+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-07-07T09:30:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=7a576bcac1c37c3c5a59fadf172aa70b15ff83c8'/>
<id>urn:sha1:7a576bcac1c37c3c5a59fadf172aa70b15ff83c8</id>
<content type='text'>
Common hallucinations sneak in around -0.9 avg_logprob.

Also:
* Limit temperatures to just 0.0. Multiple values cause latency to
  occasionally spike.
</content>
</entry>
<entry>
<title>Finish translation for Western European language speakers</title>
<updated>2023-05-31T02:13:25+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-05-31T02:01:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=0bda49279ec80187d49a922ff2a47141ffb2fd8f'/>
<id>urn:sha1:0bda49279ec80187d49a922ff2a47141ffb2fd8f</id>
<content type='text'>
NLLB needs its input to be split up into sentences. I use the
sentence_splitter Python package to do this. It supports ~20 Western
European languages, but notably, no Asian languages.

* Sort spoken language list. English is still at the top.
* Remove 'Translation source' dropdown. Infer this from the spoken
  language.
* Add lang_compat.py to map language codes between the various libraries
  (whisper, nllb, sentence_splitter).
* Fix bug where old text would appear in textbox when you first bring it
  up.
</content>
</entry>
</feed>
