<feed xmlns='http://www.w3.org/2005/Atom'>
<title>TaSTT.git/Scripts, branch v0.19.1</title>
<subtitle>Free self-hosted STT for VRChat.</subtitle>
<id>https://git.yummers.dev/TaSTT.git/atom?h=v0.19.1</id>
<link rel='self' href='https://git.yummers.dev/TaSTT.git/atom?h=v0.19.1'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/'/>
<updated>2024-06-09T23:43:34+00:00</updated>
<entry>
<title>Bump CUDNN to v8.9.7</title>
<updated>2024-06-09T23:43:34+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-06-09T23:43:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=4fec36c3cc00bd649dfb3c9d7e9079b5c8685a0e'/>
<id>urn:sha1:4fec36c3cc00bd649dfb3c9d7e9079b5c8685a0e</id>
<content type='text'>
Also disable flash-attention when CPU mode is selected
</content>
</entry>
<entry>
<title>Add checkbox for flash-attention</title>
<updated>2024-06-09T22:54:30+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-06-09T22:54:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=f2b21dd5afebd6b76b5835168f7d1bd3bec21f5d'/>
<id>urn:sha1:f2b21dd5afebd6b76b5835168f7d1bd3bec21f5d</id>
<content type='text'>
Pre-3000 series GPUs don't support it. Oops!
</content>
</entry>
<entry>
<title>Upgrade faster-whisper with flash-attention2</title>
<updated>2024-06-06T01:15:47+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-06-06T01:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=4f0fb5b17de990517e3c1de7ffee5d0f3c9a8961'/>
<id>urn:sha1:4f0fb5b17de990517e3c1de7ffee5d0f3c9a8961</id>
<content type='text'>
This should be significantly more efficient than prior versions.

* add large-v3 &amp; distilled variant
* simplify model acquisition code now that distilled models are part of
  faster-whisper.
</content>
</entry>
<entry>
<title>Fix distilled models</title>
<updated>2024-03-15T01:03:54+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-03-15T01:03:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=5638d86c97041de31217e058e411034143e9c882'/>
<id>urn:sha1:5638d86c97041de31217e058e411034143e9c882</id>
<content type='text'>
These were broken due to some logic errors in the codepath which
acquires models from huggingface.

Distilled large-v2 seems promising as a new default model.
</content>
</entry>
<entry>
<title>Add "simple" text-to-text demo for the modular avatar chatbox</title>
<updated>2024-03-09T02:21:22+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-03-09T02:21:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=cdc079fb59832fce46708df36ac80ede6d2bd046'/>
<id>urn:sha1:cdc079fb59832fce46708df36ac80ede6d2bd046</id>
<content type='text'>
To use it:
$ python3 -m pip install python-osc pillow
$ cd Scripts
$ python3 ./text_to_text_demo.py
</content>
</entry>
<entry>
<title>Finish plumbing GPU compute type</title>
<updated>2024-02-10T01:51:53+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-02-10T01:51:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=5ef207d28f2a9d943384b9ec6872aedae2917ac0'/>
<id>urn:sha1:5ef207d28f2a9d943384b9ec6872aedae2917ac0</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Add dropdown for GPU compute type</title>
<updated>2024-02-10T01:21:46+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-02-10T01:21:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=3b84b185d1286e1b954f5ad636b26188efa141e4'/>
<id>urn:sha1:3b84b185d1286e1b954f5ad636b26188efa141e4</id>
<content type='text'>
Should enable compatibility with older GPUs.
</content>
</entry>
<entry>
<title>Add another threshold to filter out common hallucinations</title>
<updated>2024-02-06T01:40:37+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-02-06T01:40:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=e58c718cb115c44ef3a546bea245e05e50d24c55'/>
<id>urn:sha1:e58c718cb115c44ef3a546bea245e05e50d24c55</id>
<content type='text'>
The paper recommends filtering out segments with no_speech_prob &gt; 0.6
and avg_logprob &lt; -1. This is too loose of a bound for short-form audio
which is not guaranteed to contain speech.

I already have a tighter bound:

  no_speech &gt; 0.6 and avg_logprob &lt; -0.5

While listening to instrumental music I find that a lot of
hallucinations sneak past that bound. So I added a second bound:

  no_speech &gt; 0.15 and avg_logprob &lt; -0.7

Basically we filter out things that look like speech but have a worse
avg_logprob. Seems to not have false negatives. Requires testing.

Also: dial back the default max segment length from 15 seconds to 10
seconds. This is done based on performance observations in desktop.
</content>
</entry>
<entry>
<title>Verify that audio is clean after VAD segmentation</title>
<updated>2024-02-06T01:02:23+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-02-06T01:01:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=acccf8ebcff0f7cc2b26e45e497f8b12ab73d8e1'/>
<id>urn:sha1:acccf8ebcff0f7cc2b26e45e497f8b12ab73d8e1</id>
<content type='text'>
Indeed it is. Bumped up the default max segment length to decrease
error.

Also add mic presets for beyond (the vr headset) and motu (my mic
interface).
</content>
</entry>
<entry>
<title>Revert "Begin experimenting with flash-attention"</title>
<updated>2024-01-09T02:59:27+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2024-01-09T02:59:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=33db3dcc23a45cae611bcf839c33d6615ccbf59e'/>
<id>urn:sha1:33db3dcc23a45cae611bcf839c33d6615ccbf59e</id>
<content type='text'>
This reverts commit 921b92a69f36502dc5eefd14ba3487c1bb49bb9d.
</content>
</entry>
</feed>
