<feed xmlns='http://www.w3.org/2005/Atom'>
<title>TaSTT.git/Scripts/transcribe.py, branch v0.11.0</title>
<subtitle>Free self-hosted STT for VRChat.</subtitle>
<id>https://git.yummers.dev/TaSTT.git/atom?h=v0.11.0</id>
<link rel='self' href='https://git.yummers.dev/TaSTT.git/atom?h=v0.11.0'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/'/>
<updated>2023-04-24T03:52:36+00:00</updated>
<entry>
<title>Begin integrating faster-whisper</title>
<updated>2023-04-24T03:52:36+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-04-24T03:52:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=b4bb6524652e0f76834ca26a4afa232855ca1348'/>
<id>urn:sha1:b4bb6524652e0f76834ca26a4afa232855ca1348</id>
<content type='text'>
This is a much faster, lower-VRAM reimplementation of Whisper in Python.
Early testing is extremely promising: fast transcription speed,
extremely low resource usage (CPU/RAM/VRAM), high accuracy.
</content>
</entry>
<entry>
<title>Set PYTHONPATH in synchronous multiprocessing layer</title>
<updated>2023-03-08T23:36:46+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-03-08T23:36:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=12b6447a87da8077c7dd12b92eefc27dcf7f0818'/>
<id>urn:sha1:12b6447a87da8077c7dd12b92eefc27dcf7f0818</id>
<content type='text'>
A user saw an error like `ModuleNotFoundError: No module named _socket`.
StackOverflow blames this on PYTHONPATH, so let's try setting it.

* Fix latent bug in Scripts/transcribe.py. PyAudio.open() positional
  parameters must be specified in correct order, even when telling it
  which parameter is which. *shrug*
</content>
</entry>
<entry>
<title>Revert "Apply previous window conditioning to decoding layer"</title>
<updated>2023-02-23T05:54:39+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-02-23T05:49:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=718319ee7b79d7cdbead5d765769b50c25e968f4'/>
<id>urn:sha1:718319ee7b79d7cdbead5d765769b50c25e968f4</id>
<content type='text'>
This reverts commit cece1ee8f1b985c2a89adb661dd02c6d44787f67.

This does *not* in fact result in improved temporal stability. It makes
makes things so unstable that even single-sentence messages fail to
ever stabilize.
</content>
</entry>
<entry>
<title>Apply previous window conditioning to decoding layer</title>
<updated>2023-02-23T05:49:22+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-02-19T22:15:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=cece1ee8f1b985c2a89adb661dd02c6d44787f67'/>
<id>urn:sha1:cece1ee8f1b985c2a89adb661dd02c6d44787f67</id>
<content type='text'>
Per the Whisper source code, this should result in better temporal
stability.
</content>
</entry>
<entry>
<title>Remove exponential backoff cap</title>
<updated>2023-02-19T20:10:13+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-02-19T19:46:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=52f743e43a9ef582e04d7a363fbda19824db6cc7'/>
<id>urn:sha1:52f743e43a9ef582e04d7a363fbda19824db6cc7</id>
<content type='text'>
Allows sustained exponential backoff when not transcribing. Used to cap
out at 1s.

* Add more items to README TODO list
* Adjust emote metadata
* Emotes bugfix: Non-existent emote map doesn't cause transcription
  engine to bail out.
</content>
</entry>
<entry>
<title>Finish emotes</title>
<updated>2023-02-13T22:36:25+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-02-03T02:00:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=1cb5bdfe8cba6fe4647448cd3cf0c63ecbd7dfc2'/>
<id>urn:sha1:1cb5bdfe8cba6fe4647448cd3cf0c63ecbd7dfc2</id>
<content type='text'>
Emotes require 2 bytes per char. They're encoded into the region
[0xE000, infinity). The texture is 4k, and uses 1k vertical pixels
per emote segment, for a maximum of 32 segments.

* Reduce volume of noise indicator by 90%. Quiet is probably better.
  Might want to add a volume slider idk.
* Bugfix: emotes without a transparency channel now work
* Address a couple Unity performance complaints about the shader
</content>
</entry>
<entry>
<title>Begin work adding emotes</title>
<updated>2023-02-13T22:36:20+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-02-02T09:02:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=7c6894614dcc3ebc5d4c8839b64f4da761b5ccf0'/>
<id>urn:sha1:7c6894614dcc3ebc5d4c8839b64f4da761b5ccf0</id>
<content type='text'>
Done:
* Users can add images to Fonts/Emotes/
* The basename of that image ('clueless.png' becomes 'clueless') is the
  keyword to make the image show up in game.
* Fix a bug in the shader where letters on the 2nd texture and later
  would have UV outside of [0.0, 1.0]

Not yet implemented:
* transcribed words are encoded using emotes mapping
</content>
</entry>
<entry>
<title>Built-in chatbox no longer shows empty messages</title>
<updated>2023-02-04T23:26:41+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-02-04T21:16:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=92fea304613bacfa014e1fbaf9fddb82e4f33d62'/>
<id>urn:sha1:92fea304613bacfa014e1fbaf9fddb82e4f33d62</id>
<content type='text'>
* Reduce noise on/off indicator volume by 50%
</content>
</entry>
<entry>
<title>GUI: Add ability to choose button</title>
<updated>2023-01-25T20:38:28+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-01-25T20:38:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=227ff7aa0ed2fd03c54ae53aa01430012ff3f7d0'/>
<id>urn:sha1:227ff7aa0ed2fd03c54ae53aa01430012ff3f7d0</id>
<content type='text'>
We use a button to start/stop transcription. Previously this was
hardcoded to left joystick. Now users can pick from {left, right} x
{joystick, a, b}.
</content>
</entry>
<entry>
<title>Enable using built-in chatbox</title>
<updated>2023-01-22T23:35:00+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-01-22T23:05:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=1c056bf385d2c48f6e4f77da513060c04415252c'/>
<id>urn:sha1:1c056bf385d2c48f6e4f77da513060c04415252c</id>
<content type='text'>
VRChat exposes a built-in chatbox which can be seen by anyone who has
it enabled. This was not the case when I started this project: the
chatbox would only be visible to friends. Since this is clearly useful,
enabling the STT on public models, let's enable sending data to it.

Caveats:

* The built-in chatbox has anti-spam tech which limits us to updating
  about once every 2 seconds. The custom chatbox has no such limitation
  and is thus typically much faster.
</content>
</entry>
</feed>
