<feed xmlns='http://www.w3.org/2005/Atom'>
<title>TaSTT.git/Scripts/transcribe_v2.py, branch v0.17.0</title>
<subtitle>Free self-hosted STT for VRChat.</subtitle>
<id>https://git.yummers.dev/TaSTT.git/atom?h=v0.17.0</id>
<link rel='self' href='https://git.yummers.dev/TaSTT.git/atom?h=v0.17.0'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/'/>
<updated>2023-12-09T02:13:56+00:00</updated>
<entry>
<title>Add distilled whisper large-v2 model</title>
<updated>2023-12-09T02:13:56+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-12-09T02:13:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=15368618a109eeec69209a6693839eb359ecd190'/>
<id>urn:sha1:15368618a109eeec69209a6693839eb359ecd190</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Add distilled whisper-medium model</title>
<updated>2023-11-07T23:05:29+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-11-07T23:05:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=dbb2f72792e2af3ff220313f84bf76a9a1ddbeb4'/>
<id>urn:sha1:dbb2f72792e2af3ff220313f84bf76a9a1ddbeb4</id>
<content type='text'>
I converted distil-whisper-medium.en to CTranslate2 format and uploaded
it to huggingface. This model is exceptionally fast and light compared
to the non-distilled version, at the cost of some accuracy.
</content>
</entry>
<entry>
<title>Transcripts preceding long pauses now drop</title>
<updated>2023-10-06T01:28:42+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-10-06T01:22:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=add7bd8ef86ec21cd1327eb45bcb739aa54f7db8'/>
<id>urn:sha1:add7bd8ef86ec21cd1327eb45bcb739aa54f7db8</id>
<content type='text'>
When hot-miking into the built-in chatbox, there are sometimes long
pauses in conversation. After these pauses, it's undesirable to show the
transcript generate before the pause. This feature makes it so that
those transcripts can be dropped.

Also:

* Limit number of segments sent to browser source to 10. Allow this to
  grow up to 10 segments before dropping the first 5 segments.
* Silence warnings generated by `install_in_venv`, used by e.g.
  translation codepath.
* Enable audio normalization to improve accuracy when speaking softly,
  at the cost of some accuracy when speaking normally.

Credit: user endo0269 on Discord suggested this feature.
</content>
</entry>
<entry>
<title>Reimplement BrowserSource as a StreamingPlugin</title>
<updated>2023-09-19T04:23:14+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-19T04:00:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=c2bc70c18d2fd1c3601b32f2a93b3b4a704786a5'/>
<id>urn:sha1:c2bc70c18d2fd1c3601b32f2a93b3b4a704786a5</id>
<content type='text'>
BrowserSource now fades text out continuously over time.

TODO

* Delete C++ webserver, browsersource, transcript code
* Add UI for text age fading
</content>
</entry>
<entry>
<title>Bugfixes</title>
<updated>2023-09-16T22:49:55+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-16T22:49:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=d4c85f4ac4cb627e2611359d18615d76eda29c90'/>
<id>urn:sha1:d4c85f4ac4cb627e2611359d18615d76eda29c90</id>
<content type='text'>
* uwu filter no longer adds extra whitespace before/after segments. This
  would defeat commit logic.
* disabling phonemes works again - path to prefab was being quoted
  twice, breaking the codepath.
</content>
</entry>
<entry>
<title>Bugfix: list input devices works again</title>
<updated>2023-09-12T22:33:18+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-12T22:33:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=8fcc6c248554a0b08ecd4b43cc0971b78810c080'/>
<id>urn:sha1:8fcc6c248554a0b08ecd4b43cc0971b78810c080</id>
<content type='text'>
Oops :)
</content>
</entry>
<entry>
<title>Pin huggingface_hub to 0.16.4</title>
<updated>2023-09-11T20:53:14+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-11T20:46:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec'/>
<id>urn:sha1:0447f37fb744a1b350f6b92e4d140dbdb1c8d3ec</id>
<content type='text'>
0.17.x are breaking faster_whisper's ability to download models.

Also:
* Start using frozen requirements.txt.
* Conditionally install torch &amp; legacy whisper only when doing
  mechanical optimization.
</content>
</entry>
<entry>
<title>Introduce notion of PresentationFilter</title>
<updated>2023-09-11T05:52:52+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-11T05:51:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=d3c325c4c4dd954a75267b013f33f5f3c5d041bc'/>
<id>urn:sha1:d3c325c4c4dd954a75267b013f33f5f3c5d041bc</id>
<content type='text'>
... and restructure RemoveTrailingPeriod as a filter instead of as a
plugin.

Plugins have the power to change transcription data as it comes along,
but don't have access to the entire transcript. Filters have access to
the entire transcript but can't durably change it.

TODO

* This does not work with data passed through OSC
</content>
</entry>
<entry>
<title>Fix paging bug</title>
<updated>2023-09-11T01:33:08+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-11T01:29:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=920d6dfeeac132488c85311512fe9e5da505c4a8'/>
<id>urn:sha1:920d6dfeeac132488c85311512fe9e5da505c4a8</id>
<content type='text'>
OSC was paging using incorrect board resolution. Use cfg to provide this
data.
</content>
</entry>
<entry>
<title>Fix local audio indicators</title>
<updated>2023-09-10T22:41:25+00:00</updated>
<author>
<name>yum</name>
<email>yum.food.vr@gmail.com</email>
</author>
<published>2023-09-10T22:41:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/TaSTT.git/commit/?id=4a4909919223a7446944c6248472c7f71a30307c'/>
<id>urn:sha1:4a4909919223a7446944c6248472c7f71a30307c</id>
<content type='text'>
</content>
</entry>
</feed>
