TaSTT.git - Free self-hosted STT for VRChat.

	Commit message (Collapse)	Author	Age
*	Quiet down transcribe.py	yum	2022-10-20
\| \| \| \| \| \|	Also adjust continuous transcription algorithm to use leftmost minimum instead of rightmost. This prevents some cases where we generate longer and longer text.
*	Add continuous transcription mode	yum	2022-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Algorithm: * look at last 20 chars of last committed transcription * scan new transcription using 10-char sliding window * find spot where distance is minimized * stitch two messages together Thus we're able to maintain a continuously growing transcription without having to feed the AI more than 30 seconds of data at a time. Seems to work reasonably well in bench tests. Also fix silence detection. AI exposes a probability that nothing was said. Hand-pick a probability of 0.1. Sometimes the AI still goes sicko mode with this setting but going higher occasionally results in no transcription.
*	Add libunity.addTransition	yum	2022-10-15
\| \| \| \| \|	* Implement basic board toggle using new transition logic * Metadata can now restore from file
*	Transcribe.py now pages	yum	2022-10-15
\| \| \| \| \| \| \| \| \|	Messages longer than a board will automatically write over the top. TODO * Real cell-based message diffing * Cumulative transcription * this would completely mitigate the effects of trim events
*	Further improve transcribe.py responsiveness	yum	2022-10-15
\| \| \| \| \| \| \|	Add a third heuristic. If the transcription is relatively long and the first bit differs from the previous transcription, immediately overwrite. Because the transcription is long, it's a bit less likely to be a complete mistranscription.
*	Tweak transcribe.py	yum	2022-10-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Slightly improve temporal stability and responsiveness at the cost of limiting to a 30 second recording. Before committing to a transcription, wait for two consecutive transcriptions such that they are identical, or the former is a prefix of the latter. This helps with temporal stability by eliminating most one-off wildly inaccurate transcriptions. Also make osc_ctrl.sendMessageLazy a little lazier, limiting it to 2 consecutive non-empty cells per call. This allows us to recover from mistranscriptions faster.
*	Fix animations: renamed prefab from CustomSTT to TaSTT	yum	2022-10-15
\| \| \| \| \| \| \| \| \|	Also: * Check in toggle on/off animations * Add toggle parameter * libunity bug: getUniqueId() was calling allocateId() incorrectly * Remove osc_ctrl `client` global * Fix transcribe.py text encoding
*	Add ability to leave board in world	yum	2022-10-11
\| \| \| \| \| \| \| \| \|	* Add VRLabs' World Constraint as a submodule * Add animations for world constraint * Add toggles for board * Add libunity.py (no content yet) * Support >30s transcription * Add board FBX
*	Introduce STT proof-of-concept	yum	2022-10-03
	Using OpenAI's whisper neural network, we can do local STT. Translation quality is good, system resource usage is minimal (1 GB VRAM), latency is much lower than cloud-based translation. * Add transcribe.py * Creates 3 threads: * One saves mic audio to a buffer * One passes mic audio to the STT * One sends the transcribed text to the board * Main thread listens for input. Press enter to start a new message. * Add osc_ctrl.sendMessageLazy, a simple diff-based message sending utility. * A little complexity: it only sends 1 empty cell per call, allowing us to quickly say new things without having to wait for the whole buffer to clear.