TaSTT.git - Free self-hosted STT for VRChat.

	Commit message (Collapse)	Author	Age
*	Update demo in README	yum	2023-09-11
\|
*	Update README	yum	2023-07-07
\| \| \| \| \| \| \| \| \|	Mostly updating roadmap stuff. Non-VRC use cases are "complete" since I was mostly targeting streaming. The ability to type into arbitrary text fields is still somewhat nascent & could be improved. Also update some other random stuff to be more up to date. KillFrenzy Avatar Text is now MIT, pog!
*	Add grey background to browser src	yum	2023-06-27
\| \| \| \| \| \|	Should improve legibility. * Update README
*	Add ability to type using STT	yum	2023-05-23
\| \| \| \| \| \| \| \| \| \|	To use it, do a medium hold + long hold. Keep the long hold depressed until you're done speaking. The transcription will be typed into the currently selected input field. * Add more audio feedback * Make audio feedback play asynchronously so it doesn't slow down the controller input state machine as much.
*	Fix noop animations on current creator companion buildv0.11.3	yum	2023-05-09
\| \| \| \| \| \|	See comment for details. * Update README
*	~Finish integrating faster-whisper	yum	2023-04-24
\| \| \| \|	I'm able to use the new code to show text in game. Not yet play-tested.
*	Begin integrating faster-whisperv0.11.0	yum	2023-04-23
\| \| \| \| \| \|	This is a much faster, lower-VRAM reimplementation of Whisper in Python. Early testing is extremely promising: fast transcription speed, extremely low resource usage (CPU/RAM/VRAM), high accuracy.
*	Custom chatbox shader writes depth	yum	2023-03-23
\| \| \| \| \| \| \| \|	This fixes issues where the transparent corners of the textbox render in front of other materials, causing those other materials to skip rendering. * Update README.md with roadmap and avatar resource usage.
*	Update README.txt	yum	2023-03-02
\|
*	Remove exponential backoff capv0.7.0	yum	2023-02-19
\| \| \| \| \| \| \| \| \| \|	Allows sustained exponential backoff when not transcribing. Used to cap out at 1s. * Add more items to README TODO list * Adjust emote metadata * Emotes bugfix: Non-existent emote map doesn't cause transcription engine to bail out.
*	Update hardware requirementsv0.4.1	yum	2023-01-28
\| \| \| \| \|	Deleting python310._pth causes a few more things to be installed in the venv.
*	Bugfixesv0.4.0	yum	2023-01-27
\| \| \| \| \| \|	* Fix prefab: bounding box & position are now set to 0 * Fix shader: text is no longer upside down * Update README
*	Update README.md	yum	2023-01-26
\| \| \| \|	Document recent features, better explain basis of transcription.
*	Enable more shader customization	yum	2023-01-23
\| \| \| \| \| \| \| \|	* Text color, background color, and margin color are all customizable now * Better organize shader parameters. User-facing params are exposed Like_This; internal params are exposed _Like_This. * Update README. More wordsmithing.
*	Update README	yum	2023-01-23
\| \| \| \| \|	* Point to a more up-to-date demo. * Improve wordsmithing/flow
*	package.ps1 now fetches all dependencies	yum	2023-01-23
\| \| \| \| \| \| \|	Don't literally check in Python since it looks dodgy (rightfully so). Instead the build script just fetches it. * Update README, simplifying language and documenting other projects
*	Bugfix: user-provided paths may now contain spaces	yum	2023-01-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, paths containing spaces would be interpreted by python's argument parser as multiple separate arguments, causing it to fail. Now we escape paths inside PythonWrapper using std::quoted(). * Improve PII filtering. Python output would contain multiple path separators (like C:\\Users\\foo\\), defeating the PII regex. * Silence compiler warning in PII filter. * Document usability improvements. * Transcription layer exponential backoff goes to ~infinity when paused. This is a hack, since we really don't need to transcribe at all when paused, but it lets us keep the code simple. Good enough until the next rewrite. * Shader only samples background when necessary. * Limit matchStrings() print()s to DEBUG mode
*	Add Discord to README.md	yum	2023-01-01
\| \| \| \|	This seems like the best way to support users.
*	Touch up TaSTT.shader	yum	2022-12-25
\| \| \| \| \| \| \|	Add a new shader to make the box a little prettier. * Reduce material slots required from 2 to 1 * Add rounding to edge of box
*	Document encoding optimization	yum	2022-12-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By sending encoded words rather than letters, we could speed up English paging rate by 2.5x over an optimized implementation Word-encoded implementation: 16 bits per word (capped at 64k possible words). Optimized char-based imlementation: (5.7 chars per word) * (7 bits per char) == 39.9 bits per word 2.5x slower than word encoding. Today's char-based implementation: (5.7 chars per word) * (16 bits per char) == 91.2 bits per word 5.7x slower than word encoding.
*	Update README.mdv0.0	yum	2022-12-22
\|
*	Update README.md	yum	2022-12-01
\| \| \| \|	Also decrease sync params & add a few more emotes.
*	Another transcription rework	yum	2022-11-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After re-reading the paper, I noticed that they apply a couple optimizations I wasn't using. Use the top-level `whisper.transcribe` method, which is a little slower, but more accurate than the one I was using. Although this method is slower, it has better temporal stability due to the increased quality, which I think should make for an overall more responsive UX. Lower transcription quality means the paging layer has to waste time updating earlier cells. Also, drop the auto-commit stuff and go back to string stitching. I think it's better to let the user manually commit. A rework of the hand controls is probably coming soon. Finally, update README.
*	Update README	yum	2022-11-06
\|
*	String matching no longer relies on spaces	yum	2022-11-06
\| \| \| \| \| \| \| \| \| \| \|	Add a `matchStrings` which does basically the same thing as `matchStringList` except it doesn't split the input at space boundaries. I think this should work better for Japanese and Chinese, since they don't use spaces. Doesn't seem to cause any accuracy regressions for English. Also update the README.
*	Reduce dimensionality of animator by factor of 80	yum	2022-11-05
\| \| \| \| \| \| \| \| \| \| \|	Instead of generating one animation for every single character in our character set, we just generate 2: the lowest and the highest. We use blend trees to interpolate between these two extremes. This reduces the number of animations we have to generate by a factor of 80. It also clears the way for multi-language support (coming soon). It also means we don't have to reopen unity every time we generate a new animator.
*	Add speech-to-text demo	yum	2022-11-04
\|
*	Update README	yum	2022-10-30
\|
*	Saying the word "clear" clears the board	yum	2022-10-24
\| \| \| \| \| \| \|	While the board is clearing, you can keep talking, and it will be rendered when the board finishes clearing. * bugfix: STT only beeps when it's out
*	Update backlog	yum	2022-10-16
\|
*	Add libunity.addTransition	yum	2022-10-15
\| \| \| \| \|	* Implement basic board toggle using new transition logic * Metadata can now restore from file
*	Board is now blank by default	yum	2022-10-03
\| \| \| \|	Also update README.
*	Update README	yum	2022-10-03
\|
*	Add LICENSE	yum	2022-10-02
\| \| \| \| \| \|	* Update README with contribution instructions & design details. * Add text-to-text demo gif * Document known Unity landmines in generate.sh.
*	Use a single Enable parameter instead of one per layer	yum	2022-10-02
\| \| \| \|	Even more reliable now.
*	Redo FX layer	yum	2022-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Apparently the same avatar parameter can only be updated so quickly before VRChat starts dropping messages. So now we divide the board into "groups" of 8 characters. Each group can be updated relatively slowly, but all groups can be updated in parallel. Thus we can update the board group-by-group, pausing between each group. * Fix shader bugs - now there are Row05 parameters, and row00 refers to the topmost row instead of the bottom-most. * Remove outdated layer/group names files * Extend osc_ctrl.py to support encoding & sending messages * Add generate_params.py to handle creating TaSTT_params.asset * Add generate_utils.py for common code generation facilities & parameters.
*	Add README.md	yum	2022-09-29