TaSTT.git - Free self-hosted STT for VRChat.

	Commit message (Collapse)	Author	Age
*	GUI: Add debug panelv0.5.0	yum	2023-02-04
\| \| \| \| \| \| \| \|	Add debug panel with options to show installed packages, clear the pip cache, reset venv, and clear OSC configs. * Refactor synchronous command execution + logging pattern inside PythonWrapper
*	Delete python310._pth	yum	2023-01-28
\| \| \| \| \| \| \| \| \| \| \| \|	I was using this file to constrain the set of paths that Python can see, but since `future` doesn't have a wheel, it will fail to install on a fresh system. If you set pip's --cache-dir to some new directory, you'll see it fail to install. The _pth doesn't really seem to matter, since without it, packages are still installed under the virtual environment.
*	Finish basic PBR shading	yum	2023-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \|	TaSTT shader now uses physically based rendering (PBR). Users can pick smoothness, metallic, and emissive. This implementation borrows heavily from catlikecoding.com's excellent tutorials, which are released under MIT No Attribution (MIT-0). https://catlikecoding.com/unity/tutorials/license/ To retain what little clarity remains in the shader, I have chosen not to attribute the code in the source itself.
*	GUI: Add ability to choose button	yum	2023-01-25
\| \| \| \| \| \|	We use a button to start/stop transcription. Previously this was hardcoded to left joystick. Now users can pick from {left, right} x {joystick, a, b}.
*	Use requirements.txt for Scripts/	yum	2023-01-25
\| \| \| \| \| \| \|	This seems to be the canonical way of listing a Python app's dependencies. * Installing dependencies no longer hangs the GUI
*	Bugfix: Use future 0.18.2 instead of 0.18.3	yum	2023-01-23
\| \| \| \|	Whisper doesn't like 0.18.3, so downgrade to the last version.
*	package.ps1 now fetches all dependencies	yum	2023-01-23
\| \| \| \| \| \| \|	Don't literally check in Python since it looks dodgy (rightfully so). Instead the build script just fetches it. * Update README, simplifying language and documenting other projects
*	Bugfix: shader now respects bytes per charv0.3.1	yum	2023-01-22
\|
*	Enable using built-in chatboxv0.3	yum	2023-01-22
\| \| \| \| \| \| \| \| \| \| \| \| \|	VRChat exposes a built-in chatbox which can be seen by anyone who has it enabled. This was not the case when I started this project: the chatbox would only be visible to friends. Since this is clearly useful, enabling the STT on public models, let's enable sending data to it. Caveats: * The built-in chatbox has anti-spam tech which limits us to updating about once every 2 seconds. The custom chatbox has no such limitation and is thus typically much faster.
*	GUI: Save Unity input fields across app restarts	yum	2023-01-22
\| \| \| \| \| \| \| \| \| \| \| \| \|	I found that I tend to regenerate the animator on the same avatar a lot, requiring me to re-enter the same paths and parameters over and over again. Persist them across restarts. * Refactor Config classes * Use safe `get_if` instead of the exception-throwing `operator>>` when deserializing from YAML * Begin sketching out Log singleton * Put Quote() and Unquote() into their own little lib; they shouldn't hide inside PythonApp
*	GUI: Persist transcription app configv0.2	yum	2023-01-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The configuration of the transcription app, such as the number of rows and columns in the text box, now persists across app restarts. I found that I would have to change from the defaults to my preferred config every time I started up in VR, which was annoying. Now we just start with the config that was set last time. * Add dependency on rapidyaml (MIT) * Serialize transcription config to file under Resources/ * Add Config class to wrap serializing/deserializing * Update build instructions * Simplify StartApp() API, taking Config struct instead of a ton of arguments
*	Bugfix: user-provided paths may now contain spaces	yum	2023-01-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, paths containing spaces would be interpreted by python's argument parser as multiple separate arguments, causing it to fail. Now we escape paths inside PythonWrapper using std::quoted(). * Improve PII filtering. Python output would contain multiple path separators (like C:\\Users\\foo\\), defeating the PII regex. * Silence compiler warning in PII filter. * Document usability improvements. * Transcription layer exponential backoff goes to ~infinity when paused. This is a hack, since we really don't need to transcribe at all when paused, but it lets us keep the code simple. Good enough until the next rewrite. * Shader only samples background when necessary. * Limit matchStrings() print()s to DEBUG mode
*	Portability bugfixes	yum	2023-01-01
\| \| \| \| \|	* Expose option to run transcription engine on CPU instead of GPU * Use embedded git when setting up the Python virtual environment
*	Embed git in package	yum	2023-01-01
\| \| \| \| \| \| \|	package.ps1 fetches PortableGit and embeds it in the package. This eliminates all but one runtime dependency (MSVC++ Redistributable). * Move Python into a new FOSS folder.
*	Statically link binary	yum	2023-01-01
\| \| \| \|	Update build instructions.
*	Bugfix: regions truncate correctly at page boundaries	yum	2022-12-30
\| \| \| \| \| \| \| \|	Boards whose size is an even multiple of CHARS_PER_SYNC would lose the entire last region. * Attempt to fix runaway memory usage of GUI text frames, but this needs more work
*	GUI: Update chars per sync default	yum	2022-12-30
\| \| \| \|	The defaults now reflect what I typically use.
*	GUI: Expose transcription window duration	yum	2022-12-30
\| \| \| \| \|	Users can pick longer transcription durations for accuracy-critical tasks, or shorter durations for latency-critical tasks.
*	GUI: Users can now control board dimensions	yum	2022-12-29
\| \| \| \| \| \| \| \|	Users can now control how many letters wide and tall the board is. Tested at 4x48, 5x60, 10x120, and 20x240. At 20x240, Unity freezes and does not make forward progress. Perhaps creating 4800 float parameters isn't a truly scalable interface.
*	GUI: preview number of parameter bits the config will use	yum	2022-12-29
\| \| \| \| \|	Users can now see the number of avatar parameter bits they'll use prior to committing.
*	First letter no longer disappears	yum	2022-12-29
\| \| \| \| \| \| \| \| \|	An off-by-one issue in numRegions() would result in one extra layer trying to drive a letter in the last region, which would wrap back around to the 0th character slot (cell). * GUI explicitly logs when it's done generating avatar stuff * OSC layer no longer tries to update cells which don't exist
*	Users can disable local beep	yum	2022-12-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The transcription engine beeps when you start/stop transcribing so you know that it's listening. Users can now disable this. * add help text to all input fields in GUI * make TaSTT generated file textctrls readonly, since I haven't tested them being reassigned * document idea to configure unity & transcription apps with config files * controller input thread no longer crashes if steamvr isn't running, it just slowly spins and waits * when you stop transcribing, the transcription engine re-transcribes a few times. I think this should improve end-of-transcription tail latencies * transcribe.py now prints out its args
*	Bugfix: transcribe panel respects chars per sync etc.	yum	2022-12-26
\| \| \| \| \|	The transcribe panel was grabbing data from the unity panel, causing the bytes per char / chars per sync parameters to be ignored.
*	GUI: expose chars per sync, bytes per char	yum	2022-12-24
\| \| \| \| \| \| \| \| \| \| \| \|	Users can now control how many characters they send per sync event, as well as the number of bytes used to represent each character. This gives them the power to pick between faster paging and fewer sync params. International users must use 2 bytes per char (at least for now). * package.ps1: don't distribute the gigantic TTF files, just the bitmaps
*	Don't delete TaSTT_Generated	yum	2022-12-21
\| \| \| \| \|	This makes incremental workflows much more efficient, since you don't have to reassign the FX controller, params, and menu.
*	GUI: Add better logging interface	yum	2022-12-21
\| \| \| \| \| \| \| \|	Create printf-like interface for writing to wxTextCtrl objects. Also mask out PII. I wanted a way to not dox myself when recording demos, but I wound up making a second user on my PC to serve the same purpose. Maybe I'll delete the code later idk.
*	Control tweak: introduce long/short hold behavior	yum	2022-12-20
\| \| \| \| \| \| \| \| \| \| \| \|	The typical use pattern is now possible without entering radial. Leaving mounted to the world for a long time is no longer possible. Maybe I need an override param? Left joystick controls: * Short press toggle 1: show board, lock to hand, start transcribing * Short press toggle 2: lock to world, stop transcribing * Long press: hide board, stop transcribing
*	GUI: "Finish" avatar generation workflow	yum	2022-12-20
\| \| \| \| \| \| \| \| \| \| \|	GUI now generates parameters & menu. Still need to handle write defaults. * Add capability to append to avatar parameters & menu * Install canned Unity assets, shaders, and fonts in avatar folder * Check in materials for ease of use * Bugfix: correctly label menu/parameters file pickers
*	GUI can now generate animator	yum	2022-12-20
\| \| \| \|	Still need to generate params & merge menus. Getting close....
*	GUI: Begin work generating animator	yum	2022-12-20
\| \| \| \|	The GUI can now generate guid.map and animations.
*	GUI: Fix transcription output	yum	2022-12-19
\| \| \| \| \| \| \| \| \| \| \|	Output now shows up in the textbox in ~real time. We do this by disabling Python's output buffering. This has a performance impact, but it should be negligible. * Fix crash when setting up python environment * UI tweak: text displays now expand with window * Fix how we merge transcribe.py; usually don't have to resort to SIGKILL, which loses stdout/stderr.
*	GUI: Improve error logging	yum	2022-12-19
\| \| \| \| \|	PythonWrapper correctly captures wxProcess stdout & stderr in sync and async execution modes.
*	GUI: Sketch out Unity panel	yum	2022-12-19
\| \| \| \| \| \| \|	Now there are two panels: one to run transcription, one to generate avatar assets. Also, getting mics & python version can no longer crash the app.
*	Now it's possible to build the app from Powershell	yum	2022-12-18
\| \| \| \|	No more WSL dependencies!
*	Add resource file header	yum	2022-12-18
\|
*	Add ability to select model	yum	2022-12-18
\| \| \| \| \| \| \|	* icon now works when pinned to taskbar * add model selection * add script to dump mic devices * whisper models now download into the virtual environment
*	GUI: Add mic, language selection	yum	2022-12-18
\| \| \| \| \| \| \| \|	Users can now select their mic & spoken language in the GUI. * pyaudio now samples at the mic rate, fixing an issue where frames would drop. We downsample in the callback by dropping frames. * add Sounds folder to package
*	GUI: Add ability to start & stop transcription engine	yum	2022-12-17
\|
*	Finish python virtual env	yum	2022-12-17
\| \| \| \| \| \| \| \| \|	GUI can now download all TaSTT dependencies and install them into a virtual environment. * Add buttons to check embedded python version & install dependencies * Add class to wrap interacting with embedded Python * Put all TaSTT python scripts into a folder
*	Check in `future` package	yum	2022-12-17
\| \| \| \| \| \| \| \| \| \| \|	I hit some issues installing Whisper and had to embed this package. I haven't taken the time to deeply understand what's going on. I think that embedded Python follows different rules about resolving module paths than regular system Python. Basically, `future`'s setup.py has a line like `import src`, where `src` is a module inside future (like `future/src/__init__.py`). This doesn't work unless we put that directory on the search path.
*	Document embedded venv hack	yum	2022-12-16
\| \| \| \| \| \| \|	Check in pip & modify embedded python to install to Lib and Lib/site-packages. Experimentally, packages may be installed with pip and do reside in Lib/site-packages. Hard to tell if this is also touching files outside the venv.
*	Check in python 3.11	yum	2022-12-16
\| \| \| \|	License is included in source & distributable package.
*	Refactor app	yum	2022-12-16
\| \| \| \|	Create headers & implementation files for App and Frame.
*	Add logo	yum	2022-12-16
	* GUI now shows logo * Add package.ps1 to generate distributable application bundle * Rename ~GUI to GUI * Add ScopeGuard class