| Commit message (Collapse) | Author | Age |
| | |
|
| |
|
|
|
|
|
|
|
| |
Mostly updating roadmap stuff. Non-VRC use cases are "complete" since I
was mostly targeting streaming. The ability to type into arbitrary text
fields is still somewhat nascent & could be improved.
Also update some other random stuff to be more up to date. KillFrenzy
Avatar Text is now MIT, pog!
|
| |
|
|
|
|
| |
Should improve legibility.
* Update README
|
| |
|
|
|
|
|
|
|
|
| |
To use it, do a medium hold + long hold. Keep the long hold depressed
until you're done speaking. The transcription will be typed into the
currently selected input field.
* Add more audio feedback
* Make audio feedback play asynchronously so it doesn't slow down the
controller input state machine as much.
|
| |
|
|
|
|
| |
See comment for details.
* Update README
|
| |
|
|
| |
I'm able to use the new code to show text in game. Not yet play-tested.
|
| |
|
|
|
|
| |
This is a much faster, lower-VRAM reimplementation of Whisper in Python.
Early testing is extremely promising: fast transcription speed,
extremely low resource usage (CPU/RAM/VRAM), high accuracy.
|
| |
|
|
|
|
|
|
| |
This fixes issues where the transparent corners of the textbox render
in front of other materials, causing those other materials to skip
rendering.
* Update README.md with roadmap and avatar resource usage.
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
Allows sustained exponential backoff when not transcribing. Used to cap
out at 1s.
* Add more items to README TODO list
* Adjust emote metadata
* Emotes bugfix: Non-existent emote map doesn't cause transcription
engine to bail out.
|
| |
|
|
|
| |
Deleting python310._pth causes a few more things to be installed in the
venv.
|
| |
|
|
|
|
| |
* Fix prefab: bounding box & position are now set to 0
* Fix shader: text is no longer upside down
* Update README
|
| |
|
|
| |
Document recent features, better explain basis of transcription.
|
| |
|
|
|
|
|
|
| |
* Text color, background color, and margin color are all customizable
now
* Better organize shader parameters. User-facing params are exposed
Like_This; internal params are exposed _Like_This.
* Update README. More wordsmithing.
|
| |
|
|
|
| |
* Point to a more up-to-date demo.
* Improve wordsmithing/flow
|
| |
|
|
|
|
|
| |
Don't literally check in Python since it looks dodgy (rightfully so).
Instead the build script just fetches it.
* Update README, simplifying language and documenting other projects
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, paths containing spaces would be interpreted by python's argument
parser as multiple separate arguments, causing it to fail. Now we escape paths
inside PythonWrapper using std::quoted().
* Improve PII filtering. Python output would contain multiple path separators
(like C:\\Users\\foo\\), defeating the PII regex.
* Silence compiler warning in PII filter.
* Document usability improvements.
* Transcription layer exponential backoff goes to ~infinity when paused.
This is a hack, since we really don't need to transcribe at all when paused,
but it lets us keep the code simple. Good enough until the next rewrite.
* Shader only samples background when necessary.
* Limit matchStrings() print()s to DEBUG mode
|
| |
|
|
| |
This seems like the best way to support users.
|
| |
|
|
|
|
|
| |
Add a new shader to make the box a little prettier.
* Reduce material slots required from 2 to 1
* Add rounding to edge of box
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By sending encoded words rather than letters, we could speed up
English paging rate by 2.5x over an optimized implementation
Word-encoded implementation: 16 bits per word
(capped at 64k possible words).
Optimized char-based imlementation:
(5.7 chars per word) * (7 bits per char) == 39.9 bits per word
2.5x slower than word encoding.
Today's char-based implementation:
(5.7 chars per word) * (16 bits per char) == 91.2 bits per word
5.7x slower than word encoding.
|
| | |
|
| |
|
|
| |
Also decrease sync params & add a few more emotes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After re-reading the paper, I noticed that they apply a couple
optimizations I wasn't using. Use the top-level `whisper.transcribe`
method, which is a little slower, but more accurate than the one I was
using.
Although this method is slower, it has better temporal stability due to
the increased quality, which I think should make for an overall more
responsive UX. Lower transcription quality means the paging layer has to
waste time updating earlier cells.
Also, drop the auto-commit stuff and go back to string stitching. I
think it's better to let the user manually commit. A rework of the hand
controls is probably coming soon.
Finally, update README.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
Add a `matchStrings` which does basically the same thing as
`matchStringList` except it doesn't split the input at space boundaries.
I think this should work better for Japanese and Chinese, since they
don't use spaces.
Doesn't seem to cause any accuracy regressions for English.
Also update the README.
|
| |
|
|
|
|
|
|
|
|
|
| |
Instead of generating one animation for every single character in our
character set, we just generate 2: the lowest and the highest. We use
blend trees to interpolate between these two extremes.
This reduces the number of animations we have to generate by a factor
of 80. It also clears the way for multi-language support (coming soon).
It also means we don't have to reopen unity every time we generate a new
animator.
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
While the board is clearing, you can keep talking, and it will be
rendered when the board finishes clearing.
* bugfix: STT only beeps when it's out
|
| | |
|
| |
|
|
|
| |
* Implement basic board toggle using new transition logic
* Metadata can now restore from file
|
| |
|
|
| |
Also update README.
|
| | |
|
| |
|
|
|
|
| |
* Update README with contribution instructions & design details.
* Add text-to-text demo gif
* Document known Unity landmines in generate.sh.
|
| |
|
|
| |
Even more reliable now.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently the same avatar parameter can only be updated so quickly
before VRChat starts dropping messages. So now we divide the board
into "groups" of 8 characters. Each group can be updated relatively
slowly, but all groups can be updated in parallel. Thus we can update
the board group-by-group, pausing between each group.
* Fix shader bugs - now there are Row05 parameters, and row00 refers
to the topmost row instead of the bottom-most.
* Remove outdated layer/group names files
* Extend osc_ctrl.py to support encoding & sending messages
* Add generate_params.py to handle creating TaSTT_params.asset
* Add generate_utils.py for common code generation facilities &
parameters.
|
| |
|