| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
| |
Also adjust continuous transcription algorithm to use leftmost minimum
instead of rightmost. This prevents some cases where we generate longer
and longer text.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Algorithm:
* look at last 20 chars of last committed transcription
* scan new transcription using 10-char sliding window
* find spot where distance is minimized
* stitch two messages together
Thus we're able to maintain a continuously growing transcription
without having to feed the AI more than 30 seconds of data at a
time. Seems to work reasonably well in bench tests.
Also fix silence detection. AI exposes a probability that nothing
was said. Hand-pick a probability of 0.1. Sometimes the AI still
goes sicko mode with this setting but going higher occasionally
results in no transcription.
|
| |
|
|
|
|
|
|
|
| |
Messages longer than a board will automatically write over the top.
TODO
* Real cell-based message diffing
* Cumulative transcription
* this would completely mitigate the effects of trim events
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slightly improve temporal stability and responsiveness at the cost of
limiting to a 30 second recording.
Before committing to a transcription, wait for two consecutive
transcriptions such that they are identical, or the former is a
prefix of the latter. This helps with temporal stability by eliminating
most one-off wildly inaccurate transcriptions.
Also make osc_ctrl.sendMessageLazy a little lazier, limiting it to 2
consecutive non-empty cells per call. This allows us to recover from
mistranscriptions faster.
|
| |
|
|
|
|
|
|
|
| |
Also:
* Check in toggle on/off animations
* Add toggle parameter
* libunity bug: getUniqueId() was calling allocateId() incorrectly
* Remove osc_ctrl `client` global
* Fix transcribe.py text encoding
|
| |
|
|
|
|
|
|
|
| |
* Add VRLabs' World Constraint as a submodule
* Add animations for world constraint
* Add toggles for board
* Add libunity.py (no content yet)
* Support >30s transcription
* Add board FBX
|
| |
|
|
|
| |
It's a little buggy; it likes to overwrite cells on the board. No idea
why.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using OpenAI's whisper neural network, we can do local STT. Translation
quality is good, system resource usage is minimal (1 GB VRAM), latency
is much lower than cloud-based translation.
* Add transcribe.py
* Creates 3 threads:
* One saves mic audio to a buffer
* One passes mic audio to the STT
* One sends the transcribed text to the board
* Main thread listens for input. Press enter to start a new message.
* Add osc_ctrl.sendMessageLazy, a simple diff-based message sending utility.
* A little complexity: it only sends 1 empty cell per call, allowing us to
quickly say new things without having to wait for the whole buffer to
clear.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Double board size from 6x16 to 8x22
* Reduce parameter bits used (thanks to extra layer of indexing)
* Rename template.anim to template.anim.txt to prevent Unity from
constantly rewriting it
* osc_ctrl.encodeMessage now pads the message so that all empty space is
overwritten
* Delete osc_ctrl.sendMessageCellContinuous. Now that we use a single 'Enable'
bit, this idea is sidelined.
* We can probably achieve the same effect by making TaSTT.shader a little
more clever. For example, if we pass it the current cell number, it could
render a time-based 'fade-in' effect which simulates smooth streaming.
|
| |
|
|
| |
Even more reliable now.
|
| |
|
|
|
|
|
| |
* Shorten animations to 1 frame
* Eliminate fx internal transition delays
* These were causing the shader parameters to interpolate, causing the
inconsistent / flickering letters I was seeing
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per the VRC docs, state behaviors may not execute if the total length of
time in the state is < 0.02 seconds. Adding a 2-frame 'Do Nothing'
animation to the top of every layer seems to help with stability.
*shrug*
More cleanup:
* Generate a unique return-home transition for each terminal state
instead of reusing the same one.
* Use globally unique state names in animator.
* All animations are at least 2 frames long.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
... and a bunch of bugfixes:
* Shader is now transparent
* Simplify shader row/column calculation
* Add punctuation to texture
* Fix generate.sh
* Add lorum_ipsum.txt
* Fix how long text is scrolled
* Simplify encoding logic in osc_ctrl.py
|
| |
|
|
|
|
|
|
|
|
|
| |
Add trivial line wrapping algorithm. Words are only added to
a line if they don't put it over the column limit, and only broken if
they alone exceed the column limit.
Extend board size to 16x6, using 145 bits of parameter memory.
Add simple generate.sh script, which generates everything needed to
use the text-to-text board.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently the same avatar parameter can only be updated so quickly
before VRChat starts dropping messages. So now we divide the board
into "groups" of 8 characters. Each group can be updated relatively
slowly, but all groups can be updated in parallel. Thus we can update
the board group-by-group, pausing between each group.
* Fix shader bugs - now there are Row05 parameters, and row00 refers
to the topmost row instead of the bottom-most.
* Remove outdated layer/group names files
* Extend osc_ctrl.py to support encoding & sending messages
* Add generate_params.py to handle creating TaSTT_params.asset
* Add generate_utils.py for common code generation facilities &
parameters.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
Can't get much faster than 0.1 seconds per character with the current
design. Still, a good first step!
* Simplify parameters: only use 3 8-bit ints + 1 boolean.
* Rewrite FX generator according to new params.
* Rewrite osc_ctrl.py to test in-game display.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Doesn't work in game.
Also change # of characters per slot to 80, down from 128.
Also realize that VRChat supports 256 BITS of parameter, not 256 BYTES.
Next design idea:
* 3 8-bit parameters: letter, row, col
* 1 boolean parameter: active
* one animation for each slot/letter combo, as usual
* one fx layer like this:
if !active:
do nothing
if row == 0:
if col == 0:
if letter == 0:
play row00_col00_letter00 animation
* because write defaults are off, we should be able to "save" letters
by simply setting active = false
* thus we don't need to simultaneously address the entire board, saving
memory
|
|
|
simply sends numbers to a parameter's osc address
of course, nothing is showing up in game. More debugging is needed.
|