| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
| |
Also adjust continuous transcription algorithm to use leftmost minimum
instead of rightmost. This prevents some cases where we generate longer
and longer text.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Algorithm:
* look at last 20 chars of last committed transcription
* scan new transcription using 10-char sliding window
* find spot where distance is minimized
* stitch two messages together
Thus we're able to maintain a continuously growing transcription
without having to feed the AI more than 30 seconds of data at a
time. Seems to work reasonably well in bench tests.
Also fix silence detection. AI exposes a probability that nothing
was said. Hand-pick a probability of 0.1. Sometimes the AI still
goes sicko mode with this setting but going higher occasionally
results in no transcription.
|
| |
|
|
|
| |
* Implement basic board toggle using new transition logic
* Metadata can now restore from file
|
| |
|
|
|
|
|
|
|
| |
Messages longer than a board will automatically write over the top.
TODO
* Real cell-based message diffing
* Cumulative transcription
* this would completely mitigate the effects of trim events
|
| |
|
|
|
|
|
| |
Add a third heuristic. If the transcription is relatively long and the
first bit differs from the previous transcription, immediately
overwrite. Because the transcription is long, it's a bit less likely to
be a complete mistranscription.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slightly improve temporal stability and responsiveness at the cost of
limiting to a 30 second recording.
Before committing to a transcription, wait for two consecutive
transcriptions such that they are identical, or the former is a
prefix of the latter. This helps with temporal stability by eliminating
most one-off wildly inaccurate transcriptions.
Also make osc_ctrl.sendMessageLazy a little lazier, limiting it to 2
consecutive non-empty cells per call. This allows us to recover from
mistranscriptions faster.
|
| |
|
|
|
|
|
|
|
| |
Also:
* Check in toggle on/off animations
* Add toggle parameter
* libunity bug: getUniqueId() was calling allocateId() incorrectly
* Remove osc_ctrl `client` global
* Fix transcribe.py text encoding
|
| |
|
|
|
|
|
|
|
| |
* Add VRLabs' World Constraint as a submodule
* Add animations for world constraint
* Add toggles for board
* Add libunity.py (no content yet)
* Support >30s transcription
* Add board FBX
|
|
|
Using OpenAI's whisper neural network, we can do local STT. Translation
quality is good, system resource usage is minimal (1 GB VRAM), latency
is much lower than cloud-based translation.
* Add transcribe.py
* Creates 3 threads:
* One saves mic audio to a buffer
* One passes mic audio to the STT
* One sends the transcribed text to the board
* Main thread listens for input. Press enter to start a new message.
* Add osc_ctrl.sendMessageLazy, a simple diff-based message sending utility.
* A little complexity: it only sends 1 empty cell per call, allowing us to
quickly say new things without having to wait for the whole buffer to
clear.
|