| Commit message (Collapse) | Author | Age |
| | |
|
| |
|
|
|
|
|
| |
Also:
* Double # of audio device slots
* Fetch CuDNN from NVIDIA at runtime instead of vendoring
|
| |
|
|
|
|
|
|
| |
This should be significantly more efficient than prior versions.
* add large-v3 & distilled variant
* simplify model acquisition code now that distilled models are part of
faster-whisper.
|
| |
|
|
| |
This reverts commit 921b92a69f36502dc5eefd14ba3487c1bb49bb9d.
|
| |
|
|
|
|
|
|
|
|
|
| |
Seems much faster than faster-whisper.
There are two issues:
* Requires NVIDIA 3000 series or higher.
* Incompatible with faster-whisper dependencies.
So it seems like we'll either need to toggle between two sets of
dependencies at runtime or have two environments.
|
| |
|
|
|
|
|
|
|
| |
0.17.x are breaking faster_whisper's ability to download models.
Also:
* Start using frozen requirements.txt.
* Conditionally install torch & legacy whisper only when doing
mechanical optimization.
|
| |
|
|
|
|
|
|
| |
Also:
* Enable SO_REUSEADDR on browser src socket
* Temporarily add evaluation dependencies to requirements.txt
* Fix browser src. It's now looking for a prefix that the python app
actually uses.
|
| |
|
|
|
|
|
|
|
|
| |
A set of proper interfaces is called for. See #dev-update-spam in
discord for drawing of design.
Also add code to mechanically optimize committer parameters using an
audio file. Not perfectly repeatable since it depends on the performance
characteristics of the machine, but prob better than what we had before
(nothing).
|
| |
|
|
|
| |
openxr doesn't have any notion of background process, making it unusable
trash :)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I this improves the code structure of the controller input thread and
leads to some deduplication, so I'm going to keep it. However, the
intended purpose was to decrease lag when pressing buttons, and in that
regard it failed.
The lag goes all the way down to the input layer, implying that the
input thread is not able to consistently run at its intended 100 Hz
sample rate. I suspect that the Python global interpreter lock (GIL) is
at fault.
Since we can't realistically move all our functionality into one thread
in a non-blocking model, I think multiprocessing is the logical choice
going forward. Each thread in transcribe.py would become its own
process, and pub/sub through some intermediary process sitting in the
middle.
|
| |
|
|
| |
pyopenvr is both deprecated and buggy, so switch to pyopenxr.
|
| |
|
|
|
|
|
|
| |
Common hallucinations sneak in around -0.9 avg_logprob.
Also:
* Limit temperatures to just 0.0. Multiple values cause latency to
occasionally spike.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
NLLB needs its input to be split up into sentences. I use the
sentence_splitter Python package to do this. It supports ~20 Western
European languages, but notably, no Asian languages.
* Sort spoken language list. English is still at the top.
* Remove 'Translation source' dropdown. Infer this from the spoken
language.
* Add lang_compat.py to map language codes between the various libraries
(whisper, nllb, sentence_splitter).
* Fix bug where old text would appear in textbox when you first bring it
up.
|
| |
|
|
|
|
|
|
|
| |
Use Meta's No Language Left Behind (NLLB) algorithm to provide
translation capabilities into 200 languages. Obviously most are very
untested.
This requires either 4.1 or 7.1 GB of RAM and significiantly increases
transcription latency.
|
| |
|
|
|
|
| |
Users can now configure a keybind to start/stop/dismiss the STT when in
desktop mode. The default keybind is ctrl+x, since by default VRC
doesn't use 'x' for anything.
|
| |
|
|
|
|
|
|
|
| |
faster-whisper doesn't need it. This reduces install size from 6.00GB
with base.en model to 1.70GB.
* Use a single sampler in shader (enables using more than 16 textures)
* Minor legibility regression - need to improve AA.
* Enable backface culling in shader (minor performance win)
|
| |
|
|
| |
I'm able to use the new code to show text in game. Not yet play-tested.
|
| |
|
|
|
|
| |
This is a much faster, lower-VRAM reimplementation of Whisper in Python.
Early testing is extremely promising: fast transcription speed,
extremely low resource usage (CPU/RAM/VRAM), high accuracy.
|
| |
|
|
|
| |
Need python310._pth, specifically 'import site' line, for
embedded python + pip to get along.
|
| |
|
|
| |
Ruling out possibilities for a user reported bug.
|
| |
|
|
| |
The --extra-index-url must appear *before* the dependency in this file.
|
|
|
This seems to be the canonical way of listing a Python app's
dependencies.
* Installing dependencies no longer hangs the GUI
|