TaSTT.git/string_matcher.py, branch master

TaSTT.git/string_matcher.py, branch master Free self-hosted STT for VRChat. https://git.yummers.dev/TaSTT.git/atom?h=master 2022-12-18T01:51:12+00:00 Finish python virtual env 2022-12-18T01:51:12+00:00 yum yum.food.vr@gmail.com 2022-12-18T01:51:12+00:00 urn:sha1:ee8213d1d2c2008d2d996929500c9e87dac325a3 GUI can now download all TaSTT dependencies and install them into a virtual environment. * Add buttons to check embedded python version & install dependencies * Add class to wrap interacting with embedded Python * Put all TaSTT python scripts into a folder Rework input controls 2022-11-23T02:13:18+00:00 yum yum.food.vr@gmail.com 2022-11-22T23:36:19+00:00 urn:sha1:bd8b63a357bb374f5875f0fedf2d677589419810 Press joystick once to start recording, again to stop. When you start recording, any previous text on the board is cleared. Add 2 visual indicators: one to indicate speech, another to indicate that audio is paging. Tweak transcription again 2022-11-16T08:45:09+00:00 yum yum.food.vr@gmail.com 2022-11-16T08:45:09+00:00 urn:sha1:d2e06445c42b22d2b75f5da1980b7a8d833a9c5b Works a little better on longer transcriptions while maintaining the same improved performance on short transcriptions. We really need a benchmark to evaluate performance mechanically. Clicking the left joystick resets the board. 2022-11-12T22:14:49+00:00 yum yum.food.vr@gmail.com 2022-11-12T22:14:49+00:00 urn:sha1:3b038d23ec7621e0164c1901b416bf77a27d8cf3 * Increase no speech probability threshold. This is what was preventing short transcriptions from working. We rely more on the avg logprob filter now. * Remove string matching logic from transcribe. Now when we get 2 consecutive identical transcriptions, we commit the transcription. This *could* cause words to get cut off but in practice it doesn't seem to happen. * Fix steamvr joystick click detection. Moving the joystick would also fire the event, which is not correct. * Combine locks in transcribe.py. * Remove "clear" vocal control. * osc_ctrl.clear() resets last_message_encoded * Remove osc_ctrl.sendMessage (unused) License scrub 2022-11-11T05:21:07+00:00 yum yum.food.vr@gmail.com 2022-11-11T05:21:07+00:00 urn:sha1:772b44806a5f5da11cca74c99b59c3cf7d5ceae5 Begin auditing dependencies' licenses. Update fonts 2022-11-08T08:09:59+00:00 yum yum.food.vr@gmail.com 2022-11-08T08:09:59+00:00 urn:sha1:2efc87a7180ec6e92127d22d1a3eb8c44fd392db English, Japanese, Chinese, and Korean should look much better now. French, German, and Spanish look like shit now, because I haven't figured out how to best make Noto Sans stay within its bounding box. * Use Noto Sans for most things * Simplify how we enable unicode blocks & assign fonts to them * Increase string matching window to 300. Works better in real-world test. Fix matchStrings O(n^2) loop 2022-11-08T04:31:16+00:00 yum yum.food.vr@gmail.com 2022-11-08T04:31:16+00:00 urn:sha1:77c6f366b2f81c60ed67e2fa6dc92df451e4229c This slides 2 windows across input strings, looking for a region where they are most similar. It then uses that region to stitch the strings together. Since transcribe.py passes in a continuous transcription as the `old_text` argument, we can wind up spending a lot of time here. Constrain the area of the `old_text` argument that we look at to the most recent 50 characters. This should be good enough. Also fix how we calculate levenshtein_distance. Uh... yeah, let's not talk about how it was before. String matching no longer relies on spaces 2022-11-06T20:50:38+00:00 yum yum.food.vr@gmail.com 2022-11-06T20:50:38+00:00 urn:sha1:7146acb9d4ad751fc5ced411a2990d0aad17d08f Add a `matchStrings` which does basically the same thing as `matchStringList` except it doesn't split the input at space boundaries. I think this should work better for Japanese and Chinese, since they don't use spaces. Doesn't seem to cause any accuracy regressions for English. Also update the README. Tweak continuous transcription 2022-10-28T02:15:48+00:00 yum yum.food.vr@gmail.com 2022-10-28T02:15:48+00:00 urn:sha1:113f2858016c252b97cac96eab454ee16b2dcda2 Stitching new uses 6 word sliding window instead of 4 word. Seems to dramatically improve transcription quality. De-scuff continuous transcription 2022-10-26T00:46:44+00:00 yum yum.food.vr@gmail.com 2022-10-26T00:46:44+00:00 urn:sha1:eefa14c431efa4e3bc16cafbcb004e41622c2411 Transcription stitching now occurs in word space, rather than in text space. This avoids problems where we accidentally duplicate or delete letters in the middle of words. Factor out stitching into its own module and add a small handful of test cases. Hopefully if we hit problems in production, we can just grow this list and avoid regressions if we reimplement.