From 6f2c1dace46a68620bc61a732a2f43252bd5d3ba Mon Sep 17 00:00:00 2001 From: yum Date: Thu, 22 Dec 2022 14:49:44 -0800 Subject: Document encoding optimization By sending encoded words rather than letters, we could speed up English paging rate by 2.5x over an optimized implementation Word-encoded implementation: 16 bits per word (capped at 64k possible words). Optimized char-based imlementation: (5.7 chars per word) * (7 bits per char) == 39.9 bits per word 2.5x slower than word encoding. Today's char-based implementation: (5.7 chars per word) * (16 bits per char) == 91.2 bits per word 5.7x slower than word encoding. --- README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/README.md b/README.md index 98a900a..1850b8d 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,7 @@ custom shader display the text in game. ![Speech-to-text demo](Images/speech_to_text_demo.gif) Contents: + 0. [Usage and setup](#usage-and-setup) 1. [Features](#features) 2. [Motivation](#motivation) @@ -179,6 +180,15 @@ Contributions welcome. Send a pull request to this repository. This should significantly cut down on idle resource consumption. Perhaps there's even a more efficient way to detect the odds that anything is being said, which we could use to gate transcription.~~ DONE + 5. There are ~64k words in the English language. We could encode each word + using a 16-bit int. On the other hand, suppose you represented each + character using 7 bits per character and transmitted words + character-by-character. The average word length is 4.7 characters, and we + send ~1 space character per word. Thus the expected bits per word in an + optimized version of today's encoding scheme is (5.7 * 7) == 39.9 bits. + The other encoding scheme is thus ~2.5 times more efficient. This could + be used to significantly speed up sync times. (Thanks, Noppers for the + idea!) 5. Bugfixes 1. ~~The whisper STT says "Thank you." when there's no audio?~~ DONE 6. Shine -- cgit v1.2.3