| Commit message (Collapse) | Author | Age | |
|---|---|---|---|
| * | Fix audio normalizationHEADmaster | yum | 2023-04-04 |
| | | | | | | | | Normalization was putting audio onto range [0, 255], while it should have been on range [0, 1]. * Add AudioBuffer::save() to enable debugging audio issues. | ||
| * | begin work disabling vad | yum | 2023-03-17 |
| | | |||
| * | Fix beam search previous window conditioning | yum | 2023-03-07 |
| | | | | | Not all contexts had `prev_prompt`, causing most beams to misbehave. | ||
| * | Use logprobs, fix beam candidate selection | yum | 2023-03-03 |
| | | | | | | | | | | | | | | Incorrect sort condition resulted in worst 5 beams being picked instead of best 5. Use log probabilities for joint probability calculation instead of linear probabilities. Long beams would have probabilities converge exponentially towards zero; now they converge linearly towards -INFINITY. Using both transcripts in Evaluation/setup.ps1, I see a small edit distance regression (~5%) using beam search vs. greedy. | ||
| * | Begin work on evaluation framework | yum | 2023-03-03 |
| | | | | | | Need a way to verify that beam search is actually working better than greedy. | ||
| * | Finish beam search rough draft | yum | 2023-03-02 |
| | | | | | Seems to work. Doesn't crash. Lots of room for optimization and cleanup. | ||
| * | Continue work on beam search | yum | 2023-03-02 |
| | | | | | | | Define ContextImpl::Context, wrapping all the data used in decoding. Using a vector of these is much simpler than using N vectors of all the random stuff we need. | ||
| * | Begin work on beam search decoding | yum | 2023-02-27 |
| | | | | | | | | | * ContextImpl.h puts prompts, previous prompts, probabilities, and probability IDs into vectors of size 1 or N_BEAMS, depending on the decoding strategy. * Extend sampleBest and friends to return top N tokens, instead of just the top 1 token. | ||
| * | Add retainDuration option to CaptureParams | yum | 2023-02-26 |
| | | | | | | | This allows users to retain a suffix of the PCM buffer after a VAD segmentation event, reducing some instances of words being lost at the start of the next VAD window. | ||
| * | Normalize audio before sending to transcription layer | yum | 2023-02-26 |
| | | | | | | Helps in cases where the speaker is speaking softly, or their mic gain is set low. | ||
| * | Frames with no VAD are shortened, not dropped | yum | 2023-02-26 |
| | | | | | | | | | | | | | On PCM buffers of length >= captureParams.dropStartSilence, a "no voice" VAD verdict would result in the PCM buffer being entirely cleared. The emergent behavior is that when VAD segments speech, words right after the segmentation window can frequently be dropped. By removing a prefix from the PCM buffer and clearing the VAD buffers, the transcription algorithm has access to "leading" frames before the frames which triggered VAD. This reduces cases where words are omitted in the middle of long statements. | ||
| * | Readme | Konstantin | 2023-02-18 |
| | | |||
| * | When token timestamps are requested, disabled streaming in C++ CLI example | Konstantin | 2023-02-14 |
| | | |||
| * | Restored missing token-level timestamps experimental feature | Konstantin | 2023-02-14 |
| | | |||
| * | Build instructions | Konstantin | 2023-02-14 |
| | | |||
| * | Readme | Konstantin | 2023-02-08 |
| | | |||
| * | API documentation | Konstantin | 2023-02-08 |
| | | |||
| * | Release automation | Konstantin | 2023-02-07 |
| | | |||
| * | Version 1.7 | Konstantin | 2023-02-07 |
| | | |||
| * | Cleaned up unused dialog template | Konstantin | 2023-02-07 |
| | | |||
| * | Comments | Konstantin | 2023-02-06 |
| | | |||
| * | Comments | Konstantin | 2023-02-03 |
| | | |||
| * | Minor, code generation | Konstantin | 2023-02-03 |
| | | |||
| * | Comments | Konstantin | 2023-02-03 |
| | | |||
| * | Refactor, removed a redundant function | Konstantin | 2023-02-03 |
| | | |||
| * | Bugfix, incorrect output of command-line examples when launched with ↵ | Konstantin | 2023-02-03 |
| | | | | | multiple input files | ||
| * | C++ console example, static link to C++ runtime | Konstantin | 2023-02-03 |
| | | |||
| * | C++ console example now outputs text files when asked | Konstantin | 2023-02-03 |
| | | |||
| * | Minor, C++ example | Konstantin | 2023-02-03 |
| | | |||
| * | UX bugfix, microphone C# example | Konstantin | 2023-02-03 |
| | | |||
| * | Bugfix, addRepeatEx compute shader | Konstantin | 2023-02-03 |
| | | |||
| * | Minor, .NET wrapper | Konstantin | 2023-02-03 |
| | | |||
| * | Minor, API documentation | Konstantin | 2023-02-03 |
| | | |||
| * | Comments | Konstantin | 2023-02-03 |
| | | |||
| * | Desktop app version 1.6.1 | Konstantin | 2023-01-30 |
| | | |||
| * | “Text with timestamps” output format option | Konstantin | 2023-01-30 |
| | | |||
| * | Added `*.m4a` file extension to the browse dialog | Konstantin | 2023-01-30 |
| | | |||
| * | Better performance of C++ samples on laptops with two graphics cards | Konstantin | 2023-01-30 |
| | | | | | Untested | ||
| * | Readme | Konstantin | 2023-01-29 |
| | | |||
| * | Comments | Konstantin | 2023-01-29 |
| | | |||
| * | Version 1.6 | Konstantin | 2023-01-29 |
| | | |||
| * | C# microphone example, diarize integration | Konstantin | 2023-01-29 |
| | | |||
| * | C# console example, diarize feature | Konstantin | 2023-01-29 |
| | | |||
| * | Minor, C++ console example | Konstantin | 2023-01-29 |
| | | |||
| * | Diarize feature for buffered audio | Konstantin | 2023-01-28 |
| | | |||
| * | Minor, micro-optimization | Konstantin | 2023-01-28 |
| | | |||
| * | Diarize feature, initial version | Konstantin | 2023-01-28 |
| | | |||
| * | Bugfix, stereo PCM handling | Konstantin | 2023-01-28 |
| | | |||
| * | DLL API for diarize feature | Konstantin | 2023-01-28 |
| | | |||
| * | Bugfix, compilation error in C++ console example | Konstantin | 2023-01-28 |
| | | |||
