| Commit message (Collapse) | Author | Age | |
|---|---|---|---|
| * | Fix audio normalizationHEADmaster | yum | 2023-04-04 |
| | | | | | | | | Normalization was putting audio onto range [0, 255], while it should have been on range [0, 1]. * Add AudioBuffer::save() to enable debugging audio issues. | ||
| * | begin work disabling vad | yum | 2023-03-17 |
| | | |||
| * | Fix beam search previous window conditioning | yum | 2023-03-07 |
| | | | | | Not all contexts had `prev_prompt`, causing most beams to misbehave. | ||
| * | Use logprobs, fix beam candidate selection | yum | 2023-03-03 |
| | | | | | | | | | | | | | | Incorrect sort condition resulted in worst 5 beams being picked instead of best 5. Use log probabilities for joint probability calculation instead of linear probabilities. Long beams would have probabilities converge exponentially towards zero; now they converge linearly towards -INFINITY. Using both transcripts in Evaluation/setup.ps1, I see a small edit distance regression (~5%) using beam search vs. greedy. | ||
| * | Begin work on evaluation framework | yum | 2023-03-03 |
| | | | | | | Need a way to verify that beam search is actually working better than greedy. | ||
| * | Finish beam search rough draft | yum | 2023-03-02 |
| | | | | | Seems to work. Doesn't crash. Lots of room for optimization and cleanup. | ||
| * | Continue work on beam search | yum | 2023-03-02 |
| | | | | | | | Define ContextImpl::Context, wrapping all the data used in decoding. Using a vector of these is much simpler than using N vectors of all the random stuff we need. | ||
| * | Begin work on beam search decoding | yum | 2023-02-27 |
| | | | | | | | | | * ContextImpl.h puts prompts, previous prompts, probabilities, and probability IDs into vectors of size 1 or N_BEAMS, depending on the decoding strategy. * Extend sampleBest and friends to return top N tokens, instead of just the top 1 token. | ||
| * | Add retainDuration option to CaptureParams | yum | 2023-02-26 |
| | | | | | | | This allows users to retain a suffix of the PCM buffer after a VAD segmentation event, reducing some instances of words being lost at the start of the next VAD window. | ||
| * | Normalize audio before sending to transcription layer | yum | 2023-02-26 |
| | | | | | | Helps in cases where the speaker is speaking softly, or their mic gain is set low. | ||
| * | Frames with no VAD are shortened, not dropped | yum | 2023-02-26 |
| | | | | | | | | | | | | | On PCM buffers of length >= captureParams.dropStartSilence, a "no voice" VAD verdict would result in the PCM buffer being entirely cleared. The emergent behavior is that when VAD segments speech, words right after the segmentation window can frequently be dropped. By removing a prefix from the PCM buffer and clearing the VAD buffers, the transcription algorithm has access to "leading" frames before the frames which triggered VAD. This reduces cases where words are omitted in the middle of long statements. | ||
| * | Restored missing token-level timestamps experimental feature | Konstantin | 2023-02-14 |
| | | |||
| * | Version 1.7 | Konstantin | 2023-02-07 |
| | | |||
| * | Comments | Konstantin | 2023-02-03 |
| | | |||
| * | Bugfix, incorrect output of command-line examples when launched with ↵ | Konstantin | 2023-02-03 |
| | | | | | multiple input files | ||
| * | Version 1.6 | Konstantin | 2023-01-29 |
| | | |||
| * | Diarize feature for buffered audio | Konstantin | 2023-01-28 |
| | | |||
| * | Minor, micro-optimization | Konstantin | 2023-01-28 |
| | | |||
| * | Diarize feature, initial version | Konstantin | 2023-01-28 |
| | | |||
| * | Bugfix, stereo PCM handling | Konstantin | 2023-01-28 |
| | | |||
| * | DLL API for diarize feature | Konstantin | 2023-01-28 |
| | | |||
| * | Version 1.5 | Konstantin | 2023-01-24 |
| | | |||
| * | Performance tuning on AMD iGPU | Konstantin | 2023-01-24 |
| | | |||
| * | Minor, micro-optimization | Konstantin | 2023-01-23 |
| | | |||
| * | Performance improvement, no longer destroying temporary buffers in ↵ | Konstantin | 2023-01-23 |
| | | | | | `encode()` method | ||
| * | Improved VRAM memory management, both speed and memory usage | Konstantin | 2023-01-23 |
| | | |||
| * | Minor, performance and VRAM use | Konstantin | 2023-01-23 |
| | | |||
| * | Minor, micro-optimization | Konstantin | 2023-01-23 |
| | | |||
| * | Performance improvement, `softMax` shader | Konstantin | 2023-01-23 |
| | | |||
| * | Minor, profiler tags | Konstantin | 2023-01-23 |
| | | |||
| * | VAD CPU performance, slightly better code generation | Konstantin | 2023-01-23 |
| | | |||
| * | GPU performance, optimized away a few shader dispatches | Konstantin | 2023-01-22 |
| | | |||
| * | Experimental, alternative busy wait implementation | Konstantin | 2023-01-21 |
| | | | | | | | Disabled with a `constexpr` flag because on a desktop with discrete GPU this slowed down by about 20%. But the CPU load is about zero. Need to test on iGPUs, thermal shenanigans might make a difference there. | ||
| * | Minor, CPU performance | Konstantin | 2023-01-21 |
| | | |||
| * | CPU performance, SSE vectorization for MEL spectrogram | Konstantin | 2023-01-21 |
| | | |||
| * | Version 1.4 | Konstantin | 2023-01-20 |
| | | |||
| * | Minor, error handling | Konstantin | 2023-01-20 |
| | | |||
| * | Version 1.3 | Konstantin | 2023-01-19 |
| | | |||
| * | Workaround for the Microsoft’s bug in their MP3 decoder MFT | Konstantin | 2023-01-19 |
| | | |||
| * | Version 1.2 | Konstantin | 2023-01-18 |
| | | |||
| * | Minor, logging and UX | Konstantin | 2023-01-18 |
| | | |||
| * | Optional startup flags to override performance-related defaults for the ↵ | Konstantin | 2023-01-18 |
| | | | | | compute shaders | ||
| * | Consistent cancellation API across the library: S_OK = continue, S_FALSE = stop | Konstantin | 2023-01-18 |
| | | |||
| * | Minor, optimized away memcpy() when running audio capture | Konstantin | 2023-01-17 |
| | | |||
| * | Comment | Konstantin | 2023-01-16 |
| | | |||
| * | Comments | Konstantin | 2023-01-16 |
| | | |||
| * | DLL version 1.1 | Konstantin | 2023-01-16 |
| | | |||
| * | Bugfix: when processing files, “Run” CPU block was erroneously measured ↵ | Konstantin | 2023-01-16 |
| | | | | | twice | ||
| * | Bugfix, failed C++ with the lack of move constructor in the CPU profiler ↵ | Konstantin | 2023-01-16 |
| | | | | | RAII class | ||
| * | Comment | Konstantin | 2023-01-16 |
| | | |||
