summaryrefslogtreecommitdiffstats
path: root/GUI/Libraries
diff options
context:
space:
mode:
authoryum <yum.food.vr@gmail.com>2023-09-03 13:23:50 -0700
committeryum <yum.food.vr@gmail.com>2023-09-03 13:23:50 -0700
commit606d223f8ba9174a2984d7cb15e6e94ef6e48228 (patch)
treeafd1b19fe801d9aac54b4e5bbe4a671e5df2217c /GUI/Libraries
parente9b5b4f1da2a8ff07b2d13e5e63dae491325251d (diff)
Experiment with Collector filters
Try adding two filters on top of the usual AudioCollector: * Minimum length preservation: never report fewer than N seconds worth of audio data. Pad with silence as needed. * Volume normalizing: normalize audio volume. Using my benchmark of 30-second audio clips from 3 speakers (lower is better): length enf + norm = 87.118 nothing = 90.917 norm = 94.538 length = 111.402 Both together are a slight improvement, but independently degrade the result by a lot. I also observed more hallucinations in a conversational pattern when using them vs. not. So I'll phase them out. I'm still curious about *compression* as opposed to normalization.
Diffstat (limited to 'GUI/Libraries')
0 files changed, 0 insertions, 0 deletions