From 600cb13f9f4b02d9030f99fc379bdebebb64b65d Mon Sep 17 00:00:00 2001 From: Konstantin Date: Sun, 29 Jan 2023 16:34:03 +0100 Subject: Readme --- Readme.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/Readme.md b/Readme.md index 59d875d..8da9d30 100644 --- a/Readme.md +++ b/Readme.md @@ -132,9 +132,6 @@ and [explicit FP16](https://github.com/microsoft/DirectXShaderCompiler/wiki/16-B Automatic language detection is not implemented. -The original version implements “diarize” feature, they analyze stereo PCM to detect speaker based on the difference between left/right channels.
-Despite my version preserves stereo PCM data over the pipeline, it doesn’t expose that data. - In the current version there’s high latency for realtime audio capture.
Specifically, depending on voice detection the figure is about 5-10 seconds.
At least in my tests, the model wasn’t happy when I supplied too short pieces of the audio.
-- cgit v1.2.3