summaryrefslogtreecommitdiffstats
path: root/README.md
diff options
context:
space:
mode:
authoryum <yum.food.vr@gmail.com>2023-12-20 22:38:24 -0800
committeryum <yum.food.vr@gmail.com>2023-12-20 22:38:24 -0800
commitca55539295c6d533f0d38ed579483555390cde9b (patch)
tree03fc8aa015e653d7840a33c3977a4df1b9a6e043 /README.md
Initial commit
Check in a shit ton of code. Most of the audio processing logic in `app.py` is lifted/ported from github.com/yum_food/TaSTT. I made some adjustments to make it work better (removing normalization, adding volume filters) and also increase fidelity.
Diffstat (limited to 'README.md')
-rw-r--r--README.md65
1 files changed, 65 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..afc1131
--- /dev/null
+++ b/README.md
@@ -0,0 +1,65 @@
+## yapBox
+
+A black box for your yapping.
+
+This app records your mic. It uses silero-vad to split audio into contiguous
+segments of speech, and saves them to disk as .wav files. Metadata is
+saved to a corresponding .yaml file.
+
+What's a black box? Wikipedia says this:
+```
+A flight recorder is an electronic recording device placed in an aircraft for
+the purpose of facilitating the investigation of aviation accidents and
+incidents. The device may often be referred to colloquially as a "black box",
+an outdated name which has become a misnomer—they are now required to be
+painted bright orange, to aid in their recovery after accidents.
+```
+
+This is a CLI app. It is not polished and requires a little elbow grease to
+use properly. The intent is to assist people who want to gather high-quality
+training data of human voices. Use responsibly.
+
+## Compatibility
+
+This application is designed for Windows 10. Functionality on any other
+platform is purely coincidental.
+
+## Running
+
+Download the latest release and double click `app.bat` in File Explorer.
+
+Read the output and change the mic to whatever you're using. To change mics,
+edit app.py. Any text editor works, including Notepad.
+
+## Building from source
+
+First install python 3.10.9. Make sure that Powershell is using that version by
+typing this (leave out the $, it's used to differentiate between commands and
+output):
+```
+$ python.exe --version
+Python 3.10.9
+```
+
+Then open Powershell and run package.ps1:
+```
+$ cd pkg
+$ ./package.ps1
+```
+
+All dependencies should download themselves. It will use the host Python to
+install dependencies into the app's environment.
+
+## Ethics
+
+We are living in the wild west of AI. You can clone anyone's voice and
+plausibly reproduce it using projects like
+[rvc-beta](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/releases).
+Legislation has not caught
+up to this yet. Cloning someone's voice without their consent is, at best,
+ethically dubious. This tool makes that process easier. In the absence of a
+legal framework, you must make your own choices as to what is right. Take this
+seriously. When in doubt, follow Kant's [universalization
+principle](https://en.wikipedia.org/wiki/Universalizability) and the [golden
+rule](https://en.wikipedia.org/wiki/Golden_Rule).
+