Whisper
These notes cover using whiser, whisper.cpp and how to train a new model which can be used by either of the others.
Creating a Custom Model
Creating the Training Data
Need audio files in the correct format, text files (CSV) with the text for each one and a model file which lists how to match.
Making Audio Files
Record the files on anyhting you may use. The speech recognition will be improved if the recording source is that which is used for real speech (assumption). These files can be converted using ffmpeg, e.g.
ffmpeg -i ~/Documents/_record_filename.???_ -ar 16k ExampleText.wavThe output must be a 16k sample rate wav file for whisper.cpp to test the speech recognition on them (useful to create training data).
cd WebDev/whisper.cpp
./main -i ExampleText.wav --model models/ggml-small.en.bin