Releases: patrickenfuego/Chapterize-Audiobooks
Future Support for Multiple Languages
What's New
I recently realized that the project supports downloading multiple language models, but the actual programming logic was written for English. I fixed that in this release by making excluded phrases and chapter markers dynamic based on the language specified (either via CLI or in defaults.toml
).
For example: if German is specified as the language, the script will search for and import German-specific excluded phrases and chapter markers if they exist. If they don't exist (meaning nothing has been contributed), an error is thrown.
Adding More Language Support
I speak a little German, so I added some German chapter markers and a few excluded phrases (although I'm far from fluent). However, I'd love to see contributions from the community so non-English speakers can use this for their audiobooks. If you speak one of the vosk supported languages, please consider contributing excluded phrases (i.e., phrases that falsely trigger a chapter break like "this chapter" or "chapter and verse") and chapter markers ("prologue", "chapter", "epilogue", etc.) to this project! If you aren't a programmer, send them to me in a GitHub issue (here) and I'll add them for you.
Generate Cue Files
What's New
This release comes with a few community requested features, as well as some minor bug fixes and other quality-of-life enhancements.
Cue Files
The script can now generate .cue files which can be used to edit chapter markers and timecodes. The files are slightly different from your standard .cue file for compatibility and parsing reasons, but close enough where converting them would not be difficult.
If a .cue file exists within the audiobook directory, it will be used in every subsequent run for parsing timecodes. To stop this behavior, simply rename/move/delete the .cue file and generate a new one (or don't, up to you).
File Numbering & Sorting
I received some feedback regarding the file identifiers not sorting properly. I've added a leading 0 to every number below 10, which should help sorting in Explorer, Finder, etc.
defaults.toml
Updates
I've added a few things to the configuration file to support the new .cue feature, with the goal of making things easier for those who always want .cue files while continuing to leave the default behavior disabled in the script for those who don't.
Narrator Argument
What's New
A small update which adds a new argument, --narrator
/-n
. This sets the "Composer" ID3 tag, which is recognized by software like Apple Audiobooks and Prologue when parsing audiobook metadata. It should work with other players, too, although I can't guarantee it.
Also added were some additional warning messages when certain arguments are (or are not) used, and the documentation has been updated.
Download Models
What's New
Now you can download new models directly from the script and keep multiple different models at the same time.
Download Models
To download a new model, use the --download_model
/-dm
parameter and specify a size, either large
or small
. You must also specify a language using --language
/-l
if the desired language isn't US English.
PS > python .\chapterize_ab.py 'C:\path\to\audiobook.mp3' --download_model large --language 'en-us'
Using Different Models
If you already have multiple models downloaded and wish to use a different one, you can use the --model
/-m
parameter and specify large
or small
(the default is small). If you want to use a different language, use the --language
parameter. The script will search the /model
directory and pick the closest match to your selection.
PS > python .\chapterize_ab.py 'C:\path\to\audiobook.mp3' --model large --language 'en-us'
More Flexible Language Selection
To specify a language, you can use both the language code it
and the full spelling, for example italian
.
Add Multi-Language Support
What's New
The script now supports all languages provided by vosk
. 'en-us' is provided by default in the repository, but additional language models will require manual downloads (for now). Only one language model should be placed in the /model
directory at any given time. See the README for full details.
New Parameters
To support multiple languages, two new parameters have been added:
--language
,-l
- Specify the language code for the model you wish to use--list_languages
,-ll
- List available languages supported byvosk
and exit the script
Initial Release
What's New
The first stable release. Chapterize-Audiobooks
uses speech-to-text Machine Learning to discover chapter timecodes in .mp3 audiobook files and ffmpeg to extract the metadata and split the file. The user may also pass various ID3-compliant metadata tags to the script, which will always take precedence over any existing ID3 tags extracted from the audiobook file.
Improvement
The script is still being improved, and I encourage anyone who uses it to report problems; particularly false positive words or phrases that trigger a chapter marker.
So far, I've found the following words or phrases (enclosed in ""
) which cause a false positive chapter marker to be created:
- chapters
- "chapter and verse"
My goal is to grow this list over time, improving the accuracy of detection.