&#128218; [Contribution] ebook2audiobook roadmap

## All Features open to public Contributions &#11088;
- [ ] Preview Blocks/Chapters before to start the conversion
- [ ] Edit by sentence converted for surgical text and SML tags changes
- [x] Change voice per chapter or sentence with SML tags
- [x] -h -help parameter info in different languages
- [x] OCR scanning for PDF / JPG / BMP / PNG / TIFF
- [x] Notebooks Folder [Talked about here](https://github.com/DrewThomasson/ebook2audiobookXTTS/issues/5#issuecomment-2408773254)
- [x] Make Chinese text splitting not split words and improve pause timing [Talked about here](https://github.com/DrewThomasson/ebook2audiobookXTTS/issues/18#issuecomment-2401154894)
- [x] Get Kaggel Notebook working
- [x] Get Working Google Colab Notebook [Talked about here
](https://github.com/DrewThomasson/ebook2audiobookXTTS/issues/5#issuecomment-2408773254)
- [ ] [Make a ios app](https://github.com/DrewThomasson/ebook2audiobook/pull/35#issuecomment-2496495212)
- [ ] [Make an android app](https://github.com/DrewThomasson/ebook2audiobook/pull/35#issuecomment-2496495212)
- [ ] Audiobookshelf integration
## Wanted Extra Parameters
- [ ] Ebook Translation option
- [x] Output format choicec
- [x] Batch ebook folder
- [x] Multiprocessing conversion
- [x] Make ebook input parameter accept a folder containing ebook files to auto-run through.
- [x] GPU Device detection and install the right torch/torchaudio pkg 
- [x] Denoise any reference audio for upload voice cloning,
- [x] Custom model dir input for pointing to a folder containing all of the custom model files if available instead of having to point to each model file individually

## TTS engines integration
- [x] XTTSv2
- [x] Bark
- [x] Fairseq
- [x] VITS
- [x] Tacotron2
- [x] YourTTS
- [x] Tortoise
- [x] GlowTTS
- [ ] Piper-TTS
- [ ] CosyVoice (https://github.com/FunAudioLLM/CosyVoice)
- [ ] Kokoro-TTS
- [ ] Orpheus-TTS
- [ ] Zonos
- [ ] Style-TTS2
- [ ] GPT-SoVITS
- [ ] F5-TTS (https://github.com/DrewThomasson/ebook2audiobookXTTS/issues/38#issuecomment-2453224267)
- [ ] VIbeVoice (https://github.com/vibevoice-community/VibeVoice)
- [ ] Qwen3-TTS (https://huggingface.co/spaces/Qwen/Qwen3-TTS)
- [ ] NewTTS (https://github.com/neuphonic/neutts?tab=readme-ov-file)
- [ ] Speedy-Speech
- [ ] Supertonic (https://github.com/supertone-inc/supertonic)
- [ ] Align-TTS
- [ ] Delightful-TTS
- [ ] Spark-TTS

## Create Readme in these languages

- [ ] Arabic (ara)
- [ ] Chinese (zho)
- [x] English (eng)
- [ ] Spanish (spa)
- [ ] French (fra)
- [ ] German (deu)
- [ ] Italian (ita)
- [ ] Portuguese (por)
- [ ] Polish (pol)
- [ ] Turkish (tur)
- [ ] Russian (rus)
- [ ] Dutch (nld)
- [ ] Czech (ces)
- [ ] Japanese (jpn)
- [ ] Hindi (hin)
- [ ] Bengali (ben)
- [ ] Hungarian (hun)
- [ ] Korean (kor)
- [ ] Vietnamese (vie)
- [ ] Swedish (swe)
- [ ] Persian (fas)
- [ ] Yoruba (yor)
- [ ] Swahili (swa)
- [ ] Indonesian (ind)
- [ ] Slovak (slk)
- [ ] Croatian (hrv)   

## &#128013; Compatibility
- [x] &#127822; Mac Intel x86
- [x] &#129695; Windows x86
- [x] &#128039; Linux x86
- [x] &#128421;&#65039;&#127823; Apple Silicon Mac
- [x] &#129695;&#128170; ARM Windows
- [x] &#128039;&#128170; ARM Linux

## Extra Overkill for training models and such (All supported Coqui-tts models and piper-tts in one easy command) 
- For info about this @DrewThomasson, he is currently working on the development of this, [work-in-progress-repo here](https://github.com/DrewThomasson/Universal_TTS_Finetune)
- [ ] Make a easy to use training gui for all coqui-tts models in the ljspeech format training recipes [here from coqui tts](https://github.com/coqui-ai/TTS/tree/dev/recipes/ljspeech)

## Auto-testing scripts for development

- [x] Standard model headless run through every language sample [Samples located here](https://github.com/DrewThomasson/ebook2audiobookXTTS/tree/main)

## Python Code normalization information for contributors
- no blank line between code, unless between functions and classes.
- single quote used for all key unless for dict() and json. dict['key'] always called with single quote
- 4 spaces indentation, not tab at all
- strict typing for all functions and its arguments declaration and return values
- no space between the argument and its typing, no space between the function, the "->" and the return value

Example:

```python
import json
from typing import Optional

def get_user(user_id:int, users:list[dict])->Optional[dict]:
    for user in users:
        if user['id'] == user_id:
            return user
    return None

def summarize(user:dict)->str:
    return f"User {user['name']} is {'active' if user['is_active'] else 'inactive'}."

def to_json(user:dict)->str:
    return json.dumps({"id": user['id'], "name": user['name'], "email": user['email']})

users:list = [
    dict(id=1, name='alice', email='alice@example.com', role='admin', is_active=True),
    dict(id=2, name='bob', email='bob@example.com', role='editor', is_active=False),
    dict(id=3, name='carol', email='carol@example.com', role='viewer', is_active=True),
]
config = {
    "max_users": 100,
    "default_role": "viewer",
    "allow_signup": True,
}
roles = ['admin', 'editor', 'viewer']
found = get_user(1, users)
if found:
    print(summarize(found))
    print(found['email'])
    print(to_json(found))
if config['default_role'] in roles:
    print(config['default_role'])
```

## Hardware donation for beta tests wanted
We accept any kind of hardware to test our development like:
- Nvidia supporting cuda >= 11.8
- XPU intel cards
- ROCm AMD cards supporting ROCm >=5.7

@DrewThomasson if you want to help out at all! &#128515;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

📚 [Contribution] ebook2audiobook roadmap #32

All Features open to public Contributions ⭐

Wanted Extra Parameters

TTS engines integration

Create Readme in these languages

🐍 Compatibility

Extra Overkill for training models and such (All supported Coqui-tts models and piper-tts in one easy command)

Auto-testing scripts for development

Python Code normalization information for contributors

Hardware donation for beta tests wanted

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

📚 [Contribution] ebook2audiobook roadmap #32

Description

All Features open to public Contributions ⭐

Wanted Extra Parameters

TTS engines integration

Create Readme in these languages

🐍 Compatibility

Extra Overkill for training models and such (All supported Coqui-tts models and piper-tts in one easy command)

Auto-testing scripts for development

Python Code normalization information for contributors

Hardware donation for beta tests wanted

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions