Lexilux is a unified LLM API client library that makes calling Chat, Embedding, Rerank, and Tokenizer APIs as simple as calling a function.
- 🎯 Function-like API: Call APIs like functions (
chat("hi"),embed(["text"])) - 🔄 Streaming Support: Built-in streaming for Chat with usage tracking
- 📊 Unified Usage: Consistent usage statistics across all APIs
- 🔧 Flexible Input: Support multiple input formats (string, list, dict)
- 🚫 Optional Dependencies: Tokenizer requires transformers only when needed
- 🌐 OpenAI-Compatible: Works with OpenAI-compatible APIs
pip install lexilux# Using full name
pip install lexilux[tokenizer]
# Or using short name
pip install lexilux[token]pip install -e ".[dev]"
# Or using Makefile
make dev-installfrom lexilux import Chat
chat = Chat(base_url="https://api.example.com/v1", api_key="your-key", model="gpt-4")
# Simple call
result = chat("Hello, world!")
print(result.text) # "Hello! How can I help you?"
print(result.usage.total_tokens) # 42
# With system message
result = chat("What is Python?", system="You are a helpful assistant")
# Streaming
for chunk in chat.stream("Tell me a joke"):
print(chunk.delta, end="")
if chunk.done:
print(f"\nUsage: {chunk.usage.total_tokens}")from lexilux import Embed
embed = Embed(base_url="https://api.example.com/v1", api_key="your-key", model="text-embedding-ada-002")
# Single text
result = embed("Hello, world!")
vector = result.vectors # List[float]
# Batch
result = embed(["text1", "text2"])
vectors = result.vectors # List[List[float]]from lexilux import Rerank
rerank = Rerank(base_url="https://api.example.com/v1", api_key="your-key", model="rerank-model")
result = rerank("python http", ["urllib", "requests", "httpx"])
ranked = result.results # List[Tuple[int, float]] - (index, score)
# With documents included
result = rerank("query", ["doc1", "doc2"], include_docs=True)
ranked = result.results # List[Tuple[int, float, str]] - (index, score, doc)from lexilux import Tokenizer
# Auto-offline mode (use local cache if available, download if not)
tokenizer = Tokenizer("Qwen/Qwen2.5-7B-Instruct", mode="auto_offline")
result = tokenizer("Hello, world!")
print(result.usage.input_tokens) # 3
print(result.input_ids) # [[15496, 11, 1917, 0]]
# Force-offline mode (for air-gapped environments)
tokenizer = Tokenizer("Qwen/Qwen2.5-7B-Instruct", mode="force_offline", cache_dir="/models/hf")Full documentation available at: lexilux.readthedocs.io
Check out the examples/ directory for practical examples:
basic_chat.py- Simple chat completionchat_streaming.py- Streaming chatembedding_demo.py- Text embeddingrerank_demo.py- Document rerankingtokenizer_demo.py- Tokenization
Run examples:
python examples/basic_chat.py# Run all tests
make test
# Run with coverage
make test-cov
# Run linting
make lint
# Format code
make formatFull documentation available at: lexilux.readthedocs.io
Build documentation locally:
pip install -e ".[docs]"
cd docs && make htmlLexilux is part of the Agentsmith open-source ecosystem. Agentsmith is a ToB AI agent and algorithm development platform, currently deployed in multiple highway management companies, securities firms, and regulatory agencies in China. The Agentsmith team is gradually open-sourcing the platform by removing proprietary code and algorithm modules, as well as enterprise-specific customizations, while decoupling the system for modular use by the open-source community.
- Varlord ⚙️ - Configuration management library with multi-source support
- Routilux ⚡ - Event-driven workflow orchestration framework
- Serilux 📦 - Flexible serialization framework for Python objects
- Lexilux 🚀 - Unified LLM API client library
These projects are modular components extracted from the Agentsmith platform, designed to be used independently or together to build powerful applications.
Lexilux is licensed under the Apache License 2.0. See LICENSE for details.
- 📦 PyPI: pypi.org/project/lexilux
- 📚 Documentation: lexilux.readthedocs.io
- 🐙 GitHub: github.com/lzjever/lexilux
Built with ❤️ by the Lexilux Team