A token-efficient, AI-native data serialization format designed for machine consumption.
AISL (Artificial Intelligence Serialization Language) is a serialization format purpose-built for AI systems. Unlike JSON, YAML, or XML—which prioritize human readability—AISL optimizes for token efficiency and deterministic parsing by large language models.
- Token-efficient: Eliminates quotes, braces, and redundant syntax that consume tokens without adding semantic value.
- Human-optional: Optimized for AI consumption; human readability is a secondary concern.
- Flattened hierarchy: Represents nested structures as dot-notation key paths (
contact.address.city). - Linear structure: Each record is a single logical unit with optional multi-line continuation.
- Deterministic parsing: Predictable syntax reduces hallucination risk and parsing errors.
- Lossless conversion: Any AISL document can be converted back to JSON without data loss.
Modern AI systems process structured data constantly—API responses, configuration files, database records, and more. The formats we use today were designed decades ago for human developers:
| Format | Primary Design Goal |
|---|---|
| JSON | Human-readable data interchange |
| YAML | Human-friendly configuration |
| XML | Document markup with extensibility |
| TOML | Human-readable configuration |
These formats introduce inefficiencies when consumed by AI:
- Repeated structural tokens: Every
{,},[,], and"consumes tokens without semantic value. - Redundant key repetition: In arrays of objects, the same keys appear repeatedly.
- Whitespace and indentation: Formatting for humans wastes context window space.
- Ambiguous nesting: Deep hierarchies increase parsing complexity and hallucination risk.
- Irregular structure: Mixed types and optional fields confuse pattern recognition.
AISL addresses these issues by:
- Linearizing hierarchical data into flat key-value pairs
- Using minimal delimiters (
|,%,~) - Eliminating quotes around strings by default
- Providing optional type hints for unambiguous parsing
- Creating predictable, scannable record structures
The result: fewer tokens, clearer structure, more reliable AI processing.
AISL removes syntactic overhead that provides no semantic value to AI systems:
# JSON (43 tokens typical)
{"user": {"name": "Alice", "age": 32, "active": true}}
# AISL (approximately 15 tokens)
record:user|name=Alice|age:int=32|active:bool=true%
Nested structures become dot-notation paths:
# Instead of nested objects:
contact.address.street=123 Main St|contact.address.city=Boston|contact.address.zip=02101%
| Symbol | Purpose | Example |
|---|---|---|
| |
Field separator | name=Alice|age=32 |
% |
Record terminator | ...active=true% |
~ |
Array element marker | tags~python|tags~ai |
\ |
Line continuation | name=Alice \ (continues next line) |
. |
Hierarchy separator | user.contact.email |
Explicit types eliminate ambiguity:
| Hint | Meaning | Example |
|---|---|---|
:int |
Integer | age:int=32 |
:float |
Floating point | price:float=19.99 |
:bool |
Boolean | active:bool=true |
:str |
String (explicit) | code:str=007 |
:null |
Null value | deleted:null= |
Every AISL document has exactly one valid interpretation. There are no optional commas, no flexible whitespace rules, and no ambiguous type coercion. This predictability reduces AI hallucinations and missing-field errors.
The most common pattern—a complete record on one line:
record:user-001|name=Alice|age:int=32|status=active%
Structure:
record:user-001— Record identifier (optional but recommended)|— Field separatorkey=value— Field assignment%— Record terminator
For records with many fields, use line continuation:
record:user-001\
name=Alice Johnson\
age:int=32\
[email protected]\
contact.phone=555-0123\
preferences.theme=dark\
preferences.notifications:bool=true%
The \ at line end indicates continuation. The record terminates at %.
Use the ~ operator to denote array membership:
record:user-001|name=Alice|[email protected]|[email protected]|[email protected]%
This represents:
{
"name": "Alice",
"emails": ["[email protected]", "[email protected]", "[email protected]"]
}Combine dot-notation with array markers:
record:order-500\
items~0.product=Widget\
items~0.qty:int=2\
items~0.price:float=9.99\
items~1.product=Gadget\
items~1.qty:int=1\
items~1.price:float=24.99%
middle_name=|deleted_at:null=%
middle_name=— Empty stringdeleted_at:null=— Explicit null
If values contain delimiters, use explicit string type:
bio:str=Works at Acme | Loves coding%
| Aspect | JSON | YAML | XML | TOML | AISL |
|---|---|---|---|---|---|
| Primary audience | Humans | Humans | Documents | Humans | AI |
| Verbosity | Medium | Low | High | Low | Very Low |
| Token efficiency | Poor | Medium | Very Poor | Medium | Excellent |
| Nested data | Native | Native | Native | Limited | Flattened |
| Schema support | External | External | XSD/DTD | Native types | Type hints |
| Human readability | Good | Excellent | Fair | Excellent | Minimal |
| AI parse reliability | Good | Poor | Fair | Good | Excellent |
| Whitespace sensitive | No | Yes | No | No | No |
Representing the same data across formats:
Data: A user with name, email, age, and two tags.
| Format | Approximate Tokens (o200k) |
|---|---|
| XML | 65-75 |
| JSON | 35-42 |
| YAML | 28-35 |
| TOML | 25-32 |
| AISL | 18-24 |
Actual savings depend on data shape, key length, and value types.
To validate AISL's efficiency for your use case, test with OpenAI's o200k_base tokenizer (used by GPT-4o and similar models):
Using Python:
import tiktoken
enc = tiktoken.get_encoding("o200k_base")
json_data = '{"name": "Alice", "age": 32, "active": true}'
aisl_data = 'record:user|name=Alice|age:int=32|active:bool=true%'
print(f"JSON tokens: {len(enc.encode(json_data))}")
print(f"AISL tokens: {len(enc.encode(aisl_data))}")Using the command line:
# Install tiktoken
pip install tiktoken
# Compare files
python -c "
import tiktoken
enc = tiktoken.get_encoding('o200k_base')
with open('data.json') as f: print('JSON:', len(enc.encode(f.read())))
with open('data.aisl') as f: print('AISL:', len(enc.encode(f.read())))
"AISL is optimized for structured records. Token savings may be minimal or negative for:
- Very short, simple values
- Data with extremely long string values
- Deeply irregular schemas
- Single-field records
The primary goal is AI-friendliness, not token minimization in all cases.
JSON TO AISL CONVERTER - https://aisl-web.github.io/AISL/
| Type | AISL Representation | Notes |
|---|---|---|
| String | key=value |
Default type, no quotes needed |
| Integer | key:int=42 |
Whole numbers |
| Float | key:float=3.14 |
Decimal numbers |
| Boolean | key:bool=true |
true or false |
| Null | key:null= |
Explicit null value |
| Array | key~value |
Multiple entries with same key |
| Object | key.subkey=value |
Dot-notation flattening |
- No comments: AISL does not support inline comments
- No multi-line strings: Use
\nescape sequences - Reserved characters:
|,%,~,\require escaping in values - Key restrictions: Keys cannot contain
=,., or delimiter characters
document = *record
record = [record-id] fields "%"
record-id = "record:" identifier "|"
fields = field *("|" field)
field = key [":" type] "=" value
key = identifier *("." identifier)
type = "int" / "bool" / "float" / "str" / "null"
value = *VCHAR
identifier = ALPHA *(ALPHA / DIGIT / "-" / "_")| Character | Escape Sequence |
|---|---|
| (pipe) |
\| |
% |
\\% |
~ |
\\~ |
\ |
\\\\ |
| Newline | \\n |
- Improved array syntax for complex nested arrays
- Binary data encoding (base64 type hint)
- Date/time type hints (
:date,:datetime)
- Schema definition language (AISL-Schema)
- Streaming parser for large documents
- Compression hints for repeated values
- Reference syntax for deduplication
- Official CLI tool with validation
- VSCode extension with syntax highlighting
- Language bindings: JavaScript, Go, Rust, Java
- JSON Schema to AISL-Schema converter
- Online playground and converter
- OpenAI function calling format support
- LangChain document loader
- Database export plugins
We welcome contributions from the community.
- Search existing issues to avoid duplicates
- Use the issue template
- Include minimal reproduction examples
- Specify your environment (Python version, OS)
- Open a discussion in the Ideas category
- Provide rationale with token-efficiency analysis
- Include before/after examples
- Consider backward compatibility
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement) - Follow the code style (Black formatting, type hints)
- Add tests for new functionality
- Update documentation
- Submit a pull request
- Use lowercase keys with hyphens for multi-word names
- Include record identifiers for debugging
- Group related fields together
- Use type hints for non-string values
- One record per logical entity
Q: Is AISL meant to replace JSON?
No. JSON remains excellent for human-facing APIs and configuration. AISL is designed for AI-to-AI communication and contexts where token efficiency matters.
Q: Can I use AISL for configuration files?
You can, but TOML or YAML may be better choices if humans will edit the files directly.
Q: How does AISL handle schema validation?
Version 1.0 relies on type hints for basic validation. Full schema support is planned for v2.0.
Q: Is AISL whitespace-sensitive?
No. Whitespace around delimiters is ignored, except within values.
MIT License
Copyright (c) 2025 AISL Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
AISL — Data serialization for the age of AI.