#interning #string-interning

simple-interner

A simple append-only interner

4 releases (2 breaking)

0.3.4 Feb 4, 2023
0.3.3 Feb 4, 2023
0.3.1 Aug 16, 2022
0.2.0 Apr 7, 2018
0.1.0 Apr 7, 2018

#241 in Caching

Apache-2.0 OR MIT

23KB
335 lines

simple-interner

A very simplistic interner based around giving out references rather than some placeholder symbol. This means that you can mostly transparently add interning into a system without requiring rewriting all of the code to work on a new Symbol type, asking the interener to concretize the symbols.

The typical use case for something like this is text processing chunks, where chunks are very likely to be repeated. For example, when parsing source code, identifiers are likely to come up multiple times. Rather than have a String allocated for every occurrence of the identifier separately, interners allow you to store Symbol. This additionally allows comparing symbols to be much quicker than comparing the full interned string.

This crate exists to give the option of using the simplest interface. For a more featureful interner, consider using a different crate, such as

crate global local 'static opt[^1] str-only symbol size symbols deref
simple-interner manual[^2] yes no no &T yes
intaglio no yes yes yes u32 no
internment rc[^3] yes no no &T yes
lasso no yes yes yes u8usize no
string-interner no yes optionally yes u16usize no
string_cache static only rc[^3] buildscript yes u64 yes
symbol_table yes yes no yes u32 global only
ustr yes no no yes usize yes

(PRs to this table are welcome!)

[^1]: The interner stores &'static references without copying the pointee into the store, e.g. storing Cow<'static, str> instead of Box<str>.

[^2]: At the moment, creating the Interner inside a static, using Interner::with_hasher, requires the hashbrown feature to be enabled.

[^3]: Uses reference counting to collect globally unused symbols.

Dependencies

~0–4.5MB