Manticore 6.pdf

Manticore Buddy
● Written in PHP 8
● Supports multithreading
● Comes as a self-extracting PHAR
● Requires manticore-executor
● manticore-executor is just statically built PHP with all needed
modules
● apt/yum install manticore manticore-extra installs
everything including:
○ Manticore Server
○ Manticore Buddy
○ Manticore-executor
● Works in Linux, Windows and MacOS
● Manticore Search starts Buddy on start
○ And restarts it in case it stops
○ Can be disabled
● Pluggable architecture in progress
● Applications: SHOW QUERIES, auto schema, mysqldump, shards
orchestration, Apache superset, Grafana,
Opensearch/Elasticsearch Kibana/Logstash/Elasticsearch, new /cli
and many-many more

● SHOW THREADS shows threads with thread ids
● SHOW QUERIES shows queries with session ids
● KILL <session id> KILLs the SELECT running in the
session
● Only SELECTs can be killed
SHOW QUERIES and KILL

Elasticsearch-compatible writes
New endpoints
● POST /<table name>/_create/<id>
● POST /<table name>/_create/ (auto id)
● PUT /<table name>/_doc/<id>
● POST /_bulk

Secondary indexes are ON by default
● Manticore 5: secondary indexes are OFF for searching, but ON for
indexing
● Manticore 6: ON for searching too
● SI index format has been changed
● The algorithms have been optimized too

Auto-schema
● You don’t have to create a table before
writing to it
● Manticore can do auto type-detection
○ Textual data => text
○ Email => string
○ Numeric => int/bigint/float
○ JSON => json
○ Array => multi/multi64
● ON by default
● searchd.auto_schema=0 disables it

● Manticore uses CBO to decide what to use for non-full-text queries:
○ Just do plain scan
○ Additional data/algorithms:
■ Docid index
■ Columnar scan
■ Secondary indexes
● CBO estimates execution cost using attribute statistics like histograms, PGM and
columnar storage statistics
● Execution cost is calculated for every filter in the query
● Multithreaded query execution is considered
○ E.g. queries using secondary/docid indexes always run in a single thread
● Optimizer hints:
○ /*+ DocidIndex(id) */
○ /*+ SecondaryIndex(<attr_name1>) */
○ /*+ ColumnarScan(<attr_name1>) */
○ /*+ NO_ColumnarScan(id) */
Revamp of cost-based query optimizer

● What for: to understand what features we need to prioritize working on
● How do disable:
○ searchd.telemetry = 0
○ Env. var. TELEMETRY=0
● All metrics are completely anonymous and no sensitive information is transmitted
● Ex. of what we don’t collect:
○ Any data
○ Table names, field names, hostnames
● Ex. of what we collect:
○ Depersonalized machine ID (hash(machine id))
○ OS name
○ Manticore version information
○ Instance uptime
○ Whether the instance is running in RT mode or plain mode
○ Whether MCL is used or not
○ Whether crash happened or not
○ Whether backup was called
Telemetry

FREEZE/UNFREEZE and manticore-backup
● FREEZE is not LOCK. After FREEZE you can:
○ Read from the table.
○ And write to the table to some extent.
● FREEZE returns the list of table files that are
immutable until you UNFREEZE
● Eventually the table will be locked, but chances
are the backup will be already made then
● Manticore-backup uses
FREEZE/UNFREEZE
● Use manticore-backup --restore to
restore the whole instance
● Use IMPORT TABLE to restore a specific table
● manticore-backup is written in PHP and
uses manticore-executor

SQL BACKUP
● SQL BACKUP command does the
same as manticore-backup
and uses the same code
● BACKUP … OPTION async=1
● No RESTORE command yet
Plans:
● Plain tables backup
● Backup to an external storage
(e.g. S3)

● Default max_matches (1000) can be automatically
increased up to max_matches_increase_threshold
(16384)
○ Useful when pseudo sharding is on (which is by
default)
● SELECT .. OPTION accurate_aggregation=1:
○ May disable query parallelization to guarantee
accuration aggregation
Dynamic max_matches and accurate aggregation

● Manticore Search
● Manticore Columnar Library
● Official docker image
Arm64 support

● Manticore 5: from 1 to 263
-1
● Manticore 6: from -263
+1 to 263
-1
● Next version: from 1 to 264
-1 + UINT64()
64-bit ID

● History: “full-text index”
● Then we added attributes, docid index,
secondary index
● “Index” became confusing
● So we renamed “index” to “table”:
○ in the docs
○ in all the commands
● The old commands are still working
Index -> table

● New columnar storage format
○ Have to rebuild the tables
● New secondary indexes file format
○ ALTER TABLE <table name> REBUILD SECONDARY
○ Since it’s on by default now, backwards compatibility will be
maintained
● New binlog file format
○ Have to stop cleanly
Breaking changes in Manticore 6

● 80+ bugs fixed in Manticore 6
● New known: memory leak related with Buddy
● Preparing Manticore 6.0.4 maintenance
release
● Updated release process: will release more
frequently
Bugs

● Performance:
○ FT + secondary: ✅
○ select count(*) where attr… (in progress)
○ Weak-AND for FT (nearest plans)
○ Per-index binlog
○ Parallel merging
● Buddy-related:
○ Mysqldump ✅
○ /cli ✅
○ Auto-schema for Elastic-like writes ✅
○ Various clients integrations (in progress)
○ Kibana / Opensearch dashboards
● Features:
○ Auto-sharding and orchestration
Plans

Manticore 6.pdf

More Related Content

Manticore 6.pdf