This document provides an overview of Apache Cassandra, including its origins from Amazon Dynamo and Google BigTable, its data model using a ring topology and column families, and how it provides horizontal scalability and eventual consistency through replication. It discusses Cassandra's write path using commit logs and memtables as well as its read path involving caching. It also covers client access, practical considerations, and Cassandra's future direction.
6. SQLSpecialized data structures (think B-trees)Shines with complicated queriesFocus on fast query & analysis quicklyNot necessarily on large datasets
17. ReplicationReplication factorHow many nodes data is replicated onConsistency levelZero, One, Quorum, AllSync or async for writesReliability of readsRead repair
37. Inserting: WritesCommit log for durabilityConfigurable fsyncSequential writes onlyMemtable – no disk access (no reads or seeks)Sstables are final (become read only)IndexesBloom filterRaw dataBottom line: FAST!!!
39. Querying: OverviewYou need a key or keys:Single: key=‘a’Range: key=‘a’ through ’f’And columns to retrieve:Slice: cols={bar through kite}By name: key=‘b’ cols={bar, cat, llama}Nothing like SQL “WHERE col=‘faz’”But secondary indices are being worked on (see CASSANDRA-749)
40. Querying: ReadsPractically lock freeSstable proliferationNew in 0.6:Row cache (avoid sstable lookup, not write-through)Key cache (avoid index scan)
46. Practical Considerations Partitioner-Random or Order PreservingRange queriesProvisioningVirtual or bare metal Cluster sizeData modelThink in terms of accessGiving up transactions, ad-hoc queries, arbitrary indexes and joins(you may already do this with an RDBMS!)
48. Future DirectionVector clocks (server side conflict resolution)Alter keyspace/column families on a live clusterCompressionMulti-tenant featuresLess memory restrictions
49. Wrapping UpUse Cassandra if you want/needHigh write throughputNear-linear scalabilityAutomated replication/fault toleranceCan tolerate missing RDBMS features
32 core processor machines are expensiveCosts go way up when you try to scale these databasesAlso-instability.
Terabytes of data~1,000,000 ops/secondSchema changes are difficult (impossible)Manual sharding takes a lot of effortAutomated sharding + replication is difficult
100 M users, 25 TB data
Horizontal – commodity hardware, not specialized boxes
Cluster is a logical storage ringNode placement divides the ring into ranges that represent start/stop points for keysAutomatic or manual token assignment (use another slide for that) Closer together means less responsibility and data
Token
Bootstrapping
Hinting not designed for long failures.
RDBMS focus on consistency. Limits scale.
No multi-key transactions
Sstable proliferation degrades performance.
DistributedScalableSchema-freeSparse tableEventually consistentTunable (throughput and fault-tolerance)