The document discusses using MongoDB for ecommerce applications. It notes that a relational database requires a rigid schema that cannot adapt when new product types are introduced, such as adding music or jeans to a site originally built for books. MongoDB provides a flexible schema that can accommodate unknown product attributes. While MongoDB lacks transactions across multiple documents, its atomic updates to single documents can still support many common ecommerce workflows. The document provides examples of flexible and complex queries enabled by MongoDB's document-oriented data model.
10. Let’s Use an Example
How about we start with books
11. Book Product Schema
id:
sku:
product dimensions: General Product
shipping weight:
MSRP:
attributes
price:
description:
...
author: Orson Scott Card
title: Enders Game
binding: Hardcover
publication date: July 15, 1994 Book Specific
publisher name: Tor Science Fiction attributes
number of pages: 352
ISBN: 0812550706
language: English
...
14. Album Product Schema
id:
sku:
product dimensions: General Product
shipping weight:
MSRP:
attributes stay the
price: same
description:
...
artist: MxPx
title: Panic Album Specific
release date: June 7, 2005
label: Side One Dummy attributes are
track listing: [ The Darkest ... different
language: English
format: CD
21. Common approaches dealing with
unknown unknowns in RDBMS world
• EAV (Entity Attribute Value popularized by
Magento)
22. Common approaches dealing with
unknown unknowns in RDBMS world
• EAV (Entity Attribute Value popularized by
Magento)
• Single table inheritance aka. Insanely wides tables
(field 1, field 2, ...)
23. Common approaches dealing with
unknown unknowns in RDBMS world
• EAV (Entity Attribute Value popularized by
Magento)
• Single table inheritance aka. Insanely wides tables
(field 1, field 2, ...)
• Generic columns ( ‘attribute1’, ‘attribute2’, ...)
24. Common approaches dealing with
unknown unknowns in RDBMS world
• EAV (Entity Attribute Value popularized by
Magento)
• Single table inheritance aka. Insanely wides tables
(field 1, field 2, ...)
• Generic columns ( ‘attribute1’, ‘attribute2’, ...)
• Serialized data in blob (or similar)
25. Common approaches dealing with
unknown unknowns in RDBMS world
• EAV (Entity Attribute Value popularized by
Magento)
• Single table inheritance aka. Insanely wides tables
(field 1, field 2, ...)
• Generic columns ( ‘attribute1’, ‘attribute2’, ...)
• Serialized data in blob (or similar)
• Concrete table inheritance
26. Common approaches dealing with
unknown unknowns in RDBMS world
• EAV (Entity Attribute Value popularized by
Magento)
• Single table inheritance aka. Insanely wides tables
(field 1, field 2, ...)
• Generic columns ( ‘attribute1’, ‘attribute2’, ...)
• Serialized data in blob (or similar)
• Concrete table inheritance
• Class table inheritance
28. EAV
as popularized by Magento
“For purposes of flexibility, the Magneto database heavily utilizes
an Entity-Attribute-Value (EAV) data model.
As is often the case, the cost of flexibility is complexity -
Magento is no exception.
The process of manipulating data in Magento is often more
“involved” than that typically experienced using traditional
relational tables.”
- Varien
29. EAV
• Crazy SQL queries.
• Hundreds of joins in a
query... or
• Hundreds of queries
joined in the application
• No database enforced
integrity
33. Single Table Inheritance
(insanely wide tables)
• No data integrity
enforcement
• Only can use FK for
common elements
• Very wasteful (but disk is
cheap!)
• Can’t effectively index
34. Generic Columns
• No data integrity
enforcement
• No data type enforcement
• Only can use FK for
common elements
• Wasteful (but disk is
cheap!)
• Can’t index
35. Serialized in Blob
• Not searchable
• No integrity
• All the disadvantages of a
document store, but none
of the advantages
• Never should be used
• One exception is Oracle
XML which operates
similar to a document store
36. Concrete Table Inheritance
(a table for each product attribute set)
• Allows for data integrity
• Querying across
attribute sets quite hard
to do (lots of joins, OR
statements and full table
scanning)
• New table needs to be
created for each new
attribute set
37. Class table inheritance
(single product table,
each attribute set in own table)
• Likely best solution within the
constraint of SQL
• Supports data type enforcement
• No data integrity enforcement
• Easy querying across categories
(for browse pages) since
common data in single table
• Every set needs a new table
• Requires a ton of forsight, as
changes are very complicated
45. Embedded documents
great for orders
• Ordered items need to be fixed at the time
of purchase
• Embed them right in the order
• db.order.find( { 'items.sku': '00e8da9f' } );
• db.order.find( { 'items.details.actor': 'James
Stewart' } ).count()
48. Stricter data
requirements for $$
• For financial systems any data inconsistency is
unacceptable
49. Stricter data
requirements for $$
• For financial systems any data inconsistency is
unacceptable
• Mongo supports atomic single document writes
50. Stricter data
requirements for $$
• For financial systems any data inconsistency is
unacceptable
• Mongo supports atomic single document writes
• Mongo doesn’t support multi-document
transactions
51. Stricter data
requirements for $$
• For financial systems any data inconsistency is
unacceptable
• Mongo supports atomic single document writes
• Mongo doesn’t support multi-document
transactions
• If you need atomic writes across multi-document
transactions don’t use Mongo.
52. Stricter data
requirements for $$
• For financial systems any data inconsistency is
unacceptable
• Mongo supports atomic single document writes
• Mongo doesn’t support multi-document
transactions
• If you need atomic writes across multi-document
transactions don’t use Mongo.
• Many ecommerce transactions could be
accomplished within a single document write.
54. What about ACID?
• ACID compliance is difficult to implement
with a distributed system. For the most
sensitive data you may want the reliability
ACID guarantees.
55. What about ACID?
• ACID compliance is difficult to implement
with a distributed system. For the most
sensitive data you may want the reliability
ACID guarantees.
• Mongo is ACI for single collection writes.
56. What about ACID?
• ACID compliance is difficult to implement
with a distributed system. For the most
sensitive data you may want the reliability
ACID guarantees.
• Mongo is ACI for single collection writes.
• For many systems ACID compliance for
recording a purchase is overkill and
unnecessary.
57. About Me
• VP Engineering @ OpenSky
• @spf13
• github.com/spf13
• Hacker, Father, Husband, Entrepreneur,
Skate Punk
58. About OpenSky
OpenSky is a better way to shop where you buy from people you trust.
OpenSky is a free online platform that helps people discover
amazing products; share them with their friends, family and followers;
and earn money.
Launching in July 2010, the OpenSky is the first ecommerce platform built
on MongoDB. The OpenSky Platform has been created using MongoDB,
Symfony 2, Doctrine 2, PHP 5.3, jQuery, node.js, Git (with gitflow) and a
touch of Java and Python. We support and contribute to many of these
open source projects.
OpenSky was founded by John Caplan, former CEO of Ford Models and
former President of the About.com Network. OpenSky is well funded by
Highland Capital Partners, Canaan Partners and Ron Conway and is based
in NYC near Union Square.
62. OpenSky currently
uses a hybrid approach
• OpenSky uses a 3rd party library to
process transactions it is written to only
use SQL for transactions. In 2011 we will
transition off this library.
• OpenSky uses Solr for search.
• All other functions of the site currently on
MongoDB.
63. Note on Ecommerce
and Mongo
Concerning transactions, it is true that MongoDB doesn’t support
the multi-object kind; however, it does support atomic operations on
individual documents. And this, combined with the document-
oriented modeling just described and a little creativity, is adequate to
many e-commerce problems. Certainly, if we needed to literally debit
one account and credit another in the same operation, or if we
wanted rollback, we’d need full-fledged transactions. However, the
transactionality provided by MongoDB may be sufficient for most, if
not all, e-commerce operations.
- Kyle Banker 10gen
Editor's Notes
actually, just the first 1/3 of it.
Ironically this is how magento solves the performance problems associated with EAV, by caching the data into insanely wide tables.
Can’t create a FK as each set references a different table. “Key” really made of attribute table name id and attribute table name
Notice how it doesn’t matter if it’s an array or not.
Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate.
kyle idea... but we are fairly atomic with authorize.net
Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate.
kyle idea... but we are fairly atomic with authorize.net
Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate.
kyle idea... but we are fairly atomic with authorize.net
Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate.
kyle idea... but we are fairly atomic with authorize.net
Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate.
kyle idea... but we are fairly atomic with authorize.net