What You Need To Know About Amazon SimpleDB
December 13th, 2007Well after being under NDA for so long, I’m glad to be able to say that Amazon SimpleDB has gone into limited beta. Congratulations to everyone on the SDS / SimpleDB team; their several years of work on SimpleDB (formerly called SDS) is a brilliant piece of engineering.
What’s cool about SimpleDB
- Really large data sets
- Really Fast
- Highly Available – It’s Amazon. Running Erlang. Whoa.
- On demand scaling – Like S3, EC2, with a sensible data metering pricing model
- Schemaless – major cool factor for me here; items are little hash tables containing sets of key, value pairs
Considerations you’ll want to think about
- Eventual Consistency – Data is not immediately propagated across all nodes… the latency is usually around a second, but for high data sets or loads, you may experience more latency. On the plus side, your data isn’t lost!
- Queries are lexigraphical – You’ll need to store data in lexicographical ordered form (zero-pad your integers, add positive offsets to negative integer sets, and convert dates into something like ISO 8601)
- Search Indexes – You’ll need to construct your own indexes for text search – The SimpleDB query expressions don’t support text search, so you’ll have to construct inverted indexes to properly do “text search”. This is actually a really great lightweight way to do this and I’m sure many interesting indexing schemes will be possible.
Under the hood
According to the SimpleDB team, SimpleDB is built on top of Erlang. One of the developers, Jim Larson and I worked together at Sendmail, and he was part of a team doing some amazing stuff with an Erlang message store way back in 2000.
While you don’t need to know Erlang to use SimpleDB, many people have visited here interested in its Erlang roots. If you are interested in learning Erlang, I can recommend Programming Erlang, written by Erlang’s creator – the best introduction you can find. I’ve associate-linked to it on Amazon; just for a little meta-fun.
The data model is simply:
- Large collections of items organized into domains.
- Items are little hash tables containing attributes of key, value pairs.
- Attributes can be searched with various lexicographical queries.
Now you can easily build:
- Search indexes
- Log databases / analysis tools –
- Data mining stores
- Tools for World Domination
Further Reading
I also wrote a very basic Python module for SimpleDB to handle the XML and REST stuff (too bad it’s not JSON, at least for now), which I’ll release as soon as I figure out how much of the NDA is now lifted. There are a few floating around, so it shouldn’t be too long before they appear publicly.
Updates:
- Added a link to Nick Christenson‘s paper on Sendmail’s Erlang message store – A great read for those of you building large scale messaging systems or anything in Erlang.
- Added a link to Werner Vogels’ article on eventual consistency – a great background behind SimpleDB’s consistency design choice.
- Whether or not SimpleDB and Dynamo are the same underlying technology has never been confirmed by an authoritative source. That’s all I’m allowed to say.
Technorati Tags: Amazon SimpleDB, SimpleDB, Amazon, Erlang, Databases
December 14th, 2007 at 1:03 am
This is one of the cooler things that I’ve read about. I’m loving these web services–keep ’em coming! Thanks for posting about it.
December 14th, 2007 at 1:38 am
Hi there
We’ve just started using EC2 and S3 and the one thing that has been holding us up is the Mysql/database side of things and our concern for our data. What you describe looks great and I’d be very interested in seeing your Python script as we’re Python guys too. Thanks for the post
John
December 14th, 2007 at 4:43 am
This is great news we at Folknology have been waiting for and AWS db cloud facility. It means we can fast forward our migrations and new apps on AWS.
One question given we are building in Erlang on EC2 is there an Erlang module/library we can use to access (pretty please) SimpleDB?
We are looking at using such a module and maybe adding mnesia caching to it.
(al at folknology)
regards Al
December 14th, 2007 at 7:21 am
Google is going to be all over this thing like a fat kind on a Twinkie.
December 14th, 2007 at 9:50 am
Urgh! Erlang is horrible. Any rumours of a PHP interface? : )
December 14th, 2007 at 9:52 am
Oops!
Didn’t see the developer’s guide. REST / SOAP interface? Awesome!
Very excited about this : )
December 14th, 2007 at 10:19 am
I agree that inverted indices seem like it might be a good way to do text search, but the limit of 256 key/value pairs means that it might turn out rather contrived (I’m guessing you could have one data element plus one or more term elements that point to the data element to do inverted indices). The only other thing slightly annoying about the current system is Query currently does not return a total number of matches (only a token to the next batch), which is unfortunately de rigeur for web apps thanks to Google, but I’m expecting that Amazon might be able to correct this for future releases… otherwise, it’s pretty cool, and having played with it a bit so far, it’s a lot of fun to boot…
December 14th, 2007 at 10:46 am
didn’t find out any mentions to sorting… are app developers supposed to grab any given thousand objects and sort them themselves? S3 also has this “small” issue. any ideas if this will be implemented in the future?
December 14th, 2007 at 11:10 am
Here is a different perspective on SimpleDB: http://marcelo.sampasite.com/brave-tech-world/Amazon-SimpleDB-What-nobody-is-t.htm
December 14th, 2007 at 12:10 pm
I can’t wait. I’m a big fan of all of AWS services. Well, I haven’t used Turk yet…lol
http://codershangout.com
December 14th, 2007 at 12:27 pm
Running Erlang? Citation, please…
December 14th, 2007 at 3:01 pm
The fact that SimpleDB is built using Erlang comes direct from the A2Z team that built it. It’s pretty nifty!
December 14th, 2007 at 3:24 pm
What impact will the 1 second lag have in the actual code – writing? It will be tremendously inefficient from a development perspective if we have to maintain tonnes of try-catch blocks just to ensure that the data is the most updated.
December 14th, 2007 at 7:45 pm
This is good news, I’m really looking forward to playing around with it once the beta opens up.
More specifically, I needed a tool like this for the eventual world taking over of. I’m pretty thrilled to see that it finally has arrived.
December 14th, 2007 at 10:16 pm
I am hoping that it really was written in Erlang.
What sucks is that there is NO support for Erlang from Amazon’s forums. Hell, most people never even heard of it.
That’s a shame.
Me, I’ve devoted an entire forum section to Erlang…lol
http://codershangout.com
December 15th, 2007 at 7:44 am
This seems like a really cool way for yet another big company to make money off of every transaction I make and every breath I take. Another way for a big company to know everything about me and all of my customers. Another way for a big company to make small companies dependent on them for their survival.
I don’t care how cool the technology is, If I can’t run it on my own server then I will not have anything to do with it.
December 15th, 2007 at 10:40 am
Grant you just made yourself something to do with it
December 16th, 2007 at 8:18 am
Where did you see that (erlang)? This seems to be a public version of their Dynamo system described here:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
Where it says:
“In Dynamo, each storage node has three main software components: request coordination, membership and failure detection, and a local persistence engine. All these components are implemented in Java.”
December 17th, 2007 at 3:25 am
simpledb will be altenative way of RDBMS, right?
December 17th, 2007 at 6:52 am
@Grant
A little paranoid are we? lol
Besides, small companies always depend on big companies for survival. Name one business where this is not true.
Hosting, electric, water, sewage, telecommunications, etc. Chances are that all small companies will use one of the above in some form. They are all big companies (or big governments).
http://codershangout.com
January 3rd, 2008 at 3:08 pm
I read that overview of Dynamo and it sure doesn’t sound like it is based on Erlang. Is it possible that SimpleDB has been confused with CouchDB, which is definitely written on top of Erlang?
http://couchdb.org/
March 26th, 2008 at 4:52 am
Someone asked, “simpledb will be altenative way of RDBMS, right?” Yes, it will be the alternative: the alternative for illiterate programmers.
Some people just don’t get it. RDBMSs are based on set theory and, as such, support many operations that simpledb requires you to do manually. RDMSs are to mathematics what simpledb is to basic math skills. Sure, you can do many things in the world only knowing addition and subtraction, but if you actually knew algebra, then you could do a lot more.
Do you guys really want simpledb because it is actually is a good fit for your application, or are you just scared of learning SQL?
March 28th, 2008 at 9:29 am
Personally, I’ve been working with SQL for ten years, building corporate OLTP databases. I definitely would not use something like SimpleDB for the stuff I do at work.
But that stuff that has a lot of complex, interrelated data, has intricate reporting requirements including complex adhoc queries, and doesn’t need huge scalability.
On the other hand, I’m working on some personal web projects. I’m hoping to need a lot of scalability, and I don’t have a lot of money to spend on it. My requirements are relatively simple, and I don’t have a fickle client imposing them on me. For these projects, I’m very interested in SimpleDB and the rest of AWS.
Right tools for the jobs, that’s all.
July 5th, 2008 at 8:15 am
[…] that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style […]
October 7th, 2008 at 9:30 pm
[…] It??s written in Erlang […]
October 13th, 2008 at 12:33 am
[…] What You Need To Know About Amazon SimpleDB — очень кратко о новом сервисе Амазон для разработчиков […]
November 20th, 2008 at 8:27 am
I keep hearing great things about SimpleDB. We have been utilizing SQL as well but Amazon keeps coming out with terrific solutions. Great work
January 9th, 2009 at 1:17 pm
Have you ever used erlang?
Didn’t think so.
January 19th, 2009 at 9:15 am
[…] Amazon’s SimpleDB Service, and some commentary […]
February 18th, 2009 at 11:32 pm
[…] Amazon’s SimpleDB Service, and some commentary […]
February 22nd, 2009 at 8:22 am
[…] Amazon’s SimpleDB Service, и некоторые комментарии […]
September 10th, 2009 at 5:14 am
Hi! I was surfing and found your blog post… nice! I love your blog. :) Cheers! Sandra. R.
September 10th, 2009 at 8:15 am
I love your site. :) Love design!!! I just came across your blog and wanted to say that Ive really enjoyed browsing your blog posts. Sign: ndsam
September 11th, 2009 at 7:43 am
Sign: umsun Hello!!! rcuwwymhyw and 4076ssgfhphzye and 3100I will try to recommend this post to my friends and family, cuz its really helpful.
October 5th, 2009 at 10:27 am
nwanrkcwrnxe
March 14th, 2010 at 4:00 pm
[…] Amazon’s SimpleDB Service, and some commentary […]
April 29th, 2010 at 1:05 am
[…] some more technical details, the Inside Looking Out blog has some, and Amazon has a SimpleDB developer […]
May 2nd, 2010 at 3:01 pm
Ðа Ñамом деле оÑÐµÐ½Ñ Ð¿ÑиколÑнÑй блог! СпаÑибо и⦠ÑазÑмееÑÑÑ, пиÑиÑе еÑе!
May 9th, 2010 at 7:57 pm
Is there anyone else who can’t view the last part of this page? I believe the writer needs to check the source code on this post maybe?
May 16th, 2010 at 10:45 pm
amazon’s api has always been challenging to deal with. Great info on the article though.
http://www.maccsl.org
August 31st, 2010 at 5:14 pm
[…] Amazon’s SimpleDB Service, and some commentary […]
September 9th, 2010 at 1:41 pm
ÐÐ°Ñ Ð¸Ð½ÑеÑеÑÑÑÑ Ð¿Ð¾Ð´ÐµÑжаннÑе авÑомобили (Ð±Ñ Ð°Ð²Ñо)? ÐÑ ÑаÑÑо пиÑеÑе в поиÑковÑÑ ÑиÑÐµÐ¼Ð°Ñ Â«ÐºÑÐ¿Ð»Ñ Ð°Ð²Ñо», пÑодаÑÑ Ð°Ð²ÑомобилÑ», «подеÑжаннÑе авÑо», Â«Ð±Ñ Ð°Ð²Ñо» и пÑоводиÑе ÑаÑÑ Ð² поиÑÐºÐ°Ñ Ð¿Ð¾Ð´Ñ Ð¾Ð´ÑÑÐ¸Ñ Ð¿Ñедложений? Ðикак не можеÑе Ñебе кÑпиÑÑ Ð°Ð²ÑомобилÑ? Тогда Ð²Ñ Ñам, где нÑжно! УкÑаинÑкий ÑÐ°Ð¹Ñ Ð¿ÑÐµÐ´Ð»Ð°Ð³Ð°ÐµÑ Ð²Ð°ÑÐµÐ¼Ñ Ð²Ð½Ð¸Ð¼Ð°Ð½Ð¸Ñ ÑÑаÑÑе авÑомобили. ÐлагодаÑÑ ÑÑÐ¾Ð¼Ñ Ð¿Ð¾ÑÑалÑ, ÑепеÑÑ ÐÑ Ð¼Ð¾Ð¶ÐµÑе даваÑÑ Ð¾Ð±ÑÑÐ²Ð»ÐµÐ½Ð¸Ñ Ð°Ð²ÑобазаÑ», «пÑодам авÑо бÑ», «авÑо бÑ». ÐÑ Ð²Ñегда можем помоÑÑ Ñем, кÑо Ñ Ð¾ÑÐµÑ Ð¿ÑодаÑÑ Ð¿Ð¾Ð´ÐµÑжаннÑе авÑомобили, и ÑомÑ, ÐºÐ¾Ð¼Ñ Ð½ÐµÐ¾Ð±Ñ Ð¾Ð´Ð¸Ð¼Ð¾ пÑеобÑеÑÑи авÑÐ¾Ð¼Ð¾Ð±Ð¸Ð»Ñ (Ð±Ñ Ð°Ð²Ñо).
September 12th, 2010 at 5:41 am
Fascinating read. There is currently quite a great deal of information close to this topic close to and about about the net and some are most defintely better than others. You’ve caught the detail right here just correct which makes for any refreshing alter – thanks.
September 28th, 2010 at 12:16 am
This is the post I was looking for long time and finally got it.. thanks for sharing
October 11th, 2010 at 1:12 pm
Hi I love this forumI’m also passionated in extreme sport and strategies to be more efficient…I found a piece technology that make me more efficientThanks again for your forumBye power balance pas cher power balance discount cheap powerbalance
October 16th, 2010 at 1:56 am
Thanks for sharing..loved ur post..how do i subscribe ur blog
October 19th, 2010 at 1:34 pm
Find out where to buy MDMA online. Discover the best MDMA to buy online.