facebook: All content tagged as facebook in NoSQL databases and polyglot persistence
Wednesday, 5 November 2014
NoSQL databases, Hadoop, Big Data: Pinned tabs Nov.5th
01: A brief overview or rather a cheatsheet of MongoDB’s index types and commands. ★
02: I didn’t know the replicating data form Couchbase Lite to Couchbase requires an extra tool, the Sync Gateway. ★
03: A very nice read about how to transform some of the most populat sequential clustering algorithms, k-means, single-linkage, correlation, scale for large amounts of data using a map-reduce massively parallel computation model. ★
04: An intro to using Spark Streaming with some HBase and data visualization. ★
05: Benchmarking Amazon EBS options, spinning vs SSD vs Provisioned IOPS SSD, using Redis. No surprises here. ★
06: Researchers from MIT and the Israel Institute of Technology have proved that for a large-class of non-blocking parallel algorithms, lock-free vs wait-free perform are equal. ★
Lock-free algorithms guarantee that some concurrent operation will make progress. Wait-free algorithms guarantee that all threads make progress.
07: Facebook organized a summit to discuss their storage engines and then look at the challenges they are facing across small & big data, but also hardware. ★
Facebook’s storage is based on: Tao and Memcached. Tao operates at a rate of billions of queries per second. The Memcached caching layer has a critical impact on the service availability.
The problems Facebook would like to address at both small data and big data layers are quite challenging. A couple of examples:
- how to deal with geographically distributed caches
- how to deal with huge amounts of logging which is quite difficult to store in their entirety for analysis
- Facebook’s data warehouse must be partitioned globally and this has important implications on the type of queries that can be executed
Original title and link: NoSQL databases, Hadoop, Big Data: Pinned tabs Nov.5th ( ©myNoSQL)