Google's BigTable is a highly scalable and high performance database used in over 60 Google products. It provides dynamic control over data layout and format and stores data as multidimensional sorted maps indexed by row key, column name, and timestamp. BigTable is column-oriented and uses a decentralized architecture with tablets and tablet servers for scalability and high availability. Google App Engine provides a way to develop applications using BigTable with Python and common web frameworks and deploy them to Google's cloud infrastructure.
2. The BigTable Goals Wide Applicability Used in more than 60 Google products Scalability High Performance High Availability
3. The BigTable Arena Internet Scale Google :: BigTable and GFS Apache :: HBase and HDFS Amazon :: SimpleDB and S3 Facebook :: Cachr and Haystacks
4. The BigTable Features Dynamic control over data layout and format Data is uninterpreted strings “ Does not support a full relational model” Locality of data Dynamic control over serving data from memory or disk Sparse, distributed, persistent multidimensional sorted map. The map is indexed by: A row key A column name A timestamp Each value in the map is an uninterpreted array of bytes Column oriented
8. App Engine BigTable + Python + AppEngine SDK Choice of web frameworks: webapp (pre-installed) Django CherryPy Pylons Web.py Google Accounts integration App Engine SDK for offline development Offline development environment Online runtime environment Free to get started Priced similar to Amazon S3
9. Getting Started Sign-up for an account Download Python 2.5 Download AppEngine SDK Local version of BigTable Web-server Google user account simulator Webapp framework Getting started tutorial Write you application Upload to google
10. Class Definition Python code to declare a datastore class: class Patient(db.Model): firstName = db.UserProperty() lastName = db.UserProperty() dateOfBirth = db.DateTimeProperty() sex = db.UserProperty()
11. Create Python code to create and store an object: patient = Patient() patient.firstName=“George” patient.lastName=“James” dateOfBirth=“2008-01-01” sex=“M” patient.put()
12. Query Python code to query a class: patients = Patient.all() for patient in patients: self.response.out.write(‘Name %s %s.’, patient.firstName, patient.lastName)
13. More complex query Python code to select the 100 youngest male patients: allPatients = Patient.all() allPatients.filter(‘sex=‘,’Male’) allPatients.order(‘dateOfBirth’) patients = allPatients.fetch(100)
14. Query using GQL GQL = Google Query Language GQL code to select the 100 youngest male patients: select * from Patient where sex=‘Male’ order by dateOfBirth Cannot select specific columns No joins
15. Indexes Development SDK Index definitions generated automatically based on data access within your application Index definitions uploaded to the Google server - kind: Patient properties: - name: dateOfBirth direction: asc - name: sex direction: desc