# 第äºç« ï¼è³ææ¨¡åèæ¥è©¢èªè¨

> èªè¨çéçå°±æ¯ææ³çéçã
>
> ââ 路德ç¶å¥ã»ç¶ç¹æ ¹æ¯å¦ï¼ãé輯å²å¸ãï¼1922ï¼
>
-------------------
[TOC]
è³ææ¨¡åå¯è½æ¯è»é«éç¼ä¸æéè¦çé¨åäºï¼å çºå®åçå½±é¿å¦æ¤æ·±é ï¼ä¸å
å
å½±é¿èè»é«ç編寫æ¹å¼ï¼èä¸å½±é¿èæåç **è§£é¡æè·¯**ã
夿¸æç¨ä½¿ç¨å±¤å±¤çå çè³ææ¨¡åæ§å»ºãå°æ¼æ¯å±¤è³ææ¨¡åçééµå顿¯ï¼å®æ¯å¦ä½ç¨ä½ä¸å±¤è³ææ¨¡åä¾ **表示** çï¼ä¾å¦ï¼
1. ä½çºä¸åæç¨éç¼äººå¡ï¼ä½ è§å¯ç¾å¯¦ä¸çï¼è£¡é¢æäººå¡ãçµç¹ã貨ç©ãè¡çºãè³éæµåãææ¸¬å¨çï¼ï¼ä¸¦æ¡ç¨ç©ä»¶æè³æçµæ§ï¼ä»¥åææ§é£äºè³æçµæ§ç API ä¾é²è¡å»ºæ¨¡ãé£äºçµæ§é常æ¯ç¹å®æ¼æç¨ç¨å¼çã
2. ç¶è¦å²åé£äºè³æçµæ§æï¼ä½ å¯ä»¥å©ç¨éç¨è³ææ¨¡åä¾è¡¨ç¤ºå®åï¼å¦ JSON æ XML æä»¶ãéä¿è³æåº«ä¸ç表æå模åã
3. è³æåº«è»é«çå·¥ç¨å¸«é¸å®å¦ä½ä»¥è¨æ¶é«ãç£ç¢æç¶²è·¯ä¸çä½å
çµä¾è¡¨ç¤º JSON / XML/ éä¿ / åè³æãéé¡è¡¨ç¤ºå½¢å¼ä½¿è³ææå¯è½ä»¥å種æ¹å¼ä¾æ¥è©¢ï¼æå°ï¼æç¸±åèçã
4. 卿´ä½ç層次ä¸ï¼ç¡¬é«å·¥ç¨å¸«å·²ç¶æ³åºäºä½¿ç¨é»æµãå
èè¡ãç£å ´æè
å
¶ä»æ±è¥¿ä¾è¡¨ç¤ºä½å
çµçæ¹æ³ã
ä¸åè¤éçæç¨ç¨å¼å¯è½æææ´å¤çä¸éå±¤æ¬¡ï¼æ¯å¦åºæ¼ API ç APIï¼ä¸éåºæ¬ææ³ä»ç¶æ¯ä¸æ¨£çï¼æ¯å層é½ééæä¾ä¸åæç¢ºçè³ææ¨¡åä¾é±èæ´ä½å±¤æ¬¡ä¸çè¤éæ§ãéäºæ½è±¡å
許ä¸åç人群ææå°åä½ï¼ä¾å¦è³æåº«å» åçå·¥ç¨å¸«å使ç¨è³æåº«çæç¨ç¨å¼éç¼äººå¡ï¼ã
è³ææ¨¡å種é¡ç¹å¤ï¼æ¯åè³ææ¨¡åé½å¸¶æå¦ä½ä½¿ç¨çè¨æ³ãæäºç¨æ³å¾å®¹æï¼æäºå䏿¯æ´å¦æ¤ï¼æäºæä½å·è¡å¾å¿«ï¼æäºå表ç¾å¾å·®ï¼æäºè³æè½æé常èªç¶ï¼æäºåå¾éº»ç
©ã
ææ¡ä¸åæ¸ææ¨¡åéè¦è±è²»å¾å¤ç²¾åï¼æ³æ³éä¿è³æå»ºæ¨¡æå¤å°æ¬æ¸ï¼ãå³ä¾¿åªä½¿ç¨ä¸åæ¸ææ¨¡åï¼ä¸ç¨æå¿å
¶å
§é¨å·¥ä½æ©å¶ï¼æ§å»ºè»é«ä¹æ¯é常å°é£çãç¶èï¼å çºè³ææ¨¡åå°ä¸å±¤è»é«çåè½ï¼è½åä»éº¼ï¼ä¸è½åä»éº¼ï¼æèè³æ·±çå½±é¿ï¼æä»¥é¸æä¸åé©åçè³ææ¨¡åæ¯é常éè¦çã
卿¬ç« ä¸ï¼æåå°ç ç©¶ä¸ç³»åç¨æ¼è³æå²ååæ¥è©¢çéç¨è³ææ¨¡åï¼åé¢å表ä¸ç第 2 é»ï¼ãç¹å¥å°ï¼æåå°æ¯è¼éä¿æ¨¡åï¼æä»¶æ¨¡ååå°éåºæ¼åå½¢çè³ææ¨¡åãæåéå°æª¢è¦å種æ¥è©¢èªè¨ä¸¦æ¯è¼å®åçç¨ä¾ãå¨ [第ä¸ç« ](ch3.md) ä¸ï¼æåå°è¨è«å²å弿æ¯å¦ä½å·¥ä½çãä¹å°±æ¯èªªï¼éäºè³ææ¨¡å實é䏿¯å¦ä½å¯¦ç¾çï¼å表ä¸ç第 3 é»ï¼ã
## éä¿æ¨¡åèæä»¶æ¨¡å
ç¾å¨æèåçè³ææ¨¡åå¯è½æ¯ SQLãå®åºæ¼ Edgar Codd å¨ 1970 å¹´æåºçéä¿æ¨¡åã1ãï¼è³æè¢«çµç¹æ **éä¿**ï¼SQL ä¸ç¨±ä½ **表**ï¼ï¼å
¶ä¸æ¯åéä¿æ¯ **å
çµ**ï¼SQL ä¸ç¨±ä½ **è¡**) çç¡åºéåã
éä¿æ¨¡åæ¾æ¯ä¸åçè«æ§çæè°ï¼ç¶æå¾å¤äººé½æ·çæ¯å¦è½å¤ ææå¯¦ç¾å®ãç¶èå°äº 20 ä¸ç´ 80 年代䏿ï¼éä¿è³æåº«ç®¡ç系統ï¼RDBMSesï¼å SQL å·²æçºå¤§å¤æ¸äººåå²ååæ¥è©¢æäºå¸¸è¦çµæ§çè³æçé¦é¸å·¥å
·ãéä¿è³æåº«å·²ç¶æçºç¨±é¸äºå¤§ç´ 25~30 å¹´ ââ éå°è¨ç®æ©å²ä¾èªªæ¯æ¥µå
¶æ¼«é·çæéã
éä¿è³æåº«èµ·æºæ¼åæ¥è³æèçï¼å¨ 20 ä¸ç´ 60 年代å 70 年代ç¨å¤§åè¨ç®æ©ä¾å·è¡ãå¾ä»å¤©çè§åº¦ä¾çï¼é£äºç¨ä¾é¡¯å¾å¾å¹³å¸¸ï¼å
¸åç **äºåèç**ï¼å°é·å®æéè¡äº¤æï¼èªç©ºå
¬å¸é è¨ï¼åº«å管çè³è¨è¨éå¨åº«ï¼å **æ¹èç**ï¼å®¢æ¶ç¼ç¥¨ï¼å·¥è³å®ï¼å ±åï¼ã
ç¶æçå
¶ä»è³æåº«è¿«ä½¿æç¨ç¨å¼éç¼äººå¡å¿
é èæ
®è³æåº«å
§é¨çè³æè¡¨ç¤ºå½¢å¼ãéä¿æ¨¡åè´åæ¼å°ä¸è¿°å¯¦ç¾ç´°ç¯é±è卿´ç°¡æ½çä»é¢ä¹å¾ã
å¤å¹´ä¾ï¼å¨è³æå²ååæ¥è©¢æ¹é¢åå¨è許å¤ç¸äºç«¶ççæ¹æ³ãå¨ 20 ä¸ç´ 70 年代å 80 年代åï¼ç¶²ç模åï¼network modelï¼å層次模åï¼hierarchical modelï¼æ¾æ¯ä¸»è¦ç鏿ï¼ä½éä¿æ¨¡åï¼relational modelï¼é¨å¾ä½æäºä¸»å°å°ä½ãç©ä»¶è³æåº«å¨ 20 ä¸ç´ 80 年代æ«å 90 年代åä¾äºåå»ãXML è³æåº«å¨äºåä¸ä¸ç´ååºç¾ï¼ä½åªæå°ç¾æ¡ç¨éãéä¿æ¨¡åçæ¯åç«¶çè
é½å¨å
¶æä»£ç¢çäºå¤§éççä½ï¼ä½å¾ä¾æ²ææçºã2ãã
é¨èé»è
¦è¶ä¾è¶å¼·å¤§åäºè¯ï¼å®åéå§ç¨æ¼æ¥ç夿¨£åçç®çãéä¿è³æåº«é常æåå°è¢«æ¨å»£å°æ¥åè³æèççåå§ç¯åä¹å¤æ´çºå»£æ³çç¨ä¾ä¸ãä½ ä»å¤©å¨ç¶²ä¸çå°ç大é¨åå
§å®¹ä¾èæ¯ç±éä¿è³æåº«ä¾æä¾æ¯æ´ï¼ç¡è«æ¯ç·ä¸éåºãè¨è«ã社交網路ãé»åååãéæ²ãè»é«å³æåçç¢åæç¨ç¨å¼çå
§å®¹ã
### NoSQL çèªç
ç¾å¨ - 2010 年代ï¼NoSQL éå§äºææ°ä¸è¼ªå試ï¼è©¦åæ¨ç¿»éä¿æ¨¡åç統治å°ä½ãâNoSQLâ éåååè®äººéºæ¾ï¼å çºå¯¦éä¸å®ä¸¦æ²ææ¶åå°ä»»ä½ç¹å®çæè¡ãæåå®åªæ¯ä½çºä¸åéç®ç Twitter æ¨ç±¤ï¼ç¨å¨ 2009 å¹´ä¸å鿼忣å¼ï¼ééä¿è³æåº«ä¸çéæºèæä¸ãç¡è«å¦ä½ï¼éåè¡èªè§¸åäºæäºç¥ç¶ï¼ä¸¦è¿
éå¨ç¶²è·¯åµæ¥ç¤¾ç¾¤å
§å¤å³æéä¾ãå¥½äºæè¶£çè³æåº«ç³»çµ±ç¾å¨é½è *#NoSQL* æ¨ç±¤ç¸éè¯ï¼ä¸¦ä¸ NoSQL 被追溯æ§å°éæ°è§£éçº **ä¸å
æ¯ SQLï¼Not Only SQLï¼** ã4ãã
æ¡ç¨ NoSQL è³æåº«çè徿幾åé©
åå ç´ ï¼å
¶ä¸å
æ¬ï¼
* éè¦æ¯éä¿è³æåº«æ´å¥½çå¯ä¼¸ç¸®æ§ï¼å
æ¬é常大çè³æéæé常é«ç寫å
¥ååé
* ç¸æ¯åæ¥è³æåº«ç¢åï¼å
è²»åéæºè»é«æ´ååæ
* éä¿æ¨¡åä¸è½å¾å¥½å°æ¯æ´ä¸äºç¹æ®çæ¥è©¢æä½
* åæ«æ¼éä¿æ¨¡åçéå¶æ§ï¼æ¸´æä¸ç¨®æ´å
·å¤åæ
æ§è表ç¾åçè³ææ¨¡åã5ã
ä¸åçæç¨ç¨å¼æä¸åçéæ±ï¼ä¸åç¨ä¾çæä½³æè¡é¸æå¯è½ä¸åæ¼å¦ä¸åç¨ä¾çæä½³æè¡é¸æãå æ¤ï¼å¨å¯é è¦çæªä¾ï¼éä¿è³æåº«ä¼¼ä¹å¯è½æç¹¼çºèå種ééä¿è³æåº«ä¸èµ·ä½¿ç¨ - éç¨®æ³æ³ææä¹è¢«ç¨±çº **æ··åæä¹
åï¼polyglot persistenceï¼**ã
### ç©ä»¶éä¿ä¸å¹é
ç®å大夿¸æç¨ç¨å¼éç¼é½ä½¿ç¨ç©ä»¶å°åçç¨å¼èªè¨ä¾éç¼ï¼éå°è´äºå° SQL è³ææ¨¡åçæ®éæ¹è©ï¼å¦æè³æå²åå¨éä¿è¡¨ä¸ï¼é£éº¼éè¦ä¸å笨æçè½æå±¤ï¼èæ¼æç¨ç¨å¼ç¨å¼ç¢¼ä¸çç©ä»¶å表ï¼è¡ï¼åçè³æåº«æ¨¡åä¹éãæ¨¡åä¹éçä¸é£è²«ææè¢«ç¨±çº **黿ä¸å¹é
ï¼impedance mismatchï¼**[^i]ã
[^i]: ä¸åå¾é»åå¸åç¨çè¡èªãæ¯åé»è·¯ç輸å
¥å輸åºé½æä¸å®ç黿ï¼äº¤æµé»é»ï¼ãç¶ä½ å°ä¸åé»è·¯ç輸åºé£ç·å°å¦ä¸åé»è·¯ç輸å
¥æï¼å¦æå
©åé»è·¯ç輸åºå輸å
¥é»æå¹é
ï¼åé£ç·ä¸çåçå³è¼¸å°è¢«æå¤§åã黿ä¸å¹é
æå°è´è¨èåå°åå
¶ä»åé¡ã
å ActiveRecord å Hibernate 鿍£ç **ç©ä»¶éä¿å°æ ï¼ORM object-relational mappingï¼** æ¡æ¶å¯ä»¥æ¸å°éåè½æå±¤æéçæ¨£æ¿ç¨å¼ç¢¼çæ¸éï¼ä½æ¯å®åä¸è½å®å
¨é±èéå
©å模åä¹éçå·®ç°ã

**å 2-1 使ç¨éä¿å模å¼ä¾è¡¨ç¤ºé è±ç°¡ä»**
ä¾å¦ï¼[å 2-1](../img/fig2-1.png) å±ç¤ºçå¦ä½å¨éä¿æ¨¡å¼ä¸è¡¨ç¤ºç°¡æ·ï¼ä¸å LinkedIn ç°¡ä»ï¼ãæ´åç°¡ä»å¯ä»¥ééä¸åå¯ä¸çèå¥ç¬¦è `user_id` 便¨èãå `first_name` å `last_name` 鿍£çæ¬ä½æ¯å使ç¨è
åªåºç¾ä¸æ¬¡ï¼æä»¥å¯ä»¥å¨ User 表ä¸å°å
¶å»ºæ¨¡çºåã使¯ï¼å¤§å¤æ¸äººå¨è·æ¥çæ¶¯ä¸ææå¤æ¼ä¸ä»½çå·¥ä½ï¼äººåå¯è½æä¸å樣çæè²é段åä»»ææ¸éçè¯çµ¡è³è¨ãå¾ä½¿ç¨è
å°éäºå°æ¡ä¹éåå¨ä¸å°å¤çéä¿ï¼å¯ä»¥ç¨å¤ç¨®æ¹å¼ä¾è¡¨ç¤ºï¼
* å³çµ± SQL 模åï¼SQLï¼1999 ä¹åï¼ä¸ï¼æå¸¸è¦çè¦ç¯åè¡¨ç¤ºå½¢å¼æ¯å°è·ä½ï¼æè²åè¯çµ¡è³è¨æ¾å¨å®ç¨ç表ä¸ï¼å° User 表æä¾å¤é¨ç´¢å¼éµå¼ç¨ï¼å¦ [å 2-1](../img/fig2-1.png) æç¤ºã
* å¾çºç SQL æ¨æºå¢å äºå°çµæ§åè³æåå¥å XML è³æçæ¯æ´ï¼éå
許å°å¤å¼è³æå²åå¨å®è¡å
§ï¼ä¸¦æ¯æ´å¨éäºæä»¶å
§æ¥è©¢åç´¢å¼ãéäºåè½å¨ Oracleï¼IBM DB2ï¼MS SQL Server å PostgreSQL ä¸é½æä¸åç¨åº¦çæ¯æ´ã6,7ããJSON è³æåå¥ä¹å¾å°å¤åæ¸æåº«çæ¯æ´ï¼å
æ¬ IBM DB2ï¼MySQL å PostgreSQL ã8ãã
* 第ä¸ç¨®é¸ææ¯å°è·æ¥ï¼æè²åè¯çµ¡è³è¨ç·¨ç¢¼çº JSON æ XML æä»¶ï¼å°å
¶å²åå¨è³æåº«çæååä¸ï¼ä¸¦è®æç¨ç¨å¼è§£æå
¶çµæ§åå
§å®¹ãé種é
ç½®ä¸ï¼é常ä¸è½ä½¿ç¨è³æåº«ä¾æ¥è©¢è©²ç·¨ç¢¼åä¸çå¼ã
å°æ¼ä¸ååç°¡æ·é樣èªå
嫿件çè³æçµæ§èè¨ï¼JSON 表示æ¯é常åé©çï¼è«åé± [ä¾ 2-1]()ãJSON æ¯ XML æ´ç°¡å®ãé¢åæä»¶çè³æåº«ï¼å¦ MongoDB ã9ãï¼RethinkDB ã10ãï¼CouchDB ã11ãå Espressoã12ãï¼æ¯æ´éç¨®è³ææ¨¡åã
**ä¾ 2-1. ç¨ JSON æä»¶è¡¨ç¤ºä¸å LinkedIn ç°¡ä»**
```json
{
"user_id": 251,
"first_name": "Bill",
"last_name": "Gates",
"summary": "Co-chair of the Bill & Melinda Gates... Active blogger.",
"region_id": "us:91",
"industry_id": 131,
"photo_url": "/p/7/000/253/05b/308dd6e.jpg",
"positions": [
{
"job_title": "Co-chair",
"organization": "Bill & Melinda Gates Foundation"
},
{
"job_title": "Co-founder, Chairman",
"organization": "Microsoft"
}
],
"education": [
{
"school_name": "Harvard University",
"start": 1973,
"end": 1975
},
{
"school_name": "Lakeside School, Seattle",
"start": null,
"end": null
}
],
"contact_info": {
"blog": "http://thegatesnotes.com",
"twitter": "http://twitter.com/BillGates"
}
}
```
æä¸äºéç¼äººå¡èªçº JSON æ¨¡åæ¸å°äºæç¨ç¨å¼ç¨å¼ç¢¼åå²å層ä¹éç黿ä¸å¹é
ãä¸éï¼æ£å¦æåå°å¨ [第åç« ](ch4.md) ä¸çå°ç飿¨£ï¼JSON ä½çºè³æç·¨ç¢¼æ ¼å¼ä¹åå¨åé¡ãç¡æ¨¡å¼å° JSON 模åä¾èªªå¾å¾è¢«èªçºæ¯ä¸ååªå¢ï¼æåå°å¨ â[æä»¶æ¨¡åä¸ç模å¼éæ´»æ§](#æä»¶æ¨¡åä¸ç模å¼éæ´»æ§)â ä¸è¨è«éååé¡ã
JSON è¡¨ç¤ºæ¯ [å 2-1](../img/fig2-1.png) ä¸çå¤è¡¨æ¨¡å¼å
·ææ´å¥½ç **ååæ§ï¼localityï¼**ã妿å¨åé¢çéä¿å示ä¾ä¸ç²åç°¡ä»ï¼é£éè¦å·è¡å¤åæ¥è©¢ï¼éé `user_id` æ¥è©¢æ¯å表ï¼ï¼æè
å¨ User 表èå
¶ä¸å±¬è¡¨ä¹éæ··äºå°å·è¡å¤è·¯é£ç·ãèå¨ JSON 表示ä¸ï¼ææç¸éè³è¨é½å¨åä¸åå°æ¹ï¼ä¸åæ¥è©¢å°±è¶³å¤ äºã
å¾ä½¿ç¨è
ç°¡ä»æªæ¡å°ä½¿ç¨è
è·ä½ï¼æè²æ·å²åè¯çµ¡è³è¨ï¼é種ä¸å°å¤éä¿é±å«äºè³æä¸çä¸å樹ççµæ§ï¼è JSON 表示使å¾é忍¹ççµæ§è®å¾æç¢ºï¼è¦ [å 2-2](../img/fig2-2.png)ï¼ã

**å 2-2 ä¸å°å¤éä¿æ§å»ºäºä¸åæ¨¹çµæ§**
### å¤å°ä¸åå¤å°å¤çéä¿
å¨ä¸ä¸ç¯ç [ä¾ 2-1]() ä¸ï¼`region_id` å `industry_id` æ¯ä»¥ IDï¼è䏿¯ç´å串 âGreater Seattle Areaâ å âPhilanthropyâ çå½¢å¼çµ¦åºçãçºä»éº¼ï¼
å¦æä½¿ç¨è
ä»é¢ç¨ä¸åèªç±æåæ¬ä½ä¾è¼¸å
¥åååè¡æ¥ï¼é£éº¼å°ä»åå²åçºç´æåå串æ¯åççãå¦ä¸æ¹å¼æ¯çµ¦åºå°çåååè¡æ¥çæ¨æºåçå表ï¼ä¸¦è®ä½¿ç¨è
å¾ä¸æå表æèªåå¡«å
å¨ä¸é²è¡é¸æï¼å
¶åªå¢å¦ä¸ï¼
* ååç°¡ä»ä¹é樣å¼åæ¼å¯«çµ±ä¸
* é¿å
æ§ç¾©ï¼ä¾å¦ï¼å¦ææå¹¾åååçåå¸ï¼
* ææ¼æ´æ° ââ å稱åªå²åå¨ä¸åå°æ¹ï¼å¦æéè¦æ´æ¹ï¼ä¾å¦ï¼ç±æ¼æ¿æ²»äºä»¶èæ¹è®åå¸å稱ï¼ï¼å¾å®¹æé²è¡å
¨é¢æ´æ°ã
* æ¬å°åæ¯æ´ ââ ç¶ç¶²ç«ç¿»è¯æå
¶ä»èªè¨æï¼æ¨æºåçå表å¯ä»¥è¢«æ¬å°åï¼ä½¿å¾å°ååè¡æ¥å¯ä»¥ä½¿ç¨ä½¿ç¨è
çèªè¨ä¾é¡¯ç¤º
* æ´å¥½çæå° ââ ä¾å¦ï¼æå°è¯çé å·çæ
åå®¶å°±æå¹é
é份簡ä»ï¼å çºå°åå表å¯ä»¥ç·¨ç¢¼è¨é西é
åå¨è¯çé éä¸äºå¯¦ï¼å¾ âGreater Seattle Areaâ éåå串ä¸çä¸åºä¾ï¼
å²å ID 鿝æåå串ï¼éæ¯å **坿¬ï¼duplicationï¼** åé¡ãç¶ä½¿ç¨ ID æï¼å°äººé¡ææç¾©çè³è¨ï¼æ¯å¦å®è©ï¼Philanthropyï¼åªå²åå¨ä¸èï¼ææå¼ç¨å®çå°æ¹ä½¿ç¨ IDï¼ID åªå¨è³æåº«ä¸ææç¾©ï¼ãç¶ç´æ¥å²åæåæï¼å°äººé¡ææç¾©çè³è¨æè¤è£½å¨æ¯è使ç¨è¨éä¸ã
ä½¿ç¨ ID çå¥½èæ¯ï¼ID å°äººé¡æ²æä»»ä½æç¾©ï¼å èæ°¸é ä¸éè¦æ¹è®ï¼ID å¯ä»¥ä¿æä¸è®ï¼å³ä½¿å®æ¨èçè³è¨ç¼çè®åãä»»ä½å°äººé¡ææç¾©çæ±è¥¿é½å¯è½éè¦å¨å°ä¾æåæåæ¹è® ââ 妿éäºè³è¨è¢«è¤å¶ï¼ææçåé¤å¯æ¬é½éè¦æ´æ°ãéæå°è´å¯«å
¥éé·ï¼ä¹åå¨ä¸ä¸è´ç風éªï¼ä¸äºå¯æ¬è¢«æ´æ°äºï¼éæäºå¯æ¬æ²æè¢«æ´æ°ï¼ãå»é¤æ¤é¡éè¤æ¯è³æåº« **è¦ç¯åï¼normalizationï¼** çé鵿æ³ã[^ii]
[^ii]: éæ¼éä¿æ¨¡åçæç»ååäºå¹¾ç¨®ä¸åçè¦ç¯å½¢å¼ï¼ä½éäºåå¥å¹¾ä¹æ²æå¯¦éæç¾©ãä¸åç¶é©æ³åæ¯ï¼å¦æéè¤å²åäºå¯ä»¥å²åå¨ä¸åå°æ¹çå¼ï¼å模å¼å°±ä¸æ¯ **è¦ç¯åï¼normalizedï¼** çã
> è³æåº«ç®¡çå¡åéç¼äººå¡åæ¡çè«è¦ç¯ååéè¦ç¯åï¼è®æåæ«æä¿ç夿·å§ã卿¬æ¸ç [第ä¸é¨å](part-iii.md)ï¼æåå°åå°éå話é¡ï¼æ¢è¨ç³»çµ±çæ¹æ³ç¨ä»¥èçå¿«åï¼éè¦ç¯ååè¡çè³æã
ä¸å¹¸çæ¯ï¼å°éäºè³æé²è¡è¦ç¯åéè¦å¤å°ä¸çéä¿ï¼è¨±å¤äººçæ´»å¨ä¸åç¹å®çå°åï¼è¨±å¤äººå¨ä¸åç¹å®çè¡æ¥å·¥ä½ï¼ï¼éèæä»¶æ¨¡åä¸å¤ªå»åãå¨éä¿è³æåº«ä¸ï¼éé ID ä¾å¼ç¨å
¶ä»è¡¨ä¸çè¡æ¯æ£å¸¸çï¼å çºé£ç·å¾å®¹æãå¨æä»¶è³æåº«ä¸ï¼ä¸å°å¤æ¨¹çµæ§æ²æå¿
è¦ç¨é£ç·ï¼å°é£ç·çæ¯æ´é常å¾å¼± [^iii]ã
[^iii]: 卿°å¯«æ¬ææï¼RethinkDB æ¯æ´é£ç·ï¼MongoDB 䏿¯æ´é£ç·ï¼è CouchDB åªæ¯æ´é å
宣åçæª¢è¦ã
å¦æè³æåº«æ¬èº«ä¸æ¯æ´é£ç·ï¼åå¿
é 卿ç¨ç¨å¼ç¨å¼ç¢¼ä¸ééå°è³æåº«é²è¡å¤åæ¥è©¢ä¾æ¨¡æ¬é£ç·ãï¼å¨é種æ
æ³ä¸ï¼å°ååè¡æ¥çå表å¯è½å¾å°ï¼æ¹åå¾å°ï¼æç¨ç¨å¼å¯ä»¥ç°¡å®å°å°å
¶å²åå¨è¨æ¶é«ä¸ãä¸éï¼å·è¡é£ç·çå·¥ä½å¾è³æåº«è¢«è½ç§»å°æç¨ç¨å¼ç¨å¼ç¢¼ä¸ãï¼
æ¤å¤ï¼å³ä¾¿æç¨ç¨å¼çæåçæ¬é©åç¡é£ç·çæä»¶æ¨¡åï¼é¨èåè½æ°å¢å°æç¨ç¨å¼ä¸ï¼è³ææè®å¾æ´å äºè¯ãä¾å¦ï¼èæ
®ä¸ä¸å°ç°¡æ·ä¾åé²è¡çä¸äºä¿®æ¹ï¼
* çµç¹å叿 ¡ä½çºå¯¦é«
å¨åé¢çæè¿°ä¸ï¼`organization`ï¼ä½¿ç¨è
å·¥ä½çå
¬å¸ï¼å `school_name`ï¼ä»åå¸ç¿çå°æ¹ï¼åªæ¯å串ãä¹è¨±ä»åæè©²æ¯å°å¯¦é«çå¼ç¨å¢ï¼ç¶å¾ï¼æ¯åçµç¹ã叿 ¡æå¤§å¸é½å¯ä»¥ææèªå·±çç¶²é ï¼æ¨èãæ°èæè¦çï¼ãæ¯åç°¡æ·å¯ä»¥é£çµå°å®ææå°ççµç¹å叿 ¡ï¼ä¸¦ä¸å
æ¬ä»åçå示åå
¶ä»è³è¨ï¼è«åé± [å 2-3](../img/fig2-3.png)ï¼ä¾èª LinkedIn çä¸åä¾åï¼ã
* æ¨è¦
åè¨ä½ æ³æ°å¢ä¸åæ°çåè½ï¼ä¸å使ç¨è
å¯ä»¥çºå¦ä¸å使ç¨è
寫ä¸åæ¨è¦ãå¨ä½¿ç¨è
çç°¡æ·ä¸é¡¯ç¤ºæ¨è¦ï¼ä¸¦é䏿¨è¦ä½¿ç¨è
çå§ååç
§çã妿æ¨è¦äººæ´æ°ä»åçç
§çï¼é£ä»å寫ç任使¨è¦é½éè¦é¡¯ç¤ºæ°çç
§çãå æ¤ï¼æ¨è¦æè©²ææä½è
å人簡ä»çå¼ç¨ã

**å 2-3 å
¬å¸åä¸å
æ¯å串ï¼éæ¯ä¸åæåå
¬å¸å¯¦é«çé£çµï¼LinkedIn æªåï¼**
[å 2-4](../img/fig2-4.png) 顿çéäºæ°åè½éè¦å¦ä½ä½¿ç¨å¤å°å¤éä¿ãæ¯åèç·ç©å½¢å
§çè³æå¯ä»¥åçµæä¸åæä»¶ï¼ä½æ¯å°å®ä½ï¼å¸æ ¡åå
¶ä»ä½¿ç¨è
çå¼ç¨éè¦è¡¨ç¤ºæå¼ç¨ï¼ä¸¦ä¸å¨æ¥è©¢æéè¦é£ç·ã

**å 2-4 使ç¨å¤å°å¤éä¿æ´å
å¥ä»¶ç°¡æ·**
### æä»¶è³æåº«æ¯å¦å¨éè¹è¦è½ï¼
å¨å¤å°å¤çéä¿åé£ç·å·²å¸¸è¦ç¨å¨éä¿è³æåº«æï¼æä»¶è³æåº«å NoSQL éåäºè¾¯è«ï¼å¦ä½ä»¥æä½³æ¹å¼å¨è³æåº«ä¸è¡¨ç¤ºå¤å°å¤éä¿ãé£å ´è¾¯è«å¯æ¯ NoSQL å¤èå¾å¤ï¼äºå¯¦ä¸ï¼ææ©å¯ä»¥è¿½æº¯å°è¨ç®æ©åè³æåº«ç³»çµ±ã
20 ä¸ç´ 70 年代æåæ¡è¿çæ¥åè³æèçè³æåº«æ¯ IBM çè³è¨ç®¡ç系統ï¼IMSï¼ï¼æåæ¯çºäºé¿æ³¢ç¾
太空è¨åç庫å管çèéç¼çï¼ä¸¦æ¼ 1968 å¹´æäºé¦æ¬¡åæ¥éåºã13ããç®åå®ä»å¨ä½¿ç¨åç¶è·ï¼å·è¡å¨ IBM 大忩ç OS/390 ä¸ã14ãã
IMS çè¨è¨ä¸ä½¿ç¨äºä¸åç¸ç¶ç°¡å®çè³ææ¨¡åï¼ç¨±çº **層次模åï¼hierarchical modelï¼**ï¼å®èæä»¶è³æåº«ä½¿ç¨ç JSON 模åæä¸äºé©äººçç¸ä¼¼ä¹èã2ããå®å°ææè³æè¡¨ç¤ºçºå·¢çå¨è¨éä¸çè¨é樹ï¼éå¾å [å 2-2](../img/fig2-2.png) ç JSON çµæ§ã
åææªè³æåº«ä¸æ¨£ï¼IMS è½è¯å¥½èçä¸å°å¤çéä¿ï¼ä½æ¯å¾é£æå°å¤å°å¤çéä¿ï¼ä¸¦ä¸ä¸æ¯æ´é£ç·ãéç¼äººå¡å¿
é æ±ºå®æ¯å¦è¤è£½ï¼éè¦ç¯åï¼è³æææå解決å¾ä¸åè¨éå°å¦ä¸åè¨éçå¼ç¨ãéäºäºåä¸ç´å
ä¸å年代çåé¡èç¾å¨éç¼äººå¡éå°çæä»¶è³æåº«åé¡é常ç¸ä¼¼ã15ãã
é£æäººåæåºäºå種ä¸åçè§£æ±ºæ¹æ¡ä¾è§£æ±ºå±¤æ¬¡æ¨¡åçä¾·éæ§ãå
¶ä¸æçªåºçå
©åæ¯ **éä¿æ¨¡å**ï¼relational modelï¼å®è®æäº SQLï¼ä¸¦çµ±æ²»äºä¸çï¼å **ç¶²çæ¨¡å**ï¼network modelï¼æåå¾åéæ³¨ï¼ä½æçµè®å¾å·éï¼ãéå
©åé£çä¹éç â大辯è«â å¨ 70 年代æçºäºå¾ä¹
æéã2ãã
é£å
©å模å¼è§£æ±ºçåé¡èç¶åçåé¡ç¸éï¼å æ¤å¼å¾ç°¡è¦åé¡§ä¸ä¸é£å ´è¾¯è«ã
#### ç¶²çæ¨¡å
ç¶²çæ¨¡åç±ä¸å稱çºè³æç³»çµ±èªè¨æè°ï¼CODASYLï¼çå§å¡æé²è¡äºæ¨æºåï¼ä¸¦è¢«æ¸åä¸åçè³æåº«å» å實ç¾ï¼å®ä¹è¢«ç¨±çº CODASYL 模åã16ãã
CODASYL æ¨¡åæ¯å±¤æ¬¡æ¨¡åçæ¨å»£ãå¨å±¤æ¬¡æ¨¡åçæ¨¹çµæ§ä¸ï¼æ¯æ¢è¨éåªæä¸åç¶ç¯é»ï¼å¨ç¶²è·¯æ¨¡å¼ä¸ï¼æ¯æ¢è¨éå¯è½æå¤åç¶ç¯é»ãä¾å¦ï¼âGreater Seattle Areaâ å°åå¯è½æ¯ä¸æ¢è¨éï¼æ¯åå±
ä½å¨è©²å°åç使ç¨è
é½å¯ä»¥èä¹ç¸éè¯ãéå
許å°å¤å°ä¸åå¤å°å¤çéä¿é²è¡å»ºæ¨¡ã
ç¶²çæ¨¡åä¸è¨éä¹éçé£çµä¸æ¯å¤é¨ç´¢å¼éµï¼èæ´åç¨å¼èªè¨ä¸çææ¨ï¼åæä»ç¶å²åå¨ç£ç¢ä¸ï¼ã訪åè¨éçå¯ä¸æ¹æ³æ¯è·é¨å¾æ ¹è¨é起沿éäºéè·¯æå½¢æçè·¯å¾ãéè¢«ç¨±çº **訪åè·¯å¾ï¼access pathï¼**ã
æç°¡å®çæ
æ³ä¸ï¼è¨ªåè·¯å¾é¡ä¼¼éæ·é£çµä¸²åï¼å¾å表é éå§ï¼æ¯æ¬¡æª¢è¦ä¸æ¢è¨éï¼ç´å°æ¾å°æéçè¨éãä½å¨å¤å°å¤éä¿çæ
æ³ä¸ï¼æ¸æ¢ä¸åçè·¯å¾å¯ä»¥å°éç¸åçè¨éï¼ç¶²ç模åçç¨å¼è¨è¨å¸«å¿
é è·è¹¤éäºä¸åç訪åè·¯å¾ã
CODASYL ä¸çæ¥è©¢æ¯ééå©ç¨éæ·è¨éååè·é¨è¨ªåè·¯å¾è¡¨å¨è³æåº«ä¸ç§»å鿍ä¾å·è¡çã妿è¨éæå¤åç¶çµé»ï¼å³å¤åä¾èªå
¶ä»è¨éçå³å
¥ææ¨ï¼ï¼åæç¨ç¨å¼ç¨å¼ç¢¼å¿
é è·è¹¤ææçå種éä¿ãçè³ CODASYL å§å¡ææå¡ä¹æ¿èªï¼éå°±åå¨ n ç¶è³æç©ºéä¸é²è¡å°èªã17ãã
å管æåé¸æè¨ªåè·¯å¾è½å¤ æææå°å©ç¨ 20 ä¸ç´ 70 年代é常æéç硬é«åè½ï¼å¦ç£å¸¶é©
åå¨ï¼å
¶æå°é度é常æ
¢ï¼ï¼ä½éä½¿å¾æ¥è©¢åæ´æ°è³æåº«çç¨å¼ç¢¼è®å¾è¤éä¸éæ´»ãç¡è«æ¯åå±¤éæ¯ç¶²çæ¨¡åï¼å¦æä½ æ²ææéè³æçè·¯å¾ï¼å°±æé·å
¥å°å¢ãä½ å¯ä»¥æ¹è®è¨ªåè·¯å¾ï¼ä½æ¯å¿
é ç覽大éæå¯«è³æåº«æ¥è©¢ç¨å¼ç¢¼ï¼ä¸¦é寫ä¾èçæ°ç訪åè·¯å¾ãæ´æ¹æç¨ç¨å¼çè³ææ¨¡åæ¯å¾é£çã
#### éä¿æ¨¡å
ç¸æ¯ä¹ä¸ï¼éä¿æ¨¡ååçå°±æ¯å°ææçè³ææ¾å¨å
天忥ä¹ä¸ï¼ä¸å **éä¿ï¼è¡¨ï¼** åªæ¯ä¸å **å
çµï¼è¡ï¼** çéåï¼å
æ¤èå·²ãå¦æä½ æ³è®åè³æï¼å®æ²æè¿·å®®ä¼¼çå·¢ççµæ§ï¼ä¹æ²æè¤éç訪åè·¯å¾ãä½ å¯ä»¥é¸ä¸ç¬¦åä»»ææ¢ä»¶çè¡ï¼è®å表ä¸ç任使ææè¡ãä½ å¯ä»¥ééæå®æäºåä½çºå¹é
ééµåä¾è®åç¹å®è¡ãä½ å¯ä»¥å¨ä»»ä½è¡¨ä¸æå
¥ä¸åæ°çè¡ï¼èä¸å¿
æå¿èå
¶ä»è¡¨çå¤é¨ç´¢å¼éµéä¿ [^iv]ã
[^iv]: å¤é¨ç´¢å¼éµç´æå
許å°ä¿®æ¹é²è¡éå¶ï¼ä½å°æ¼éä¿æ¨¡åé䏦䏿¯å¿
é¸é
ãå³ä½¿æç´æï¼å¤é¨ç´¢å¼éµé£ç·å¨æ¥è©¢æå·è¡ï¼èå¨ CODASYL ä¸ï¼é£ç·å¨æå
¥æé«æå®æã
å¨éä¿è³æåº«ä¸ï¼æ¥è©¢æä½³åå¨èªåæ±ºå®æ¥è©¢çåªäºé¨å以åªåé åºå·è¡ï¼ä»¥å使ç¨åªäºç´¢å¼ãéäºé¸æå¯¦é䏿¯ â訪åè·¯å¾âï¼ä½æå¤§çåå¥å¨æ¼å®åæ¯ç±æ¥è©¢æä½³åå¨èªåçæçï¼è䏿¯ç±ç¨å¼è¨è¨å¸«çæï¼æä»¥æåå¾å°éè¦èæ
®å®åã
å¦ææ³ææ°çæ¹å¼æ¥è©¢è³æï¼ä½ å¯ä»¥å®£åä¸åæ°çç´¢å¼ï¼æ¥è©¢æèªåä½¿ç¨æåé©çé£äºç´¢å¼ãç¡éæ´æ¹æ¥è©¢ä¾å©ç¨æ°çç´¢å¼ï¼è«åé± â[è³ææ¥è©¢èªè¨](#è³ææ¥è©¢èªè¨)âï¼ãéä¿æ¨¡åå æ¤ä½¿æ°å¢æç¨ç¨å¼æ°åè½è®å¾æ´å 容æã
éä¿è³æåº«çæ¥è©¢æä½³å卿¯è¤éçï¼å·²èè²»äºå¤å¹´çç ç©¶åéç¼ç²¾åã18ããéä¿æ¨¡åçä¸åé鵿´å¯æ¯ï¼åªéæ§å»ºä¸æ¬¡æ¥è©¢æä½³åå¨ï¼é¨å¾ä½¿ç¨è©²è³æåº«çæææç¨ç¨å¼é½å¯ä»¥å¾ä¸åçãå¦æä½ æ²ææ¥è©¢æä½³åå¨ç話ï¼é£éº¼çºç¹å®æ¥è©¢æå編寫訪åè·¯å¾æ¯ç·¨å¯«éç¨æä½³å卿´å®¹æ ââ ä¸éå¾é·æçéç¨è§£æ±ºæ¹æ¡æ´å¥½ã
#### èæä»¶è³æåº«ç¸æ¯
å¨ä¸åæ¹é¢ï¼æä»¶è³æåº«éåçºå±¤æ¬¡æ¨¡åï¼å¨å
¶ç¶è¨éä¸å²åå·¢çè¨éï¼[å 2-1](../img/fig2-1.png) ä¸çä¸å°å¤éä¿ï¼å¦ `positions`ï¼`education` å `contact_info`ï¼ï¼è䏿¯å¨å®ç¨ç表ä¸ã
使¯ï¼å¨è¡¨ç¤ºå¤å°ä¸åå¤å°å¤çéä¿æï¼éä¿è³æåº«åæä»¶è³æåº«ä¸¦æ²ææ ¹æ¬çä¸åï¼å¨éå
©ç¨®æ
æ³ä¸ï¼ç¸éå°æ¡é½è¢«ä¸åå¯ä¸çèå¥ç¬¦èå¼ç¨ï¼éåèå¥ç¬¦èå¨éä¿æ¨¡åä¸è¢«ç¨±çº **å¤é¨ç´¢å¼éµ**ï¼å¨æä»¶æ¨¡åä¸ç¨±çº **æä»¶å¼ç¨**ã9ãã該èå¥ç¬¦èå¨è®åæééé£ç·æå¾çºæ¥è©¢ä¾è§£æãè¿ä»çºæ¢ï¼æä»¶è³æåº«æ²æèµ° CODASYL çèè·¯ã
### éä¿åè³æåº«èæä»¶è³æåº«å¨ä»æ¥çå°æ¯
å°éä¿è³æåº«èæä»¶è³æåº«é²è¡æ¯è¼æï¼å¯ä»¥èæ
®è¨±å¤æ¹é¢çå·®ç°ï¼å
æ¬å®åç容é¯å±¬æ§ï¼è«åé± [第äºç« ](ch5.md)ï¼åèçä½µç¼æ§ï¼è«åé± [第ä¸ç« ](ch7.md)ï¼ãæ¬ç« å°åªéæ³¨è³ææ¨¡åä¸çå·®ç°ã
æ¯æ´æä»¶è³ææ¨¡åç主è¦è«ææ¯æ¶æ§éæ´»æ§ï¼å ååæ§èæææ´å¥½çæè½ï¼ä»¥åå°æ¼æäºæç¨ç¨å¼èè¨æ´æ¥è¿æ¼æç¨ç¨å¼ä½¿ç¨çè³æçµæ§ãéä¿æ¨¡åééçºé£ç·æä¾æ´å¥½çæ¯æ´ä»¥åæ¯æ´å¤å°ä¸åå¤å°å¤çéä¿ä¾åæã
#### åªç¨®è³ææ¨¡åæ´æå©æ¼ç°¡åæç¨ç¨å¼ç¢¼ï¼
妿æç¨ç¨å¼ä¸çè³æå
·æé¡ä¼¼æä»¶ççµæ§ï¼å³ï¼ä¸å°å¤éä¿æ¨¹ï¼é叏䏿¬¡æ§è¼å
¥æ´å樹ï¼ï¼é£éº¼ä½¿ç¨æä»¶æ¨¡åå¯è½æ¯ä¸å好主æãå°é¡ä¼¼æä»¶ççµæ§åè§£æå¤å表ï¼å¦ [å 2-1](../img/fig2-1.png) ä¸ç `positions`ã`education` å `contact_info`ï¼çéä¿æè¡å¯è½å°è´ç¹ç£ç模å¼åä¸å¿
è¦çè¤éçæç¨ç¨å¼ç¨å¼ç¢¼ã
æä»¶æ¨¡åæä¸å®çä¾·éæ§ï¼ä¾å¦ï¼ä¸è½ç´æ¥å¼ç¨æä»¶ä¸çå·¢ççå°æ¡ï¼èæ¯éè¦èªª â使ç¨è
251 çä½ç½®å表ä¸ç第äºé
âï¼å¾å層次模åä¸ç訪åè·¯å¾ï¼ã使¯ï¼åªè¦æªæ¡å·¢çä¸å¤ªæ·±ï¼éé叏䏿¯åé¡ã
æä»¶è³æåº«å°é£ç·çç³ç³æ¯æ´å¯è½æ¯ååé¡ï¼ä¹å¯è½ä¸æ¯åé¡ï¼éåæ±ºæ¼æç¨ç¨å¼ãä¾å¦ï¼å¦ææåæåæç¨ç¨å¼ä½¿ç¨ä¸åæä»¶è³æåº«ä¾è¨é使ä½å°ç¼çäºä½äºï¼é£éº¼å¤å°å¤éä¿å¯è½æ°¸é ä¹ç¨ä¸ä¸ãã19ãã
ä½å¦æä½ çæç¨ç¨å¼ç¢ºå¯¦æç¨å°å¤å°å¤éä¿ï¼é£éº¼æä»¶æ¨¡åå°±æ²æé£éº¼èªäººäºãå管å¯ä»¥ééåè¦ç¯å便¶é¤å°é£ç·çéæ±ï¼ä½ééè¦æç¨ç¨å¼ç¨å¼ç¢¼ä¾åé¡å¤çå·¥ä½ä»¥ç¢ºä¿è³æä¸è´æ§ãå管æç¨ç¨å¼ç¨å¼ç¢¼å¯ä»¥ééåè³æåº«ç¼åºå¤åè«æ±çæ¹å¼ä¾æ¨¡æ¬é£ç·ï¼ä½éä¹å°è¤éæ§è½ç§»å°æç¨ç¨å¼ä¸ï¼èä¸é叏乿æ¯ç±è³æåº«å
§çå°ç¨ç¨å¼ç¢¼æ´æ
¢ãå¨é種æ
æ³ä¸ï¼ä½¿ç¨æä»¶æ¨¡åå¯è½æå°è´æ´è¤éçæç¨ç¨å¼ç¢¼èæ´å·®çæè½ã15ãã
æåæ²æè¾¦æ³èªªåªç¨®è³ææ¨¡åæ´æå©æ¼ç°¡åæç¨ç¨å¼ç¢¼ï¼å çºå®å決æ¼è³æé
ä¹éçéä¿ç¨®é¡ãå°é«åº¦éè¯çè³æèè¨ï¼æä»¶æ¨¡åæ¯æ¥µå
¶ç³ç³çï¼éä¿æ¨¡åæ¯å¯ä»¥æ¥åçï¼èé¸ç¨å形模åï¼è«åé± â[åè³ææ¨¡å](#åè³ææ¨¡å)âï¼æ¯æèªç¶çã
#### æä»¶æ¨¡åä¸ç模å¼éæ´»æ§
大夿¸æä»¶è³æåº«ä»¥åéä¿è³æåº«ä¸ç JSON æ¯æ´é½ä¸æå¼·å¶æä»¶ä¸çè³ææ¡ç¨ä½ç¨®æ¨¡å¼ãéä¿è³æåº«ç XML æ¯æ´é常帶æå¯é¸ç模å¼é©èãæ²ææ¨¡å¼æå³èå¯ä»¥å°ä»»æçéµå弿°å¢å°æä»¶ä¸ï¼ä¸¦ä¸ç¶è®åæï¼å®¢æ¶ç«¯ç¡æ³ä¿èæä»¶å¯è½å
å«çæ¬ä½ã
æä»¶è³æåº«ææç¨±çº **ç¡æ¨¡å¼ï¼schemalessï¼**ï¼ä½éå
·æèª¤å°æ§ï¼å çºè®åè³æçç¨å¼ç¢¼é常åå®æç¨®çµæ§ ââ å³åå¨é±å¼æ¨¡å¼ï¼ä½ä¸ç±è³æåº«å¼·å¶å·è¡ã20ããä¸åæ´ç²¾ç¢ºçè¡èªæ¯ **è®ææ¨¡å¼**ï¼å³ schema-on-readï¼è³æççµæ§æ¯é±å«çï¼åªæå¨è³æè¢«è®åææè¢«è§£éï¼ï¼ç¸æçæ¯ **å¯«ææ¨¡å¼**ï¼å³ schema-on-writeï¼å³çµ±çéä¿è³æåº«æ¹æ³ä¸ï¼æ¨¡å¼æç¢ºï¼ä¸è³æåº«ç¢ºä¿ææçè³æé½ç¬¦åå
¶æ¨¡å¼ï¼ã21ãã
è®ææ¨¡å¼é¡ä¼¼æ¼ç¨å¼èªè¨ä¸çåæ
ï¼å·è¡æï¼å奿ª¢æ¥ï¼èå¯«ææ¨¡å¼é¡ä¼¼æ¼éæ
ï¼ç·¨è¯æï¼å奿ª¢æ¥ãå°±åéæ
ååæ
å奿ª¢æ¥çç¸å°åªé»å
·æå¾å¤§ççè°æ§ä¸æ¨£ã22ãï¼è³æåº«ä¸æ¨¡å¼çå¼·å¶æ§æ¯ä¸åå
·æçè°ç話é¡ï¼ä¸è¬ä¾èªªæ²ææ£ç¢ºæé¯èª¤ççæ¡ã
卿ç¨ç¨å¼æ³è¦æ¹è®å
¶è³ææ ¼å¼çæ
æ³ä¸ï¼éäºæ¹æ³ä¹éçåå¥å°¤å
¶æé¡¯ãä¾å¦ï¼åè¨ä½ ææ¯å使ç¨è
çå
¨åå²åå¨ä¸åæ¬ä½ä¸ï¼èç¾å¨æ³åå¥å²ååååå§æ°ã23ããå¨æä»¶è³æåº«ä¸ï¼åªééå§å¯«å
¥å
·ææ°æ¬ä½çæ°æä»¶ï¼ä¸¦å¨æç¨ç¨å¼ä¸ä½¿ç¨ç¨å¼ç¢¼ä¾èçè®åèæä»¶çæ
æ³ãä¾å¦ï¼
```go
if (user && user.name && !user.first_name) {
// Documents written before Dec 8, 2013 don't have first_name
user.first_name = user.name.split(" ")[0];
}
```
å¦ä¸æ¹é¢ï¼å¨ âéæ
åå¥â è³æåº«æ¨¡å¼ä¸ï¼é常æå·è¡ä»¥ä¸ **é·ç§»ï¼migrationï¼** æä½ï¼
```sql
ALTER TABLE users ADD COLUMN first_name text;
UPDATE users SET first_name = split_part(name, ' ', 1); -- PostgreSQL
UPDATE users SET first_name = substring_index(name, ' ', 1); -- MySQL
```
模å¼è®æ´çéåº¦å¾æ
¢ï¼èä¸è¦æ±åéãå®çé種å£åè½ä¸¦ä¸æ¯å®å
¨æå¾çï¼å¤§å¤æ¸éä¿è³æåº«ç³»çµ±å¯å¨å¹¾æ¯«ç§å
§å·è¡ `ALTER TABLE` èªå¥ãMySQL æ¯ä¸åå¼å¾æ³¨æçä¾å¤ï¼å®å·è¡ `ALTER TABLE` ææè¤è£½æ´å表ï¼éå¯è½æå³è卿´æ¹ä¸å大å表ææè±è²»å¹¾åéçè³å¹¾åå°æçåæ©æéï¼å管åå¨å種工å
·ä¾è§£æ±ºéåéå¶ã24,25,26ãã
大å表ä¸å·è¡ `UPDATE` èªå¥å¨ä»»ä½è³æåº«ä¸é½å¯è½æå¾æ
¢ï¼å çºæ¯ä¸è¡é½éè¦é寫ãè¦æ¯ä¸å¯æ¥åçè©±ï¼æç¨ç¨å¼å¯ä»¥å° `first_name` è¨å®çºé è¨å¼ `NULL`ï¼ä¸¦å¨è®åæåå¡«å
ï¼å°±åä½¿ç¨æä»¶è³æåº«ä¸æ¨£ã
ç¶ç±æ¼æç¨®åå ï¼ä¾å¦ï¼è³ææ¯ç°æ§çï¼éåä¸çå°æ¡ä¸¦ä¸é½å
·æç¸åççµæ§æï¼è®ææ¨¡å¼æ´å
·åªå¢ãä¾å¦ï¼å¦æï¼
* åå¨è¨±å¤ä¸ååå¥çç©ä»¶ï¼å°æ¯ç¨®åå¥çç©ä»¶æ¾å¨èªå·±çè¡¨ä¸æ¯ä¸ç¾å¯¦çã
* è³æççµæ§ç±å¤é¨ç³»çµ±æ±ºå®ãä½ ç¡æ³æ§å¶å¤é¨ç³»çµ±ä¸å®é¨æå¯è½è®åã
å¨ä¸è¿°æ
æ³ä¸ï¼æ¨¡å¼çå£èé 大æ¼å®ç幫å©ï¼ç¡æ¨¡å¼æä»¶å¯è½æ¯ä¸åæ´å èªç¶çè³ææ¨¡åã使¯ï¼è¦æ¯ææè¨éé½å
·æç¸åççµæ§ï¼é£éº¼æ¨¡å¼æ¯è¨é並強å¶éç¨®çµæ§çæææ©å¶ã第åç« å°æ´è©³ç´°å°è¨è«æ¨¡å¼åæ¨¡å¼æ¼åã
#### æ¥è©¢çè³æååæ§
æä»¶é常以å®åé£çºå串形å¼é²è¡å²åï¼ç·¨ç¢¼çº JSONãXML æå
¶äºé²ä½å¶è®é«ï¼å¦ MongoDB ç BSONï¼ã妿æç¨ç¨å¼ç¶å¸¸éè¦è¨ªåæ´åæä»¶ï¼ä¾å¦ï¼å°å
¶æ¸²æè³ç¶²é ï¼ï¼é£éº¼å²åååæ§æå¸¶ä¾æè½åªå¢ã妿å°è³æåå²å°å¤å表ä¸ï¼å¦ [å 2-1](../img/fig2-1.png) æç¤ºï¼ï¼åéè¦é²è¡å¤æ¬¡ç´¢å¼æ¥è©¢æè½å°å
¶å
¨é¨æª¢ç´¢åºä¾ï¼éå¯è½éè¦æ´å¤çç£ç¢æ¥è©¢ä¸¦è±è²»æ´å¤çæéã
ååæ§å
å
é©ç¨æ¼åæéè¦æä»¶çµå¤§é¨åå
§å®¹çæ
æ³ãå³ä½¿åªè¨ªåæä»¶å
¶ä¸çä¸å°é¨åï¼è³æåº«é常éè¦è¼å
¥æ´åæä»¶ï¼å°æ¼å¤§åæä»¶ä¾èªªé種è¼å
¥è¡çºæ¯å¾æµªè²»çãæ´æ°æä»¶æï¼é常éè¦æ´åé寫ãåªæä¸æ¹è®æä»¶å¤§å°çä¿®æ¹æå¯ä»¥å®¹æå°åå°å·è¡ãå æ¤ï¼é常建è°ä¿æç¸å°å°çæä»¶ï¼ä¸¦é¿å
å¢å æä»¶å¤§å°ç寫å
¥ã9ããéäºæè½éå¶å¤§å¤§æ¸å°äºæä»¶è³æåº«ç實ç¨å ´æ¯ã
å¼å¾æåºçæ¯ï¼çºäºååæ§èåçµéåç¸éè³æçæ³æ³ä¸¦ä¸ä¾·éæ¼æä»¶æ¨¡åãä¾å¦ï¼Google ç Spanner è³æåº«å¨éä¿è³ææ¨¡å䏿ä¾äºå樣çå忧屬æ§ï¼å
許模å¼å®£åä¸å表çè¡æè©²äº¤é¯ï¼å·¢çï¼å¨ç¶è¡¨å
§ã27ããOracle é¡ä¼¼å°å
許使ç¨ä¸åç¨±çº **å¤è¡¨ç´¢å¼å¢é表ï¼multi-table index cluster tablesï¼** çé¡ä¼¼ç¹æ§ã28ããBigtable è³ææ¨¡åï¼ç¨æ¼ Cassandra å HBaseï¼ä¸ç **åæï¼column-familyï¼** æ¦å¿µè管çååæ§çç®çé¡ä¼¼ã29ãã
å¨ [第ä¸ç« ](ch3.md) å°éæçå°æ´å¤éæ¼ååæ§çå
§å®¹ã
#### æä»¶åéä¿è³æåº«çèå
èª 2000 年代䏿以ä¾ï¼å¤§å¤æ¸éä¿è³æåº«ç³»çµ±ï¼MySQL é¤å¤ï¼é½å·²æ¯æ´ XMLãéå
æ¬å° XML æä»¶é²è¡æ¬å°ä¿®æ¹çåè½ï¼ä»¥åå¨ XML æä»¶ä¸é²è¡ç´¢å¼åæ¥è©¢çåè½ãéå
許æç¨ç¨å¼ä½¿ç¨é£ç¨®èæä»¶è³æåº«æç¶ä½¿ç¨çé常é¡ä¼¼çè³ææ¨¡åã
å¾ 9.3 çæ¬éå§ç PostgreSQL ã8ãï¼å¾ 5.7 çæ¬éå§ç MySQL 以åå¾çæ¬ 10.5 éå§ç IBM DB2ã30ãä¹å° JSON æä»¶æä¾äºé¡ä¼¼çæ¯æ´ç´å¥ãéæ¼ç¨å¨ Web APIs ç JSON æµè¡è¶¨å¢ï¼å
¶ä»éä¿è³æåº«å¾å¯è½æè·é¨ä»åçè
³æ¥ä¸¦æ°å¢ JSON æ¯æ´ã
å¨æä»¶è³æåº«ä¸ï¼RethinkDB å¨å
¶æ¥è©¢èªè¨ä¸æ¯æ´é¡ä¼¼éä¿çé£ç·ï¼ä¸äº MongoDB é©
åç¨å¼å¯ä»¥èªåè§£æè³æåº«å¼ç¨ï¼ææå°å·è¡å®¢æ¶ç«¯é£ç·ï¼å管éå¯è½æ¯å¨è³æåº«ä¸å·è¡çé£ç·æ
¢ï¼éè¦é¡å¤ç網路å¾è¿ï¼ä¸¦ä¸æä½³åæ´å°ï¼ã
é¨èæéçæ¨ç§»ï¼éä¿è³æåº«åæä»¶è³æåº«ä¼¼ä¹è®å¾è¶ä¾è¶ç¸ä¼¼ï¼éæ¯ä¸ä»¶å¥½äºï¼è³ææ¨¡åç¸äºè£å
[^v]ï¼å¦æä¸åæ¸æåº«è½å¤ èçé¡ä¼¼æä»¶çè³æï¼ä¸¦è½å¤ å°å
¶å·è¡éä¿æ¥è©¢ï¼é£éº¼æç¨ç¨å¼å°±å¯ä»¥ä½¿ç¨æç¬¦åå
¶éæ±çåè½çµåã
éä¿æ¨¡ååæä»¶æ¨¡åçæ··åæ¯æªä¾è³æåº«ä¸æ¢å¾å¥½çè·¯ç·ã
[^v]: Codd å°éä¿æ¨¡åã1ãçåå§æè¿°å¯¦éä¸å
許å¨éä¿æ¨¡å¼ä¸è JSON æä»¶é常ç¸ä¼¼ãä»ç¨±ä¹çº **éç°¡å®åï¼nonsimple domainsï¼**ãéåæ³æ³æ¯ï¼ä¸è¡ä¸çå¼ä¸ä¸å®æ¯ä¸å忏åæå䏲䏿¨£çåå§è³æåå¥ï¼ä¹å¯ä»¥æ¯ä¸åå·¢ççéä¿ï¼è¡¨ï¼ï¼å æ¤å¯ä»¥æä¸åä»»æå·¢ççæ¨¹çµæ§ä½çºä¸åå¼ï¼éå¾å 30 年徿°å¢å° SQL ä¸ç JSON æ XML æ¯æ´ã
## è³ææ¥è©¢èªè¨
ç¶å¼å
¥éä¿æ¨¡åæï¼éä¿æ¨¡åå
å«äºä¸ç¨®æ¥è©¢è³æçæ°æ¹æ³ï¼SQL æ¯ä¸ç¨® **宣åå¼** æ¥è©¢èªè¨ï¼è IMS å CODASYL ä½¿ç¨ **å½ä»¤å¼** ç¨å¼ç¢¼ä¾æ¥è©¢è³æåº«ã飿¯ä»éº¼ææï¼
許å¤å¸¸ç¨çç¨å¼èªè¨æ¯å½ä»¤å¼çãä¾å¦ï¼çµ¦å®ä¸ååç©ç©ç¨®çå表ï¼è¿åå表ä¸çé¯éå¯ä»¥é樣寫ï¼
```js
function getSharks() {
var sharks = [];
for (var i = 0; i < animals.length; i++) {
if (animals[i].family === "Sharks") {
sharks.push(animals[i]);
}
}
return sharks;
}
```
èå¨éä¿ä»£æ¸ä¸ï¼ä½ å¯ä»¥é樣寫ï¼
$$
sharks = \sigma_{family = "sharks"}(animals)
$$
å
¶ä¸ $\sigma$ï¼å¸èåæ¯è¥¿æ ¼çªï¼æ¯é¸æéç®åï¼åªè¿å符å `family="shark"` æ¢ä»¶çåç©ã
å®ç¾© SQL æï¼å®ç·å¯å°éµå¾ªéä¿ä»£æ¸ççµæ§ï¼
```sql
SELECT * FROM animals WHERE family ='Sharks';
```
å½ä»¤å¼èªè¨å訴è¨ç®æ©ä»¥ç¹å®é åºå·è¡æäºæä½ãå¯ä»¥æ³è±¡ä¸ä¸ï¼éè¡å°éæ·ç¨å¼ç¢¼ï¼è©ä¼°æ¢ä»¶ï¼æ´æ°è®æ¸ï¼ä¸¦æ±ºå®æ¯å¦åè¿´åä¸éã
å¨å®£å弿¥è©¢èªè¨ï¼å¦ SQL æéä¿ä»£æ¸ï¼ä¸ï¼ä½ åªéæå®æéè³æçæ¨¡å¼ - çµæå¿
é 符ååªäºæ¢ä»¶ï¼ä»¥åå¦ä½å°è³æè½æï¼ä¾å¦ï¼æåºï¼åçµåéåï¼ - ä½ä¸æ¯å¦ä½å¯¦ç¾éä¸ç®æ¨ãè³æåº«ç³»çµ±çæ¥è©¢æä½³å卿±ºå®ä½¿ç¨åªäºç´¢å¼ååªäºé£ç·æ¹æ³ï¼ä»¥å以ä½ç¨®é åºå·è¡æ¥è©¢çååé¨åã
宣å弿¥è©¢èªè¨æ¯è¿·äººçï¼å çºå®é常æ¯å½ä»¤å¼ API æ´å ç°¡æ½å容æã使´éè¦çæ¯ï¼å®éé±èäºè³æåº«å¼æç實ç¾ç´°ç¯ï¼é使å¾è³æåº«ç³»çµ±å¯ä»¥å¨ç¡éå°æ¥è©¢å任使´æ¹çæ
æ³ä¸é²è¡æè½æåã
ä¾å¦ï¼å¨æ¬ç¯éé æç¤ºçå½ä»¤ç¨å¼ç¢¼ä¸ï¼åç©å表以ç¹å®é åºåºç¾ãå¦æè³æåº«æ³è¦å¨å¾èºåæ¶æªä½¿ç¨çç£ç¢ç©ºéï¼åå¯è½éè¦ç§»åè¨éï¼éææ¹è®åç©åºç¾çé åºãè³æåº«è½å¦å®å
¨å°å·è¡ï¼è䏿䏿·æ¥è©¢ï¼
SQL 示ä¾ä¸ç¢ºä¿ä»»ä½ç¹å®çé åºï¼å æ¤ä¸å¨æé åºæ¯å¦æ¹è®ã使¯å¦ææ¥è©¢ç¨å½ä»¤å¼çç¨å¼ç¢¼ä¾å¯«ç話ï¼é£éº¼è³æåº«å°±æ°¸é ä¸å¯è½ç¢ºå®ç¨å¼ç¢¼æ¯å¦ä¾è³´æ¼æåºãSQL ç¸ç¶æéçåè½æ§çºè³æåº«æä¾äºæ´å¤èªåæä½³åç空éã
æå¾ï¼å®£åå¼èªè¨å¾å¾é©å並è¡å·è¡ãç¾å¨ï¼CPU çé度ééæ ¸å¿ï¼coreï¼çå¢å è®å¾æ´å¿«ï¼è䏿¯ä»¥æ¯ä»¥åæ´é«çæéé度å·è¡ã31ããå½ä»¤ç¨å¼ç¢¼å¾é£å¨å¤åæ ¸å¿åå¤åæ©å¨ä¹é並è¡åï¼å çºå®æå®äºæä»¤å¿
é 以ç¹å®é åºå·è¡ã宣åå¼èªè¨æ´å
·æä¸¦è¡å·è¡çæ½åï¼å çºå®åå
æå®çµæç模å¼ï¼è䏿å®ç¨æ¼ç¢ºå®çµæçæ¼ç®æ³ãå¨é©ç¶æ
æ³ä¸ï¼è³æåº«å¯ä»¥èªç±ä½¿ç¨æ¥è©¢èªè¨ç並è¡å¯¦ç¾ã32ãã
### Web ä¸ç宣å弿¥è©¢
宣å弿¥è©¢èªè¨çåªå¢ä¸å
éæ¼è³æåº«ãçºäºèªªæéä¸é»ï¼è®æåå¨ä¸åå®å
¨ä¸åçç°å¢ä¸æ¯è¼å®£åå¼åå½ä»¤å¼æ¹æ³ï¼ä¸å Web ç覽å¨ã
åè¨ä½ æä¸åéæ¼æµ·æ´åç©çç¶²ç«ã使ç¨è
ç¶åæ£å¨æª¢è¦é¯éé é¢ï¼å æ¤ä½ å°ç¶åæé¸çå°èªå°æ¡ âé¯éâ æ¨è¨çºç¶åé¸ä¸å°æ¡ã
```html
-
Sharks
- Great White Shark
- Tiger Shark
- Hammerhead Shark
-
Whales
- Blue Whale
- Humpback Whale
- Fin Whale
```
ç¾å¨æ³è®ç¶åæé¸é é¢çæ¨é¡å
·æä¸åèè²çèæ¯ï¼ä»¥ä¾¿å¨è¦è¦ºä¸çªåºé¡¯ç¤ºãä½¿ç¨ CSS 實ç¾èµ·ä¾é常簡å®ï¼
```css
li.selected > p {
background-color: blue;
}
```
é裡ç CSS é¸æå¨ `li.selected > p` è²æçæåæ³è¦æç¨èè²æ¨£å¼çå
ç´ çæ¨¡å¼ï¼å³å
¶ç´æ¥ç¶å
ç´ æ¯å
·æ CSS é¡ `selected` ç `` å
ç´ çææ `` å
ç´ ã示ä¾ä¸çå
ç´ `
Sharks
` å¹é
æ¤æ¨¡å¼ï¼ä½ `Whales
` ä¸å¹é
ï¼å çºå
¶ `` ç¶å
ç´ ç¼ºå° `class="selected"`ã
å¦æä½¿ç¨ XSL è䏿¯ CSSï¼ä½ å¯ä»¥åé¡ä¼¼çäºæ
ï¼
```xml
```
é裡ç XPath è¡¨ç¤ºå¼ `li[@class='selected']/p` ç¸ç¶æ¼ä¸ä¾ä¸ç CSS é¸æå¨ `li.selected > p`ãCSS å XSL çå
±åä¹è卿¼ï¼å®å齿¯ç¨æ¼æå®æä»¶æ¨£å¼ç宣åå¼èªè¨ã
æ³è±¡ä¸ä¸ï¼å¿
é 使ç¨å½ä»¤å¼æ¹æ³çæ
æ³ææ¯å¦ä½ãå¨ Javascript ä¸ï¼ä½¿ç¨ **æä»¶ç©ä»¶æ¨¡åï¼DOMï¼** APIï¼å
¶çµæå¯è½å¦ä¸æç¤ºï¼
```js
var liElements = document.getElementsByTagName("li");
for (var i = 0; i < liElements.length; i++) {
if (liElements[i].className === "selected") {
var children = liElements[i].childNodes;
for (var j = 0; j < children.length; j++) {
var child = children[j];
if (child.nodeType === Node.ELEMENT_NODE && child.tagName === "P") {
child.setAttribute("style", "background-color: blue");
}
}
}
}
```
éæ®µ JavaScript ç¨å¼ç¢¼å½ä»¤å¼å°å°å
ç´ è¨å®çºèè²èæ¯ï¼ä½æ¯ç¨å¼ç¢¼çèµ·ä¾å¾ç³ç³ãä¸å
æ¯ CSS å XSL çå¹ç©æ´é·ï¼æ´é£çè§£ï¼èä¸éæä¸äºå´éçåé¡ï¼
* 妿é¸å®çé¡è¢«ç§»é¤ï¼ä¾å¦ï¼å çºä½¿ç¨è
é»é¸äºä¸åçé é¢ï¼ï¼å³ä½¿ç¨å¼ç¢¼éæ°å·è¡ï¼èè²èæ¯ä¹ä¸æè¢«ç§»é¤ - å æ¤è©²å°æ¡å°ä¿æçªåºé¡¯ç¤ºï¼ç´å°æ´åé é¢è¢«éæ°è¼å
¥ãä½¿ç¨ CSSï¼çè¦½å¨æèªå檢測 `li.selected > p` è¦å使ä¸åé©ç¨ï¼ä¸¦å¨é¸å®çé¡è¢«ç§»é¤å¾ç«å³ç§»é¤èè²èæ¯ã
* å¦æä½ æ³è¦å©ç¨æ°ç APIï¼ä¾å¦ `document.getElementsByClassName("selected")` çè³ `document.evaluate()`ï¼ä¾æé«æè½ï¼åå¿
é é寫ç¨å¼ç¢¼ãå¦ä¸æ¹é¢ï¼ç覽å¨ä¾æåå¯ä»¥å¨ä¸ç ´å£ç¸å®¹æ§çæ
æ³ä¸æé« CSS å XPath çæè½ã
å¨ Web ç覽å¨ä¸ï¼ä½¿ç¨å®£åå¼ CSS æ¨£å¼æ¯ä½¿ç¨ JavaScript å½ä»¤å¼å°æä½æ¨£å¼è¦å¥½å¾å¤ãé¡ä¼¼å°ï¼å¨è³æåº«ä¸ï¼ä½¿ç¨å SQL 鿍£ç宣å弿¥è©¢èªè¨æ¯ä½¿ç¨å½ä»¤å¼æ¥è©¢ API è¦å¥½å¾å¤ [^vi]ã
[^vi]: IMS å CODASYL é½ä½¿ç¨å½ä»¤å¼ APIãæç¨ç¨å¼éå¸¸ä½¿ç¨ COBOL ç¨å¼ç¢¼éæ·è³æåº«ä¸çè¨éï¼ä¸æ¬¡ä¸æ¢è¨éã2,16ãã
### MapReduceæ¥è©¢
MapReduce æ¯ä¸åç± Google æ¨å»£çç¨å¼è¨è¨æ¨¡åï¼ç¨æ¼å¨å¤èºæ©å¨ä¸æ¹æ¬¡èçå¤§è¦æ¨¡çè³æã33ããä¸äº NoSQL è³æå²åï¼å
æ¬ MongoDB å CouchDBï¼æ¯æ´æéå½¢å¼ç MapReduceï¼ä½çºå¨å¤åæä»¶ä¸å·è¡åªè®æ¥è©¢çæ©å¶ã
éæ¼ MapReduce æ´è©³ç´°çä»ç´¹å¨ [第åç« ](ch10.md)ãç¾å¨æååªç°¡è¦è¨è«ä¸ä¸ MongoDB 使ç¨ç模åã
MapReduce æ¢ä¸æ¯ä¸å宣åå¼çæ¥è©¢èªè¨ï¼ä¹ä¸æ¯ä¸åå®å
¨å½ä»¤å¼çæ¥è©¢ APIï¼èæ¯èæ¼å
©è
ä¹éï¼æ¥è©¢çé輯ç¨ç¨å¼ç¢¼ç段ä¾è¡¨ç¤ºï¼éäºç¨å¼ç¢¼ç段æè¢«èçæ¡æ¶éè¤æ§å¼å«ãå®åºæ¼ `map`ï¼ä¹ç¨±çº `collect`ï¼å `reduce`ï¼ä¹ç¨±çº `fold` æ `inject`ï¼å½å¼ï¼å
©åå½å¼å卿¼è¨±å¤å½æ¸èªè¨ç¨å¼è¨è¨èªè¨ä¸ã
æå¥½èä¾ä¾è§£é MapReduce 模åãåè¨ä½ æ¯ä¸åæµ·æ´çç©å¸å®¶ï¼æ¯ç¶ä½ çå°æµ·æ´ä¸çåç©æï¼ä½ 齿å¨è³æåº«ä¸æ°å¢ä¸æ¢è§å¯è¨éãç¾å¨ä½ æ³çæä¸åå ±åï¼èªªæä½ æ¯æçå°å¤å°é¯éã
å¨ PostgreSQL ä¸ï¼ä½ å¯ä»¥å鿍£è¡¨è¿°éåæ¥è©¢ï¼
```sql
SELECT
date_trunc('month', observation_timestamp) AS observation_month,
sum(num_animals) AS total_animals
FROM observations
WHERE family = 'Sharks'
GROUP BY observation_month;
```
`date_trunc('month', timestamp)` å½å¼ç¨æ¼ç¢ºå®å
å« `timestamp` çæ¥ææä»½ï¼ä¸¦è¿å代表該æä»½éå§çå¦ä¸åæéæ³ãæå¥è©±èªªï¼å®å°æéæ³èå
¥ææè¿çæä»½ã
éåæ¥è©¢é¦å
éæ¿¾è§å¯è¨éï¼ä»¥åªé¡¯ç¤ºé¯éå®¶æçç©ç¨®ï¼ç¶å¾æ ¹æå®åç¼ççæ¥ææä»½å°è§å¯è¨éæé²è¡åçµï¼æå¾å°å¨è©²æçææè§å¯è¨éä¸çå°çåç©æ¸ç®å èµ·ä¾ã
忍£çæ¥è©¢ç¨ MongoDB ç MapReduce åè½å¯ä»¥æå¦ä¸ä¾è¡¨è¿°ï¼
```js
db.observations.mapReduce(function map() {
var year = this.observationTimestamp.getFullYear();
var month = this.observationTimestamp.getMonth() + 1;
emit(year + "-" + month, this.numAnimals);
},
function reduce(key, values) {
return Array.sum(values);
},
{
query: {
family: "Sharks"
},
out: "monthlySharkReport"
});
```
* å¯ä»¥å®£åå¼å°æå®ä¸ååªèæ
®é¯é種é¡çéæ¿¾å¨ï¼éæ¯ MongoDB ç¹å®ç MapReduce æ´å
å¥ä»¶ï¼ã
* æ¯åå¹é
æ¥è©¢çæä»¶é½æå¼å«ä¸æ¬¡ JavaScript å½å¼ `map`ï¼å° `this` è¨å®çºæä»¶ç©ä»¶ã
* `map` å½å¼ç¼åºä¸åéµï¼å
æ¬å¹´ä»½åæä»½çå串ï¼å¦ `"2013-12"` æ `"2014-1"`ï¼åä¸åå¼ï¼è©²è§å¯è¨éä¸çåç©æ¸éï¼ã
* `map` ç¼åºçéµå¼å°æéµä¾åçµãå°æ¼å
·æç¸åéµï¼å³ï¼ç¸åçæä»½å年份ï¼çææéµå¼å°ï¼å¼å«ä¸æ¬¡ `reduce` å½å¼ã
* `reduce` å½å¼å°ç¹å®æä»½å
§ææè§æ¸¬è¨éä¸çåç©æ¸éç¸å ã
* å°æçµç輸åºå¯«å
¥å° `monthlySharkReport` éåä¸ã
ä¾å¦ï¼åè¨ `observations` éåå
å«éå
©åæä»¶ï¼
```json
{
observationTimestamp: Date.parse( "Mon, 25 Dec 1995 12:34:56 GMT"),
family: "Sharks",
species: "Carcharodon carcharias",
numAnimals: 3
}
{
observationTimestamp: Date.parse("Tue, 12 Dec 1995 16:17:18 GMT"),
family: "Sharks",
species: "Carcharias taurus",
numAnimals: 4
}
```
å°æ¯åæä»¶é½æå¼å«ä¸æ¬¡ `map` å½å¼ï¼çµæå°æ¯ `emit("1995-12",3)` å `emit("1995-12",4)`ãé¨å¾ï¼ä»¥ `reduce("1995-12",[3,4])` å¼å« `reduce` å½å¼ï¼å°è¿å `7`ã
map å reduce å½å¼å¨åè½ä¸ææéå¶ï¼å®åå¿
é æ¯ **ç´** å½å¼ï¼éæå³èå®ååªä½¿ç¨å³é給å®åçè³æä½çºè¼¸å
¥ï¼å®åä¸è½å·è¡é¡å¤çè³æåº«æ¥è©¢ï¼ä¹ä¸è½æä»»ä½å¯ä½ç¨ãéäºéå¶å
è¨±è³æåº«ä»¥ä»»ä½é åºå·è¡ä»»ä½åè½ï¼ä¸¦å¨å¤±ææéæ°å·è¡å®åãç¶èï¼map å reduce å½å¼ä»ç¶æ¯å¼·å¤§çï¼å®åå¯ä»¥è§£æå串ãå¼å«åº«å½å¼ãå·è¡è¨ç®ççã
MapReduce æ¯ä¸åç¸ç¶åºå±¤çç¨å¼è¨è¨æ¨¡åï¼ç¨æ¼è¨ç®æ©å¢éä¸ç忣å¼å·è¡ãå SQL 鿍£çæ´é«éçæ¥è©¢èªè¨å¯ä»¥ç¨ä¸ç³»åç MapReduce æä½ä¾å¯¦ç¾ï¼è¦ [第åç« ](ch10.md)ï¼ï¼ä½æ¯ä¹æå¾å¤ä¸ä½¿ç¨ MapReduce çåæ£å¼ SQL 實ç¾ãé æ³¨æï¼SQL ä¸¦æ²æéå¶å®åªè½å¨å®ä¸æ©å¨ä¸å·è¡ï¼è MapReduce ä¹ä¸¦æ²æå£æ·ææç忣弿¥è©¢å·è¡ã
è½å¤ 卿¥è©¢ä¸ä½¿ç¨ JavaScript ç¨å¼ç¢¼æ¯é«éæ¥è©¢çä¸åéè¦ç¹æ§ï¼ä½éä¸éæ¼ MapReduceï¼ä¸äº SQL è³æåº«ä¹å¯ä»¥ç¨ JavaScript å½å¼é²è¡æ´å
å¥ä»¶ã34ãã
MapReduce çä¸åå¯ç¨æ§å顿¯ï¼å¿
é 編寫å
©åå¯ååä½ç JavaScript å½å¼ï¼éé常æ¯ç·¨å¯«å®åæ¥è©¢æ´å°é£ãæ¤å¤ï¼å®£å弿¥è©¢èªè¨çºæ¥è©¢æä½³å卿ä¾äºæ´å¤æ©æä¾æé«æ¥è©¢çæè½ãåºæ¼éäºåå ï¼MongoDB 2.2 æ·»å äºä¸ç¨®å«å **èå管é** ç宣å弿¥è©¢èªè¨çæ¯æ´ã9ããç¨é種èªè¨è¡¨è¿°é¯éè¨æ¸æ¥è©¢å¦ä¸æç¤ºï¼
```js
db.observations.aggregate([
{ $match: { family: "Sharks" } },
{ $group: {
_id: {
year: { $year: "$observationTimestamp" },
month: { $month: "$observationTimestamp" }
},
totalAnimals: { $sum: "$numAnimals" } }}
]);
```
èå管éèªè¨ç表ç¾åèï¼åè¿° PostgreSQL ä¾åçï¼SQL åéç¸ç¶ï¼ä½æ¯å®ä½¿ç¨åºæ¼ JSON çèªæ³è䏿¯ SQL é£ç¨®æ¥è¿è±æå¥å¼çèªæ³ï¼é種差ç°ä¹è¨±åªæ¯å£å³åé¡ãéåæ
äºç坿æ¯ï¼NoSQL 系統å¯è½ææå¤ç¼ç¾èªå·±åªæ¯éæ°ç¼æäºä¸å¥ç¶éå¬è£æ¹æ®ç SQLã
## åè³ææ¨¡å
妿åä¹åæè¦ï¼å¤å°å¤éä¿æ¯ä¸åè³ææ¨¡åä¹éå
·æå奿§çéè¦ç¹å¾µãå¦æä½ çæç¨ç¨å¼å¤§å¤æ¸çéä¿æ¯ä¸å°å¤éä¿ï¼æ¨¹ççµæ§åè³æï¼ï¼æè
大夿¸è¨éä¹éä¸åå¨éä¿ï¼é£éº¼ä½¿ç¨æä»¶æ¨¡åæ¯åé©çã
使¯ï¼è¦æ¯å¤å°å¤éä¿å¨ä½ çè³æä¸å¾å¸¸è¦å¢ï¼éä¿æ¨¡åå¯ä»¥èçå¤å°å¤éä¿çç°¡å®æ
æ³ï¼ä½æ¯é¨èè³æä¹éçé£ç·è®å¾æ´å è¤éï¼å°è³æå»ºæ¨¡çºå形顝徿´å èªç¶ã
ä¸ååç±å
©ç¨®ç©ä»¶çµæï¼**é é»**ï¼verticesï¼ä¹ç¨±çº **ç¯é»**ï¼å³ nodesï¼æ **實é«**ï¼å³ entitiesï¼ï¼å **é**ï¼edgesï¼ä¹ç¨±çº **éä¿**ï¼å³ relationshipsï¼æ **å¼§**ï¼å³ arcsï¼ãå¤ç¨®è³æå¯ä»¥è¢«å»ºæ¨¡çºä¸ååå½¢ãå
¸åçä¾åå
æ¬ï¼
* 社交åè
é 黿¯äººï¼éæç¤ºåªäºäººå½¼æ¤èªèã
* 網路åè
é 黿¯ç¶²é ï¼é緣表示æåå
¶ä»é é¢ç HTML é£çµã
* å
¬è·¯æéµè·¯ç¶²è·¯
é 黿¯äº¤åè·¯å£ï¼éç·ä»£è¡¨å®åä¹éçéè·¯æéµè·¯ç·ã
å¯ä»¥å°é£äºç¾æå¨ç¥çæ¼ç®æ³éç¨å°éäºåä¸ï¼ä¾å¦ï¼æ±½è»å°èªç³»çµ±æå°é路網路ä¸å
©é»ä¹éçæçè·¯å¾ï¼PageRank å¯ä»¥ç¨å¨ç¶²è·¯åä¸ä¾ç¢ºå®ç¶²é çæµè¡ç¨åº¦ï¼å¾è確å®è©²ç¶²é 卿å°çµæä¸çæåã
å¨åå給åºçä¾åä¸ï¼åä¸çææé é»ä»£è¡¨äºç¸ååå¥çäºç©ï¼äººãç¶²é æäº¤åè·¯å£ï¼ãä¸éï¼å並ä¸ä¾·éæ¼é樣çåé¡è³æï¼åæ¨£å¼·å¤§å°æ¯ï¼åæä¾äºä¸ç¨®ä¸è´çæ¹å¼ï¼ç¨ä¾å¨å®åæ¸æå²åä¸å²åå®å
¨ä¸ååå¥çç©ä»¶ãä¾å¦ï¼Facebook ç¶è·ä¸åå
å«è¨±å¤ä¸ååå¥çé é»åéçå®ååï¼é é»è¡¨ç¤ºäººãå°é»ãäºä»¶ãç°½å°å使ç¨è
çè©è«ï¼é表示åªäºäººæ¯å¥½åãç°½å°ç¼çå¨åªè£¡ã誰è©è«äºä»éº¼å¸åã誰åèäºä»éº¼äºä»¶ççã35ãã
卿¬ç¯ä¸ï¼æåå°ä½¿ç¨ [å 2-5](../img/fig2-5.png) æç¤ºç示ä¾ãå®å¯ä»¥å¾ç¤¾äº¤ç¶²è·¯æç³»èè³æåº«ä¸ç²å¾ï¼å®é¡¯ç¤ºäºå
©å人ï¼ä¾èªæéè·å·ç Lucy åä¾èªæ³å Beaune ç Alainãä»åå·²å©ï¼ä½å¨å«æ¦ã

**å 2-5 åè³æçµæ§ç¤ºä¾ï¼æ¡ä»£è¡¨é é»ï¼ç®é 代表éï¼**
æå¹¾ç¨®ä¸åä½ç¸éçæ¹æ³ç¨ä¾æ§å»ºåæ¥è©¢å表ä¸çè³æã卿¬ç¯ä¸ï¼æåå°è¨è«å±¬æ§å模åï¼ç± Neo4jï¼Titan å InfiniteGraph 實ç¾ï¼åä¸å
çµå²åï¼triple-storeï¼æ¨¡åï¼ç± DatomicãAllegroGraph ç實ç¾ï¼ãæåå°æª¢è¦åçä¸ç¨®å®£å弿¥è©¢èªè¨ï¼Cypherï¼SPARQL å Datalogã餿¤ä¹å¤ï¼éæå Gremlin ã36ã鿍£çåå½¢æ¥è©¢èªè¨åå Pregel 鿍£çåå½¢èçæ¡æ¶ï¼è¦ [第åç« ](ch10.md)ï¼ã
### 屬æ§å
å¨å±¬æ§å模åä¸ï¼æ¯åé é»ï¼vertexï¼å
æ¬ï¼
* å¯ä¸çèå¥ç¬¦è
* ä¸çµåºéï¼outgoing edgesï¼
* ä¸çµå
¥éï¼ingoing edgesï¼
* ä¸çµå±¬æ§ï¼éµå¼å°ï¼
æ¯æ¢éï¼edgeï¼å
æ¬ï¼
* å¯ä¸èå¥ç¬¦è
* éçèµ·é»ï¼**å°¾é¨é é»**ï¼å³ tail vertexï¼
* éççµé»ï¼**é é¨é é»**ï¼å³ head vertexï¼
* æè¿°å
©åé é»ä¹ééä¿åå¥çæ¨ç±¤
* ä¸çµå±¬æ§ï¼éµå¼å°ï¼
å¯ä»¥å°åå²åçä½ç±å
©åéä¿è¡¨çµæï¼ä¸åå²åé é»ï¼å¦ä¸åå²åéï¼å¦ [ä¾ 2-2]() æç¤ºï¼è©²æ¨¡å¼ä½¿ç¨ PostgreSQL JSON è³æåå¥ä¾å²åæ¯åé é»ææ¯æ¢éç屬æ§ï¼ãé é¨åå°¾é¨é é»ç¨ä¾å²åæ¯æ¢éï¼å¦æä½ æ³è¦ä¸çµé é»ç輸å
¥æè¼¸åºéï¼ä½ å¯ä»¥åå¥éé `head_vertex` æ `tail_vertex` 便¥è©¢ `edges` 表ã
**ä¾ 2-2 使ç¨éä¿æ¨¡å¼ä¾è¡¨ç¤ºå±¬æ§å**
```sql
CREATE TABLE vertices (
vertex_id INTEGER PRIMARY KEY,
properties JSON
);
CREATE TABLE edges (
edge_id INTEGER PRIMARY KEY,
tail_vertex INTEGER REFERENCES vertices (vertex_id),
head_vertex INTEGER REFERENCES vertices (vertex_id),
label TEXT,
properties JSON
);
CREATE INDEX edges_tails ON edges (tail_vertex);
CREATE INDEX edges_heads ON edges (head_vertex);
```
éæ¼é忍¡åçä¸äºéè¦æ¹é¢æ¯ï¼
1. ä»»ä½é é»é½å¯ä»¥æä¸æ¢éé£ç·å°ä»»ä½å
¶ä»é é»ãæ²ææ¨¡å¼éå¶åªç¨®äºç©å¯ä¸å¯ä»¥éè¯ã
2. 給å®ä»»ä½é é»ï¼å¯ä»¥é«æå°æ¾å°å®çå
¥éååºéï¼å¾èéæ·åï¼å³æ²¿èä¸ç³»åé é»çè·¯å¾åå¾ç§»åï¼éå°±æ¯çºä»éº¼ [ä¾ 2-2]() å¨ `tail_vertex` å `head_vertex` åä¸é½æç´¢å¼çåå ï¼ã
3. ééå°ä¸ååå¥çéä¿ä½¿ç¨ä¸åçæ¨ç±¤ï¼å¯ä»¥å¨ä¸ååä¸å²å幾種ä¸åçè³è¨ï¼åæä»ç¶ä¿æä¸åæ¸
æ°çè³ææ¨¡åã
éäºç¹æ§çºè³æå»ºæ¨¡æä¾äºå¾å¤§çéæ´»æ§ï¼å¦ [å 2-5](../img/fig2-5.png) æç¤ºãåä¸é¡¯ç¤ºäºä¸äºå³çµ±éä¿æ¨¡å¼é£ä»¥è¡¨éçäºæ
ï¼ä¾å¦ä¸ååå®¶çä¸åå°åçµæ§ï¼æ³åæçå大åï¼ç¾åæç¸£åå·ï¼ï¼åä¸åçæªäºï¼å
忽ç¥ä¸»æ¬åå®¶åæ°æé¯ç¶è¤éççæ¤åï¼ï¼ä¸åçè³æç²åº¦ï¼Lucy ç¾å¨ç使è¨éå
·é«å°åå¸ï¼è她çåºçå°é»åªæ¯å¨ä¸åå·çç´å¥ï¼ã
ä½ å¯ä»¥æ³è±¡è©²åéè½å»¶ä¼¸åºè¨±å¤éæ¼ Lucy å Alain çäºå¯¦ï¼æå
¶ä»äººçå
¶ä»æ´å¤çäºå¯¦ãä¾å¦ï¼ä½ å¯ä»¥ç¨å®ä¾è¡¨ç¤ºé£ç©éæï¼çºæ¯åéææºå¢å ä¸åé é»ï¼ä¸¦å¢å 人èéææºä¹éç䏿¢éä¾æç¤ºä¸ç¨®éææ
æ³ï¼ï¼ä¸¦é£çµå°éææºï¼æ¯åéææºå
·æä¸çµé é»ç¨ä¾é¡¯ç¤ºåªäºé£ç©å«æåªäºç©è³ªãç¶å¾ï¼ä½ å¯ä»¥å¯«ä¸åæ¥è©¢ï¼æ¾åºæ¯å人åä»éº¼æ¯å®å
¨çãåå¨å¯æ¼åæ§æ¹é¢æ¯å¯æåªå¢çï¼ç¶ä½ 宿ç¨ç¨å¼æ°å¢åè½æï¼å¯ä»¥è¼é¬æ´å
å¥ä»¶å以驿ç¨å¼è³æçµæ§çè®åã
### Cypher æ¥è©¢èªè¨
Cypher æ¯å±¬æ§åç宣å弿¥è©¢èªè¨ï¼çº Neo4j åå½¢è³æåº«èç¼æã37ãï¼å®æ¯ä»¥é»å½± âé§å®¢å¸åâ ä¸çä¸åè§è²ä¾å½åçï¼èèå¯ç¢¼å¸ä¸çå 坿¼ç®æ³ç¡éã38ãï¼ã
[ä¾ 2-3]() 顯示äºå° [å 2-5](../img/fig2-5.png) çå·¦éé¨åæå
¥åå½¢è³æåº«ç Cypher æ¥è©¢ãä½ å¯ä»¥ä»¥é¡ä¼¼çæ¹å¼æåçå©é¤é¨åæ°å¢é²å»ï¼ä½é裡çºäºæç« å¯é±è®æ§èçç¥éé¨åç示ä¾ãæ¯åé é»é½æä¸åå `USA` æ `Idaho` 鿍£ç符èåç¨±ï¼æ¥è©¢çå
¶ä»é¨åå¯ä»¥ä½¿ç¨éäºå稱å¨é é»ä¹é建ç«éï¼ä½¿ç¨ç®é 符èï¼`ï¼Idahoï¼ - [ï¼WITHIN] ->ï¼USAï¼` 建ç«ä¸æ¢æ¨è¨çº `WITHIN` çéï¼`Idaho` çºå°¾ç¯é»ï¼`USA` çºé ç¯é»ã
**ä¾ 2-3 å°å 2-5 ä¸çè³æåéè¡¨ç¤ºçº Cypher æ¥è©¢**
```cypher
CREATE
(NAmerica:Location {name:'North America', type:'continent'}),
(USA:Location {name:'United States', type:'country' }),
(Idaho:Location {name:'Idaho', type:'state' }),
(Lucy:Person {name:'Lucy' }),
(Idaho) -[:WITHIN]-> (USA) -[:WITHIN]-> (NAmerica),
(Lucy) -[:BORN_IN]-> (Idaho)
```
ç¶ [å 2-5](../img/fig2-5.png) çææé é»åé被æ°å¢å°è³æåº«å¾ï¼è®æåæäºæè¶£çåé¡ï¼ä¾å¦ï¼æ¾å°ææå¾ç¾åç§»æ°å°ææ´²ç人çååãæ´ç¢ºåå°èªªï¼é裡æåæ³è¦æ¾å°ç¬¦åä¸é¢æ¢ä»¶çææé é»ï¼ä¸¦ä¸è¿åéäºé é»ç `name` 屬æ§ï¼è©²é 黿æä¸æ¢é£å°ç¾åä»»ä¸ä½ç½®ç `BORN_IN` éï¼å䏿¢é£å°ææ´²çä»»ä¸ä½ç½®ç `LIVING_IN` éã
[ä¾ 2-4]() å±ç¤ºçå¦ä½å¨ Cypher ä¸è¡¨ééåæ¥è©¢ãå¨ MATCH åå¥ä¸ä½¿ç¨ç¸åçç®é 符è便¥è©¢åä¸ç模å¼ï¼`(person) -[:BORN_IN]-> ()` å¯ä»¥å¹é
`BORN_IN` éçä»»æå
©åé é»ã該éçå°¾ç¯é»è¢«ç¶å®äºè®æ¸ `person`ï¼é ç¯é»åæªè¢«ç¹«çµã
**ä¾ 2-4 æ¥è©¢ææå¾ç¾åç§»æ°å°ææ´²ç人ç Cypher æ¥è©¢ï¼**
```cypher
MATCH
(person) -[:BORN_IN]-> () -[:WITHIN*0..]-> (us:Location {name:'United States'}),
(person) -[:LIVES_IN]-> () -[:WITHIN*0..]-> (eu:Location {name:'Europe'})
RETURN person.name
```
æ¥è©¢æå¦ä¸ä¾è§£è®ï¼
> æ¾å°æ»¿è¶³ä»¥ä¸å
©åæ¢ä»¶çææé é»ï¼ç¨±ä¹çº person é é»ï¼ï¼
> 1. `person` é 黿æä¸æ¢å°æåé é»ç `BORN_IN` åºéãå¾é£åé é»éå§ï¼æ²¿èä¸ç³»å `WITHIN` åºéæçµå°éä¸ååå¥çº `Location`ï¼`name` 屬æ§çº `United States` çé é»ã
>
> 2. `person` é é»éææä¸æ¢ `LIVES_IN` åºéãæ²¿è鿢éï¼å¯ä»¥ééä¸ç³»å `WITHIN` åºéæçµå°éä¸ååå¥çº `Location`ï¼`name` 屬æ§çº `Europe` çé é»ã
>
> å°æ¼é樣ç `Person` é é»ï¼è¿åå
¶ `name` 屬æ§ã
å·è¡éæ¢æ¥è©¢å¯è½ææå¹¾ç¨®å¯è¡çæ¥è©¢è·¯å¾ãé裡給åºçæè¿°å»ºè°é¦å
ææè³æåº«ä¸çææäººï¼æª¢æ¥æ¯å人çåºçå°åå±
ä½å°ï¼ç¶å¾åªè¿åç¬¦åæ¢ä»¶çé£äºäººã
çå¹å°ï¼ä¹å¯ä»¥å¾å
©å `Location` é é»éå§ååå°æ¥è©¢ãåå¦ `name` 屬æ§ä¸æç´¢å¼ï¼åå¯ä»¥é«æå°æ¾å°ä»£è¡¨ç¾ååææ´²çå
©åé é»ãç¶å¾ï¼æ²¿èææ `WITHIN` å
¥éï¼å¯ä»¥ç¹¼çºæ¥æ¾åºææå¨ç¾ååææ´²çä½ç½®ï¼å·ãå°åãåå¸çï¼ãæå¾ï¼æ¥æ¾åºé£äºå¯ä»¥ç± `BORN_IN` æ `LIVES_IN` å
¥éå°é£äºä½ç½®é é»ç人ã
é叏尿¼å®£å弿¥è©¢èªè¨ä¾èªªï¼å¨ç·¨å¯«æ¥è©¢èªå¥æï¼ä¸éè¦æå®å·è¡ç´°ç¯ï¼æ¥è©¢æä½³åç¨å¼æèªåé¸æé æ¸¬æçæé«ççç¥ï¼å æ¤ä½ å¯ä»¥å°æ³¨æ¼ç·¨å¯«æç¨ç¨å¼çå
¶ä»é¨åã
### SQL ä¸çåæ¥è©¢
[ä¾ 2-2]() æåºï¼å¯ä»¥å¨éä¿è³æåº«ä¸è¡¨ç¤ºåè³æã使¯ï¼å¦æåè³æå·²ç¶ä»¥éä¿çµæ§å²åï¼æåæ¯å¦ä¹å¯ä»¥ä½¿ç¨ SQL æ¥è©¢å®ï¼
çæ¡æ¯è¯å®çï¼ä½æäºå°é£ãå¨éä¿è³æåº«ä¸ï¼ä½ é常æäºå
ç¥é卿¥è©¢ä¸éè¦åªäºé£ç·ãå¨åæ¥è©¢ä¸ï¼ä½ å¯è½éè¦å¨æ¾å°å¾
æ¥è©¢çé é»ä¹åï¼éæ·å¯è®æ¸éçéãä¹å°±æ¯èªªï¼é£ç·çæ¸éäºå
並ä¸ç¢ºå®ã
卿åçä¾åä¸ï¼éç¼çå¨ Cypher æ¥è©¢ä¸ç `() -[:WITHIN*0..]-> ()` è¦åä¸ãä¸å人ç `LIVES_IN` éå¯ä»¥æåä»»ä½åå¥çä½ç½®ï¼è¡éãåå¸ãå°åãåå®¶çãä¸ååå¸å¯ä»¥å¨ï¼WITHINï¼ä¸åå°åå
§ï¼ä¸åå°åå¯ä»¥å¨ï¼WITHINï¼å¨ä¸åå·å
§ï¼ä¸åå·å¯ä»¥å¨ï¼WITHINï¼ä¸ååå®¶å
§ï¼ççã`LIVES_IN` éå¯ä»¥ç´æ¥æåæ£å¨æ¥è©¢çä½ç½®ï¼æè
ä¸åå¨ä½ç½®å±¤æ¬¡çµæ§ä¸éäºæ¸å±¤çä½ç½®ã
å¨ Cypher ä¸ï¼ç¨ `WITHIN*0..` é常簡æ½å°è¡¨è¿°äºä¸è¿°äºå¯¦ï¼â沿è `WITHIN` éï¼é¶æ¬¡æå¤æ¬¡âãå®å¾åæ£å表示å¼ä¸ç `*` éç®åã
èª SQL:1999ï¼æ¥è©¢å¯è®é·åº¦éæ·è·¯å¾çææ³å¯ä»¥ä½¿ç¨ç¨±çº **éè¿´å
¬ç¨è¡¨è¡¨éå¼**ï¼`WITH RECURSIVE` èªæ³ï¼çæ±è¥¿ä¾è¡¨ç¤ºã[ä¾ 2-5]() 顯示äºåæ¨£çæ¥è©¢ - æ¥è©¢å¾ç¾åç§»æ°å°ææ´²ç人çå§å - å¨ SQL 使ç¨é種æè¡ï¼PostgreSQLãIBM DB2ãOracle å SQL Server 忝æ´ï¼ä¾è¡¨è¿°ã使¯ï¼è Cypher ç¸æ¯ï¼å
¶èªæ³é常笨æã
**ä¾ 2-5 èç¤ºä¾ 2-4 忍£çæ¥è©¢ï¼å¨ SQL ä¸ä½¿ç¨éè¿´å
¬ç¨è¡¨è¡¨éå¼è¡¨ç¤º**
```sql
WITH RECURSIVE
-- in_usa å
嫿æçç¾åå¢å
§çä½ç½® ID
in_usa(vertex_id) AS (
SELECT vertex_id FROM vertices WHERE properties ->> 'name' = 'United States'
UNION
SELECT edges.tail_vertex FROM edges
JOIN in_usa ON edges.head_vertex = in_usa.vertex_id
WHERE edges.label = 'within'
),
-- in_europe å
嫿æçææ´²å¢å
§çä½ç½® ID
in_europe(vertex_id) AS (
SELECT vertex_id FROM vertices WHERE properties ->> 'name' = 'Europe'
UNION
SELECT edges.tail_vertex FROM edges
JOIN in_europe ON edges.head_vertex = in_europe.vertex_id
WHERE edges.label = 'within' ),
-- born_in_usa å
å«äºææåå¥çº Personï¼ä¸åºçå¨ç¾åçé é»
born_in_usa(vertex_id) AS (
SELECT edges.tail_vertex FROM edges
JOIN in_usa ON edges.head_vertex = in_usa.vertex_id
WHERE edges.label = 'born_in' ),
-- lives_in_europe å
å«äºææåå¥çº Personï¼ä¸å±
ä½å¨ææ´²çé é»ã
lives_in_europe(vertex_id) AS (
SELECT edges.tail_vertex FROM edges
JOIN in_europe ON edges.head_vertex = in_europe.vertex_id
WHERE edges.label = 'lives_in')
SELECT vertices.properties ->> 'name'
FROM vertices
JOIN born_in_usa ON vertices.vertex_id = born_in_usa.vertex_id
JOIN lives_in_europe ON vertices.vertex_id = lives_in_europe.vertex_id;
```
* é¦å
ï¼æ¥è©¢ `name` 屬æ§çº `United States` çé é»ï¼å°å
¶ä½çº `in_usa` é é»çéåç第ä¸åå
ç´ ã
* å¾ `in_usa` éåçé é»åºç¼ï¼æ²¿èææç `with_in` å
¥éï¼å°å
¶å°¾é é»å å
¥åä¸éåï¼ä¸æ·éè¿´ç´å°ææ `with_in` å
¥éé½è¢«è¨ªåå®ç¢ã
* åçï¼å¾ `name` 屬æ§çº `Europe` çé é»åºç¼ï¼å»ºç« `in_europe` é é»çéåã
* å°æ¼ `in_usa` éåä¸çæ¯åé é»ï¼æ ¹æ `born_in` å
¥é便¥æ¾åºçå¨ç¾åæåå°æ¹ç人ã
* 忍£ï¼å°æ¼ `in_europe` éåä¸çæ¯åé é»ï¼æ ¹æ `lives_in` å
¥é便¥è©¢å±
ä½å¨ææ´²ç人ã
* æå¾ï¼æå¨ç¾ååºçç人çéåèå¨ææ´²å±
ä½ç人çéåç¸äº¤ã
åä¸åæ¥è©¢ï¼ç¨æä¸åæ¥è©¢èªè¨å¯ä»¥å¯«æ 4 è¡ï¼èç¨å¦ä¸åæ¥è©¢èªè¨éè¦ 29 è¡ï¼éæ°æ°èªªæäºä¸åçè³ææ¨¡åæ¯çºä¸åçæç¨å ´æ¯èè¨è¨çã鏿é©åæç¨ç¨å¼çè³ææ¨¡åé常éè¦ã
### ä¸å
çµå²åå SPARQL
ä¸å
çµå²å模å¼å¤§é«ä¸è屬æ§å模åç¸åï¼ç¨ä¸åçè©ä¾æè¿°ç¸åçæ³æ³ãä¸éä»ç¶å¼å¾è¨è«ï¼å çºä¸å
çµå²åæå¾å¤ç¾æçå·¥å
·åèªè¨ï¼éäºå·¥å
·åèªè¨å°æ¼æ§å»ºæç¨ç¨å¼çå·¥å
·ç®±å¯è½æ¯å¯¶è²´çè£å
ã
å¨ä¸å
çµå²åä¸ï¼ææè³è¨é½ä»¥é常簡å®çä¸é¨å表示形å¼å²åï¼**主èª**ï¼**è¬èª**ï¼**è³èª**ï¼ãä¾å¦ï¼ä¸å
çµ **(åå§, åæ¡, é¦è)** ä¸ï¼**åå§** æ¯ä¸»èªï¼**åæ¡** æ¯è¬èªï¼åè©ï¼ï¼**é¦è** æ¯ç©ä»¶ã
ä¸å
çµç主èªç¸ç¶æ¼åä¸çä¸åé é»ãèè³èªæ¯ä¸é¢å
©è
ä¹ä¸ï¼
1. åå§è³æåå¥ä¸çå¼ï¼ä¾å¦åä¸²ææ¸åãå¨é種æ
æ³ä¸ï¼ä¸å
çµçè¬èªåè³èªç¸ç¶æ¼ä¸»èªé é»ä¸ç屬æ§çéµåå¼ãä¾å¦ï¼`(lucy, age, 33)` å°±åå±¬æ§ `{âageâï¼33}` çé é» lucyã
2. åä¸çå¦ä¸åé é»ãå¨é種æ
æ³ä¸ï¼è¬èªæ¯åä¸ç䏿¢éï¼ä¸»èªæ¯å
¶å°¾é¨é é»ï¼èè³èªæ¯å
¶é é¨é é»ãä¾å¦ï¼å¨ `(lucy, marriedTo, alain)` ä¸ä¸»èªåè³èª `lucy` å `alain` 齿¯é é»ï¼ä¸¦ä¸è¬èª `marriedTo` æ¯é£ç·ä»åçéçæ¨ç±¤ã
[ä¾ 2-6]() å±ç¤ºäºè [ä¾ 2-3]() ç¸åçè³æï¼ä»¥ç¨±çº Turtle çæ ¼å¼ï¼Notation3ï¼N3ï¼ã39ãçä¸ååéï¼å¯«æä¸å
çµã
**ä¾ 2-6 å 2-5 ä¸çè³æåéï¼è¡¨ç¤ºçº Turtle ä¸å
çµ**
```reStructuredText
@prefix : .
_:lucy a :Person.
_:lucy :name "Lucy".
_:lucy :bornIn _:idaho.
_:idaho a :Location.
_:idaho :name "Idaho".
_:idaho :type "state".
_:idaho :within _:usa.
_:usa a :Location
_:usa :name "United States"
_:usa :type "country".
_:usa :within _:namerica.
_:namerica a :Location
_:namerica :name "North America"
_:namerica :type :"continent"
```
å¨éåä¾åä¸ï¼åçé é»è¢«å¯«çºï¼`_ï¼someName`ãéååå䏦䏿å³èéåæªæ¡ä»¥å¤ç任使±è¥¿ãå®çåå¨åªæ¯å¹«å©æåæç¢ºåªäºä¸å
çµå¼ç¨äºåä¸é é»ãç¶è¬èªè¡¨ç¤ºéæï¼è©²è³èªæ¯ä¸åé é»ï¼å¦ `_:idaho :within _:usa.`ãç¶è¬èªæ¯ä¸åå±¬æ§æï¼è©²è³èªæ¯ä¸åå串ï¼å¦ `_:usa :name"United States"`
ä¸éåä¸éå°éè¤ç¸åç主èªçèµ·ä¾ç¸ç¶éè¤ï¼ä½å¹¸éçæ¯ï¼å¯ä»¥ä½¿ç¨åèä¾èªªæéæ¼åä¸ä¸»èªçå¤åäºæ
ãéä½¿å¾ Turtle æ ¼å¼ç¸ç¶ä¸é¯ï¼å¯è®æ§å¼·ï¼è«åé± [ä¾ 2-7]()ã
**ä¾ 2-7 ä¸ç¨®ç¸å°ä¾ 2-6 寫å
¥è³æçæ´çºç°¡æ½çæ¹æ³ã**
```
@prefix : .
_:lucy a :Person; :name "Lucy"; :bornIn _:idaho.
_:idaho a :Location; :name "Idaho"; :type "state"; :within _:usa
_:usa a :Loaction; :name "United States"; :type "country"; :within _:namerica.
_:namerica a :Location; :name "North America"; :type "continent".
```
#### èªç¾©ç¶²
å¦æä½ æ·±å
¥çè§£éæ¼ä¸å
çµå²åçè³è¨ï¼å¯è½æé·å
¥éæ¼**èªç¾©ç¶²**çè¨è«æ¼©æ¸¦ä¸ãä¸å
çµå²å模åå
¶å¯¦æ¯å®å
¨ç¨ç«æ¼èªç¾©ç¶²åå¨çï¼ä¾å¦ï¼Datomicã40ãä½çºä¸ç¨®ä¸å
çµå²åè³æåº« [^vii]ï¼å¾æªè¢«ç¨æ¼èªç¾©ç¶²ä¸ã使¯ï¼ç±æ¼å¨å¾å¤äººç¼ä¸éå
©è
ç·å¯ç¸é£ï¼æåæè©²ç°¡è¦å°è¨è«ä¸ä¸ã
[^vii]: 徿è¡ä¸è¬ï¼Datomic 使ç¨çæ¯äºå
çµè䏿¯ä¸å
çµï¼å
©åé¡å¤çæ¬ä½æ¯ç¨æ¼çæ¬æ§å¶çå
è³æ
徿¬è³ªä¸è¬ï¼èªç¾©ç¶²æ¯ä¸åç°¡å®ä¸åççæ³æ³ï¼ç¶²ç«å·²ç¶å°è³è¨éåºçºæåååçä¾äººé¡é±è®ï¼çºä»éº¼ä¸å°è³è¨ä½çºæ©å¨å¯è®çè³æä¹éåºçµ¦è¨ç®æ©å¢ï¼ï¼åºæ¼ä¸å
çµæ¨¡åçï¼**è³æºæè¿°æ¡æ¶**ï¼**RDF**ï¼ã41ãï¼è¢«ç¨ä½ä¸åç¶²ç«ä»¥çµ±ä¸çæ ¼å¼éåºè³æçä¸ç¨®æ©å¶ï¼å
許ä¾èªä¸åç¶²ç«çè³æèªååä½µæ **ä¸åæ¸æç¶²è·¯** ââ æçºä¸ç¨®ç¶²é網路ç¯åå
§ç âéç¨èªç¾©ç¶²è³æåº«âã
ä¸å¹¸çæ¯ï¼èªç¾©ç¶²å¨äºåä¸ä¸ç´å被é度çä½ï¼ä½å°ç®åçºæ¢æ²æä»»ä½è·¡è±¡è¡¨æå·²å¨å¯¦è¸ä¸æç¨ï¼é使å¾è¨±å¤äººå¤ä¹ä»¥é¼»ãå®é飽åç¼è±ç¹äºç縮ç¥è©ãéæ¼è¤éçæ¨æºææ¡åçå¦èªå¤§çå°æ¾ã
ç¶èï¼å¦æå¾éå»ç失æä¸æ±²åæè¨ï¼èªç¾©ç¶²å°æ¡éæ¯ææå¾å¤åªç§çææãå³ä½¿ä½ æ²æèè¶£å¨èªç¾©ç¶²ä¸éåº RDF è³æï¼ä¸å
çµé種模å乿¯ä¸ç¨®å¥½çæç¨ç¨å¼å
§é¨è³ææ¨¡åã
#### RDF è³ææ¨¡å
[ä¾ 2-7]() ä¸ä½¿ç¨ç Turtle èªè¨æ¯ä¸ç¨®ç¨æ¼ RDF è³æç人é¡å¯è®æ ¼å¼ãææåï¼RDF ä¹å¯ä»¥ä»¥ XML æ ¼å¼ç·¨å¯«ï¼ä¸é宿忍£çäºæ
æç¸å°åå¦ï¼è«åé± [ä¾ 2-8]()ãTurtle/N3 æ¯æ´å¯åçï¼å çºå®æ´å®¹æé±è®ï¼å Apache Jena ã42ã鿍£çå·¥å
·å¯ä»¥æ ¹æéè¦å¨ä¸åç RDF æ ¼å¼ä¹éé²è¡èªåè½æã
**ä¾ 2-8 ç¨ RDF/XML èªæ³è¡¨ç¤ºä¾ 2-7 çè³æ**
```xml
Idaho
state
United States
country
North America
continent
Lucy
```
RDF æä¸äºå¥æªä¹èï¼å çºå®æ¯çºäºå¨ç¶²é網路ä¸äº¤æè³æèè¨è¨çãä¸å
çµç主èªï¼è¬èªåè³èªéå¸¸æ¯ URIãä¾å¦ï¼è¬èªå¯è½æ¯ä¸å URIï¼å¦ `` æ ``ï¼èä¸å
å
æ¯ `WITHIN` æ `LIVES_IN`ãéåè¨è¨èå¾çåå çºäºè®ä½ è½å¤ æä½ çè³æåå
¶ä»äººçè³æçµåèµ·ä¾ï¼å¦æä»å賦äºå®è© `within` æè
`lives_in` ä¸åçå«ç¾©ï¼å
©è
ä¹ä¸æè¡çªï¼å çºå®åçè¬èªå¯¦é䏿¯ `` å ``ã
å¾ RDF çè§åº¦ä¾çï¼URL `` ä¸ä¸å®éè¦è½è§£ææä»éº¼æ±è¥¿ï¼å®åªæ¯ä¸åå稱空éãçºé¿å
è `http://URL` æ··æ·ï¼æ¬ç¯ä¸ç示ä¾ä½¿ç¨ä¸å¯è§£æç URIï¼å¦ `urnï¼exampleï¼within`ã幸éçæ¯ï¼ä½ åªé卿ªæ¡é é¨å°éååé¦å䏿¬¡å®£åï¼å¾çºå°±ä¸ç¨å管äºã
### SPARQL æ¥è©¢èªè¨
**SPARQL** æ¯ä¸ç¨®ç¨æ¼ä¸å
çµå²åçé¢å RDF è³ææ¨¡åçæ¥è©¢èªè¨ã43ãï¼å®æ¯ SPARQL åè°å RDF æ¥è©¢èªè¨ç縮寫ï¼ç¼é³çº âsparkleâï¼ãSPARQL æ©æ¼ Cypherï¼ä¸¦ä¸ç±æ¼ Cypher çæ¨¡å¼å¹é
åéæ¼ SPARQLï¼é使å¾å®åçèµ·ä¾é常ç¸ä¼¼ã37ãã
èä¹åç¸åçæ¥è©¢ ââ æ¥è©¢å¾ç¾åç§»æ°å°ææ´²ç人 ââ ä½¿ç¨ SPARQL æ¯ä½¿ç¨ Cypher çè³æ´çºç°¡æ½ï¼è«åé± [ä¾ 2-9]()ï¼ã
**ä¾ 2-9 èç¤ºä¾ 2-4 ç¸åçæ¥è©¢ï¼ç¨ SPARQL 表示**
```sparql
PREFIX :
SELECT ?personName WHERE {
?person :name ?personName.
?person :bornIn / :within* / :name "United States".
?person :livesIn / :within* / :name "Europe".
}
```
çµæ§é常ç¸ä¼¼ã以ä¸å
©åè¡¨ç¤ºå¼æ¯çå¹çï¼SPARQL ä¸çè®æ¸ä»¥åèéé ï¼ï¼
```
(person) -[:BORN_IN]-> () -[:WITHIN*0..]-> (location) # Cypher
?person :bornIn / :within* ?location. # SPARQL
```
å çº RDF ä¸åå屬æ§åéï¼èåªæ¯å°å®åä½çºè¬èªï¼æä»¥å¯ä»¥ä½¿ç¨ç¸åçèªæ³ä¾å¹é
屬æ§ãå¨ä¸é¢ç表示å¼ä¸ï¼è®æ¸ `usa` 被繫çµå°ä»»æ `name` 屬æ§çºåä¸²å¼ `"United States"` çé é»ï¼
```
(usa {name:'United States'}) # Cypher
?usa :name "United States". # SPARQL
```
SPARQL æ¯ä¸ç¨®å¾å¥½çæ¥è©¢èªè¨ ââ åç®¡å®æ§æ³çèªç¾©ç¶²å¾æªå¯¦ç¾ï¼ä½å®ä»ç¶æ¯ä¸ç¨®å¯ç¨æ¼æç¨ç¨å¼å
§é¨ç強大工å
·ã
> #### åå½¢è³æåº«èç¶²çæ¨¡åç¸æ¯è¼
>
> å¨ â[æä»¶è³æåº«æ¯å¦å¨éè¹è¦è½ï¼](#æä»¶è³æåº«æ¯å¦å¨éè¹è¦è½ï¼)â ä¸ï¼æåè¨è«äº CODASYL åéä¿æ¨¡åå¦ä½ç«¶ç¸è§£æ±º IMS ä¸çå¤å°å¤éä¿åé¡ãä¹ä¸çï¼CODASYL çç¶²çæ¨¡åçèµ·ä¾è忍¡åç¸ä¼¼ãCODASYL æ¯å¦æ¯åå½¢è³æåº«ç第äºåè®ç¨®ï¼
>
> ä¸ï¼ä»åå¨å¹¾åéè¦æ¹é¢ææä¸åï¼
>
> * å¨ CODASYL ä¸ï¼è³æåº«æä¸å模å¼ï¼ç¨æ¼æå®åªç¨®è¨éåå¥å¯ä»¥å·¢çå¨å
¶ä»è¨éåå¥ä¸ãå¨åå½¢è³æåº«ä¸ï¼ä¸åå¨é樣çéå¶ï¼ä»»ä½é é»é½å¯ä»¥å
·æå°å
¶ä»ä»»ä½é é»çéãéçºæç¨ç¨å¼é©æä¸æ·è®åçéæ±æä¾äºæ´å¤§çéæ´»æ§ã
> * å¨ CODASYL ä¸ï¼éå°ç¹å®è¨éçå¯ä¸æ¹æ³æ¯éæ·å
¶ä¸çä¸å訪åè·¯å¾ãå¨åå½¢è³æåº«ä¸ï¼å¯ä»¥ééå
¶å¯ä¸ ID ç´æ¥å¼ç¨ä»»ä½é é»ï¼ä¹å¯ä»¥ä½¿ç¨ç´¢å¼ä¾æ¥è©¢å
·æç¹å®å¼çé é»ã
> * å¨ CODASYL ä¸ï¼è¨éçåå°æ¡æ¯ä¸åæåºéåï¼æä»¥è³æåº«å¿
é å»ç®¡çå®åçæ¬¡åºï¼éæå½±é¿å²åä½å±ï¼ï¼ä¸¦ä¸æç¨ç¨å¼å¨æå
¥æ°è¨éå°è³æåº«æå¿
é éæ³¨æ°è¨éå¨éäºéåä¸çä½ç½®ãå¨åå½¢è³æåº«ä¸ï¼é é»å鿝ç¡åºçï¼åªè½å¨æ¥è©¢æå°çµæé²è¡æåºï¼ã
> * å¨ CODASYL ä¸ï¼æææ¥è©¢é½æ¯å½ä»¤å¼çï¼é£ä»¥ç·¨å¯«ï¼ä¸¦ä¸å¾å®¹æå æ¶æ§è®åèåå°ç ´å£ãå¨åå½¢è³æåº«ä¸ï¼ä½ å¯ä»¥å¨å½ä»¤å¼ç¨å¼ç¢¼ä¸æå¯«éæ·éç¨ï¼ä½å¤§å¤æ¸åå½¢è³æåº«é½æ¯æ´é«é宣å弿¥è©¢ï¼å¦ Cypher æ SPARQLã
>
>
### åºç¤ï¼Datalog
**Datalog** æ¯æ¯ SPARQLãCypher æ´å¤èçèªè¨ï¼å¨ 20 ä¸ç´ 80 年代被å¸è
廣æ³ç ç©¶ã44,45,46ããå®å¨è»é«å·¥ç¨å¸«ä¸ä¸å¤ªç¥åï¼ä½æ¯å®æ¯éè¦çï¼å çºå®çºä»¥å¾çæ¥è©¢èªè¨æä¾äºåºç¤ã
實è¸ä¸ï¼Datalog 卿éçå¹¾åè³æç³»çµ±ä¸ä½¿ç¨ï¼ä¾å¦ï¼å®æ¯ Datomic ã40ãçæ¥è©¢èªè¨ï¼Cascalog ã47ãæ¯ä¸ç¨®ç¨æ¼æ¥è©¢ Hadoop å¤§è³æéç Datalog å¯¦ç¾ [^viii]ã
[^viii]: Datomic å Cascalog ä½¿ç¨ Datalog ç Clojure S 表示å¼èªæ³ãå¨ä¸é¢çä¾åä¸ä½¿ç¨äºä¸åæ´å®¹æé±è®ç Prolog èªæ³ï¼ä½å
©è
æ²æä»»ä½åè½å·®ç°ã
Datalog çè³ææ¨¡åé¡ä¼¼æ¼ä¸å
çµæ¨¡å¼ï¼ä½é²è¡äºä¸é»æ³åãæä¸å
çµå¯«æ **è¬èª**ï¼**主èªï¼è³èª**ï¼ï¼è䏿¯å¯«ä¸å
èªï¼**主èªï¼è¬èªï¼è³èª**ï¼ã[ä¾ 2-10]() 顯示çå¦ä½ç¨ Datalog 寫å
¥æåçä¾åä¸çè³æã
**ä¾ 2-10 ç¨ Datalog ä¾è¡¨ç¤ºå 2-5 ä¸çè³æåé**
```prolog
name(namerica, 'North America').
type(namerica, continent).
name(usa, 'United States').
type(usa, country).
within(usa, namerica).
name(idaho, 'Idaho').
type(idaho, state).
within(idaho, usa).
name(lucy, 'Lucy').
born_in(lucy, idaho).
```
æ¢ç¶å·²ç¶å®ç¾©äºè³æï¼æåå¯ä»¥åä¹å䏿¨£ç·¨å¯«ç¸åçæ¥è©¢ï¼å¦ [ä¾ 2-11]() æç¤ºãå®çèµ·ä¾è Cypher æ SPARQL çèªæ³å·®ç°è¼å¤§ï¼ä½è«ä¸è¦ææå®ãDatalog æ¯ Prolog çä¸ååéï¼å¦æä½ æ¯è¨ç®æ©ç§å¸å°æ¥çå¸çï¼å¯è½å·²ç¶è¦é Prologã
**ä¾ 2-11 èç¤ºä¾ 2-4 ç¸åçæ¥è©¢ï¼ç¨ Datalog 表示**
```
within_recursive(Location, Name) :- name(Location, Name). /* Rule 1 */
within_recursive(Location, Name) :- within(Location, Via), /* Rule 2 */
within_recursive(Via, Name).
migrated(Name, BornIn, LivingIn) :- name(Person, Name), /* Rule 3 */
born_in(Person, BornLoc),
within_recursive(BornLoc, BornIn),
lives_in(Person, LivingLoc),
within_recursive(LivingLoc, LivingIn).
?- migrated(Who, 'United States', 'Europe'). /* Who = 'Lucy'. */
```
Cypher å SPARQL ä½¿ç¨ SELECT ç«å³è·³è½ï¼ä½æ¯ Datalog 䏿¬¡åªé²è¡ä¸å°æ¥ãæåå®ç¾© **è¦å**ï¼ä»¥å°æ°è¬èªåè¨´è³æåº«ï¼å¨éè£¡ï¼æåå®ç¾©äºå
©åæ°çè¬èªï¼`within_recursive` å `migrated`ãéäºè¬èªä¸æ¯å²åå¨è³æåº«ä¸çä¸å
çµä¸ï¼èæ¯å¾è³ææå
¶ä»è¦åæ´¾çèä¾çãè¦åå¯ä»¥å¼ç¨å
¶ä»è¦åï¼å°±åå½å¼å¯ä»¥å¼å«å
¶ä»å½å¼æè
éè¿´å°å¼å«èªå·±ä¸æ¨£ãå鿍£ï¼è¤éçæ¥è©¢å¯ä»¥èç±å°çç£ç¦æ§å»ºèµ·ä¾ã
å¨è¦åä¸ï¼ä»¥å¤§å¯«åæ¯éé çå®è©æ¯è®æ¸ï¼è¬èªåç¨ Cypher å SPARQL çæ¹å¼ä¸æ¨£ä¾å¹é
ãä¾å¦ï¼`name(Location, Name)` ééè®æ¸ç¹«çµ `Location = namerica` å `Name ='North America'` å¯ä»¥å¹é
ä¸å
çµ `name(namerica, 'North America')`ã
è¦æ¯ç³»çµ±å¯ä»¥å¨ `:-` éç®åçå³å´æ¾å°èææè¬èªçä¸åå¹é
ï¼å°±éç¨è©²è¦åãç¶è¦åéç¨æï¼å°±å¥½åéé `:-` çå·¦å´å°å
¶æ°å¢å°è³æåº«ï¼å°è®æ¸æ¿ææå®åå¹é
çå¼ï¼ã
å æ¤ï¼ä¸ç¨®å¯è½çæç¨è¦åçæ¹å¼æ¯ï¼
1. è³æåº«åå¨ `name (namerica, 'North America')`ï¼æ
éç¨è¦å 1ãå®çæ `within_recursive (namerica, 'North America')`ã
2. è³æåº«åå¨ `within (usa, namerica)`ï¼å¨ä¸ä¸æ¥é©ä¸çæ `within_recursive (namerica, 'North America')`ï¼æ
éç¨è¦å 2ã宿ç¢ç `within_recursive (usa, 'North America')`ã
3. è³æåº«åå¨ `within (idaho, usa)`ï¼å¨ä¸ä¸æ¥çæ `within_recursive (usa, 'North America')`ï¼æ
éç¨è¦å 2ãå®ç¢ç `within_recursive (idaho, 'North America')`ã
éééè¤æç¨è¦å 1 å 2ï¼`within_recursive` è¬èªå¯ä»¥å訴æåå¨è³æåº«ä¸å
å«åç¾ï¼æä»»ä½å
¶ä»ä½ç½®å稱ï¼çææä½ç½®ãéåéç¨å¦ [å 2-6](../img/fig2-6.png) æç¤ºã

**å 2-6 使ç¨ç¤ºä¾ 2-11 ä¸ç Datalog è¦åä¾ç¢ºå®æéè·å·å¨åç¾ã**
ç¾å¨è¦å 3 å¯ä»¥æ¾å°åºç卿åå°æ¹ `BornIn` ç人ï¼ä¸¦ä½å¨æåå°æ¹ `LivingIn`ãééæ¥è©¢ `BornIn ='United States'` å `LivingIn ='Europe'`ï¼ä¸¦å°æ¤äººä½çºè®æ¸ `Who`ï¼è® Datalog 系統æ¾åºè®æ¸ `Who` æåºç¾åªäºå¼ãå æ¤ï¼æå¾å¾å°äºèæ©å
ç Cypher å SPARQL æ¥è©¢ç¸åççæ¡ã
ç¸å°æ¼æ¬ç« è¨è«çå
¶ä»æ¥è©¢èªè¨ï¼æåéè¦æ¡åä¸åçæç¶æ¹å¼ä¾æè Datalog æ¹æ³ï¼ä½éæ¯ä¸ç¨®éå¸¸å¼·å¤§çæ¹æ³ï¼å çºè¦åå¯ä»¥å¨ä¸åçæ¥è©¢ä¸é²è¡çµååéç¨ãéç¶å°æ¼ç°¡å®ç䏿¬¡æ§æ¥è©¢ï¼é¡¯å¾ä¸å¤ªæ¹ä¾¿ï¼ä½æ¯å®å¯ä»¥æ´å¥½å°èçè³æå¾è¤éçæ
æ³ã
## æ¬ç« å°çµ
è³ææ¨¡åæ¯ä¸å巨大ç課é¡ï¼å¨æ¬ç« ä¸ï¼æåå¿«éç覽äºå種ä¸åçæ¨¡åãæåæ²æè¶³å¤ çç¯å¹
ä¾è©³è¿°æ¯å模åçç´°ç¯ï¼ä½æ¯å¸æéåæ¦è¿°è¶³ä»¥æ¿èµ·ä½ çèè¶£ï¼ä»¥æ´å¤å°çè§£æé©åä½ çæç¨éæ±ç模åã
卿·å²ä¸ï¼è³ææéå§è¢«è¡¨ç¤ºçºä¸æ£µå¤§æ¨¹ï¼å±¤æ¬¡è³ææ¨¡åï¼ï¼ä½æ¯éä¸å©æ¼è¡¨ç¤ºå¤å°å¤çéä¿ï¼æä»¥ç¼æäºéä¿æ¨¡åä¾è§£æ±ºéååé¡ãæè¿ï¼éç¼äººå¡ç¼ç¾ä¸äºæç¨ç¨å¼ä¹ä¸é©åæ¡ç¨éä¿æ¨¡åãæ°çééä¿å âNoSQLâ è³æå²åååçºå
©åä¸»è¦æ¹åï¼
1. **æä»¶è³æåº«** 主è¦éæ³¨èªæå
å«çè³ææä»¶ï¼è䏿件ä¹éçéä¿é常ç¨å°ã
2. **åå½¢è³æåº«** ç¨æ¼ç¸åçå ´æ¯ï¼ä»»æäºç©ä¹éé½å¯è½å卿½å¨çéè¯ã
éä¸ç¨®æ¨¡åï¼æä»¶ï¼éä¿ååå½¢ï¼å¨ä»å¤©é½è¢«å»£æ³ä½¿ç¨ï¼ä¸¦ä¸å¨åèªçé åé½ç¼æ®å¾å¥½ãä¸å模åå¯ä»¥ç¨å¦ä¸å模å便¨¡æ¬ ââ ä¾å¦ï¼åè³æå¯ä»¥å¨éä¿è³æåº«ä¸è¡¨ç¤º ââ ä½çµæå¾å¾æ¯ç³ç³çãéå°±æ¯çºä»éº¼æåæèéå°ä¸åç®ççä¸å系統ï¼è䏿¯ä¸åå®ä¸çè¬è½è§£æ±ºæ¹æ¡ã
æä»¶è³æåº«ååè³æåº«æä¸åå
±åé»ï¼é£å°±æ¯å®åé叏䏿å°å²åçè³æå¼·å¶ç´æçºç¹å®æ¨¡å¼ï¼éå¯ä»¥ä½¿æç¨ç¨å¼æ´å®¹æé©æä¸æ·è®åçéæ±ã使¯æç¨ç¨å¼å¾å¯è½ä»æåå®è³æå
·æä¸å®ççµæ§ï¼åå¥å
卿¼æ¨¡å¼æ¯**æç¢ºç**ï¼å¯«å
¥æå¼·å¶ï¼éæ¯**é±å«ç**ï¼è®åæèçï¼ã
æ¯åè³ææ¨¡åé½å
·æåèªçæ¥è©¢èªè¨ææ¡æ¶ï¼æåè¨è«äºå¹¾åä¾åï¼SQLï¼MapReduceï¼MongoDB çèå管éï¼Cypherï¼SPARQL å Datalogãæåä¹è«å°äº CSS å XSL/XPathï¼å®å䏿¯è³æåº«æ¥è©¢èªè¨ï¼èå
å«æè¶£çç¸ä¼¼ä¹èã
éç¶æåå·²ç¶è¦èäºå¾å¤å±¤é¢ï¼ä½ä»ç¶æè¨±å¤è³ææ¨¡åæ²ææå°ãèå¹¾åç°¡å®çä¾åï¼
* 使ç¨åºå çµè³æçç 究人å¡é常éè¦å·è¡ **åºåç¸ä¼¼æ§æå°**ï¼éæå³èéè¦ä¸åå¾é·çå串ï¼ä»£è¡¨ä¸å DNA åºåï¼ï¼ä¸¦å¨ä¸åææé¡ä¼¼ä½ä¸å®å
¨ç¸åçå串ç大åè³æåº«ä¸å°æ¾å¹é
ãé裡ææè¿°çè³æåº«é½ä¸è½èçéç¨®ç¨æ³ï¼éå°±æ¯çºä»éº¼ç 究人å¡ç·¨å¯«äºå GenBank 鿍£çå°éçåºå çµè³æåº«è»é«çåå ã48ãã
* ç²åç©çå¸å®¶æ¸åå¹´ä¾ä¸ç´å¨é²è¡å¤§è³æåå¥çå¤§è¦æ¨¡è³æåæï¼å大åå¼·åå°ææ©ï¼LHCï¼é樣çå°æ¡ç¾å¨æèçæ¸ç¾ PB çè³æï¼å¨é樣çè¦æ¨¡ä¸ï¼éè¦å®è£½è§£æ±ºæ¹æ¡ä¾é»æ¢ç¡¬é«ææ¬ç失æ§ã49ãã
* **å
¨ææå°** å¯ä»¥èªªæ¯ä¸ç¨®ç¶å¸¸èè³æåº«ä¸èµ·ä½¿ç¨çè³ææ¨¡åãè³è¨æª¢ç´¢æ¯ä¸åå¾å¤§çå°æ¥èª²é¡ï¼æå䏿卿¬æ¸ä¸è©³ç´°ä»ç´¹ï¼ä½æ¯æåå°å¨ç¬¬ä¸ç« å第ä¸é¨åä¸ä»ç´¹æå°ç´¢å¼ã
è®æåæ«æå°å
¶æ¾å¨ä¸éãå¨ [ä¸ä¸ç« ](ch3.md) ä¸ï¼æåå°è¨è«å¨ **實ç¾** æ¬ç« æè¿°çè³ææ¨¡åææéå°çä¸äºæ¬è¡¡ã
## åèæç»
1. Edgar F. Codd: â[A Relational Model of Data for Large Shared Data Banks](https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf),â *Communications of the ACM*, volume 13, number 6, pages 377â387, June 1970. [doi:10.1145/362384.362685](http://dx.doi.org/10.1145/362384.362685)
1. Michael Stonebraker and Joseph M. Hellerstein: â[What Goes Around Comes Around](http://mitpress2.mit.edu/books/chapters/0262693143chapm1.pdf),â in *Readings in Database Systems*, 4th edition, MIT Press, pages 2â41, 2005. ISBN: 978-0-262-69314-1
1. Pramod J. Sadalage and Martin Fowler: *NoSQL Distilled*. Addison-Wesley, August 2012. ISBN: 978-0-321-82662-6
1. Eric Evans: â[NoSQL: What's in a Name?](https://web.archive.org/web/20190623045155/http://blog.sym-link.com/2009/10/30/nosql_whats_in_a_name.html),â *blog.sym-link.com*, October 30, 2009.
1. James Phillips: â[Surprises in Our NoSQL Adoption Survey](http://blog.couchbase.com/nosql-adoption-survey-surprises),â *blog.couchbase.com*, February 8, 2012.
1. Michael Wagner: *SQL/XML:2006 â Evaluierung der Standardkonformität ausgewählter Datenbanksysteme*. Diplomica Verlag, Hamburg, 2010. ISBN: 978-3-836-64609-3
1. â[XML Data (SQL Server)](https://docs.microsoft.com/en-us/sql/relational-databases/xml/xml-data-sql-server?view=sql-server-ver15),â SQL Server documentation, *docs.microsoft.com*, 2013.
1. â[PostgreSQL 9.3.1 Documentation](http://www.postgresql.org/docs/9.3/static/index.html),â The PostgreSQL Global Development Group, 2013.
1. â[The MongoDB 2.4 Manual](http://docs.mongodb.org/manual/),â MongoDB, Inc., 2013.
1. â[RethinkDB 1.11 Documentation](http://www.rethinkdb.com/docs/),â *rethinkdb.com*, 2013.
1. â[Apache CouchDB 1.6 Documentation](http://docs.couchdb.org/en/latest/),â *docs.couchdb.org*, 2014.
1. Lin Qiao, Kapil Surlaker, Shirshanka Das, et al.: â[On Brewing Fresh Espresso: LinkedInâs Distributed Data Serving Platform](http://www.slideshare.net/amywtang/espresso-20952131),â at *ACM International Conference on Management of Data* (SIGMOD), June 2013.
1. Rick Long, Mark Harrington, Robert Hain, and Geoff Nicholls: [*IMS Primer*](http://www.redbooks.ibm.com/redbooks/pdfs/sg245352.pdf). IBM Redbook SG24-5352-00, IBM International Technical Support Organization, January 2000.
1. Stephen D. Bartlett: â[IBMâs IMSâMyths, Realities, and Opportunities](https://public.dhe.ibm.com/software/data/ims/pdf/TCG2013015LI.pdf),â The Clipper Group Navigator, TCG2013015LI, July 2013.
1. Sarah Mei: â[Why You Should Never Use MongoDB](http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/),â *sarahmei.com*, November 11, 2013.
1. J. S. Knowles and D. M. R. Bell: âThe CODASYL Model,â in *DatabasesâRole and Structure: An Advanced Course*, edited by P. M. Stocker, P. M. D. Gray, and M. P. Atkinson, pages 19â56, Cambridge University Press, 1984. ISBN: 978-0-521-25430-4
1. Charles W. Bachman: â[The Programmer as Navigator](http://dl.acm.org/citation.cfm?id=362534),â *Communications of the ACM*, volume 16, number 11, pages 653â658, November 1973. [doi:10.1145/355611.362534](http://dx.doi.org/10.1145/355611.362534)
1. Joseph M. Hellerstein, Michael Stonebraker, and James Hamilton: â[Architecture of a Database System](http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf),â *Foundations and Trends in Databases*, volume 1, number 2, pages 141â259, November 2007. [doi:10.1561/1900000002](http://dx.doi.org/10.1561/1900000002)
1. Sandeep Parikh and Kelly Stirman: â[Schema Design for Time Series Data in MongoDB](http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb),â *blog.mongodb.org*, October 30, 2013.
1. Martin Fowler: â[Schemaless Data Structures](http://martinfowler.com/articles/schemaless/),â *martinfowler.com*, January 7, 2013.
1. Amr Awadallah: â[Schema-on-Read vs. Schema-on-Write](http://www.slideshare.net/awadallah/schemaonread-vs-schemaonwrite),â at *Berkeley EECS RAD Lab Retreat*, Santa Cruz, CA, May 2009.
1. Martin Odersky: â[The Trouble with Types](http://www.infoq.com/presentations/data-types-issues),â at *Strange Loop*, September 2013.
1. Conrad Irwin: â[MongoDBâConfessions of a PostgreSQL Lover](https://speakerdeck.com/conradirwin/mongodb-confessions-of-a-postgresql-lover),â at *HTML5DevConf*, October 2013.
1. â[Percona Toolkit Documentation: pt-online-schema-change](http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html),â Percona Ireland Ltd., 2013.
1. Rany Keddo, Tobias Bielohlawek, and Tobias Schmidt: â[Large Hadron Migrator](https://github.com/soundcloud/lhm),â SoundCloud, 2013.
1. Shlomi Noach: â[gh-ost: GitHub's Online Schema Migration Tool for MySQL](http://githubengineering.com/gh-ost-github-s-online-migration-tool-for-mysql/),â *githubengineering.com*, August 1, 2016.
1. James C. Corbett, Jeffrey Dean, Michael Epstein, et al.: â[Spanner: Googleâs Globally-Distributed Database](https://research.google/pubs/pub39966/),â at *10th USENIX Symposium on Operating System Design and Implementation* (OSDI), October 2012.
1. Donald K. Burleson: â[Reduce I/O with Oracle Cluster Tables](https://web.archive.org/web/20231207233228/http://www.dba-oracle.com/oracle_tip_hash_index_cluster_table.htm),â *dba-oracle.com*.
1. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, et al.: â[Bigtable: A Distributed Storage System for Structured Data](https://research.google/pubs/pub27898/),â at *7th USENIX Symposium on Operating System Design and Implementation* (OSDI), November 2006.
1. Bobbie J. Cochrane and Kathy A. McKnight: â[DB2 JSON Capabilities, Part 1: Introduction to DB2 JSON](https://web.archive.org/web/20180516203043/https://www.ibm.com/developerworks/data/library/techarticle/dm-1306nosqlforjson1/),â IBM developerWorks, June 20, 2013.
1. Herb Sutter: â[The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software](http://www.gotw.ca/publications/concurrency-ddj.htm),â *Dr. Dobb's Journal*, volume 30, number 3, pages 202-210, March 2005.
1. Joseph M. Hellerstein: â[The Declarative Imperative: Experiences and Conjectures in Distributed Logic](http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-90.pdf),â Electrical Engineering and Computer Sciences, University of California at Berkeley, Tech report UCB/EECS-2010-90, June 2010.
1. Jeffrey Dean and Sanjay Ghemawat: â[MapReduce: Simplified Data Processing on Large Clusters](https://research.google/pubs/pub62/),â at *6th USENIX Symposium on Operating System Design and Implementation* (OSDI), December 2004.
1. Craig Kerstiens: â[JavaScript in Your Postgres](https://blog.heroku.com/javascript_in_your_postgres),â *blog.heroku.com*, June 5, 2013.
1. Nathan Bronson, Zach Amsden, George Cabrera, et al.: â[TAO: Facebookâs Distributed Data Store for the Social Graph](https://www.usenix.org/conference/atc13/technical-sessions/presentation/bronson),â at *USENIX Annual Technical Conference* (USENIX ATC), June 2013.
1. â[Apache TinkerPop3.2.3 Documentation](http://tinkerpop.apache.org/docs/3.2.3/reference/),â *tinkerpop.apache.org*, October 2016.
1. â[The Neo4j Manual v2.0.0](http://docs.neo4j.org/chunked/2.0.0/index.html),â Neo Technology, 2013.
1. Emil Eifrem: [Twitter correspondence](https://twitter.com/emileifrem/status/419107961512804352), January 3, 2014.
1. David Beckett and Tim Berners-Lee: â[Turtle â Terse RDF Triple Language](http://www.w3.org/TeamSubmission/turtle/),â W3C Team Submission, March 28, 2011.
1. â[Datomic Development Resources](http://docs.datomic.com/),â Metadata Partners, LLC, 2013.
1. W3C RDF Working Group: â[Resource Description Framework (RDF)](http://www.w3.org/RDF/),â *w3.org*, 10 February 2004.
1. â[Apache Jena](http://jena.apache.org/),â Apache Software Foundation.
1. Steve Harris, Andy Seaborne, and Eric Prud'hommeaux: â[SPARQL 1.1 Query Language](http://www.w3.org/TR/sparql11-query/),â W3C Recommendation, March 2013.
1. Todd J. Green, Shan Shan Huang, Boon Thau Loo, and Wenchao Zhou: â[Datalog and Recursive Query Processing](http://blogs.evergreen.edu/sosw/files/2014/04/Green-Vol5-DBS-017.pdf),â *Foundations and Trends in Databases*, volume 5, number 2, pages 105â195, November 2013. [doi:10.1561/1900000017](http://dx.doi.org/10.1561/1900000017)
1. Stefano Ceri, Georg Gottlob, and Letizia Tanca: â[What You Always Wanted to Know About Datalog (And Never Dared to Ask)](https://www.researchgate.net/profile/Letizia_Tanca/publication/3296132_What_you_always_wanted_to_know_about_Datalog_and_never_dared_to_ask/links/0fcfd50ca2d20473ca000000.pdf),â *IEEE Transactions on Knowledge and Data Engineering*, volume 1, number 1, pages 146â166, March 1989. [doi:10.1109/69.43410](http://dx.doi.org/10.1109/69.43410)
1. Serge Abiteboul, Richard Hull, and Victor Vianu: [*Foundations of Databases*](http://webdam.inria.fr/Alice/). Addison-Wesley, 1995. ISBN: 978-0-201-53771-0, available online at *webdam.inria.fr/Alice*
1. Nathan Marz: â[Cascalog](https://github.com/nathanmarz/cascalog)," *github.com*.
1. Dennis A. Benson, Ilene Karsch-Mizrachi, David J. Lipman, et al.: â[GenBank](https://academic.oup.com/nar/article/36/suppl_1/D25/2507746),â *Nucleic Acids Research*, volume 36, Database issue, pages D25âD30, December 2007. [doi:10.1093/nar/gkm929](http://dx.doi.org/10.1093/nar/gkm929)
1. Fons Rademakers: â[ROOT for Big Data Analysis](https://indico.cern.ch/event/246453/contributions/1566610/attachments/423154/587535/ROOT-BigData-Analysis-London-2013.pdf),â at *Workshop on the Future of Big Data Management*, London, UK, June 2013.
------
| ä¸ä¸ç« | ç®é | ä¸ä¸ç« |
| -------------------------------------------- | ------------------------------- | ---------------------------- |
| [第ä¸ç« ï¼å¯é æ§ãå¯ä¼¸ç¸®æ§åå¯ç¶è·æ§](ch1.md) | [è¨è¨è³æå¯éåæç¨](README.md) | [第ä¸ç« ï¼å²åèæª¢ç´¢](ch3.md) |