ãã¼ã¿åæã¬ãåå¼·ã¢ããã³ãã«ã¬ã³ãã¼ 23æ¥ç®ã
ããã¾ã§ãã¼ã¿ãã©ããã風ã«å¦çããããã©ãããã¿ã¹ã¯ãããªãã¦ãããã«ã¤ãã¦åå¼·ãã¦ãããã ä¸åº¦åºç¤çãªäºé ã«æ»ã£ã¦ã¿ããã¨æããåºç¤ã ããç°¡åã¨ããããã§ã¯ãªããåºç¤ã ããããé£ãããã¾ãæ¬è³ªçãªå 容ã
ãã¼ã¿åæã§ä½¿ããã¦ããææ³ãªã©ãã¾ã¨ãã¦éç´ãã¦ãç°¡åãªèª¬æãä»ãå ãã¦ããã ãããããã®ãããã®æ°å¦*1ã¯è¦æãªã®ã§ããªãã¹ãç´æçã«èªåã®ã¤ã¡ã¼ã¸ãæ¸ãã¦ããã
- ãããããçãã¦ãã空éããè·é¢ã¯"æ£ãã"ã®ã
- ç¹ã®è·é¢
- åå¸ã®è·é¢
- ã«ã¼ãã«(åçæ ¸ãã«ãã«ã空é)
- Topological Data Analysis(TDA)
- 次å åæ¸/Embedding
- ã¾ã¨ã
ãããããçãã¦ãã空éããè·é¢ã¯"æ£ãã"ã®ã
"æ£ãã"ã¨ããã®ã¯ããã¼ã¿åæã«ä½¿ãã®ã«é©åãªã®ããã¨ããæå³ã§æ¸ãã¦ããã ãã¨ãã°ãæ°å®¿ã¾ã§1.2km"ã¨ãããããªè¡¨ç¤ºãè¦ããå°çã¨ãã座æ¨ç©ºéã®ä¸ã§ãæ°å®¿ã¾ã§ã®è·é¢ã1.2kmã¨ããããã ãããããã¯æ¥å¸¸çæ´»ã®ä¸ã§ãããã«å¯¾ãã¦ä½ã®çåãæ±ããªããããã¯ãæ¥å¸¸ã®è·é¢ã¯ã¦ã¼ã¯ãªãã空éã 大åæã«ããããã§ããã
ã¦ã¼ã¯ãªãã空é/ã¦ã¼ã¯ãªããè·é¢
ç°¡åã«è¨ãã¨ãé«æ ¡ã¾ã§*2ãããããçãã¦ãã空éã§ããããã¨ãã°ã(0,0)ãã(3,4)ã¾ã§ã®è·é¢ã¯ï¼ã¨ããããã¨ã大ä½ã®äººã¯ä¸å¹³æ¹ã®å®çã ï¼ã¨ãªã£ã¦ã5ã¨ããæ°åãæãæµ®ãã¶ã
ããã©ãä¸å³ãè¦ã¦ã»ããã空éãå¿ ãããx,yã®ç´äº¤åº§æ¨ã§ããããããããã§ã¯ãªããå³å¯ã«è¨ãã¨ãå°çã¯ä¸¸ãã®ã§ãæ²ãã£ã空éã§ã®è·é¢ãæ±ããã¹ããªã®ã§ãããã¾ããè·é¢ã®æ¸¬ãæ¹ã ã£ã¦ããªã¨ã¼ã·ã§ã³ããã£ã¦ããã¯ãã ã ã¦ã¼ã¯ãªããè·é¢ã¨ããã°ãæ辺ã®è·é¢ã¨æããã¡ã ãããã¨ãã°è¡ãªã©ã¯ãæãã«æ¨ªåããªããã¨ã ã£ã¦ããã®ã§ãããããã¨é²ãã§ãããã¨ãåæã«è·é¢ã測ã£ãããã*3ããã®ããã«ç¶æ³ã«ãã£ã¦ç©ºéã測ãããã¯å¤ããã¹ããªã®ã§ããã
ãã¼ã¿åæã¨ã¯ããããããªã¸ã£ã³ã«/ãããããªã¿ã¹ã¯ãããããã§ããããæ¬å½ã«ã¦ã¼ã¯ãªãã空é/ã¦ã¼ã¯ãªããè·é¢ã§èããã¹ã話é¡ãªã®ãã©ããã¨ããã®ã¯ãèãã¦ããã¹ãåé¡ãªã®ã§ããã é常ã«æ ¹æ¬çãªè©±ã§ããã ãã«ãããããªç©ºéãè·é¢ã§è§£ãããã®ã¯ç¾ãããæ±ç¨æ§ãããã
ç¹ã®è·é¢
Name | åå | å¼ | åè(ããã°) |
---|---|---|---|
Euclid | ã¦ã¼ã¯ãªãã | ä¸è¬çãªè·é¢ | |
Manhattan | ãã³ããã¿ã³ | å¤ãå¤ã®å½±é¿ãåãã«ãã | |
Minkowski | ãã³ã³ãã¹ãã¼ | Euclid, Manhattan, Chebyshevãä¸è¬åãããã® | |
Chebyshev | ãã§ãã·ã | æåã®å·®ããã£ã¨ã大ãã次å ã ããæ½åºãã¦ãã | |
Mahalanobis | ããã©ããã¹ | æ£è¦åå¸ã®å ±åæ£ã®å½¢ã«ãããã¦ç®åºããè·é¢ããã使ã | |
Hellinger | ããªã³ã¸ã£ã¼ | å¤ãå°ã®å½±é¿ãåãã«ããã® | |
Hamming | ããã³ã° | ãã¯ãã«ã®è¦ç´ ä¸ã§ä¸è´ãã¦ããªãè¦ç´ æ°ãã«ãã´ãªå¤æ°ã«å©ç¨ |
åå¸ã®è·é¢
- 確çåå¸éã®è·é¢æ¨å®:æ©æ¢°å¦ç¿åéã«ãããææ°åå(2013)
- ãå¯è¦åãã«ã«ããã¯ã©ã¤ãã©ã¼ãªã©åå¸ã®å·®ã表ãææ¨ã®éã
åå¸ã®è·é¢ã¨ããã¨ãã«ããããããªè·é¢/è·é¢å°ºåº¦ããããå³å¯ã«ã¯è·é¢ã®å ¬ç(éè² æ§ã»åãç¹ãªã0ã»å¯¾ç§°æ§ã»ä¸è§ä¸çå¼ãæãç«ã¤)*4ãæºããã¦ããªããã®ãããããããã¯è·é¢ã¨ã¯ãããªãããå·®ããããããã®ã¨ãã¦ãã使ããããã®ãªã®ã§ãè·é¢å°ºåº¦ã¨ãã¦ç¨ããããã
åå | å¼ | åè(ããã°) |
---|---|---|
Histogram Intersection | (è·é¢ãããªãã¦é¡ä¼¼åº¦)ãã¹ãã°ã©ã ã®ãããªé¢æ£å¤ã«ä½¿ãã2ã¤ã®åå¸ã®å ±éé åã | |
KL divergence | ç¸å¯¾ã¨ã³ãããã¼ã®æ¦å¿µã«åºã¥ãã¦ã2åå¸éã®è·é¢ãç®åº(é対称) | |
JS divergence | KLãæ¹è¯ãã¦** |
|
L1 norm | é£ç¶çãªåå¸ã«ããã誤差ã®çµ¶å¯¾å¤ã®å | |
L2 norm | é£ç¶çãªåå¸ã«ãããäºä¹èª¤å·®ã®å | |
Wasserstein distance | åå¸ãè·ç©éã¨ã¿ãªããè·ç©ãä»ã®åå¸ã«ç§»ããããã¨ãã«ãããã³ã¹ã |
ããããã®éãã®å®è£ ãgithubã«ããã¦ãã(day23)
wassersteinè¨é
ç©çã§ã¯æãããã輸éè·é¢ã ããWGANã®ç»å ´ã«ãããåå¸ã®è·é¢è¡¨ç¾ã¨ãã¦æ³¨ç®ãããæ§ã«ãªã£ã¦ããã
- 輸éã³ã¹ãæå°ååé¡ã解ããå ´åã®æå°ã®è¼¸éã³ã¹ã
- æå°ã³ã¹ãããã¡ãã¡è¨ç®ãããããè¨ç®æéã¯ãã¡ããã¡ãããããããã®åç©ççã«ãæ¬è³ªçãªè·é¢ã«ãªã£ã¦ããã
- å®è£ ã«ã¯POTã¨ããã©ã¤ãã©ãªã使ã(Optimal Transportation)github
pip install Cython pip install POT
- Earth Mover's Distance*5ã®èª¬æ(ç´æçã«ãããããã)
- Wasserstein GANã®è¨äº(Wassersteinè·é¢ã®ç´°ããªèª¬æã¢ãª)
- Wassersteinã§æ©æ¢°å¦ç¿
- WassersteinGANã®PyTorchå®è£
ã«ã¼ãã«(åçæ ¸ãã«ãã«ã空é)
éç·å½¢ã®åé¡ãæ±ããã®ã¨ãã¦ã¯ãããªãä¼çµ±çãªææ³ãæ°å¼ãæãã¦æåã ãã§èª¬æããã¨ãããªæã
- ç·å½¢ã®ç¹å¾´é
ãçµã¿åããã¦ãéç·å½¢ãªç¹å¾´éãä½ãããã
- ãããªç¹å¾´éã®çµã¿åããã¯ç¡æ°ã«ãã(ä¾:
,...)
- å¤ãã»ã©éç·å½¢æ§ã¯é«ã¾ãããçµã¿åãããè¨å¤§
- ã«ã¼ãã«é¢æ°ã使ãã¨ãç¡é次å ã®ç¹å¾´ãã¯ãã«ãç¨ãã¦ããã®ã¨ç価*6
- ã§ããè¨ç®éã¯ç¡éãããªãã¦ããã¼ã¿æ°ã«ä¾åãã*7
- ä¸è¬çåç·å½¢ã¢ãã«ã¯ãã«ã¼ãã«é¢æ°ã«ãã£ã¦æ¸ãæãããã¨ãåºæ¥ãã®ã§ããã¼ã¿æ°ã«ä¾åããå½¢ã§éç·å½¢ãå¦ç¿ã§ãã
ã¨ããããã§ããã¼ã¿æ°åã®è¨ç®éã§ãç¡é次å 空éã®å¦ç¿ãåºæ¥ãã£ã¦æãã 注æã¨ãã¦ã¯ãå ¥åãããã¼ã¿æ°ãå¤ããªãã¨ãè¨ç®ãå³ãããªã(ãã¼ã¿æ°Ããã¼ã¿æ°ã®éè¡åã解ã)
以ä¸ããããããã&æ°å¼ããã¡ãã¨è¼ã£ã¦ãã説æã
- ä»ããèããªãã«ã¼ãã«æ³ã¨ãµãã¼ããã¯ã¿ã¼ãã·ã³
- http://enakai00.hatenablog.com/entry/2017/10/13/145337
Topological Data Analysis(TDA)

persistent homologyã®çæå ã®çºçã¨æ¶æ» ã®ã¤ã¡ã¼ã¸

ä¸å³ã®ããã«åãåºãã¦ãããçºçã®æå»ã¨æ¶æ» ã®æå»ã2次å ããããã«ããã®ãpersistent diagram
以ä¸ããããããã説æã¨è«æ
- 位相的データ解析と3D画像認識 - ニンゲンメモ
- Statistical Topological Data Analysis - A Kernel Perspective
- ãã¼ã¿ç§å¦ã¸ã® æ©æ¢°å¦ç¿çã¢ããã¼ã(çµ±æ°ç :ç¦æ°´å¥æ¬¡)
次å åæ¸/Embedding
ç¹å¾´éãé常ã«å¤ãå ´åããã®ç¹å¾´éå ¨ã¦ããã¼ã¿ã¨ãã¦éè¦ãã©ããã¯å¾®å¦ã ç¹å¾´éãéè¦ãªé¨åã ãæ½åºããã(Dimension Reduction) ç¸æ§ã®è¯ã空éã¸ä»ããéå¤ãªãã¼ã¿ãåãè¾¼ãã ã(Embedding)ãããã¨ãå¤ãã以ä¸ã¯ä¸»è¦/ææãªææ³ã¨ç°¡åãªèª¬æãæç®ãåæãã(éææ´æ°)
PCA(principal component analysis)
å ¸åçãªæ¬¡å åæ¸ã®æ¹æ³ã次å ãè½ã¨ãã¦ãæ å ±ããªãã¹ãè½ã¨ããªããããªè»¸ãè¨ç®ã§æ±ããã
ã¡ãªã¿ã«ãAutoEncoderã¯PCAã®ç´ç²ãªæ¡å¼µã«ãªããããã¨ãæ°å¦çã«è¨¼æã§ãã *8
t-SNE(t-Distributed Stochastic Neighbor Embedding)
PCAãç¹ã ããè¦ãã®ã«å¯¾ããt-SNEã¯åå¸åä½ã§è¦ããå§ç¸®åã¨å§ç¸®å¾ã®ç¢ºçåå¸ã®KLãã¤ãã¼ã¸ã§ã³ã¹ãæå°åãããã¨ã«ããæ±ããã

[t-SNEã®ä»çµã¿(ALBERT)](https://blog.albert2005.co.jp/2015/12/02/tsne/)ããå¼ç¨
- t-SNE(Slideshare)
- LLEã¨ã¡ãã£ã¨T-SNE
- PCAã¨ã®æ§è½å·®ãããã TSNE vs PCA | Kaggle
Word2Vec
CBoW(Continuous Bag-of-Words)ãç¨ãã¦ãæèä¸ã®åèªãã対象åèªãç¾ããæ¡ä»¶ä»ã確çãæ大åããããã«å¦ç¿ãé²ãã¦ããã
ãããããã¨ã§ãåèªã座æ¨ç©ºéä¸ã«åãè¾¼ããã¨ãã§ããã ããã«ãã£ã¦ãè¨èã®è¶³ãç®å¼ãç®ãã§ããå¯è½æ§ãæããã

ç·å¥³éã®é¢ä¿ãè¿ãå¤ã®ãã¯ãã«ã§è¡¨ç¾ããã(åèè¨äºããç»åãå¼ç¨)
æè¿ã§ã¯ãword2vecã ãã§ã¯ãªãdoc2vecãªã©ããããã¾ãã¾ãå¿ç¨å¯è½æ§ãå¢ãã¦ããã
- Word2Vecï¼çºæããæ¬äººãé©ãåèªãã¯ãã«ã®é©ç°çãªå
- gensim : Pythonã§word2vec
Poincare Embeddings
Word2Vecã®ç©ºéæ¹è¯çã¨ãã£ãã¨ãããåæ²ç©ºéã«ãã¼ã¿ãåãè¾¼ããã¨ã§ã
- é層æ§é /æ¨æ§é ãèªç¶ã«ç©ºéã®ä¸ã«åãè¾¼ãããããããæ§é ã®ãã¼ã¿ã«è¯ãã
- ãªãããªç©ºé表ç¾ã®ãããå ã®æ¬¡å ãæ¸ãããã¨ãåºæ¥ã
- 空éã«å°ãã®è£æ£ãå ããã ãã§ä½¿ããã®ã§ãæ¡å¼µã容æ
ãªç¹ãªã©ããã注ç®ããã¦ãããåæ²ç©ºéã¨ããã®ã¯ãä¸å³çä¸ã®ãããªãæ²ãã£ã空éãå¤ã«è¡ãã»ã©å¯ã«ãªã£ã¦ãã(=空éãåºã)

[Fig]åæ²ç©ºéãå ´æã«ãã£ã¦è·é¢å°ºåº¦ãå¤ãã
å®éãåèªã¯é層æ§é ããªããã¨ãå¤ããä¸å³ã®çµæã®ããã«ã空éä¸ã«å¹çããåèªãé ç½®ãããã¨ãã§ããã

[Fig] è«æã®å®é¨çµæãããã¨å¨è¾ºèªã®é層é¢ä¿ã表ãã¦ãã
- ç°ç©ºéã¸ã®åãè¾¼ã¿ï¼Poincare Embeddingsãæã表ç¾å¦ç¿ã®æ°å±é
- Pythonã§Poincare embeddingsã使ã(gensim)
- åè«æ: Poincaré Embeddings for Learning Hierarchical Representations
Wasserstein Embeddings
ãã®åèªãã è«æããWasserstein空éãä½ãè·é¢ä¸ã®ç©ºéã«åãè¾¼ãã¨ãããã®ã ã£ãã
LEARNING WASSERSTEIN EMBEDDINGS
ã¾ã¨ããã¨
- wassersteinã¯è¨éã¨ãã¦åªç§ãåå¸ã®å·®ããã¾ã表ãã
- ã§ãè¨ç®éããã ããã次å ã®ï¼ä¹ã®è¨ç®é
- ãã¾ãè¿ä¼¼ããé¢æ°ããã£ã¼ãã©ã¼ãã³ã°ã§ä½ãã
- è¿ä¼¼ã«ã¯Siamese networkãæå¹ããã ããå¦ç¿ãã¦é¢æ°ãä½ã
- ãã¾ãè¿ä¼¼ã§ãããçè«çè£ã¥ãã課é¡ã
使ã£ãã®ã¯Siamese networkãè·é¢å¦ç¿ã«ãã使ãããã

ï¼ã¤ã®ãããã¯ã¼ã¯ãç¨æãã¦ããã©ã¡ã¼ã¿ã®æ´æ°ã¯
ã§å¤æããããã¯ãã«å士ã®ã®äºä¹èª¤å·®ã¨ãWassersteinè·é¢ã®å·®ãè¿ã¥ããããã«
ã¨
ã®èª¤å·®(åæ§æ誤差)ãå°ããããããã«
è¡ããä»å¾ããªãããªç©ºéã§ãè¨ç®éãããã¯ã«ãªãå ´åã¯ããã®ããã«ãã¥ã¼ã©ã«ãããã§ç©ºéã®è¨éãè¿ä¼¼ãããããªææ³ã主æµã«ãªããããããªã*9
ã¾ã¨ã
ä»æ¥ã¯ãããªãåºç¤çãªå 容ã«ãªã£ãã å人çã«ã¯ããã®ã¬ãã«ã§ã®æ°ããã¢ã«ã´ãªãºã ãä»å¾åºã¦ããã®ã§ã¯ãªãã®ããªã¨æã£ã¦ãããåºç¤çãªã¨ããããã®å¤æ´ã¯ããã©ã¹ãã£ãã¯ãªå¤é©ãå¼ãèµ·ãããã¨ãå¤ãã®ã§ãä»å¾ãéæ追ã£ã¦ããããã¨æãã
ãããããã¨ï¼æ¥ãææ¥ã¯ãããã§æ¬ãµã¤ãã§æ¸ãã¦ãããã¼ã¿åæ/æ©æ¢°å¦ç¿ã«é¢ããå 容ãç·ã¾ã¨ããããã¨æããå¿ è¦ã§ãï¼ã§ã¯ã§ã¯ã
*1:空éãè¨éã測度è«...ã ãª...
*3:ãã³ããã¿ã³è·é¢
*4:é·ããªãã®ã§å²æã詳ããã¯ãã¡ã
*5:ç©çç¨èªã§ã¯ããããã
*7:ãªããªã¼ã³ã¿å®çã¨ãã
*8:深層å¦ç¿ (æ©æ¢°å¦ç¿ãããã§ãã·ã§ãã«ã·ãªã¼ãº)ãªã©ãåç §
*9:ã¨ãå人çã«æ³¨ç®ãã¦ãã