[DL輪èªä¼]Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets
ã¯ããã« æ¬è¨äºã¯Python2.7, numpy 1.11, scipy 0.17, scikit-learn 0.18, matplotlib 1.5, seaborn 0.7, pandas 0.17ã使ç¨ãã¦ãã¾ãï¼ jupyter notebookä¸ã§åä½ç¢ºèªæ¸ã¿ã§ãï¼(%matplotlib inlineã¯é©å½ã«ä¿®æ£ãã¦ãã ãã) Sklearnã®Manifold learningã®è¨äºãåèã«ãã¦ãã¾ãï¼ å¤æ§ä½å¦ç¿ã¨è¨ãããææ³ã«ã¤ãã¦ï¼sklearnã®digitsãµã³ãã«ãç¨ãã¦èª¬æãã¾ãï¼ ç¹ã«t-SNEã¯Kaggleãªã©ã§ããã¾ã«ä½¿ç¨ããã¦ããï¼å¤æ¬¡å ãã¼ã¿ã®å¯è¦åã«é©ããææ³ã§ãï¼ ã¾ãå¯è¦åã ãã§ãªãï¼å ã®ãã¼ã¿ã¨å§ç¸®ããããã¼ã¿ãçµåãããã¨ã§ï¼åç´ãªåé¡åé¡ã®ç²¾åº¦ãåä¸ãããã¨ãã§ãã¾ãï¼ ç®æ¬¡ ãã¼ã¿ã®çæ ç·å½¢è¦ç´ ã«æ³¨ç®ããæ¬¡å 忏 Random Proj
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets. We applied it on data sets with up to 30 million examples. The technique and its variants are intro
(Notes, e.g. [a] appear below) Input X: D by N matrix consisting of N data items in D dimensions. Output Y: d by N matrix consisting of d < D dimensional embedding coordinates for the input points. Find neighbours in X space [b,c]. for i=1:N compute the distance from Xi to every other point Xj find the K smallest distances assign the corresponding points to be neighbours of Xi end Solve for recons
ã©ã³ãã³ã°
ã©ã³ãã³ã°
ãç¥ãã
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}