Gemini 2.0 Flashã®ãã¬ãã¥ã¼ãåºã¾ãããã APIããã¥ã¡ã³ããè¦ã¦ãèå³ãæ¹ãããã®ã¯Bounding box detectionã ããã¦ãããç»åãã対象ã®åº§æ¨ãåãããé¢ç½ããã¨ã§ãããã ãªãã¨æã£ã¦ãã¾ããã Google AI Studioã§è©¦ããã¨ãã§ãã¾ãã
Google AI Studioãçºè¡ãã¦ããããã³ããã確èªããã¨
Detect æç, with no more than 20 items. Output a json list where each entry contains the 2D bounding box in "box_2d" and a text label in "label".
ã£ã¦æãã§ãèªç¶è¨èªã§JSONã®å½¢å¼ãæå®ãã¦ãããã§ãããOpenAIã®structured_outputã«æ £ãã¦ããã¨ã¡ãã£ã¨é©ãã å®éã«ãã©ãããã¬ã¹ãã³ã¹ãæ¥ãã®ãã¯å¾è¿°ãã¾ãã
ãã®ç©ºéèªèã»ãã¦ã³ãã£ã³ã°æ¤åºã¯ãæ¢åã®JavaScriptåãSDKã§ãå©ç¨å¯è½ã§ããã®ãããã試ãã«ã¯æ軽ã§ããã§ãã æ°ããSDKã¯ãJavaScriptåãã®æä¾ãã¾ã å§ã¾ã£ã¦ããªããããªã®ã§ã
const model = vertexai.getGenerativeModel({ model: "gemini-2.0-flash-exp", });
ç´ ç´ã«ãæ°ããã¢ãã«åãæå®ããã ãã§ä½¿ããã¿ããã§ãã
ãã¨ã¯ããã¤ãéãã« generateContent
ãå¼ã³åºãã¦ããã¹ããçæããã ãã§ãã
å
è¿°ã®ããã«ãã¦ã³ãã£ã³ã°ãæ¤åºãããæ¨ãããã³ããã«å«ãããã¨ã§ã決ã¾ã£ãæ¸å¼ã®JSONã£ã½ããã®ãè¿ããã¾ãã
ã¾ãã¯2Dããå®è£ ã ããã³ããã¯Google AI Studioã®ãã®ããã¡ããã£ã¨ããã£ã¦ãããªæãã«ãã¦ã¿ã¾ãã
`Detect ${target}, with no more than 20 items. Output a json list where each entry contains the 2D bounding box in "box_2d" and name in "label".`
ããã§generateContentããã¨ã次ã®ãããªããã¹ããçæããã¾ãã
```json [ {"box_2d": [13, 432, 457, 879], "label": "æç"}, {"box_2d": [309, 0, 892, 509], "label": "æç"}, {"box_2d": [535, 390, 892, 773], "label": "æç"} ] ```
ç´ç²ãªJSONã§ã¯ãªãã¦ãJSONãå«ãMarkdownã§ãããã¨ã«æ³¨æãå¿
è¦ã§ããä¸èº«ã®JSONã ãåãåºãã¦æ±ãã¾ãããã
æ´æ°ã®é
åã®æå³ã¯ãã®è³æã«æ¸ããã¦ãã¾ãã
ç»å å·¦ä¸ã®åº§æ¨ã (0, 0) ã¨ããå³ä¸ã®åº§æ¨ã (1000, 1000) ã¨ããã¨ãã®ãããã¯ã¹ã®å·¦ä¸ãå³ä¸ã®åº§æ¨ã表ãã¦ãã¾ãã
y座æ¨ãå
ã§ãx座æ¨ãå¾ãªãã¨ã«æ³¨æã
ã¤ã¾ã [13, 432, 457, 879]
ã¯ãå·¦ä¸ x=432, y=13 å³ä¸ x=879, y=457 ã§ãã
å®éã«ãã©ã¦ã¶ä¸ã«ãã 表示ããã®ã¯ç°¡åã§ãã CSSã§absoluteãã¦top, left, width, heightã%æå®ããã ãã§ããã
3Dã大ä½åãã§ã¯ããã¾ãã Google AI Studioã®ããã³ãããããã£ã¦ããããã¦ã¿ã¾ããã
`Detect the 3D bounding boxes of ${target} , output no more than 10 items. Output a json list where each entry contains the object name in "label" and its 3D bounding box in "box_3d".`
çæãããããã¹ãã¯ãããªæãã§ãã
```json [ {"label": "æç", "box_3d": [0.41,1.54,0.22,0.71,0.74,0.87,-34,-13,10]}, {"label": "æç", "box_3d": [-0.31,1.55,-0.1,0.71,0.5,0.76,-34,-12,9]}, {"label": "æç", "box_3d": [0.06,1.54,-0.4,0.46,0.36,0.37,-33,-2,2]} ] ```
ããããã®æ°å¤ã®æå³ã¯
- æåã®3ã¤: x_center, y_center, z_center
- çãä¸ã®3ã¤: x_size, y_size, z_size
- æå¾ã®3ã¤: x軸ã®å転ãy軸ã®å転, z軸ã®å転ï¼åº¦æ°æ³ï¼
ã§ãããã¥ã¡ã³ãæ°ã x_center ã¨ã x_size ã¯ã¡ã¼ãã«åä½ã®é·ãã£ã½ãã§ãã ãªã®ã§ãç»é¢ã«æç»ãããã¨ã¯ãºã¼ã ã¤ã³ï¼ãºã¼ã ã¢ã¦ããè¡ã£ã¦ãå ç»åã¨ã®éãªãã調ç¯ããªãããããªãã¨æãã¾ãããã¶ãã
ä»åã¯Reactã§Three.jsããReact Three Fiberã使ãã¾ããã
ã¡ãªã¿ã«ãNext.js 15ç³»ã§React Three Fiber 8.17.10ã ã¨ãä¸æãåãã¾ããã§ããã ãã®issueãåèã«ã9.0.0-rc.1 ã®React Three Fiberã使ã£ããä¸æãè¡ãã¾ããã
ããä¸ã¤åä»ãã¨ããã誤解ã®ç¨®ã¨ãã¦ãThree.jsã®åº§æ¨ç³»ã¨Geminiãè¿ãããã¯ç°ãªãã®ã§æ³¨æãå¿ è¦ã§ãã Three.jsã¯ãç»é¢ã«å¯¾ãã¦åç´æ¹åãy軸ã§ã奥è¡ããz軸ã Geminiã¯ãåç´æ¹åãz軸ã§ã奥è¡ããy軸ã£ã½ãã§ãã
ãããæ°ãã¤ãã¦React Three Fiberã§ç®±åã®è¾ºãçµãã°ãæå¾ éãã«3Dãæç»ã§ãã¾ãã
ãªãã3Dã®æ¹ã¯ã¾ã å®é¨æ®µéã«ããã精度ã¯é«ããªãããã§ãã
ã¨ããã§ãå ¨ç¶é¢ä¿ãªãã§ããã使ç¨ãã¦ããåçã¯ã¤ãç°ããæ°ã«å ¥ãã®ã¤ã麺å±ããã®ä¸ã¤ã§ãã