More than 1 year has passed since last update.

OpenCVAdvent Calendar 2023

@Kazuhito(高橋かずひと)

Pydroid 3 で OpenCV を動かして画像処理やAIをお試し

Last updated at 2023-12-16Posted at 2023-12-16

この記事はOpenCV Advent Calendar 2023の17日目の記事です。

有料アプリを利用した内容なので、ちょっと恐縮ですが。。。

Pydroid 3とは

Android 向けの Python3 IDE。
OpenCV や PyTorch など、いくつかのアーキテクチャ依存のパッケージも独自にコンパイルされていて使用できる。

ただし、有料アプリです。
2023年8月時点では、日本円で1850円の買い切りアプリ。
※ストアの都合上、一度購入すると今の価格を調べるのが面倒なため、今の価格は調べてないです🙇

iPhone で似たようなことがしたければ、Pythonista 3 で出来るはず？
もう1年以上Pythonista触ってないし、適当なこと言うの怖いから言及するのやめとこ🦔

後述しますが、OpenCVやPyTorch、TensorFlowなどが使えるため、画像処理やAIの処理をサクッと試したい場合にも便利です。

Pydroid 3の特徴

公式の自称ですが、以下のような特徴があります。
括弧内は私のテキトー意訳のため、間違っていればマサカリをば。

Loaded with modern educational libraries and assets（最新の教育ライブラリとアセットが満載）
Interactive terminal mode for both casual and advanced usage（カジュアルでも高度でも使用できる対話型ターミナル）
Multiple graphical interface libaries support（複数の視覚的なインターフェイスライブラリのサポート）
Custom pip repository if bundled C compiler is not fast enough（バンドルCでは十分に動作しないパッケージのカスタムpipリポジトリ）
Use your phone sensors with Kivy and Qt（Kivy と Qt 上でスマートフォンのセンサーを利用できる）

おそらく4個目の「Custom pip repository if bundled C compiler is not fast enough」が、独自に Pydroid 3 向けにビルドしたパッケージのことを指しているのかな🤔
前向きな表現使われていますが、、、
裏を返せば、アーキテクチャ依存するようなパッケージは、公式が提供しているものしか使えない。ということですが。。。
まあ、OpenCVは提供されているので僕的にはOKです👀

Pydroid3を画像処理に使う

OpenCV などのいくつかのパッケージは「QUICK INSTALL」と言うタブからインストールすることが出来ます。
（アーキテクチャ依存しないようなパッケージは「INSTALL」タブからインストール可）
画像処理に使えそうなパッケージは以下のようなものがあります。

OpenCV
Pillow
NumPy
PyTorch
TensorFlow

ONNXランタイムはありませんでした。残念。

Pydroid3でOpenCVを使う際の注意点

Pydroid3上でも、OpenCVの色々な機能が使えるようになっていますが、OpenCVで表示する際によく使用する cv2.imshow() は、さすがに使えないです。
そのため、表示に関わる処理は、Tkinter や QTを使う必要があります。

cv2.imshow("Sample", debug_image)

を以下のように、Tkinterで表示するように書き換えます。

# TK準備
root = tk.Tk()
root.rowconfigure(0, weight=1)
root.columnconfigure(0, weight=1)
label = tk.Label(root)
label.grid()

rgb_image = cv2.cvtColor(debug_image, cv2.COLOR_BGR2RGB)  # TK用にチャンネルをRGBに変更
tk_image = ImageTk.PhotoImage(image=Image.fromarray(rgb_image))
label.configure(image=tk_image)
label.update()

いくつかOpenCVのサンプルを以降に記載していきます🦔
pyファイルに保存して、USB経由でAndroid内にコピーして動かしています。

サンプル1：バージョン確認

ある意味、ハローワールドですね👀

sample_01.py

import cv2

print(cv2.__version__)

これを実行すると以下のように表示されます（2023年12月16日時点）

4.8.0でした。
それなりに新しいバージョンの OpenCV が使用できますね👀

サンプル2：画像読み込み

画像を読み込んで表示するサンプルです。
TKを使っている関係上、画像を読み込んでからの記述がちょっと長いですね👀

sample_02.py

import cv2
import tkinter as tk
from PIL import ImageTk, Image

# 画像読み込み
image = cv2.imread('sample.jpg')
# TK用にチャンネルをRGBに変更
rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# TK準備
root = tk.Tk()
root.rowconfigure(0, weight=1)
root.columnconfigure(0, weight=1)
label = tk.Label(root)
label.grid()

# 表示向き補正
label_width = label.winfo_screenwidth()
label_height = label.winfo_screenheight()
image_width = rgb_image.shape[0]
image_height = rgb_image.shape[1]
image_width, image_height = image_height, image_width
if (label_width > label_height) != (image_width > image_height):
    image_width, image_height = image_height, image_width
    rgb_image = cv2.rotate(rgb_image, cv2.ROTATE_90_COUNTERCLOCKWISE)

# 画面全体におさまるようにリサイズ
label_width = int(min(image_width * label_height / image_height, label_width))
label_height = int(min(image_height * label_width / image_width, label_height))
rgb_image = cv2.resize(
    rgb_image,
    (label_width, label_height),
    interpolation=cv2.INTER_LINEAR,
)

# TKのラベル更新
tk_image = ImageTk.PhotoImage(image=Image.fromarray(rgb_image))
label.configure(image=tk_image)
label.update()

# TKループ開始
root.mainloop()

これを実行すると以下のように表示されます。
横長の画像だったので回転して表示しています🦔
　

サンプル3：カメラ読み込み

スマホのカメラを読み込んで表示するサンプルです。
おそらく大体のスマホで、VideoCapture()のデバイス番号に0を入れると背面カメラ、1を入れると前面カメラになると思います。

sample_03.py

import cv2
import tkinter as tk
from PIL import ImageTk, Image

# TK準備
root = tk.Tk()
root.rowconfigure(0, weight=1)
root.columnconfigure(0, weight=1)
label = tk.Label(root)
label.grid()

# カメラオープン
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1920)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 1080)


def label_loop():
    global label, cap

    # カメラフレーム読み込み
    ret, frame = cap.read()
    if not ret:
        label.after(0, label_loop)
        return
    rgb_image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # 表示向き補正
    label_width = label.winfo_screenwidth()
    label_height = label.winfo_screenheight()
    image_width = rgb_image.shape[0]
    image_height = rgb_image.shape[1]
    image_width, image_height = image_height, image_width
    if (label_width > label_height) != (image_width > image_height):
        image_width, image_height = image_height, image_width
        rgb_image = cv2.rotate(rgb_image, cv2.ROTATE_90_COUNTERCLOCKWISE)

    # 画面全体におさまるようにリサイズ
    label_width = int(
        min(image_width * label_height / image_height, label_width))
    label_height = int(
        min(image_height * label_width / image_width, label_height))
    rgb_image = cv2.resize(
        rgb_image,
        (label_width, label_height),
        interpolation=cv2.INTER_LINEAR,
    )

    # TKのラベル更新
    tk_image = ImageTk.PhotoImage(image=Image.fromarray(rgb_image))
    label.configure(image=tk_image)
    label.update()
    label.after(0, label_loop)


label_loop()
root.mainloop()

サンプル4：アスキーカメラ（公式サンプル）

公式のサンプルですが、面白かったので日本語コメントを追記して記載。
カメラ画像をグレースケール化＋N階調化して、アスキーコードで表示するサンプルです。

sample_04.py

import curses

import cv2
import numpy as np
from PIL import Image

chars = np.asarray(list(' .,:;irsXA253hMHGS#9B&@'))
scale = 0.15
scale_x = 1.75

# カメラオープン
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 320)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 240)

# curses初期化
scr = curses.initscr()

try:
    while True:
        # カメラフレーム読み込み
        ret, frame = cap.read()
        if not ret:
            continue

        # グレースケール化してN階調化
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        image = Image.fromarray(frame)
        S = (round(image.size[0] * scale * scale_x),
             round(image.size[1] * scale))
        image = np.asarray(image.resize(S))
        image = (image / 255.) * (chars.size - 1)

        # コンソール表示
        for i, r in enumerate(chars[image.astype(int)]):
            try:
                scr.addstr(i, 0, "".join(r))
            except KeyboardInterrupt:
                raise
            except:
                pass
        scr.refresh()
except KeyboardInterrupt:
    scr.endwin()
    raise

サンプル5：PyTorch の MobileNet v2 を使用したクラス分類

こちらも公式のサンプルですが、ちょっとだけ気に入らなかった書き方を修正して、日本語コメントを追記して記載。

ソースコード
　※ラベル定義も1ファイルに全部書いて長いため折りたたみ

sample_05.py

import copy
import threading
import tkinter as tk

import cv2
import numpy as np
from PIL import Image, ImageTk

import torch
from torchvision import transforms
from torchvision.models.quantization import mobilenet

# ImageNetラベル
imagenet_labels = [
    "tench", "goldfish", "great white shark", "tiger shark",
    "hammerhead shark", "electric ray", "stingray", "cock", "hen", "ostrich",
    "brambling", "goldfinch", "house finch", "junco", "indigo bunting",
    "American robin", "bulbul", "jay", "magpie", "chickadee",
    "American dipper", "kite", "bald eagle", "vulture", "great grey owl",
    "fire salamander", "smooth newt", "newt", "spotted salamander", "axolotl",
    "American bullfrog", "tree frog", "tailed frog", "loggerhead sea turtle",
    "leatherback sea turtle", "mud turtle", "terrapin", "box turtle",
    "banded gecko", "green iguana", "Carolina anole",
    "desert grassland whiptail lizard", "agama", "frilled-necked lizard",
    "alligator lizard", "Gila monster", "European green lizard", "chameleon",
    "Komodo dragon", "Nile crocodile", "American alligator", "triceratops",
    "worm snake", "ring-necked snake", "eastern hog-nosed snake",
    "smooth green snake", "kingsnake", "garter snake", "water snake",
    "vine snake", "night snake", "boa constrictor", "African rock python",
    "Indian cobra", "green mamba", "sea snake", "Saharan horned viper",
    "eastern diamondback rattlesnake", "sidewinder", "trilobite", "harvestman",
    "scorpion", "yellow garden spider", "barn spider",
    "European garden spider", "southern black widow", "tarantula",
    "wolf spider", "tick", "centipede", "black grouse", "ptarmigan",
    "ruffed grouse", "prairie grouse", "peacock", "quail", "partridge",
    "grey parrot", "macaw", "sulphur-crested cockatoo", "lorikeet", "coucal",
    "bee eater", "hornbill", "hummingbird", "jacamar", "toucan", "duck",
    "red-breasted merganser", "goose", "black swan", "tusker", "echidna",
    "platypus", "wallaby", "koala", "wombat", "jellyfish", "sea anemone",
    "brain coral", "flatworm", "nematode", "conch", "snail", "slug",
    "sea slug", "chiton", "chambered nautilus", "Dungeness crab", "rock crab",
    "fiddler crab", "red king crab", "American lobster", "spiny lobster",
    "crayfish", "hermit crab", "isopod", "white stork", "black stork",
    "spoonbill", "flamingo", "little blue heron", "great egret", "bittern",
    "crane (bird)", "limpkin", "common gallinule", "American coot", "bustard",
    "ruddy turnstone", "dunlin", "common redshank", "dowitcher",
    "oystercatcher", "pelican", "king penguin", "albatross", "grey whale",
    "killer whale", "dugong", "sea lion", "Chihuahua", "Japanese Chin",
    "Maltese", "Pekingese", "Shih Tzu", "King Charles Spaniel", "Papillon",
    "toy terrier", "Rhodesian Ridgeback", "Afghan Hound", "Basset Hound",
    "Beagle", "Bloodhound", "Bluetick Coonhound", "Black and Tan Coonhound",
    "Treeing Walker Coonhound", "English foxhound", "Redbone Coonhound",
    "borzoi", "Irish Wolfhound", "Italian Greyhound", "Whippet",
    "Ibizan Hound", "Norwegian Elkhound", "Otterhound", "Saluki",
    "Scottish Deerhound", "Weimaraner", "Staffordshire Bull Terrier",
    "American Staffordshire Terrier", "Bedlington Terrier", "Border Terrier",
    "Kerry Blue Terrier", "Irish Terrier", "Norfolk Terrier",
    "Norwich Terrier", "Yorkshire Terrier", "Wire Fox Terrier",
    "Lakeland Terrier", "Sealyham Terrier", "Airedale Terrier",
    "Cairn Terrier", "Australian Terrier", "Dandie Dinmont Terrier",
    "Boston Terrier", "Miniature Schnauzer", "Giant Schnauzer",
    "Standard Schnauzer", "Scottish Terrier", "Tibetan Terrier",
    "Australian Silky Terrier", "Soft-coated Wheaten Terrier",
    "West Highland White Terrier", "Lhasa Apso", "Flat-Coated Retriever",
    "Curly-coated Retriever", "Golden Retriever", "Labrador Retriever",
    "Chesapeake Bay Retriever", "German Shorthaired Pointer", "Vizsla",
    "English Setter", "Irish Setter", "Gordon Setter", "Brittany",
    "Clumber Spaniel", "English Springer Spaniel", "Welsh Springer Spaniel",
    "Cocker Spaniels", "Sussex Spaniel", "Irish Water Spaniel", "Kuvasz",
    "Schipperke", "Groenendael", "Malinois", "Briard", "Australian Kelpie",
    "Komondor", "Old English Sheepdog", "Shetland Sheepdog", "collie",
    "Border Collie", "Bouvier des Flandres", "Rottweiler",
    "German Shepherd Dog", "Dobermann", "Miniature Pinscher",
    "Greater Swiss Mountain Dog", "Bernese Mountain Dog",
    "Appenzeller Sennenhund", "Entlebucher Sennenhund", "Boxer", "Bullmastiff",
    "Tibetan Mastiff", "French Bulldog", "Great Dane", "St. Bernard", "husky",
    "Alaskan Malamute", "Siberian Husky", "Dalmatian", "Affenpinscher",
    "Basenji", "pug", "Leonberger", "Newfoundland", "Pyrenean Mountain Dog",
    "Samoyed", "Pomeranian", "Chow Chow", "Keeshond", "Griffon Bruxellois",
    "Pembroke Welsh Corgi", "Cardigan Welsh Corgi", "Toy Poodle",
    "Miniature Poodle", "Standard Poodle", "Mexican hairless dog", "grey wolf",
    "Alaskan tundra wolf", "red wolf", "coyote", "dingo", "dhole",
    "African wild dog", "hyena", "red fox", "kit fox", "Arctic fox",
    "grey fox", "tabby cat", "tiger cat", "Persian cat", "Siamese cat",
    "Egyptian Mau", "cougar", "lynx", "leopard", "snow leopard", "jaguar",
    "lion", "tiger", "cheetah", "brown bear", "American black bear",
    "polar bear", "sloth bear", "mongoose", "meerkat", "tiger beetle",
    "ladybug", "ground beetle", "longhorn beetle", "leaf beetle",
    "dung beetle", "rhinoceros beetle", "weevil", "fly", "bee", "ant",
    "grasshopper", "cricket", "stick insect", "cockroach", "mantis", "cicada",
    "leafhopper", "lacewing", "dragonfly", "damselfly", "red admiral",
    "ringlet", "monarch butterfly", "small white", "sulphur butterfly",
    "gossamer-winged butterfly", "starfish", "sea urchin", "sea cucumber",
    "cottontail rabbit", "hare", "Angora rabbit", "hamster", "porcupine",
    "fox squirrel", "marmot", "beaver", "guinea pig", "common sorrel", "zebra",
    "pig", "wild boar", "warthog", "hippopotamus", "ox", "water buffalo",
    "bison", "ram", "bighorn sheep", "Alpine ibex", "hartebeest", "impala",
    "gazelle", "dromedary", "llama", "weasel", "mink", "European polecat",
    "black-footed ferret", "otter", "skunk", "badger", "armadillo",
    "three-toed sloth", "orangutan", "gorilla", "chimpanzee", "gibbon",
    "siamang", "guenon", "patas monkey", "baboon", "macaque", "langur",
    "black-and-white colobus", "proboscis monkey", "marmoset",
    "white-headed capuchin", "howler monkey", "titi",
    "Geoffroy's spider monkey", "common squirrel monkey", "ring-tailed lemur",
    "indri", "Asian elephant", "African bush elephant", "red panda",
    "giant panda", "snoek", "eel", "coho salmon", "rock beauty", "clownfish",
    "sturgeon", "garfish", "lionfish", "pufferfish", "abacus", "abaya",
    "academic gown", "accordion", "acoustic guitar", "aircraft carrier",
    "airliner", "airship", "altar", "ambulance", "amphibious vehicle",
    "analog clock", "apiary", "apron", "waste container", "assault rifle",
    "backpack", "bakery", "balance beam", "balloon", "ballpoint pen",
    "Band-Aid", "banjo", "baluster", "barbell", "barber chair", "barbershop",
    "barn", "barometer", "barrel", "wheelbarrow", "baseball", "basketball",
    "bassinet", "bassoon", "swimming cap", "bath towel", "bathtub",
    "station wagon", "lighthouse", "beaker", "military cap", "beer bottle",
    "beer glass", "bell-cot", "bib", "tandem bicycle", "bikini", "ring binder",
    "binoculars", "birdhouse", "boathouse", "bobsleigh", "bolo tie",
    "poke bonnet", "bookcase", "bookstore", "bottle cap", "bow", "bow tie",
    "brass", "bra", "breakwater", "breastplate", "broom", "bucket", "buckle",
    "bulletproof vest", "high-speed train", "butcher shop", "taxicab",
    "cauldron", "candle", "cannon", "canoe", "can opener", "cardigan",
    "car mirror", "carousel", "tool kit", "carton", "car wheel",
    "automated teller machine", "cassette", "cassette player", "castle",
    "catamaran", "CD player", "cello", "mobile phone", "chain",
    "chain-link fence", "chain mail", "chainsaw", "chest", "chiffonier",
    "chime", "china cabinet", "Christmas stocking", "church", "movie theater",
    "cleaver", "cliff dwelling", "cloak", "clogs", "cocktail shaker",
    "coffee mug", "coffeemaker", "coil", "combination lock",
    "computer keyboard", "confectionery store", "container ship",
    "convertible", "corkscrew", "cornet", "cowboy boot", "cowboy hat",
    "cradle", "crane (machine)", "crash helmet", "crate", "infant bed",
    "Crock Pot", "croquet ball", "crutch", "cuirass", "dam", "desk",
    "desktop computer", "rotary dial telephone", "diaper", "digital clock",
    "digital watch", "dining table", "dishcloth", "dishwasher", "disc brake",
    "dock", "dog sled", "dome", "doormat", "drilling rig", "drum", "drumstick",
    "dumbbell", "Dutch oven", "electric fan", "electric guitar",
    "electric locomotive", "entertainment center", "envelope",
    "espresso machine", "face powder", "feather boa", "filing cabinet",
    "fireboat", "fire engine", "fire screen sheet", "flagpole", "flute",
    "folding chair", "football helmet", "forklift", "fountain", "fountain pen",
    "four-poster bed", "freight car", "French horn", "frying pan", "fur coat",
    "garbage truck", "gas mask", "gas pump", "goblet", "go-kart", "golf ball",
    "golf cart", "gondola", "gong", "gown", "grand piano", "greenhouse",
    "grille", "grocery store", "guillotine", "barrette", "hair spray",
    "half-track", "hammer", "hamper", "hair dryer", "hand-held computer",
    "handkerchief", "hard disk drive", "harmonica", "harp", "harvester",
    "hatchet", "holster", "home theater", "honeycomb", "hook", "hoop skirt",
    "horizontal bar", "horse-drawn vehicle", "hourglass", "iPod",
    "clothes iron", "jack-o'-lantern", "jeans", "jeep", "T-shirt",
    "jigsaw puzzle", "pulled rickshaw", "joystick", "kimono", "knee pad",
    "knot", "lab coat", "ladle", "lampshade", "laptop computer", "lawn mower",
    "lens cap", "paper knife", "library", "lifeboat", "lighter", "limousine",
    "ocean liner", "lipstick", "slip-on shoe", "lotion", "speaker", "loupe",
    "sawmill", "magnetic compass", "mail bag", "mailbox", "tights",
    "tank suit", "manhole cover", "maraca", "marimba", "mask", "match",
    "maypole", "maze", "measuring cup", "medicine chest", "megalith",
    "microphone", "microwave oven", "military uniform", "milk can", "minibus",
    "miniskirt", "minivan", "missile", "mitten", "mixing bowl", "mobile home",
    "Model T", "modem", "monastery", "monitor", "moped", "mortar",
    "square academic cap", "mosque", "mosquito net", "scooter",
    "mountain bike", "tent", "computer mouse", "mousetrap", "moving van",
    "muzzle", "nail", "neck brace", "necklace", "nipple", "notebook computer",
    "obelisk", "oboe", "ocarina", "odometer", "oil filter", "organ",
    "oscilloscope", "overskirt", "bullock cart", "oxygen mask", "packet",
    "paddle", "paddle wheel", "padlock", "paintbrush", "pajamas", "palace",
    "pan flute", "paper towel", "parachute", "parallel bars", "park bench",
    "parking meter", "passenger car", "patio", "payphone", "pedestal",
    "pencil case", "pencil sharpener", "perfume", "Petri dish", "photocopier",
    "plectrum", "Pickelhaube", "picket fence", "pickup truck", "pier",
    "piggy bank", "pill bottle", "pillow", "ping-pong ball", "pinwheel",
    "pirate ship", "pitcher", "hand plane", "planetarium", "plastic bag",
    "plate rack", "plow", "plunger", "Polaroid camera", "pole", "police van",
    "poncho", "billiard table", "soda bottle", "pot", "potter's wheel",
    "power drill", "prayer rug", "printer", "prison", "projectile",
    "projector", "hockey puck", "punching bag", "purse", "quill", "quilt",
    "race car", "racket", "radiator", "radio", "radio telescope",
    "rain barrel", "recreational vehicle", "reel", "reflex camera",
    "refrigerator", "remote control", "restaurant", "revolver", "rifle",
    "rocking chair", "rotisserie", "eraser", "rugby ball", "ruler",
    "running shoe", "safe", "safety pin", "salt shaker", "sandal", "sarong",
    "saxophone", "scabbard", "weighing scale", "school bus", "schooner",
    "scoreboard", "CRT screen", "screw", "screwdriver", "seat belt",
    "sewing machine", "shield", "shoe store", "shoji", "shopping basket",
    "shopping cart", "shovel", "shower cap", "shower curtain", "ski",
    "ski mask", "sleeping bag", "slide rule", "sliding door", "slot machine",
    "snorkel", "snowmobile", "snowplow", "soap dispenser", "soccer ball",
    "sock", "solar thermal collector", "sombrero", "soup bowl", "space bar",
    "space heater", "space shuttle", "spatula", "motorboat", "spider web",
    "spindle", "sports car", "spotlight", "stage", "steam locomotive",
    "through arch bridge", "steel drum", "stethoscope", "scarf", "stone wall",
    "stopwatch", "stove", "strainer", "tram", "stretcher", "couch", "stupa",
    "submarine", "suit", "sundial", "sunglass", "sunglasses", "sunscreen",
    "suspension bridge", "mop", "sweatshirt", "swimsuit", "swing", "switch",
    "syringe", "table lamp", "tank", "tape player", "teapot", "teddy bear",
    "television", "tennis ball", "thatched roof", "front curtain", "thimble",
    "threshing machine", "throne", "tile roof", "toaster", "tobacco shop",
    "toilet seat", "torch", "totem pole", "tow truck", "toy store", "tractor",
    "semi-trailer truck", "tray", "trench coat", "tricycle", "trimaran",
    "tripod", "triumphal arch", "trolleybus", "trombone", "tub", "turnstile",
    "typewriter keyboard", "umbrella", "unicycle", "upright piano",
    "vacuum cleaner", "vase", "vault", "velvet", "vending machine", "vestment",
    "viaduct", "violin", "volleyball", "waffle iron", "wall clock", "wallet",
    "wardrobe", "military aircraft", "sink", "washing machine", "water bottle",
    "water jug", "water tower", "whiskey jug", "whistle", "wig",
    "window screen", "window shade", "Windsor tie", "wine bottle", "wing",
    "wok", "wooden spoon", "wool", "split-rail fence", "shipwreck", "yawl",
    "yurt", "website", "comic book", "crossword", "traffic sign",
    "traffic light", "dust jacket", "menu", "plate", "guacamole", "consomme",
    "hot pot", "trifle", "ice cream", "ice pop", "baguette", "bagel",
    "pretzel", "cheeseburger", "hot dog", "mashed potato", "cabbage",
    "broccoli", "cauliflower", "zucchini", "spaghetti squash", "acorn squash",
    "butternut squash", "cucumber", "artichoke", "bell pepper", "cardoon",
    "mushroom", "Granny Smith", "strawberry", "orange", "lemon", "fig",
    "pineapple", "banana", "jackfruit", "custard apple", "pomegranate", "hay",
    "carbonara", "chocolate syrup", "dough", "meatloaf", "pizza", "pot pie",
    "burrito", "red wine", "espresso", "cup", "eggnog", "alp", "bubble",
    "cliff", "coral reef", "geyser", "lakeshore", "promontory", "shoal",
    "seashore", "valley", "volcano", "baseball player", "bridegroom",
    "scuba diver", "rapeseed", "daisy", "yellow lady's slipper", "corn",
    "acorn", "rose hip", "horse chestnut seed", "coral fungus", "agaric",
    "gyromitra", "stinkhorn mushroom", "earth star", "hen-of-the-woods",
    "bolete", "ear", "toilet paper"
]
MIN_SCORE = 0.5

# MovileNet v2 初期化
model = mobilenet.mobilenet_v2(pretrained=True, quantize=True)
model.eval()

# 前処理定義
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])

# TK準備
root = tk.Tk()
root.rowconfigure(0, weight=1)
root.columnconfigure(0, weight=1)
label = tk.Label(root)
label.grid()

# カメラオープン
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# クラス分類スレッドに受け渡すための画像変数
image_for_classification = None


def label_loop():
    global root, label, image_for_classification, tk_image

    # カメラフレーム読み込み
    ret, frame = cap.read()
    if not ret:
        label.after(0, label_loop)
        return
    rgb_image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # 表示向き補正
    label_width = label.winfo_screenwidth()
    label_height = label.winfo_screenheight()
    image_width = rgb_image.shape[0]
    image_height = rgb_image.shape[1]
    image_width, image_height = image_height, image_width
    if (label_width > label_height) != (image_width > image_height):
        image_width, image_height = image_height, image_width
        rgb_image = cv2.rotate(rgb_image, cv2.ROTATE_90_COUNTERCLOCKWISE)

    # クラス分類スレッド用に画像をコピー
    image_for_classification = copy.deepcopy(rgb_image)

    # 画面全体におさまるようにリサイズ
    label_width = int(
        min(image_width * label_height / image_height, label_width))
    label_height = int(
        min(image_height * label_width / image_width, label_height))
    rgb_image = cv2.resize(
        rgb_image,
        (label_width, label_height),
        interpolation=cv2.INTER_LINEAR,
    )

    # TKのラベル更新
    tk_image = ImageTk.PhotoImage(
        image=Image.fromarray(rgb_image),
        master=root,
    )
    label.configure(image=tk_image)
    label.update()
    label.after(0, label_loop)


def classification_thread():
    global label, image_for_classification
    while True:
        if image_for_classification is not None:
            temp_image = copy.deepcopy(image_for_classification)

            # 前処理
            input_tensor = preprocess(Image.fromarray(temp_image))
            input_batch = input_tensor.unsqueeze(0)

            # 推論
            with torch.no_grad():
                output = model(input_batch)

            # 後処理
            class_scores = torch.nn.functional.softmax(output[0], dim=0)
            class_id = int(np.argmax(class_scores))

            # 画面更新
            if class_scores[class_id] < MIN_SCORE:
                text = "No object detected"
            else:
                text = "Detected the following object: "
                text += imagenet_labels[class_id]
            label.config(
                text=text,
                wraplength=label.winfo_screenwidth(),
                compound=tk.BOTTOM,
            )


# TKメイン処理開始
label_loop()
# クラス分類スレッド起動
threading.Thread(target=classification_thread).start()
root.mainloop()

サンプル6：TensorFlow Lite を用いて物体検出
※PINTO_model_zoo/426_YOLOX-Body-Head-Hand を拝借

PINTOさんの人体検出モデルを使用したデモです。
使用モデルは、yolox_n_body_head_hand_post_0461_0.4428_1x3x256x320_float32.tflite をお借りしました👀

sample_06.py

import copy
import time

import cv2
import numpy as np
import tkinter as tk
import tensorflow as tf
from PIL import ImageTk, Image

# TK準備
root = tk.Tk()
root.rowconfigure(0, weight=1)
root.columnconfigure(0, weight=1)
label = tk.Label(root)
label.grid()

# カメラオープン
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 360)

# モデル読み込み
interpreter = tf.lite.Interpreter(
    model_path=
    'yolox_n_body_head_hand_post_0461_0.4428_1x3x256x320_float32.tflite')
interpreter.allocate_tensors()
model = interpreter.get_signature_runner()

input_details = interpreter.get_input_details()
input_shape = [input.get('shape', None) for input in input_details]
input_width, input_height = input_shape[0][2], input_shape[0][1]


def label_loop():
    global label, cap, model, input_width, input_height

    start_time = time.time()

    # カメラフレーム読み込み
    ret, frame = cap.read()
    if not ret:
        label.after(0, label_loop)
        return
    debug_image = copy.deepcopy(frame)

    # 推論
    input_image = cv2.resize(frame, (input_width, input_height))
    input_data = {'input': np.asarray([input_image], dtype=np.float32)}

    outputs = model(**input_data)

    results = outputs['batchno_classid_score_x1y1x2y2']

    elapsed_time = time.time() - start_time

    # デバッグ描画
    frame_width, frame_height = frame.shape[1], frame.shape[0]
    for result in results:
        class_id = int(result[1])
        score = result[2]
        x1 = int(result[3] / input_width * frame_width)
        y1 = int(result[4] / input_height * frame_height)
        x2 = int(result[5] / input_width * frame_width)
        y2 = int(result[6] / input_height * frame_height)

        if score < 0.5:
            continue

        if class_id == 0:
            color = (255, 0, 0)
        elif class_id == 1:
            color = (0, 255, 0)
        else:
            color = (0, 0, 255)
        debug_image = cv2.rectangle(
            debug_image,
            (x1, y1),
            (x2, y2),
            color,
            thickness=1,
        )
    text = 'Elapsed time:' + '%.0f' % (elapsed_time * 1000)
    text = text + 'ms'
    debug_image = cv2.putText(
        debug_image,
        text,
        (10, 30),
        cv2.FONT_HERSHEY_SIMPLEX,
        0.75,
        (255, 0, 0),
        thickness=2,
    )

    rgb_image = cv2.cvtColor(debug_image, cv2.COLOR_BGR2RGB)

    # 表示向き補正
    label_width = label.winfo_screenwidth()
    label_height = label.winfo_screenheight()
    image_width = rgb_image.shape[0]
    image_height = rgb_image.shape[1]
    image_width, image_height = image_height, image_width
    if (label_width > label_height) != (image_width > image_height):
        image_width, image_height = image_height, image_width
        rgb_image = cv2.rotate(rgb_image, cv2.ROTATE_90_COUNTERCLOCKWISE)

    # 画面全体におさまるようにリサイズ
    label_width = int(
        min(image_width * label_height / image_height, label_width))
    label_height = int(
        min(image_height * label_width / image_width, label_height))
    rgb_image = cv2.resize(
        rgb_image,
        (label_width, label_height),
        interpolation=cv2.INTER_LINEAR,
    )

    # TKのラベル更新
    tk_image = ImageTk.PhotoImage(image=Image.fromarray(rgb_image))
    label.configure(image=tk_image)
    label.update()
    label.after(0, label_loop)


label_loop()
root.mainloop()

サンプル7：dnnモジュールを用いてONNX推論

Pydroid 3 はONNXランタイムインストールできないので、ONNX試せないなー。
と思っていたら、そもそも、OpenCVのdnnモジュールでONNX読めましたね。。。
6日にdnnモジュールの投稿していたくせに、頭からすっぽ抜けてました😇

せっかくなので、Unaさんのアドベントカレンダーの投稿を参考に試してみました。

モデルはyolox_nano.onnxを使用しています。

sample_07.py

import copy
import time

import cv2
import numpy as np
import tkinter as tk
from PIL import ImageTk, Image


# detection model class for yolox
class DetectionModel:
    # constructor
    def __init__(self, weight, input_size=(640, 640)):
        self.__initialize(weight, input_size)

    # initialize
    def __initialize(self, weight, input_size):
        self.net = cv2.dnn.readNet(weight)
        self.input_size = input_size

        self.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
        self.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

        strides = [8, 16, 32]
        self.grids, self.expanded_strides = self.__create_grids_and_expanded_strides(
            strides)

    # create grids and expanded strides
    def __create_grids_and_expanded_strides(self, strides):
        grids = []
        expanded_strides = []

        hsizes = [self.input_size[0] // stride for stride in strides]
        wsizes = [self.input_size[1] // stride for stride in strides]

        for hsize, wsize, stride in zip(hsizes, wsizes, strides):
            xv, yv = np.meshgrid(np.arange(hsize), np.arange(wsize))
            grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
            grids.append(grid)
            shape = grid.shape[:2]
            expanded_strides.append(np.full((*shape, 1), stride))

        grids = np.concatenate(grids, 1)
        expanded_strides = np.concatenate(expanded_strides, 1)

        return grids, expanded_strides

    # set preferable backend
    def setPreferableBackend(self, backend):
        self.net.setPreferableBackend(backend)

    # set preferable target
    def setPreferableTarget(self, target):
        self.net.setPreferableTarget(target)

    # detect objects
    def detect(self, image, score_threshold, iou_threshold):
        self.image_shape = image.shape
        input_blob, resize_ratio = self.__preprocess(image)
        output_blob = self.__predict(input_blob)
        boxes, scores, class_ids = self.__postprocess(output_blob,
                                                      resize_ratio)
        boxes, scores, class_ids = self.__nms(boxes, scores, class_ids,
                                              score_threshold, iou_threshold)

        return class_ids, scores, boxes

    # preprocess
    def __preprocess(self, image):
        resize_ratio = min(self.input_size[0] / self.image_shape[0],
                           self.input_size[1] / self.image_shape[1])
        resized_image = cv2.resize(image,
                                   dsize=None,
                                   fx=resize_ratio,
                                   fy=resize_ratio)

        padded_image = np.ones(
            (self.input_size[0], self.input_size[1], 3), dtype=np.uint8) * 114
        padded_image[:resized_image.shape[0], :resized_image.
                     shape[1]] = resized_image

        input_blob = cv2.dnn.blobFromImage(padded_image, 1.0, self.input_size,
                                           (0.0, 0.0, 0.0), True, False)

        return input_blob, resize_ratio

    # predict
    def __predict(self, input_blob):
        self.net.setInput(input_blob)

        output_layer = self.net.getUnconnectedOutLayersNames()[0]  # "output"
        output_blob = self.net.forward(output_layer)

        return output_blob

    # postprocess
    def __postprocess(self, output_blob, resize_ratio):
        output_blob[..., :2] = (output_blob[..., :2] +
                                self.grids) * self.expanded_strides
        output_blob[..., 2:4] = np.exp(
            output_blob[..., 2:4]) * self.expanded_strides

        predictions = output_blob[0]

        boxes = predictions[:, :4]
        boxes_xywh = np.ones_like(boxes)
        boxes_xywh[:, 0] = boxes[:, 0] - boxes[:, 2] * 0.5
        boxes_xywh[:, 1] = boxes[:, 1] - boxes[:, 3] * 0.5
        boxes_xywh[:, 2] = (boxes[:, 0] + boxes[:, 2] * 0.5) - boxes_xywh[:, 0]
        boxes_xywh[:, 3] = (boxes[:, 1] + boxes[:, 3] * 0.5) - boxes_xywh[:, 1]
        boxes_xywh /= resize_ratio

        scores = predictions[:, 4:5] * predictions[:, 5:]
        class_ids = scores.argmax(1)
        scores = scores[np.arange(len(class_ids)), class_ids]

        return boxes_xywh, scores, class_ids

    # non maximum suppression
    def __nms(self, boxes, scores, class_ids, score_threshold, iou_threshold):
        indices = cv2.dnn.NMSBoxesBatched(
            boxes, scores, class_ids, score_threshold,
            iou_threshold)  # OpenCV 4.7.0 or later

        keep_boxes = []
        keep_scores = []
        keep_class_ids = []
        for index in indices:
            keep_boxes.append(boxes[index])
            keep_scores.append(scores[index])
            keep_class_ids.append(class_ids[index])

        if len(keep_boxes) > 0:
            keep_boxes = np.vectorize(int)(keep_boxes)

        return keep_boxes, keep_scores, keep_class_ids


def get_id_color(index):
    temp_index = abs(int(index + 1)) * 3
    color = ((37 * temp_index) % 255, (17 * temp_index) % 255,
             (29 * temp_index) % 255)
    return color


# TK準備
root = tk.Tk()
root.rowconfigure(0, weight=1)
root.columnconfigure(0, weight=1)
label = tk.Label(root)
label.grid()

# カメラオープン
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 360)

# モデル読み込み
input_width = 416
input_height = 416
model = DetectionModel(
    'yolox_nano.onnx',
    (input_height, input_width),
)


def label_loop():
    global label, cap, model
    start_time = time.time()

    # カメラフレーム読み込み
    ret, frame = cap.read()
    if not ret:
        label.after(0, label_loop)
        return
    debug_image = copy.deepcopy(frame)

    # 推論
    class_ids, scores, boxes = model.detect(
        image=frame,
        score_threshold=0.6,
        iou_threshold=0.4,
    )

    elapsed_time = time.time() - start_time

    # デバッグ描画
    for box, score, class_id in zip(boxes, scores, class_ids):
        cv2.rectangle(debug_image, box, get_id_color(class_id), 2)

    text = 'Elapsed time:' + '%.0f' % (elapsed_time * 1000)
    text = text + 'ms'
    debug_image = cv2.putText(
        debug_image,
        text,
        (10, 30),
        cv2.FONT_HERSHEY_SIMPLEX,
        0.75,
        (255, 0, 0),
        thickness=2,
    )

    rgb_image = cv2.cvtColor(debug_image, cv2.COLOR_BGR2RGB)

    # 表示向き補正
    label_width = label.winfo_screenwidth()
    label_height = label.winfo_screenheight()
    image_width = rgb_image.shape[0]
    image_height = rgb_image.shape[1]
    image_width, image_height = image_height, image_width
    if (label_width > label_height) != (image_width > image_height):
        image_width, image_height = image_height, image_width
        rgb_image = cv2.rotate(rgb_image, cv2.ROTATE_90_COUNTERCLOCKWISE)

    # 画面全体におさまるようにリサイズ
    label_width = int(
        min(image_width * label_height / image_height, label_width))
    label_height = int(
        min(image_height * label_width / image_width, label_height))
    rgb_image = cv2.resize(
        rgb_image,
        (label_width, label_height),
        interpolation=cv2.INTER_LINEAR,
    )

    # TKのラベル更新
    tk_image = ImageTk.PhotoImage(image=Image.fromarray(rgb_image))
    label.configure(image=tk_image)
    label.update()
    label.after(0, label_loop)


label_loop()
root.mainloop()

以上。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up