Medical images like MRIs, CTs (3D images) are very similar to videos - both of them encode 2D spatial information over a 3rd dimension. Much like diagnosing abnormalities from 3D images, action recognition from videos would require capturing context from entire video rather than just capturing information from each frame. Fig 1: Left: Example Head CT scan. Right: Example video from a action recogn
{{#tags}}- {{label}}
{{/tags}}