Skip to content

GeneratorVideo does not work #1014

Open
@BielStela

Description

Description

VideoDataset using a GeneratorVideo does not work

Context

I'm truing to create a video using GeneratorVideo to see if I can free up some memory. I already tried successfully with SequentialVideo (which works quite well btw) and refactored to use a generator that yields frames and a GeneratorVideo.

Steps to Reproduce

# catalog.yml
test_video:
  type: video.VideoDataset
  filepath: data/03_primary/test.mp4
# nodes.py
from collections.abc import Generator

from PIL import Image
from kedro_datasets.video.video_dataset import GeneratorVideo


def make_video() -> GeneratorVideo:
    """Makes a video with three frames: one red, one green and one blue at 1 fps"""
    def frames() -> Generator[Image.Image, None, None]:
        w, h = 256, 256
        red_frame = Image.new("RGB", (w, h), (255, 0, 0))
        green_frame = Image.new("RGB", (w, h), (0, 255, 0))
        blue_frame = Image.new("RGB", (w, h), (0, 0, 255))
        frames = [red_frame, green_frame, blue_frame]
        yield from frames

    return GeneratorVideo(frames(), length=None, fps=1)
# pipeline.py
from kedro.pipeline import Pipeline, pipeline, node

from .nodes import make_video


def create_pipeline(**kwargs) -> Pipeline:
    return pipeline([node(make_video, inputs=None, outputs="test_video")])

Expected Result

A colorful video similar to this one ( in the preview does not work, hope it does when published)

test.mp4

Actual Result

This error!

kedro.io.core.DatasetError: Failed while saving data to dataset VideoDataset(filepath=<removed>, protocol=file).
'Image' object has no attribute 'fps'

If one changes the node to use a SequenceVideo like so:

def make_video() -> SequenceVideo:
    """Makes a video with three frames
        one red, one green and one blue at 1 fps"""
    def frames() -> list:
        w, h = 256, 256
        red_frame = Image.new("RGB", (w, h), (254, 0, 0))
        green_frame = Image.new("RGB", (w, h), (0, 254, 0))
        blue_frame = Image.new("RGB", (w, h), (0, 0, 254))
        frames = [red_frame, green_frame, blue_frame, blue_frame]
        return frames

    return SequenceVideo(frames(), fps=1)

It works well.

Now here it comes my debugging report:
One can see that there's a moment when running the pipeline, when the program is at
kedro.runner._run_node_sequential:528, the code does

        items = zip(it.cycle(keys), interleave(*streams))

where streams is a list containing my GeneratorVideo which gets iterated in the chaining. The problem is that the stream itself is an Iterator that gets crystallized into an iterator of Image.Image in the operation and iterated over while calling catalog.save(name, data). Then VideoDataset takes the control and fails instantly because the input is no longer a GeneratorVideo nor a SequenceVideo, it is now an Image

From here I have no more clue about how this can be fixed tho :_)

Your Environment

  • Kedro version used (pip show kedro or kedro -V): 0.19.9
  • Python version used (python -V): Python 3.11.9
  • Operating system and version: Linux 6.8.0-48-generic 22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Oct 7 11:24:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    CommunityIssue/PR opened by the open-source community

    Type

    No type

    Projects

    • Status

      In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions