Skip to content

wuffs significantly slower than OpenCV 4.9.0 when decoding PNGs for 7680x4320 image #149

Open
@zchrissirhcz

Description

Problem

When decoding a big image (height=4320, width=7680, channels=4, data type = uint8_t), wuffs is much slow than OpenCV 4.9.0, on Apple M1 (Mac-mini).

Time cost

7680x4320 image

time cost
opencv 4.9.0 270 ms
wuffs latest("unsupported.c") 370 ms

OpenCV 4.9.0 details

brew install opencv

which is built on libpng 1.6.43:

  Media I/O: 
    ZLib:                        /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/usr/lib/libz.tbd (ver 1.2.12)
    JPEG:                        /opt/homebrew/lib/libjpeg.dylib (ver 80)
    WEBP:                        /opt/homebrew/lib/libwebp.dylib (ver encoder: 0x020f)
    PNG:                         /opt/homebrew/lib/libpng.dylib (ver 1.6.43)
    TIFF:                        /opt/homebrew/lib/libtiff.dylib (ver 42 / 4.6.0)
    JPEG 2000:                   OpenJPEG (ver 2.5.2)
    OpenEXR:                     OpenEXR::OpenEXR (ver 3.2.4)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

What exactly code do I use

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
// Copyright 2023 The Wuffs Authors.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// https://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.
//
// SPDX-License-Identifier: Apache-2.0 OR MIT

// ----------------

/*
toy-aux-image demonstrates using the wuffs_aux::DecodeImage C++ function to
decode an in-memory compressed image. In this example, the compressed image is
hard-coded to a specific image: a JPEG encoding of the first frame of the
test/data/muybridge.gif animated image.

To run:

$CXX toy-aux-image.cc && ./a.out; rm -f a.out

for a C++ compiler $CXX, such as clang++ or g++.

The expected output:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@X@@@@XX@@@@@@@@@@X
XXXXX@@XXX@@@@@@@II@@@X@X@@@@@
XXXXX@@XX@@X@@@XO+XXX@XX@@@X@@
XXXXXXXX@XX@X@XI=I@@XXI+OXX@XX
XXXXXXXXXXXXXXX+=+OXO+=::OXX@X
XXXXXXXXXXXXXXXXXX=+==:::=XXXX
XXXXXXXXO+:::::+OO+===+OI=+XXX
XXXO::=++:::==+++XI+++X@XXO@XX
XXXO=X@X+::=::::+O++=I@XX@XXXX
XXXXX@XXX=:::::::::=+@XXXX@XXX
XXXXXXXX@O::IXO=::::O@@XXXXXXX
XXXXXXXXO=X+X@@XX::O@@XXXXXXXX
XXXXXXXXXOO=X@X@X+OIXXXXXXXXXX
XXXXXXXXXXX+IIXX+X@OX@XXXXXXXX
XXXXXXXXX@XXOI+IIOOOXXXXXXXXXX
XXXXXXXXXXX@XXXXX@XXXXXXXXXXXX
XXXXXXXXXXXXXXXXX@XXXXXXXXXXXX
OOOOXXXXXXXXXXOXXXXXXXXXXXXOOO
=+++IIIIIIIOOOOOOOOOOIIIIIIII+
*/

// Wuffs ships as a "single file C library" or "header file library" as per
// https://github.com/nothings/stb/blob/master/docs/stb_howto.txt
//
// To use that single file as a "foo.c"-like implementation, instead of a
// "foo.h"-like header, #define WUFFS_IMPLEMENTATION before #include'ing or
// compiling it.
#define WUFFS_IMPLEMENTATION

// Defining the WUFFS_CONFIG__STATIC_FUNCTIONS macro is optional, but when
// combined with WUFFS_IMPLEMENTATION, it demonstrates making all of Wuffs'
// functions have static storage.
//
// This can help the compiler ignore or discard unused code, which can produce
// faster compiles and smaller binaries. Other motivations are discussed in the
// "ALLOW STATIC IMPLEMENTATION" section of
// https://raw.githubusercontent.com/nothings/stb/master/docs/stb_howto.txt
#define WUFFS_CONFIG__STATIC_FUNCTIONS

// Defining the WUFFS_CONFIG__MODULE* macros are optional, but it lets users of
// release/c/etc.c choose which parts of Wuffs to build. That file contains the
// entire Wuffs standard library, implementing a variety of codecs and file
// formats. Without this macro definition, an optimizing compiler or linker may
// very well discard Wuffs code for unused codecs, but listing the Wuffs
// modules we use makes that process explicit. Preprocessing means that such
// code simply isn't compiled.
/*
#define WUFFS_CONFIG__MODULES
#define WUFFS_CONFIG__MODULE__AUX__BASE
#define WUFFS_CONFIG__MODULE__AUX__IMAGE
#define WUFFS_CONFIG__MODULE__BASE
#define WUFFS_CONFIG__MODULE__JPEG
*/
#define WUFFS_CONFIG__MODULES
#define WUFFS_CONFIG__MODULE__AUX__BASE
#define WUFFS_CONFIG__MODULE__AUX__IMAGE
#define WUFFS_CONFIG__MODULE__ADLER32
#define WUFFS_CONFIG__MODULE__BASE
#define WUFFS_CONFIG__MODULE__CRC32
#define WUFFS_CONFIG__MODULE__DEFLATE
#define WUFFS_CONFIG__MODULE__PNG
#define WUFFS_CONFIG__MODULE__ZLIB

// Defining the WUFFS_CONFIG__DST_PIXEL_FORMAT__ENABLE_ALLOWLIST (and the
// associated ETC__ALLOW_FOO) macros are optional, but can lead to smaller
// programs (in terms of binary size). By default (without these macros),
// Wuffs' standard library can decode images to a variety of pixel formats,
// such as BGR_565, BGRA_PREMUL or RGBA_NONPREMUL. The destination pixel format
// is selectable at runtime. Using these macros essentially makes the selection
// at compile time, by narrowing the list of supported destination pixel
// formats. The FOO in ETC__ALLOW_FOO should match the pixel format passed (as
// part of the wuffs_base__image_config argument) to the decode_frame method.
//
// If using the wuffs_aux C++ API, without overriding the SelectPixfmt method,
// the implicit destination pixel format is BGRA_PREMUL.
#define WUFFS_CONFIG__DST_PIXEL_FORMAT__ENABLE_ALLOWLIST
#define WUFFS_CONFIG__DST_PIXEL_FORMAT__ALLOW_BGRA_PREMUL

// If building this program in an environment that doesn't easily accommodate
// relative includes, you can use the script/inline-c-relative-includes.go
// program to generate a stand-alone C file.
//##include "wuffs-v0.4.c"
//#include "wuffs-v0.3.c"
#include "wuffs-unsupported-snapshot.c"

//static std::string decode()
cv::Mat ncv::read_png(const std::string filename)
{
  // Call wuffs_aux::DecodeImage, which is the entry point to Wuffs' high-level
  // C++ API for decoding images. This API is easier to use than Wuffs'
  // low-level C API but the low-level one (1) handles animation, (2) handles
  // asynchronous I/O, (3) handles metadata and (4) does no dynamic memory
  // allocation, so it can run under a `SECCOMP_MODE_STRICT` sandbox.
  // Obviously, if you don't need any of those features, then these simple
  // lines of code here suffices.
  //
  // This example program doesn't explicitly use Wuffs' low-level C API but, if
  // you're curious to learn more, the wuffs_aux::DecodeImage implementation in
  // internal/cgen/auxiliary/*.cc uses it, as does the example/convert-to-nia C
  // program. There's also documentation at doc/std/image-decoders.md
  //
  // If you also want metadata like EXIF orientation and ICC color profiles,
  // script/print-image-metadata.cc has some example code. It uses Wuffs'
  // low-level API but it's a C++ program to use Wuffs' shorter convenience
  // methods: `decoder->decode_frame_config(NULL, &src)` instead of C's
  // `wuffs_base__image_decoder__decode_frame_config(decoder, NULL, &src)`.
  std::ifstream file(filename, std::ios::binary | std::ios::ate);
  if (!file.is_open())
  {
    std::cerr << "failed to open file " << filename << "\n";
    return cv::Mat();
  }
  std::streampos filesize = file.tellg();
  file.seekg(0, std::ios::beg);
  std::vector<char> buffer(filesize);
  if (!file.read(buffer.data(), filesize))
  {
    std::cerr << "error: could not read file content.\n";
    return cv::Mat();
  }
  file.close();

  wuffs_aux::DecodeImageCallbacks callbacks;
  wuffs_aux::sync_io::MemoryInput input(buffer.data(), buffer.size());
  wuffs_aux::DecodeImageResult result =
      wuffs_aux::DecodeImage(callbacks, input);
  if (!result.error_message.empty()) {
    std::cerr << "error: " << result.error_message << "\n";
    return cv::Mat();
  }
  // If result.error_message is empty then the DecodeImage call succeeded. The
  // decoded image is held in result.pixbuf, backed by memory that is released
  // when result.pixbuf_mem_owner (a std::unique_ptr) is destroyed. In this
  // example program, this happens at the end of this function.

  wuffs_base__table_u8 table = result.pixbuf.plane(0);
  //printf("table: %p, %zu, %zu, %zu\n", table.ptr, table.width, table.height, table.stride);

  // print result.pixbuf.pixcfg
//   printf("bpp: %d\n", result.pixbuf.pixcfg.pixel_format().bits_per_pixel());
//   printf("human redable: height=%zu, width=%zu, channel=%zu\n", 
//     result.pixbuf.pixcfg.height(),
//     result.pixbuf.pixcfg.width(),
//     result.pixbuf.pixcfg.pixel_format().bits_per_pixel() / 8
//   );

  cv::Size size;
  size.height = result.pixbuf.pixcfg.height();
  size.width = result.pixbuf.pixcfg.width();
  int channels = result.pixbuf.pixcfg.pixel_format().bits_per_pixel() / 8;
  cv::Mat image(size, CV_8UC(channels));
  std::copy_n(table.ptr, size.width * size.height * channels, image.data);

  return image;
}





int main()
{
    std::cout << "OpenCV version (runtime): " << cv::getVersionString() << std::endl;

    //const std::string filename = "/Users/zz/data/peppers.png";
    const std::string filename = "/Users/zz/data/ASRDebug_0_7680x4320.png";
    cv::Mat src2;
    {
        birch::AutoTimer timer1("cv::imread");
        src2 = cv::imread(filename);
    }
    printf("src2: rows=%d, cols=%d\n", src2.rows, src2.cols);
    //cv::imwrite("result2.png", src2);

    cv::Mat src1;
    {
        birch::AutoTimer timer1("ncv::read_png");
        src1 = ncv::read_png(filename);
    }
    //cv::imwrite("result1.png", src1);
    printf("src2: rows=%d, cols=%d\n", src1.rows, src1.cols);

    std::cout << cv::getBuildInformation() << std::endl;

    return 0;
}

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions