Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems regarding ssim/psnr output #191

Closed
Selur opened this issue Jan 19, 2020 · 15 comments
Closed

Problems regarding ssim/psnr output #191

Selur opened this issue Jan 19, 2020 · 15 comments

Comments

@Selur
Copy link

Selur commented Jan 19, 2020

When using --ssim / --psnr inside the Windows command prompt I get something like:

encoded 182 frames, 92.06 fps, 12698.48 kbps, 11.49 MB
ssim/psnr: SSIM YUV: 0.993957 (22.187781), 0.997201 (25.529748), 0.998155 (27.340250), All: 0.995198 (23.185425), (Frames: 182)
encode time 0:00:01, CPU: 1.7%, GPU: 11.0%, GPUClock: 1607MHz, VEClock: 1442MHz
ssim/psnr: PSNR YUV: 50.492561, 55.531229, 57.256833, Avg: 51.718770, (Frames: 182)
frame type IDR 2
frame type I 2, total size 0.43 MB
frame type P 180, total size 11.06 MB

at the end of the encode, but when calling NVEncC through Qt and capturing the standard error output of the NVEncC process I don't get the ssim/psnr lines, all other outputs are there.
Also those lines also do not show up when piping to NVEncC (even in the Windows command prompt).

Did you change the output method somehow or is the SSIM/PSNR output by another process?

@rigaya
Copy link
Owner

rigaya commented Jan 19, 2020

So far no problem here...

Did you change the output method somehow or is the SSIM/PSNR output by another process?

Output method is unchanged, ssim/psnr uses the same output method. There was ssim/psnr on the --log output of NVEncC, or redirected stderr output in command prompt.

x64\NVEncC64.exe -i sakura_op.mpg -o F:\temp\test.mp4 --ssim 2>> F:\temp\test2.log

Also, I have a .NET plugin which captures and shows stderr of the NVEncC64, but I was able to get the results.

image

@Selur
Copy link
Author

Selur commented Jan 20, 2020

Does ssim/psnr work for you if you feed NVEncC through a (for example ffmpeg) pipe?

@rigaya
Copy link
Owner

rigaya commented Jan 20, 2020

No problem here, at least on command prompt...

Y:\QSVTest>x64\ffmpeg -loglevel error -y -i "sakura_op.mpg" -an -pix_fmt yuv420p -f yuv4mpegpipe - | x64\NVEncC64.exe --y4m -i - -o F:\temp\test.mp4 --ssim
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18362)
CPU            Intel Core i9-7980XE @ 2.60GHz [TB: 4.10GHz] (18C/36T)
GPU            #0: GeForce RTX 2070 (2304 cores, 1710 MHz)[PCIe3x16][436.2]
NVENC / CUDA   NVENC API 9.1, CUDA 10.1, schedule mode: auto
Input Buffers  CUDA, 20 frames
Input Info     y4m(yv12)->nv12 [AVX2], 1280x720, 30/1 fps
Vpp Filters    copyHtoD
               ssim (yv12)
Output Info    H.264/AVC high @ Level auto
               1280x720p 1:1 30.000fps (30/1fps)
               avwriter: h264 => mp4
Encoder Preset default
Rate Control   CQP  I:20  P:23  B:25
Lookahead      off
GOP length     300 frames
B frames       3 frames [ref mode: disabled]
Ref frames     3 frames, MultiRef L0:auto L1:auto
AQ             off
Others         mv:auto cabac deblock adapt-transform:auto bdirect:auto
ssim/psnr: SSIM YUV: 0.989677 (19.861812), 0.989543 (19.805884), 0.988034 (19.220593), All: 0.989381 (19.739020), (Frames: 3501)

encoded 3501 frames, 350.70 fps, 2783.27 kbps, 38.72 MB
encode time 0:00:09, CPU: 1.5%, GPU: 20.8%, VE: 33.9%, VD: 26.1%, GPUClock: 1455MHz, VEClock: 1345MHz
frame type IDR   12
frame type I     12,  avgQP  20.00,  total size   0.67 MB
frame type P    875,  avgQP  23.00,  total size  16.31 MB
frame type B   2614,  avgQP  25.00,  total size  21.74 MB

@Selur
Copy link
Author

Selur commented Jan 20, 2020

Okay, this is getting stranger and stranger:

using:

I:\Hybrid\64bit>NVEncC64 --avhw  -i "F:\TestClips&Co\files\5000frames.mp4" --fps 25.000 --codec h265 --profile main10 --level auto --tier high --sar 1:1 --lookahead 32 --output-depth 10 --vbrhq 0 --vbr-quality 18.00 --max-bitrate 240000 --aq --aq-strength 3 --gop-len 0 --ref 3 --multiref-l0 3 --multiref-l1 3 --nonrefp --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorprim undef --transfer undef --colormatrix bt470bg --vpp-resize spline36 --output-res 640x480 --vpp-gauss disabled --vpp-unsharp radius=2,weight=0.5,threshold=10 --cuda-schedule sync --psnr --ssim --output "E:\Output\5000frames_18_15_28_2310_01.265"

I got:

Multiple Refs unsupported.
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: sync
Input Buffers  CUDA, 41 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> yv12(16bit))
               unsharp: radius 2, weight 0.5, threshold 10.0
               cspconv(yv12(16bit) -> p010)
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBRHQ
Bitrate        0 kbps (Max: 240000 kbps)
Target Quality 18.00
Initial QP     I:20  P:23  B:25
VBV buf size   auto
Lookahead      on, 32 frames, Adaptive I Insert
GOP length     250 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             on
CU max / min   auto / auto
Others         mv:Q-pel nonrefp

encoded 5000 frames, 930.41 fps, 35.19 kbps, 0.84 MB
encode time 0:00:05, CPU: 9.0%, GPU: 48.7%, VE: 32.5%, VD: 34.0%, GPUClock: 1829MHz, VEClock: 1643MHz
frame type IDR   20
frame type I     20,  total size  0.01 MB
frame type P   4980,  total size  0.83 MB

removing '--cuda-schedule' I got:

I:\Hybrid\64bit>NVEncC64 --avhw  -i "F:\TestClips&Co\files\5000frames.mp4" --fps 25.000 --codec h265 --profile main10 --level auto --tier high --sar 1:1 --lookahead 32 --output-depth 10 --vbrhq 0 --vbr-quality 18.00 --max-bitrate 240000 --aq --aq-strength 3 --gop-len 0 --ref 3 --multiref-l0 3 --multiref-l1 3 --nonrefp --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorprim undef --transfer undef --colormatrix bt470bg --vpp-resize spline36 --output-res 640x480 --vpp-gauss disabled --vpp-unsharp radius=2,weight=0.5,threshold=10 --psnr --ssim --output "E:\Output\5000frames_18_15_28_2310_01.265"
Multiple Refs unsupported.
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: auto
Input Buffers  CUDA, 41 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> yv12(16bit))
               unsharp: radius 2, weight 0.5, threshold 10.0
               cspconv(yv12(16bit) -> p010)
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBRHQ
Bitrate        0 kbps (Max: 240000 kbps)
Target Quality 18.00
Initial QP     I:20  P:23  B:25
VBV buf size   auto
Lookahead      on, 32 frames, Adaptive I Insert
GOP length     250 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             on
CU max / min   auto / auto
Others         mv:Q-pel nonrefp

encoded 5000 frames, 925.07 fps, 35.19 kbps, 0.84 MB
ssim/psnr: SSIM YUV: 0.999997 (55.504931), 1.000000 (inf), 1.000000 (inf), All: 0.999998 (57.265844), (Frames: 5000)
encode time 0:00:05, CPU: 9.9%, GPU: 47.3%, VE: 32.2%, VD: 34.2%, GPUClock: 1829MHz, VEClock: 1643MHz
ssim/psnr: PSNR YUV: 74.984393, inf, inf, Avg: 76.745305, (Frames: 5000)
frame type IDR   20
frame type I     20,  total size  0.01 MB
frame type P   4980,  total size  0.83 MB

rerunning the initial call:

I:\Hybrid\64bit>NVEncC64 --avhw  -i "F:\TestClips&Co\files\5000frames.mp4" --fps 25.000 --codec h265 --profile main10 --level auto --tier high --sar 1:1 --lookahead 32 --output-depth 10 --vbrhq 0 --vbr-quality 18.00 --max-bitrate 240000 --aq --aq-strength 3 --gop-len 0 --ref 3 --multiref-l0 3 --multiref-l1 3 --nonrefp --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorprim undef --transfer undef --colormatrix bt470bg --vpp-resize spline36 --output-res 640x480 --vpp-gauss disabled --vpp-unsharp radius=2,weight=0.5,threshold=10 --cuda-schedule sync --psnr --ssim --output "E:\Output\5000frames_18_15_28_2310_01.265"

I got

Multiple Refs unsupported.
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: sync
Input Buffers  CUDA, 41 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> yv12(16bit))
               unsharp: radius 2, weight 0.5, threshold 10.0
               cspconv(yv12(16bit) -> p010)
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBRHQ
Bitrate        0 kbps (Max: 240000 kbps)
Target Quality 18.00
Initial QP     I:20  P:23  B:25
VBV buf size   auto
Lookahead      on, 32 frames, Adaptive I Insert
GOP length     250 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             on
CU max / min   auto / auto
Others         mv:Q-pel nonrefp

encoded 5000 frames, 971.06 fps, 35.19 kbps, 0.84 MB
encode time 0:00:05, CPU: 9.5%, GPU: 49.3%, VE: 33.3%, VD: 35.3%, GPUClock: 1873MHz, VEClock: 1683MHz
ssim/psnr: SSIM YUV: 0.999997 (55.504931), 1.000000 (inf), 1.000000 (inf), All: 0.999998 (57.265844), (Frames: 5000)
frame type IDR   20
ssim/psnr: PSNR YUV: 74.984393, inf, inf, Avg: 76.745305, (Frames: 5000)
frame type I     20,  total size  0.01 MB
frame type P   4980,  total size  0.83 MB

-> No clue why, but:
a. whether ssim/psnr values are displayed
and
b. the order of the output
seems to be random.
Also the order or the output is always different than the order you got. :/

@SiV44
Copy link

SiV44 commented Jan 20, 2020

Hello,
I'm new here and at the beginning I want to thank @rigaya, for the great job he does with NVEnc.

Back to the topic.
@Selur, using --avhw try again with the lookahead option turned off, maybe it will help you.
For me with the option enabled, it ends with:
"Video encoding using NVEnc 4.61 (r1307) failed with exit code: -1073740940 (0xC0000374)".
With the option disabled, the encoding proceeds OK but does not display the PSNR / SSIM results. The only way is to use --avsw, without lookahead, and everything works fine.
I use GTX 1050 Ti and StaxRip.

@Selur
Copy link
Author

Selur commented Jan 21, 2020

Removing the lookahead and three times calling,....

I:\Hybrid\64bit>NVEncC64 --avhw  -i "F:\TestClips&Co\files\5000frames.mp4" --fps 25.000 --codec h265 --profile main10 --level auto --tier high --sar 1:1 --output-depth 10 --vbrhq 0 --vbr-quality 18.00 --max-bitrate 240000 --aq --aq-strength 3 --gop-len 0 --ref 3 --multiref-l0 3 --multiref-l1 3 --nonrefp --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorprim undef --transfer undef --colormatrix bt470bg --vpp-resize spline36 --output-res 640x480 --vpp-gauss disabled --vpp-unsharp radius=2,weight=0.5,threshold=10 --cuda-schedule sync --psnr --ssim --output "E:\Output\5000frames_18_15_28_2310_01.265"
Multiple Refs unsupported.
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: sync
Input Buffers  CUDA, 17 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> yv12(16bit))
               unsharp: radius 2, weight 0.5, threshold 10.0
               cspconv(yv12(16bit) -> p010)
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBRHQ
Bitrate        0 kbps (Max: 240000 kbps)
Target Quality 18.00
Initial QP     I:20  P:23  B:25
VBV buf size   auto
Lookahead      off
GOP length     250 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             on
CU max / min   auto / auto
Others         mv:Q-pel nonrefp

encoded 5000 frames, 950.75 fps, 35.19 kbps, 0.84 MB
encode time 0:00:05, CPU: 10.0%, GPU: 47.7%, VE: 30.2%, VD: 34.2%, GPUClock: 1839MHz, VEClock: 1653MHz
frame type IDR   20
frame type I     20,  total size  0.01 MB
frame type P   4980,  total size  0.83 MB

I:\Hybrid\64bit>NVEncC64 --avhw  -i "F:\TestClips&Co\files\5000frames.mp4" --fps 25.000 --codec h265 --profile main10 --level auto --tier high --sar 1:1 --output-depth 10 --vbrhq 0 --vbr-quality 18.00 --max-bitrate 240000 --aq --aq-strength 3 --gop-len 0 --ref 3 --multiref-l0 3 --multiref-l1 3 --nonrefp --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorprim undef --transfer undef --colormatrix bt470bg --vpp-resize spline36 --output-res 640x480 --vpp-gauss disabled --vpp-unsharp radius=2,weight=0.5,threshold=10 --cuda-schedule sync --psnr --ssim --output "E:\Output\5000frames_18_15_28_2310_01.265"
Multiple Refs unsupported.
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: sync
Input Buffers  CUDA, 17 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> yv12(16bit))
               unsharp: radius 2, weight 0.5, threshold 10.0
               cspconv(yv12(16bit) -> p010)
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBRHQ
Bitrate        0 kbps (Max: 240000 kbps)
Target Quality 18.00
Initial QP     I:20  P:23  B:25
VBV buf size   auto
Lookahead      off
GOP length     250 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             on
CU max / min   auto / auto
Others         mv:Q-pel nonrefp

encoded 5000 frames, 985.61 fps, 35.19 kbps, 0.84 MB
encode time 0:00:05, CPU: 9.9%, GPU: 48.0%, VE: 31.5%, VD: 35.7%, GPUClock: 1839MHz, VEClock: 1653MHz
frame type IDR   20
frame type I     20,  total size  0.01 MB
frame type P   4980,  total size  0.83 MB

I:\Hybrid\64bit>NVEncC64 --avhw  -i "F:\TestClips&Co\files\5000frames.mp4" --fps 25.000 --codec h265 --profile main10 --level auto --tier high --sar 1:1 --output-depth 10 --vbrhq 0 --vbr-quality 18.00 --max-bitrate 240000 --aq --aq-strength 3 --gop-len 0 --ref 3 --multiref-l0 3 --multiref-l1 3 --nonrefp --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorprim undef --transfer undef --colormatrix bt470bg --vpp-resize spline36 --output-res 640x480 --vpp-gauss disabled --vpp-unsharp radius=2,weight=0.5,threshold=10 --cuda-schedule sync --psnr --ssim --output "E:\Output\5000frames_18_15_28_2310_01.265"
Multiple Refs unsupported.
NVEncC (x64) 4.61 test3 (r1308) by rigaya, Jan 20 2020 00:01:42 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: sync
Input Buffers  CUDA, 17 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> yv12(16bit))
               unsharp: radius 2, weight 0.5, threshold 10.0
               cspconv(yv12(16bit) -> p010)
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBRHQ
Bitrate        0 kbps (Max: 240000 kbps)
Target Quality 18.00
Initial QP     I:20  P:23  B:25
VBV buf size   auto
Lookahead      off
GOP length     250 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames
AQ             on
CU max / min   auto / auto
Others         mv:Q-pel nonrefp

encoded 5000 frames, 937.56 fps, 35.19 kbps, 0.84 MB
encode time 0:00:05, CPU: 9.9%, GPU: 51.8%, VE: 29.5%, VD: 33.5%, GPUClock: 1839MHz, VEClock: 1653MHz
frame type IDR   20
frame type I     20,  total size  0.01 MB
frame type P   4980,  total size  0.83 MB

I:\Hybrid\64bit>

ssim/psnr values are not shown once,.. :/

@Selur
Copy link
Author

Selur commented Jan 22, 2020

using:

NVEncC64  --log-level debug -i "F:\TestClips&Co\files\5000frames.mp4" --psnr --output `"E:\Output\5000frames.mp4"

I get:

....
encoded 5000 frames, 1004.22 fps, 20.49 kbps, 0.49 MB
encode time 0:00:04, CPU: 7.0%, GPU: 26.0%, VE: 11.7%, VD: 37.0%, GPUClock: 1792MHz, VEClock: 1611MHz
frame type IDR   20
frame type I     20,  avgQP  20.00,  total size  0.01 MB
frame type P   1260,  avgQP  23.00,  total size  0.15 MB
frame type B   3720,  avgQP  25.00,  total size  0.33 MB
ssim/psnr: Waiting for ssim/psnr calculation thread to finish.
ssim/psnr: Freed CUDA resources.
ssim/psnr: closed ssim/psnr filter.
avcuvid: Closing...
avcuvid: Closed Stream Packet Buffer.
avcuvid: Closed caption handler.
avcuvid: Closed format.
avcuvid: Closed video.
avcuvid: Cleared frame pos list.
avcuvid: Closed.
avcuvid: Closing...
avcuvid: Close...
avout: Closing...
avout: closed queues...
avout: Closed format.
avout: Closed video.
avout: Closed.
nvEncDestroyEncoder: success.
cuvid: Closing decoder...
cuvid: cuvidDestroyDecoder: Fin.
cuvid: cuvidDestroyVideoParser: Fin.
cuvid: Closed decoder.
cuvidCtxLockDestroy...
cuvidCtxLockDestroy: Fin.
Closing EncodeStatus...
Closed EncodeStatus.
cuCtxDestroy...
cuCtxDestroy: Fin.
Closing perf monitor...
perf monitor: Closing thread...
perf monitor: Closed thread.
perf monitor: Closing perf counter...
perf monitor: Closed perf counter.
Closing logger...

so seems like the psnr/ssim calculation is done, but no output is showing,...

@rigaya
Copy link
Owner

rigaya commented Jan 26, 2020

Thanks for the log, I might have missed the synchronization of the ssim calculation thread.

I made a new build with improved synchronization between main thread and ssim calc thread, would you please have a try? The order of the output should be always the same on this build as below.
NVEncC64_4.61_test4.zip

encoded 6462 frames, 319.73 fps, 18463.89 kbps, 474.58 MB
encode time 0:00:20, CPU: 1.1%, GPU: 10.7%, VE: 60.1%, VD: 95.2%, GPUClock: 1882MHz, VEClock: 1748MHz
frame type IDR   22
frame type I     22,  avgQP  20.00,  total size    4.87 MB
frame type P   1616,  avgQP  23.00,  total size  176.13 MB
frame type B   4824,  avgQP  25.00,  total size  293.58 MB
ssim/psnr: Waiting for ssim/psnr calculation thread to finish.
ssim/psnr: SSIM YUV: 0.984365 (18.058884), 0.992402 (21.192886), 0.992699 (21.366440), All: 0.987093 (18.891817), (Frames: 6462)
ssim/psnr: closed ssim/psnr filter.
avcuvid: Closing...
avcuvid: Closed Stream Packet Buffer.
avcuvid: Closed caption handler.
avcuvid: Closed format.
avcuvid: Closed video.
avcuvid: Cleared frame pos list.
avcuvid: Closed.
avcuvid: Closing...
avcuvid: Close...
avout: Closing...
avout: closed queues...
avout: Closed format.
avout: Closed video.
avout: Closed.
nvEncDestroyEncoder: success.
cuvid: Closing decoder...
cuvid: cuvidDestroyDecoder: Fin.
cuvid: cuvidDestroyVideoParser: Fin.
cuvid: Closed decoder.
cuvidCtxLockDestroy...
cuvidCtxLockDestroy: Fin.
Closing EncodeStatus...
Closed EncodeStatus.
cuCtxDestroy...
cuCtxDestroy: Fin.
Closing perf monitor...
perf monitor: Closing thread...
perf monitor: Closed thread.
perf monitor: Closing perf counter...
perf monitor: Closed perf counter.
Closing logger...

@Selur
Copy link
Author

Selur commented Jan 26, 2020

Hmm,... something is still not okay.
calling:

I:\Hybrid\64bit>NVEncC64 -i "F:\TestClips&Co\files\5000frames.mp4" --psnr --output "E:\Output\5000frames.mp4"

the call is stuck

NVEncC (x64) 4.61 test4 (r1322) by rigaya, Jan 26 2020 23:17:31 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: auto
Input Buffers  CUDA, 20 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> nv12)
               psnr (yv12)
Output Info    H.264/AVC high @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
               avwriter: h264 => mp4
Encoder Preset default
Rate Control   CQP  I:20  P:23  B:25
Lookahead      off
GOP length     250 frames
B frames       3 frames [ref mode: disabled]
Ref frames     3 frames
AQ             off
Others         mv:auto cabac deblock adapt-transform:auto bdirect:auto
[79.0%] 4060 frames: 1824.72 fps, 21 kb/s, remain 0:00:01, GPU 15%, VE 28%, VD 85%, est out size 0.5MB

waited a minute, then I aborted.
With '--log-level debug' I see it ends with:

[hevc @ 00000201cdff8300] nal_unit_type: 0(TRAIL_N), nuh_layer_id: 0, temporal_id: 0
avcuvid: 5000 frames, End of file
Flushing Decoderames: 1474.61 fps, 21 kb/s, remain 0:00:00, GPU 21%, VE 23%, VD 70%, est out size 0.5MB
Flushed Decoder

and then nothing happens.
Looking at the Windows Task-Manager I see that NVEncC still uses some cpu (0.1-0.2 %) but nothing happens in the command line window ...
(waited 5min, nothing changed RAM usage stayed the same, CPU usage was 0 - 0.1 % before I aborted,..)

@rigaya
Copy link
Owner

rigaya commented Jan 28, 2020

Thanks for testing, please give a try with the new test build, should fix the lock you have faced at the end of encoding.
NVEncC64_4.61_test5.zip

@Selur
Copy link
Author

Selur commented Jan 28, 2020

That's better:

I:\Hybrid\64bit>NVEncC64 -i "F:\TestClips&Co\files\5000frames.mp4" --psnr --output "E:\Output\5000frames.mp4"
--------------------------------------------------------------------------------
E:\Output\5000frames.mp4
--------------------------------------------------------------------------------
NVEncC (x64) 4.61 test5 (r1330) by rigaya, Jan 28 2020 22:16:29 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: auto
Input Buffers  CUDA, 20 frames
Input Info     avcuvid: H.265/HEVC, 640x480, 25/1 fps
Vpp Filters    cspconv(p010 -> nv12)
               psnr (yv12)
Output Info    H.264/AVC high @ Level auto
               640x480p 1:1 25.000fps (25/1fps)
               avwriter: h264 => mp4
Encoder Preset default
Rate Control   CQP  I:20  P:23  B:25
Lookahead      off
GOP length     250 frames
B frames       3 frames [ref mode: disabled]
Ref frames     3 frames
AQ             off
Others         mv:auto cabac deblock adapt-transform:auto bdirect:auto

encoded 5000 frames, 1833.52 fps, 20.49 kbps, 0.49 MB
encode time 0:00:02, CPU: 6.5%, GPU: 12.0%, VE: 14.5%, VD: 48.0%, GPUClock: 1740MHz, VEClock: 1563MHz
frame type IDR   20
frame type I     20,  avgQP  20.00,  total size  0.01 MB
frame type P   1260,  avgQP  23.00,  total size  0.15 MB
frame type B   3720,  avgQP  25.00,  total size  0.33 MB
ssim/psnr: PSNR YUV: 66.319038, inf, inf, Avg: 68.079951, (Frames: 5000)

strange thing are the two 'inf' values.

@rigaya
Copy link
Owner

rigaya commented Jan 28, 2020

strange thing are the two 'inf' values.

"inf" shows there is no difference between original and encoded video. I think it's because the input file 5000frames.mp4 has almost no chroma in there.

@Selur
Copy link
Author

Selur commented Jan 29, 2020

you are probably right, I tested with a short sample:

NVEncC64.exe -i "F:\TestClips&Co\files\10bitTest.mkv" --psnr --output "E:\Output\test.mp4"
--------------------------------------------------------------------------------
E:\Output\test.mp4
--------------------------------------------------------------------------------
NVEncC (x64) 4.61 test5 (r1330) by rigaya, Jan 28 2020 22:16:29 (VC 1924/Win/avx2)
OS Version     Windows 10 x64 (18363)
CPU            AMD Ryzen 7 1800X Eight-Core Processor (8C/16T)
GPU            #0: GeForce GTX 1070 Ti (2432 cores, 1683 MHz)[PCIe3x16][441.87]
NVENC / CUDA   NVENC API 9.1, CUDA 10.2, schedule mode: auto
Input Buffers  CUDA, 20 frames
Input Info     avsw: h264(yv12(10bit))->nv12 [AVX2], 640x352, 25/1 fps
Vpp Filters    copyHtoD
               psnr (yv12)
Output Info    H.264/AVC high @ Level auto
               640x352p 1:1 25.000fps (25/1fps)
               avwriter: h264 => mp4
Encoder Preset default
Rate Control   CQP  I:20  P:23  B:25
Lookahead      off
GOP length     250 frames
B frames       3 frames [ref mode: disabled]
Ref frames     3 frames
AQ             off
Others         mv:auto cabac deblock adapt-transform:auto bdirect:auto

encoded 429 frames, 1284.43 fps, 326.25 kbps, 0.67 MB
encode time 0:00:00, CPULoad: 14.0%
frame type IDR   2
frame type I     2,  avgQP  20.00,  total size  0.02 MB
frame type P   108,  avgQP  23.00,  total size  0.30 MB
frame type B   319,  avgQP  25.00,  total size  0.35 MB
ssim/psnr: PSNR YUV: 48.464830, 52.419647, 53.092397, Avg: 49.482321, (Frames: 429)

@rigaya
Copy link
Owner

rigaya commented Jan 29, 2020

Thanks for checking, I'll add this fix in the next release.

@Selur
Copy link
Author

Selur commented Jan 29, 2020

Happy to help, especially since that fixed not only the command line but also the problem me not capturing the output in Qt. :)

@Selur Selur closed this as completed Jan 29, 2020
rigaya added a commit that referenced this issue Feb 1, 2020
contextをまたいだフレームのやり取りで不安定になっている可能性があるので。
rigaya added a commit that referenced this issue Feb 1, 2020
排他制御のロックの最適化と結果表示のタイミングを調整した。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants