Skip to content

Sometimes inconsistent values in hist conversion from TProfile read via uproot #1175

Open
@raymondEhlers

Description

When I convert TProfile objects to hist that I've read from a ROOT file with uproot, I sometimes observe inconsistent values. For some input files, they're consistent, but for others, they're inconsistent - It depends on the input file. Unfortunately, I haven't figured out how to reproduce this in a simple way, so I've just attached two files which are produced in the same way and exhibit this behavior: (failing.root.txt (11MB), passing.root.txt (< 1 MB)). Both appear to be valid output files beyond this issue. The code below illustrates the behavior:

def test_uproot_profile_consistency_with_hist() -> None:
    import hist
    import numpy as np
    import uproot

    input_file = Path("failing.root")

    # Open and extract with uproot
    with uproot.open(input_file) as f:
        # Cast to a base class. Actual class is here: https://github.com/alisw/AliPhysics/blob/master/PWG/EMCAL/EMCALbase/AliEmcalList.h
        output_list = f["AliAnalysisTaskTrackSkim_pythia"].bases[0]
        x_sec_uproot = next(h for h in output_list if h.name == "fHistXsection")
        x_sec_uproot_hist = x_sec_uproot.to_hist()

    # Cross check with ROOT
    ROOT = pytest.importorskip("ROOT")
    with ROOT.TFile.Open(str(input_file), "READ") as f_ROOT:
        output_list = f_ROOT.Get("AliAnalysisTaskTrackSkim_pythia")
        # Cast to a base class. Actual class is here: https://github.com/alisw/AliPhysics/blob/master/PWG/EMCAL/EMCALbase/AliEmcalList.h
        output_list = ROOT.bind_object(ROOT.addressof(output_list), "TList")

        x_sec_temp = output_list.FindObject("fHistXsection")
        x_sec_temp.SetDirectory(0)
    x_sec_ROOT_values = np.array([x_sec_temp.GetBinContent(i) for i in range(1, x_sec_temp.GetNbinsX() + 1)], dtype=np.float64)

    print(f"Uproot: {uproot.__version__}, hist: {hist.__version__}")

    # The standard uproot values before conversion appear correct
    np.testing.assert_allclose(x_sec_uproot.values(), x_sec_ROOT_values)
    # Fails here for `failing.root`, but fine for `passing.root`
    np.testing.assert_allclose(x_sec_uproot.values(), x_sec_uproot_hist.values())

It strikes as odd that it depends on the particular file when they were generated the same way (with different inputs). I wonder if it's due to some issue with accessing the base class/the cast that I do. It's required because the profile is stored in a TList derived class that's part of our experimental software stack (note that I don't care about any of the additional info stored in the derived class. It's available here if helpful). I'm concerned this is perhaps corrupting an expected memory layout or otherwise causing an issue. Since this class is unfortunately driven by an experiment software stack constraint, it's difficult for me to avoid. If there's a better way for me to workaround this, I'd be happy to use that instead.

Bottom line: I suspect there might be an uproot issue here, but I'm concerned that it's instead due to an experimental software stack quirk. Help here is greatly appreciated - thanks!

>>> import uproot, hist
... print(f"Uproot: {uproot.__version__}, hist: {hist.__version__}")
Uproot: 5.3.1, hist: 2.7.2

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bug (unverified)The problem described would be a bug, but needs to be triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions