Skip to content

Patch release 2.9.1#21954

Draft
stelfrag wants to merge 23 commits intonetdata:v2.9from
stelfrag:patch_release
Draft

Patch release 2.9.1#21954
stelfrag wants to merge 23 commits intonetdata:v2.9from
stelfrag:patch_release

Conversation

@stelfrag
Copy link
Collaborator

@stelfrag stelfrag commented Mar 16, 2026

Summary
  1. Create journal directory if missing before watching (Create journal directory if missing before watching #22019)
  2. Fix uninitialized vnode stale timeout field in pluginsd parser (Fix uninitialized vnode stale timeout field in pluginsd parser #21983)
  3. Correctly prefetch IBM MQ libraries in DEB package build CI jobs. (Correctly prefetch IBM MQ libraries in DEB package build CI jobs. #21959)
  4. Fix windows config editor (Fix windows config editor #21957)
  5. Ensure thread safety and proper cleanup in GetHardwareInfo (Ensure thread safety and proper cleanup in GetHardwareInfo #21958)
  6. Fix permissions for systemd-journal.plugin on offline installs (Fix permissions for systemd-journal.plugin on offline installs #21953)
  7. Fix health api call (Fix health api call #21952)
  8. Remove .inf extension from file filter in build workflow (Remove .inf extension from file filter in build workflow #21941)
  9. Fix initialization handling in GetHardwareInfo function (Fix initialization handling in GetHardwareInfo function #21885)
  10. Refactor UID/GID cache updates in apps plugin aggregation logic (Refactor UID/GID cache updates in apps plugin aggregation logic #21864)
  11. Improve print parsed (Improve print parsed #21790)
  12. Add revalidation for clean pages under lock to ensure queue integrity (Add revalidation for clean pages under lock to ensure queue integrity #21793)
  13. Fix crash when processing a corrupted journalfile (Fix crash when processing a corrupted journalfile #21794)
  14. Fix url check (Fix url check #21805)
  15. Fix potential use after free in RAM mode (Fix potential use after free in RAM mode #21809)
  16. Fix data race in ML training during host stop (Fix data race in ML training during host stop #21844)
  17. Improve installer (Windows.plugin) (Improve installer (Windows.plugin) #21911)
  18. Update Windows.plugin (Update Windows.plugin #21797)
  19. Add Fedora 44 to CI and package builds. (Add Fedora 44 to CI and package builds. #21943)
  20. Add Ubuntu 26.04 to CI and package builds. (Add Ubuntu 26.04 to CI and package builds. #21939)
  21. Handle fetching IBM MQ libraries in CI package build jobs outside of CMake (Handle fetching IBM MQ libraries in CI package build jobs outside of CMake #21862)
  22. Fix handling of control files for DEB packages. (Fix handling of control files for DEB packages. #21940)
  23. Add Fedora 44 to CI and package builds. (Add Fedora 44 to CI and package builds. #21943)
  24. Add Ubuntu 26.04 to CI and package builds. (Add Ubuntu 26.04 to CI and package builds. #21939)
  25. Handle fetching IBM MQ libraries in CI package build jobs outside of CMake (Handle fetching IBM MQ libraries in CI package build jobs outside of CMake #21862)
  26. Fix coverity issue (Fix coverity issue #21777)

Summary by cubic

Patch release 2.9.1 updates packaging and CI (adds Fedora 44 and Ubuntu 26.04), pre-fetches IBM MQ libs in CI to improve builds, and fixes DEB control files so only the netdata package includes its scripts. It also updates the Windows plugin/installer (driver INF packaging, better service handling, GetHardwareInfo enabled), fixes the Windows config editor check, and includes stability fixes across ML, RAM mode, journal replay, page cache, URL handling, and a Coverity issue.

  • Bug Fixes
    • Health API: proper acquire/release for instances to prevent leaks and crashes.
    • Windows GetHardwareInfo: add MSR device locking, safer thread shutdown, self-heal driver install, and clearer signature error logging.
    • Offline installs: correct permissions for systemd-journal.plugin (try setcap before SUID).
    • Windows installer: fix editor profile check in packaging script.
    • pluginsd parser: initialize node_stale_after_seconds to avoid uninitialized reads.
    • OTEL signal viewer: create journal directory before watching to avoid missed logs on startup.

Written for commit f39d9a9. Summary will update on new commits.

stelfrag and others added 5 commits March 16, 2026 15:14
Properly init found variable

(cherry picked from commit a0f79da)
Due to a typo in the CPack configuration, the main `netdata` package
control files were being included in any package that did not explicitly
define own control files, causing uninstallation of those packages to
mask the netdata service.

Fix this so that we set the control files for the main package itself
instead of specifying them as the default for any package that doesn’t
define it’s own control files.

(cherry picked from commit 1bec654)
…CMake (netdata#21862)

* Support FETCHCONTENT_SOURCE_DIR for IBM MQ libs.

This allows us to handle fetching them as a separate step in CI, letting
us better handle the errors resulting from IBM’s questioanble
infrastructure.

* Pre-fetch IBM MQ libraries in package build CI jobs.

This lets us better handle infrastructure issues when trying to fetch
the libraries.

* Fix duplicate CONFIGURE_COMMAND.

* Address review comments from copilot.

* Fix additional bugs.

* Don’t use loop for file handling.

(cherry picked from commit 15ffa4f)
Expected release date is 2026-04-23.

(cherry picked from commit 5bc4512)
Expected release date is 2026-04-14.

(cherry picked from commit cd8d489)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Confidence score: 3/5

  • There is a concrete correctness risk in src/database/sqlite/sqlite_aclk_alert.c: initializing found to true can cause SQLite bind failures to be interpreted as a successful transition check.
  • Given the reported severity (7/10) and high confidence (10/10), this is more than a minor code-quality issue and could mask real database errors with user-visible alert-state behavior.
  • Pay close attention to src/database/sqlite/sqlite_aclk_alert.c - transition-check logic may return success on bind failure paths.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/database/sqlite/sqlite_aclk_alert.c">

<violation number="1" location="src/database/sqlite/sqlite_aclk_alert.c:780">
P1: Initializing `found` to `true` makes bind failures look like a successful transition check.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

thiagoftsm and others added 10 commits March 16, 2026 16:04
Prevent concurrent ML activity during host reset and improve thread safety in k-means dimension handling

(cherry picked from commit cee7787)
* cache uuid in mem_metric_handle to avoid dereferencing potentially freed RRDDIM during metric release (ram mode)

* release rrddim metrics after querying oldest and latest times to prevent potential resource leaks (ram mode)

(cherry picked from commit c21e5ce)
Fix check and some formatting

(cherry picked from commit d3831c4)
Fix incorrect calculation of `max_size` in journal transaction replay logic

(cherry picked from commit ee4fd63)
…netdata#21793)

* Prevent potential race conditions by revalidating page status after acquiring the clean lock.
* Handle inconsistent states by logging and updating flags appropriately.

(cherry picked from commit 2733e6f)
* Refactor double formatting in eval-utils.c: replace custom logic with `print_netdata_double` for clarity and maintainability.

* Add null checks for `dst` and `fmt` in `vsnprintfz` to prevent potential issues

* Add null check for `fmt` in `vsnprintfz` and ensure `dst` is properly initialized

(cherry picked from commit 22962fc)
…ata#21864)

Refactor UID/GID cache updates in aggregation logic

- Relocate `update_cached_host_users()` and `update_cached_host_groups()` calls outside per-PID loops to optimize and simplify code structure.
- Remove redundant updates within conditional blocks.

(cherry picked from commit b88a35c)
)

- Add checks for uninitialized `cpus` and `cpus_lock` in `netdata_loop_cpu_chart` to prevent potential issues.
- Introduce `init_failed` flag to handle initialization failures gracefully and prevent redundant attempts.

(cherry picked from commit 1024b69)
@github-actions github-actions bot added area/ci area/packaging Packaging and operating systems support area/docs area/collectors Everything related to data collection area/claim area/ml Machine Learning Related Issues area/build Build system (autotools and cmake). collectors/apps area/metadata Integrations metadata collectors/windows labels Mar 16, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found across 31 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packaging/windows/generate-driver-catalog.ps1">

<violation number="1" location="packaging/windows/generate-driver-catalog.ps1:34">
P1: Add the x64 match to `$candidates`; otherwise the normal Windows Kits layout is treated as "not found" and catalog generation fails.</violation>
</file>

<file name="src/collectors/windows.plugin/driver/netdata_driver.inf">

<violation number="1" location="src/collectors/windows.plugin/driver/netdata_driver.inf:24">
P2: Add a `DefaultUninstall.NTamd64.Services`/`DelService` entry. Right now the uninstall path removes the `.sys` file but leaves the `NetdataDriver` service registered on the legacy uninstall path.</violation>
</file>

<file name="packaging/windows/package-windows.sh">

<violation number="1" location="packaging/windows/package-windows.sh:48">
P2: This file-existence check is inverted, so the Windows package will skip writing `EDITOR` to the installed profile in the normal case.</violation>
</file>

<file name="src/collectors/windows.plugin/metadata.yaml">

<violation number="1" location="src/collectors/windows.plugin/metadata.yaml:46">
P1: Use `title` instead of `name` for prerequisite items; `name` is not valid for `setup.prerequisites.list[]` and makes these new metadata entries schema-invalid.</violation>
</file>

<file name=".github/workflows/packaging.yml">

<violation number="1" location=".github/workflows/packaging.yml:251">
P1: This condition misses the `amd64` x64 package jobs, so Debian/Ubuntu builds won't use the pre-fetched IBM MQ libs.</violation>
</file>

<file name="src/collectors/windows.plugin/GetHardwareInfo.c">

<violation number="1" location="src/collectors/windows.plugin/GetHardwareInfo.c:393">
P2: `device_lock` is initialized and destroyed but never actually entered/left anywhere. The `msr_device` handle—which this lock was presumably introduced to protect—is accessed unsynchronized from the hardware-info thread (`netdata_reopen_device_if_needed`, `netdata_read_msr`) and closed on the main thread in cleanup. If the thread join fails, this is a use-after-close race.</violation>
</file>

<file name="packaging/windows/netdata.wxs.in">

<violation number="1" location="packaging/windows/netdata.wxs.in:271">
P1: The driver package is incomplete here: the INF references `netdata_driver.cat`, but the MSI only installs the `.sys` and `.inf` files.</violation>

<violation number="2" location="packaging/windows/netdata.wxs.in:271">
P1: Copying the INF into `%windir%\\INF` is not enough to install or stage the driver package. Use `PnPUtil`/SetupAPI to add the package to the driver store instead of treating the INF as a normal file.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

# Prefer the typical WDK/SDK layout: Windows Kits\10\bin\<version>\x64\Inf2Cat.exe
$patternX64 = Join-Path $root '*\x64\Inf2Cat.exe'
$found = Get-ChildItem -Path $patternX64 -File -ErrorAction SilentlyContinue
if (-not $found) {
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Add the x64 match to $candidates; otherwise the normal Windows Kits layout is treated as "not found" and catalog generation fails.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packaging/windows/generate-driver-catalog.ps1, line 34:

<comment>Add the x64 match to `$candidates`; otherwise the normal Windows Kits layout is treated as "not found" and catalog generation fails.</comment>

<file context>
@@ -0,0 +1,95 @@
+            # Prefer the typical WDK/SDK layout: Windows Kits\10\bin\<version>\x64\Inf2Cat.exe
+            $patternX64 = Join-Path $root '*\x64\Inf2Cat.exe'
+            $found = Get-ChildItem -Path $patternX64 -File -ErrorAction SilentlyContinue
+            if (-not $found) {
+                # Fallback: any versioned subfolder directly under bin containing Inf2Cat.exe
+                $patternAnyArch = Join-Path $root '*\Inf2Cat.exe'
</file context>
Fix with Cubic

prerequisites:
list: []
list:
- name: Netdata installation
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Use title instead of name for prerequisite items; name is not valid for setup.prerequisites.list[] and makes these new metadata entries schema-invalid.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/collectors/windows.plugin/metadata.yaml, line 46:

<comment>Use `title` instead of `name` for prerequisite items; `name` is not valid for `setup.prerequisites.list[]` and makes these new metadata entries schema-invalid.</comment>

<file context>
@@ -42,7 +42,13 @@ modules:
       prerequisites:
-        list: []
+        list:
+          - name: Netdata installation
+            description: |
+              When Netdata is installed on Windows, it automatically registers as a Windows Service
</file context>
Fix with Cubic

<File Id="NetdataDrv" Name="netdata_driver.sys" Directory="System64Folder" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
<File Id="NetdataDrv" Name="netdata_driver.sys" Directory="DRIVERDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
</File>
<File Id="NetdataDrvInf" Name="netdata_driver.inf" Directory="INFDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.inf" Condition="NDDRVINST=1">
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The driver package is incomplete here: the INF references netdata_driver.cat, but the MSI only installs the .sys and .inf files.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packaging/windows/netdata.wxs.in, line 271:

<comment>The driver package is incomplete here: the INF references `netdata_driver.cat`, but the MSI only installs the `.sys` and `.inf` files.</comment>

<file context>
@@ -254,7 +266,9 @@
-                <File Id="NetdataDrv" Name="netdata_driver.sys" Directory="System64Folder" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
+                <File Id="NetdataDrv" Name="netdata_driver.sys" Directory="DRIVERDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
+                </File>
+                <File Id="NetdataDrvInf" Name="netdata_driver.inf" Directory="INFDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.inf" Condition="NDDRVINST=1">
                 </File>
             </FeatureGroup>
</file context>
Fix with Cubic

<File Id="NetdataDrv" Name="netdata_driver.sys" Directory="System64Folder" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
<File Id="NetdataDrv" Name="netdata_driver.sys" Directory="DRIVERDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
</File>
<File Id="NetdataDrvInf" Name="netdata_driver.inf" Directory="INFDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.inf" Condition="NDDRVINST=1">
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Copying the INF into %windir%\\INF is not enough to install or stage the driver package. Use PnPUtil/SetupAPI to add the package to the driver store instead of treating the INF as a normal file.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packaging/windows/netdata.wxs.in, line 271:

<comment>Copying the INF into `%windir%\\INF` is not enough to install or stage the driver package. Use `PnPUtil`/SetupAPI to add the package to the driver store instead of treating the INF as a normal file.</comment>

<file context>
@@ -254,7 +266,9 @@
-                <File Id="NetdataDrv" Name="netdata_driver.sys" Directory="System64Folder" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
+                <File Id="NetdataDrv" Name="netdata_driver.sys" Directory="DRIVERDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.sys" Condition="NDDRVINST=1">
+                </File>
+                <File Id="NetdataDrvInf" Name="netdata_driver.inf" Directory="INFDIR" Source="C:\msys64\opt\netdata\usr\bin\netdata_driver.inf" Condition="NDDRVINST=1">
                 </File>
             </FeatureGroup>
</file context>
Fix with Cubic


[DefaultUninstall.NTamd64]
LegacyUninstall=1
DelFiles = DriverCopyFiles
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Add a DefaultUninstall.NTamd64.Services/DelService entry. Right now the uninstall path removes the .sys file but leaves the NetdataDriver service registered on the legacy uninstall path.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/collectors/windows.plugin/driver/netdata_driver.inf, line 24:

<comment>Add a `DefaultUninstall.NTamd64.Services`/`DelService` entry. Right now the uninstall path removes the `.sys` file but leaves the `NetdataDriver` service registered on the legacy uninstall path.</comment>

<file context>
@@ -0,0 +1,45 @@
+
+[DefaultUninstall.NTamd64]
+LegacyUninstall=1
+DelFiles = DriverCopyFiles
+
+[DriverCopyFiles]
</file context>
Fix with Cubic

stelfrag and others added 6 commits March 16, 2026 16:30
- Add proper acquire / release logic for instances

(cherry picked from commit 7bbee1f)
…ta#21953)

* Fix permissions for systemd-journal.plugin on offline installs

* Try setcap before suid

* systemd-journal.plugin is optional

* systemd-journal.plugin is optional for the fallback, too

* Use customary order of cap flags

(cherry picked from commit aa77c3a)
…#21958)

* Ensure thread safety and proper cleanup in `GetHardwareInfo`

- Add critical section locks for MSR device operations to prevent race conditions.
- Improve error handling during thread join and hardware info cleanup.
- Refine resource lifecycle management to avoid teardown races.

* Simplify `do_GetHardwareInfo_cleanup` by removing redundant thread join checks and refining resource teardown logic.

* Further improvements

* Address review comments

* Address comments, add retry count

* Add `THREAD_JOIN_FALLBACK_WAIT_MS` and align retry logic in `GetHardwareInfo`

(cherry picked from commit e4aeead)
Fix conditional check for editor configuration in Windows packaging script

(cherry picked from commit dfbf1a5)
…ta#21983)

Initialize `node_stale_after_seconds` field in `pluginsd_parser`

(cherry picked from commit baf8ccb)
If the otel-signal-viewer plugin starts before the otel plugin, the
journal directory (e.g. /var/log/netdata/otel/v1) might not yet exist.
This causes inotify watch setup to fail, and the viewer never picks up
logs even after the otel plugin creates the directory and starts writing.

Create the directory with create_dir_all before setting up the watch.
This is safe because the otel plugin also uses create_dir_all (to create
a machine-id subdirectory inside it), which is idempotent.

(cherry picked from commit 6851d51)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/aclk area/build Build system (autotools and cmake). area/ci area/claim area/collectors Everything related to data collection area/database area/docs area/metadata Integrations metadata area/ml Machine Learning Related Issues area/packaging Packaging and operating systems support area/plugins.d collectors/apps collectors/windows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants