-
Notifications
You must be signed in to change notification settings - Fork 1
update documentation #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded@PythonFZ has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 18 minutes and 34 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThis PR shifts substructure matching to an RDKit-centered API, adds fragment-aware grouping and visualization, bundles/compiles a Packmol binary with a hatch build hook and wrapper/CLI, extends utilities (graph drawing, density, unwrap, bond detection), updates docs/notebooks, tests, CI, and packaging metadata. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Substruct as match_substructure
participant RDKit
participant Mol as Chem.Mol
User->>Substruct: match_substructure(Mol, "SMARTS/SMILES", hydrogens, mapped_only)
activate Substruct
Substruct->>Substruct: parse pattern & validate maps
Substruct->>RDKit: run substructure search
RDKit-->>Substruct: matches (atom index tuples)
Substruct->>Substruct: apply hydrogen mode & mapped ordering
Substruct-->>User: tuple of matches
deactivate Substruct
sequenceDiagram
participant User
participant PackAPI as molify.pack
participant PackMod as molify.packmol
participant Bin as packmol binary
User->>PackAPI: pack(..., packmol=None)
activate PackAPI
PackAPI->>PackMod: get_packmol_binary()
PackMod-->>PackAPI: /.../src/molify/binaries/packmol
PackAPI->>Bin: subprocess(stdin = pack input)
Bin-->>PackAPI: stdout/stderr
PackAPI-->>User: packed ASE Atoms
deactivate PackAPI
sequenceDiagram
participant HatchBuild
participant Hook as CustomBuildHook
participant Make as make (external/packmol)
participant BinDest as src/molify/binaries/
HatchBuild->>Hook: initialize(version, build_data)
activate Hook
Hook->>Hook: locate gfortran (candidates)
alt gfortran found
Hook->>Make: run make FORTRAN=<gfortran>
Make-->>Hook: compiled binary
Hook->>BinDest: copy binary into package
Hook->>HatchBuild: mark wheel platform-specific
else not found
Hook-->>HatchBuild: raise RuntimeError (guidance)
end
deactivate Hook
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–75 minutes Key areas requiring extra attention:
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #80 +/- ##
==========================================
- Coverage 96.76% 92.36% -4.40%
==========================================
Files 25 26 +1
Lines 1853 1873 +20
==========================================
- Hits 1793 1730 -63
- Misses 60 143 +83 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
molify/utils.py (2)
76-85: calculate_box_dimensions has unit/dimension errors.total_mass is in amu; dividing by density (kg/m³) and “/1000” yields inconsistent units. Return value is used as Å, but the math doesn’t convert kg→ų. Fix to compute volume in m³, convert to ų, then cube‑root.
def calculate_box_dimensions(images: list[ase.Atoms], density: float) -> list[float]: - """Calculates the dimensions of the simulation box - - based on the molar volume and target density. - """ - total_mass = sum(sum(atoms.get_masses()) for atoms in images) - molar_volume = total_mass / density / 1000 # m^3 / mol - volume_per_mol = molar_volume * ase.units.m**3 / ase.units.mol - box_edge = volume_per_mol ** (1 / 3) - return [box_edge] * 3 + """Compute cubic box edges (Å) for given images at target density (kg/m³). + + Assumes `images` is the list of all molecules to be packed in one box. + """ + # Total mass in kg (get_masses returns amu) + total_mass_kg = sum(sum(atoms.get_masses()) for atoms in images) * ase.units._amu + # Volume in m³ from density (kg/m³) + volume_m3 = total_mass_kg / density + # Convert to ų + volume_A3 = volume_m3 * 1e30 + box_edge_A = volume_A3 ** (1 / 3) + return [box_edge_A] * 3
169-173: Comment says BFS, code uses DFS.Update the comment for accuracy.
- # now do a bfs traversal to unwrap the molecule + # now do a DFS traversal to unwrap the molecule
🧹 Nitpick comments (7)
docs/source/index.rst (1)
175-178: Section numbering jumps from “2.” to “4.”Change “4. Full Reference” to “3. Full Reference” for consistency.
-**4. Full Reference** +**3. Full Reference**docs/source/packmol_tools.ipynb (1)
307-329: Runtime/version-dependent output in docs.“Packmol.jl version: v0.1.12” will drift. Consider:
- Mark cell as “do not execute” for published docs, or
- Print a generic success message and hide exact version, or
- Guard with a short timeout and fallback text.
This keeps the docs resilient across environments.
molify/utils.py (2)
239-260: Fragile parsing of Julia Packmol version.Splitting on spaces and indexing parts[2] may break with format changes. Prefer a regex search for “Packmol” and a semver token, or expose the query in docs and handle absence gracefully.
- for line in result.stdout.splitlines(): - if "Packmol" in line: - parts = line.split() - if len(parts) >= 3: - return parts[2] + import re + for line in result.stdout.splitlines(): + if "Packmol" in line: + m = re.search(r'Packmol.*\b(v?\d+(?:\.\d+){1,3})\b', line) + if m: + return m.group(1)
70-73: Minor: uses private ASE constant.ase.units._amu is private; consider hard-coding the CODATA value or referencing public constants to reduce churn risk. Low priority.
molify/__init__.py (1)
35-35: Consider documenting draw_molecular_graph as optional.If you keep the public symbol, please note in README/docs that plotting requires extra deps (matplotlib), or provide an extras_require pin.
docs/source/atom_selection.ipynb (2)
244-260: Strip committed outputs (large images/printed dumps).Keep notebooks lean; outputs drift across environments and increase repo size. Use nbstripout or pre-commit to clear outputs.
Also applies to: 248-260, 393-401
133-134: Notebook lint: useless expression warning.Static analysis flagged a useless expression; remove stray bare expressions that don’t affect execution.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/source/atom_selection.ipynb(9 hunks)docs/source/index.rst(1 hunks)docs/source/packmol_tools.ipynb(2 hunks)molify/__init__.py(2 hunks)molify/utils.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
molify/__init__.py (1)
molify/utils.py (2)
draw_molecular_graph(263-417)unwrap_structures(88-178)
🪛 Ruff (0.14.1)
molify/utils.py
352-352: Avoid specifying long messages outside the exception class
(TRY003)
369-372: Avoid specifying long messages outside the exception class
(TRY003)
docs/source/atom_selection.ipynb
133-133: Found useless expression. Either assign it to a variable or remove it.
(B018)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pytest (3.11, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.10, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.13, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.12, ubuntu-latest, 20.15.1)
🔇 Additional comments (3)
docs/source/index.rst (2)
4-8: No branding mismatch—the documentation is correct. The official package name is"molify"(perpyproject.toml), and the docs consistently use it for imports and function calls. The GitHub repository is namedrdkit2ase, but repository names and package names commonly differ and is not an issue. Users will correctly installpip install molifyand importimport molify.
27-40: All required Mermaid configuration is already in place—no action needed.Verification confirms that
sphinxcontrib.mermaidis configured indocs/source/conf.py(line 25) andsphinxcontrib-mermaid>=1.0.0is included in the docs dependency group inpyproject.toml. The Mermaid diagrams should render correctly as-is.docs/source/packmol_tools.ipynb (1)
40-49: Strip notebook outputs and configure nbstripout in pre-commit to prevent recurrence.The notebook contains 8 output cells with embedded base64-encoded images (2 images, ~3KB each encoded), resulting in unnecessary file bloat. This is part of a broader pattern: all 5 notebooks in
docs/source/carry embedded outputs (13–22 cells each), inflating the repository unnecessarily.Recommendation remains valid. Strip outputs from this notebook (and ideally all others) and configure nbstripout in
.git/.pre-commit-config.yamlto enforce clean notebooks going forward.
molify/__init__.py
Outdated
| visualize_selected_molecules, | ||
| ) | ||
| from molify.utils import unwrap_structures | ||
| from molify.utils import draw_molecular_graph, unwrap_structures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eager import adds a hard matplotlib dependency at import time.
Importing draw_molecular_graph from init forces matplotlib even when users only need core conversions. Make it lazy (try/except shim) or move to a viz submodule to keep base install slim.
-from molify.utils import draw_molecular_graph, unwrap_structures
+from molify.utils import unwrap_structures
+try:
+ # Optional: available only if plotting deps are installed
+ from molify.utils import draw_molecular_graph # type: ignore
+except Exception: # ImportError and others (e.g., backend issues)
+ def draw_molecular_graph(*_, **__): # pragma: no cover
+ raise RuntimeError(
+ "molify.draw_molecular_graph requires matplotlib. "
+ "Install optional plotting deps, e.g. `pip install molify[plot]`."
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from molify.utils import draw_molecular_graph, unwrap_structures | |
| from molify.utils import unwrap_structures | |
| try: | |
| # Optional: available only if plotting deps are installed | |
| from molify.utils import draw_molecular_graph # type: ignore | |
| except Exception: # ImportError and others (e.g., backend issues) | |
| def draw_molecular_graph(*_, **__): # pragma: no cover | |
| raise RuntimeError( | |
| "molify.draw_molecular_graph requires matplotlib. " | |
| "Install optional plotting deps, e.g. `pip install molify[plot]`." | |
| ) |
🤖 Prompt for AI Agents
molify/__init__.py around line 18: the file currently does a top-level import of
draw_molecular_graph which forces a matplotlib dependency on import; remove that
eager import and instead either (A) move draw_molecular_graph (and any viz
helpers) into a new submodule molify.viz and export it from there, updating
documentation, or (B) replace the top-level import with a lazy shim: wrap the
import in a try/except ImportError and set draw_molecular_graph to a small
function that raises a clear error suggesting installation of the optional viz
extras (e.g., "pip install molify[viz]") or dynamically import the real function
inside draw_molecular_graph when called; ensure unwrap_structures (core) remains
imported at top-level.
| if layout == "spring": | ||
| if weight_by_bond_order: | ||
| # Use bond_order as weights - higher order = stronger spring | ||
| pos = nx.spring_layout(graph, weight="bond_order", seed=42) | ||
| else: | ||
| pos = nx.spring_layout(graph, seed=42) | ||
| elif layout == "circular": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Guard spring_layout weights when bond_order is None.
NetworkX expects numeric weights; None may error or be ignored inconsistently. Precompute numeric weights (fallback to 1.0).
- if layout == "spring":
- if weight_by_bond_order:
- # Use bond_order as weights - higher order = stronger spring
- pos = nx.spring_layout(graph, weight="bond_order", seed=42)
+ if layout == "spring":
+ if weight_by_bond_order:
+ # Build a view with numeric weights: None -> 1.0
+ g_w = nx.Graph()
+ g_w.add_nodes_from(graph.nodes(data=True))
+ for u, v, data in graph.edges(data=True):
+ w = data.get("bond_order")
+ g_w.add_edge(u, v, weight=float(w) if isinstance(w, (int, float)) else 1.0)
+ pos = nx.spring_layout(g_w, weight="weight", seed=42)
else:
pos = nx.spring_layout(graph, seed=42)🤖 Prompt for AI Agents
In molify/utils.py around lines 358 to 364, the spring_layout call uses edge
attribute "bond_order" directly which may contain None values and NetworkX
requires numeric weights; ensure all edges have numeric bond_order before
calling spring_layout by iterating edges and replacing None (or non-numeric)
bond_order values with a numeric fallback (e.g., 1.0), then call
nx.spring_layout(graph, weight="bond_order", seed=42). If weight_by_bond_order
is False, leave current behavior unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (12)
docs/source/index.rst (1)
149-161: Tighten uv install steps.Consider adding explicit uv venv/create step to avoid confusion for new contributors. For example: “uv venv && . .venv/bin/activate && uv sync”.
docs/source/packmol_tools.ipynb (3)
41-87: Clear notebook outputs before committing.Large base64 images/prints bloat the repo and go stale. Strip outputs (e.g., nbconvert --clear-output or nbstripout) and keep execution_count null.
Also applies to: 248-268
321-329: Guard Packmol discovery.The try/except is good; consider also showing a short preflight helper (e.g., checking shutil.which for “packmol.jl”/“packmol”) so users know why cells are skipped.
75-87: Reproducibility note.You pass default seed=42 in pack(); consider calling it explicitly in examples to make determinism obvious.
tests/test_substructure.py (3)
17-21: Skip tests when Packmol is unavailable.Gate Packmol-dependent tests to improve local dev ergonomics.
+import shutil + @pytest.mark.parametrize("packmol", ["packmol.jl"]) -def test_match_substructure_box(packmol): +@pytest.mark.skipif( + shutil.which("packmol.jl") is None and shutil.which("packmol") is None, + reason="Packmol not installed", +) +def test_match_substructure_box(packmol): @@ @pytest.mark.parametrize("packmol", ["packmol.jl"]) -def test_get_substructure_box(packmol): +@pytest.mark.skipif( + shutil.which("packmol.jl") is None and shutil.which("packmol") is None, + reason="Packmol not installed", +) +def test_get_substructure_box(packmol): @@ @pytest.mark.parametrize("packmol", ["packmol.jl"]) -def test_iter_fragments(packmol, remove_connectivity): +@pytest.mark.skipif( + shutil.which("packmol.jl") is None and shutil.which("packmol") is None, + reason="Packmol not installed", +) +def test_iter_fragments(packmol, remove_connectivity): @@ @pytest.mark.parametrize("packmol", ["packmol.jl"]) -def test_match_substructure_ions_with_none_bond_orders(packmol): +@pytest.mark.skipif( + shutil.which("packmol.jl") is None and shutil.which("packmol") is None, + reason="Packmol not installed", +) +def test_match_substructure_ions_with_none_bond_orders(packmol):Also applies to: 43-49, 61-72, 417-439
24-27: Reduce brittleness of exact index assertions.Packmol ordering can vary; prefer semantic checks (element sets/counts) over fixed indices to avoid flakes across platforms.
Also applies to: 25-27
260-266: Add a negative test for mapped_only without maps.Documented behavior is implicit; either enforce an error or test current behavior. Example new test:
+def test_mapped_only_without_maps_returns_full_match(): + mol = Chem.MolFromSmiles("CCO") + mol = Chem.AddHs(mol) + # Current behavior: returns full match when no maps are present + matches = molify.match_substructure(mol, "CO", mapped_only=True) + assert matches == ((1, 2),)docs/source/atom_selection.ipynb (1)
98-101: Clarify mapped_only semantics.Add a short note: if mapped_only=True and no atom maps exist in the pattern, the full match is returned. This avoids surprises.
molify/substructure.py (4)
108-121: Consider stricter mapped_only behavior.If mapped_only=True but the pattern has no atom maps, returning full matches can surprise users. Either raise a ValueError or document explicitly.
- if mapped_only and mapped_pattern_indices: + if mapped_only and not mapped_pattern_indices: + raise ValueError("mapped_only=True requires mapped atoms in the pattern") + if mapped_only: core_atoms = tuple(match[idx] for idx in mapped_pattern_indices) else: core_atoms = match
236-273: Allow passing match options through get_substructures.Expose hydrogens/mapped_only to make this utility more flexible.
-def get_substructures( - atoms: ase.Atoms, - pattern: str, - **kwargs, -) -> list[ase.Atoms]: +def get_substructures( + atoms: ase.Atoms, + pattern: str, + *, + hydrogens: tp.Literal["include", "exclude", "isolated"] = "exclude", + mapped_only: bool = False, + **kwargs, +) -> list[ase.Atoms]: @@ - mol = ase2rdkit(atoms, **kwargs) - matches = match_substructure(mol, pattern) + mol = ase2rdkit(atoms, **kwargs) + matches = match_substructure( + mol, pattern, hydrogens=hydrogens, mapped_only=mapped_only + )
387-396: Lazy-import pyplot and avoid prints.Reduce import overhead and use logging instead of print to keep libraries quiet.
-import matplotlib.pyplot as plt +from typing import TYPE_CHECKING +if TYPE_CHECKING: # for type checkers only + import matplotlib.pyplot as plt @@ - colors = plt.cm.tab10.colors + # Lazy import to avoid heavy import at module import time + import matplotlib.pyplot as plt # type: ignore + colors = plt.cm.tab10.colors @@ - if not candidate_mols: - print("No molecules to draw with the given selections.") - return None + if not candidate_mols: + import logging + logging.getLogger(__name__).info( + "No molecules to draw with the given selections." + ) + return NoneAlso applies to: 445-461, 474-482
81-87: Tidy exception messages (TRY003).Shorten messages or use dedicated exception classes to satisfy linters. Optional, but keeps noise down.
- raise ValueError(f"Invalid SMARTS/SMILES: {smarts_or_smiles}") + raise ValueError("Invalid SMARTS/SMILES pattern") - raise ValueError(f"Atom map label '{map_num}' is used multiple times") + raise ValueError("Duplicate atom map label found") - raise ValueError( - f"Invalid value for `hydrogens`: {hydrogens!r}. " - "Expected one of 'include', 'exclude', 'isolated'." - ) + raise ValueError("hydrogens must be one of: include, exclude, isolated")Also applies to: 95-96, 162-165
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/source/atom_selection.ipynb(2 hunks)docs/source/index.rst(1 hunks)docs/source/packmol_tools.ipynb(2 hunks)molify/__init__.py(2 hunks)molify/substructure.py(2 hunks)tests/test_substructure.py(9 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- molify/init.py
🧰 Additional context used
🧬 Code graph analysis (2)
molify/substructure.py (2)
molify/ase2x.py (1)
ase2rdkit(187-220)molify/utils.py (1)
find_connected_components(32-62)
tests/test_substructure.py (5)
molify/ase2x.py (1)
ase2rdkit(187-220)molify/substructure.py (4)
match_substructure(13-170)get_substructures(236-272)group_matches_by_fragment(173-233)visualize_selected_molecules(387-482)molify/smiles2x.py (2)
smiles2conformers(64-149)smiles2atoms(33-61)molify/pack.py (1)
pack(158-263)tests/conftest.py (3)
ethanol_water(8-14)alanine_dipeptide(18-20)alanine_dipeptide_box(24-29)
🪛 Ruff (0.14.1)
molify/substructure.py
86-86: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
162-165: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pytest (3.11, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.12, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.10, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.13, ubuntu-latest, 20.15.1)
🔇 Additional comments (5)
docs/source/index.rst (2)
22-40: Mermaid blocks need Sphinx support configured.Confirm sphinxcontrib-mermaid (or equivalent) is enabled in docs/conf.py and on your docs builder (e.g., RTD). Otherwise these diagrams won’t render.
47-53: Nice: documents connectivity schema and None bond orders.Matches current ase2rdkit behavior that tolerates None bond orders.
docs/source/atom_selection.ipynb (1)
32-44: LGTM; SMARTS usage and RDKit-first flow are clear.Examples correctly use SMARTS and explicit ase2rdkit(suggestions=[...]).
Also applies to: 55-60, 72-77
molify/substructure.py (2)
172-233: Group-by-fragment logic is clear and correct.Deduplication with order preservation is a nice touch.
275-314: iter_fragments: robust fallback without networkx.Good defensive implementation using adjacency if networkx is missing.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/source/index.rst(1 hunks)docs/source/packmol_tools.ipynb(2 hunks)
🔇 Additional comments (4)
docs/source/packmol_tools.ipynb (1)
380-380: Verify the nbformat_minor downgrade is intentional.The notebook format was downgraded from
nbformat_minor: 5tonbformat_minor: 4. This change moves to an older Jupyter Notebook format version (4.4 instead of 4.5). While this may improve compatibility with older Jupyter installations, please confirm this was intentional and not an inadvertent change during editing.docs/source/index.rst (3)
172-172: The review comment is incorrect; no issues found.The
modulestoctree entry on line 172 is valid. The filedocs/source/modules.rstexists in the repository, andsphinx.ext.autodocis properly configured inconf.py. The documentation build will not fail due to this reference.
64-98: Code examples correctly reference exported public APIs — verified.All four conversion functions used in the Quick Start examples (
rdkit2ase,ase2rdkit,rdkit2networkx,ase2networkx) are properly exported in molify's__init__.py, listed in__all__, and defined in their respective modules. The documentation examples accurately use the public API.
27-39: No issues found with Mermaid configuration or documentation structure.sphinxcontrib-mermaid (>=1.0.0) is properly configured in
conf.pyand has appropriate version constraints inpyproject.toml. The Mermaid diagram syntax in both flowcharts is valid (graph LR with proper node/edge formatting), and all six conversion functions referenced in the diagrams and code examples exist in the molify API. Themodules.rstreference is valid, and documentation structure is complete.
| import molify | ||
| # Generate 10 different 3D conformers of water | ||
| water = molify.smiles2conformers("O", numConfs=10) | ||
| **Molecular System Building** | ||
|
|
||
| For many applications, it is useful to solvate molecules or create mixtures. | ||
| molify provides a convenient interface to Packmol_ for building molecular systems. | ||
|
|
||
| .. code-block:: python | ||
| import molify | ||
| # Generate conformers for water and ethanol | ||
| water = molify.smiles2conformers("O", numConfs=10) | ||
| etoh = molify.smiles2conformers("CCO", numConfs=10) | ||
| # Pack 5 water and 5 ethanol molecules into a box at 800 kg/m³ | ||
| # Note: Requires Packmol.jl (Julia) or Packmol to be installed | ||
| box = molify.pack( | ||
| data=[water, etoh], counts=[5, 5], density=800, packmol="packmol.jl" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify Advanced Features functions and documentation links.
The Advanced Features section references molify.smiles2conformers() and molify.pack() functions. Verify these are exported in the public API and that the documentation link targets (:doc:packmol_tools, :doc:ase_tools, etc. on line 135) exist and resolve correctly during documentation build.
Run the following script to verify function exports and documentation references:
🏁 Script executed:
#!/bin/bash
# Description: Verify Advanced Features functions are exported and doc links exist
echo "=== Checking for function definitions ==="
rg -n "def (smiles2conformers|pack)\s*\(" --type py
echo -e "\n=== Checking for referenced documentation files ==="
fd -e rst "^(packmol_tools|ase_tools|rdkit_tools|atom_selection|networkx_tools)\.rst$" docs/source/Length of output: 168
---
🏁 Script executed:
```shell
#!/bin/bash
# Description: Verify function exports and find actual documentation references
echo "=== Checking molify/__init__.py for function exports ==="
rg -n "smiles2conformers|pack" molify/__init__.py -C 3
echo -e "\n=== Checking index.rst line 135 for documentation link targets ==="
sed -n '135,140p' docs/source/index.rst
echo -e "\n=== Finding all .rst files in docs/source/ ==="
fd -e rst docs/source/ | head -20
Length of output: 296
---
🏁 Script executed:
```shell
#!/bin/bash
# Description: Verify function exports and find actual documentation references
echo "=== Checking molify/__init__.py for function exports ==="
rg -n "smiles2conformers|pack" molify/__init__.py -C 3
echo -e "\n=== Checking index.rst line 135 for documentation link targets ==="
sed -n '135,140p' docs/source/index.rst
echo -e "\n=== Finding all .rst files in docs/source/ ==="
fd -e rst docs/source/
Length of output: 1277
---
🏁 Script executed:
```shell
#!/bin/bash
# Description: Verify documentation link targets exist
echo "=== Finding all .rst files in docs/source/ ==="
fd . docs/source/ -e rst
echo -e "\n=== Checking specifically for referenced doc files ==="
for docname in packmol_tools ase_tools rdkit_tools atom_selection networkx_tools; do
if [ -f "docs/source/${docname}.rst" ]; then
echo "✓ Found: docs/source/${docname}.rst"
else
echo "✗ Missing: docs/source/${docname}.rst"
fi
done
Length of output: 349
---
🏁 Script executed:
```shell
#!/bin/bash
# Description: Check full docs structure and verify this is the issue
echo "=== Full directory structure of docs/ ==="
find docs -type f -name "*.rst" | sort
echo -e "\n=== Check if documentation is incomplete or links are wrong ==="
echo "Searching for any mention of packmol, ase, rdkit in .rst files..."
rg -l "packmol|ase|rdkit" docs/source/
Length of output: 416
Fix broken documentation links in Advanced Features section.
The functions smiles2conformers and pack are correctly exported in the public API. However, the documentation links on line 135 are broken: they reference :doc: targets (which expect .rst files) but the actual documentation files are .ipynb (Jupyter notebooks). The referenced files packmol_tools.rst, ase_tools.rst, rdkit_tools.rst, and atom_selection.rst do not exist; they are named packmol_tools.ipynb, ase_tools.ipynb, etc. This will cause broken links during documentation builds. Update the references to use the correct format (e.g., link to the .ipynb files directly or convert the documentation source files to .rst).
🤖 Prompt for AI Agents
In docs/source/index.rst around lines 110 to 133 (and the broken link at ~135),
the docs reference non-existent :doc: targets (packmol_tools.rst, ase_tools.rst,
rdkit_tools.rst, atom_selection.rst); update those links to point to the actual
notebook files (packmol_tools.ipynb, ase_tools.ipynb, rdkit_tools.ipynb,
atom_selection.ipynb) or convert the notebooks to .rst and update the targets
accordingly — specifically replace the :doc: references with direct
links/appropriate Sphinx roles that reference the .ipynb files (or change the
source filenames to .rst and keep :doc:) so the documentation builder resolves
them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
molify/__init__.py (1)
17-21: Matplotlib dependency still forced at import time.The past review comment regarding eager import of
draw_molecular_graphremains unresolved. Importing it at the top level forces matplotlib even when users only need core conversions.molify/utils.py (2)
336-336: Documentation still incorrect for node_size default.The past review comment remains unresolved. The docstring states "Default is 3000" but the signature defaults to 500.
401-407: Bond order None handling still missing for spring_layout.The past review comment remains unresolved. When
weight_by_bond_order=True, edges withbond_order=Nonemay cause NetworkX errors or inconsistent behavior inspring_layout.
🧹 Nitpick comments (2)
molify/utils.py (2)
264-303: LGTM!The auto-detection logic correctly prioritizes Julia wrapper over system binary and provides helpful error messages. The implementation is clean and well-documented.
Optional style improvements (per Ruff hints):
- Line 290: Consider moving
passto anelseblock for clarity- Line 300-303: Consider extracting the error message to a constant if it's reused elsewhere
382-398: LGTM!The implementation correctly:
- Handles both provided and new axes
- Delegates bond order visualization appropriately
- Uses Jmol color scheme with proper fallback
- Manages figure lifecycle to prevent double display in notebooks
Optional: Extract error messages at lines 395 and 412-415 to constants per Ruff TRY003 hint.
Also applies to: 417-460
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/source/packmol_tools.ipynb(2 hunks)molify/__init__.py(2 hunks)molify/pack.py(4 hunks)molify/utils.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
molify/pack.py (1)
molify/utils.py (3)
calculate_box_dimensions(77-86)find_packmol_executable(264-303)get_packmol_julia_version(232-261)
molify/__init__.py (2)
molify/substructure.py (2)
group_matches_by_fragment(173-233)match_substructure(13-170)molify/utils.py (2)
draw_molecular_graph(306-460)find_packmol_executable(264-303)
🪛 Ruff (0.14.3)
molify/utils.py
290-290: Consider moving this statement to an else block
(TRY300)
300-303: Avoid specifying long messages outside the exception class
(TRY003)
395-395: Avoid specifying long messages outside the exception class
(TRY003)
412-415: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pytest (3.10, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.13, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.11, ubuntu-latest, 20.15.1)
- GitHub Check: pytest (3.12, ubuntu-latest, 20.15.1)
🔇 Additional comments (13)
docs/source/packmol_tools.ipynb (8)
7-11: LGTM! Clear introduction with proper prerequisites.The updated introduction clearly explains what Packmol is and links to the external installation requirement. This helps users understand the prerequisites before proceeding.
42-65: LGTM! Clear and well-structured example.The water box example demonstrates the workflow effectively:
- Generate conformers
- Pack with appropriate density (1000 kg/m³ for water)
- Display results with clear print statements
The code is clean and educational.
81-84: LGTM! Connectivity verification is helpful.The connectivity check demonstrates that bond information is preserved through the packing process, which is important for users to understand. The expected bond count calculation (10 molecules × 2 O-H bonds) is correct.
184-227: LGTM! Comprehensive mixture example.The water-ethanol mixture example effectively demonstrates:
- Creating conformers for multiple molecule types
- Packing mixed systems with appropriate density
- Visualizing the result with RDKit
The code is clear and the density choice (800 kg/m³) is reasonable for such a mixture.
245-249: LGTM! Useful verification example.The density calculation using
molify.utils.calculate_density()provides a practical way for users to verify their packed systems match the target density.
305-305: Verify nbformat_minor downgrade.The
nbformat_minorwas changed from 5 to 4, which represents a downgrade to an older notebook format. Please verify:
- Is this downgrade intentional for backward compatibility?
- Are there any notebook features (from nbformat 5) that might break with this change?
- Was the notebook edited with an older Jupyter version?
Modern Jupyter installations (6.x+) use nbformat 5, so downgrading might indicate compatibility requirements or an editing tool issue.
1-306: Consider WIP status and coverage before merging.While the notebook documentation is comprehensive and well-structured, please note:
WIP Status: The PR description indicates this is "WIP" (Work In Progress). Confirm the PR is ready for review and merge.
Coverage Concerns: Per the PR objectives, coverage dropped significantly (96.76% → 92.69%, −4.07%). The
molify/utils.pyfile has only 7.77% patch coverage with 83 missing lines. Consider addressing test coverage before finalizing this documentation update.The tutorial content itself is excellent—clear examples, good structure, and comprehensive parameter explanations.
22-22: The importimport molifyis correct; no inconsistency exists.The original concern misunderstood Python naming conventions. The package name in
pyproject.tomlis "molify"—this is what must be imported. The kernel'sdisplay_nameof "rdkit2ase" and the repository name "zincware/rdkit2ase" are separate identities. Additionally,rdkit2aseis a function exported from the molify package, not a package name. All imports across the codebase correctly useimport molify, consistent with the actual package name.Likely an incorrect or invalid review comment.
molify/pack.py (3)
12-16: LGTM!The import statement correctly includes
find_packmol_executableto support the new auto-detection feature.
223-225: LGTM!The auto-detection logic is correctly placed early in the function to fail fast if no packmol installation is found, before any expensive conformer selection.
169-169: Backward compatibility verified—change is safe and improves robustness.The auto-detection implementation at
find_packmol_executable()properly handles both fallback paths: Julia wrapper (preferred) and native binary in PATH. Existing tests with explicitpackmolparameter are unaffected; tests relying on the default will now auto-detect and benefit from Julia support if available. ClearRuntimeErroris raised if neither installation is found.molify/utils.py (2)
2-2: LGTM!The new imports support the visualization and executable detection features. All are appropriately used in the new functions.
Also applies to: 5-5, 9-9, 14-16
463-571: LGTM!The bond order visualization correctly:
- Handles None and unknown bond orders with appropriate fallbacks
- Draws single/double/triple bonds with parallel lines
- Renders aromatic bonds with solid+dashed style
- Uses proper perpendicular offset calculations
docs/source/packmol_tools.ipynb
Outdated
| "By default, `pack()` automatically detects your Packmol installation. It searches for one of the following:\n", | ||
| "- **Native binary** (`packmol`)\n", | ||
| "- **Julia version** (`packmol.jl`)\n", | ||
| "\n", | ||
| "You can explicitly specify which version to use:\n", | ||
| "\n", | ||
| "```python\n", | ||
| "# Let auto-detection choose (recommended)\n", | ||
| "molify.pack([water], [10], 1000)\n", | ||
| "\n", | ||
| "# Explicitly use Julia version\n", | ||
| "molify.pack([water], [10], 1000, packmol=\"packmol.jl\")\n", | ||
| "\n", | ||
| "# Use custom path\n", | ||
| "molify.pack([water], [10], 1000, packmol=\"/path/to/packmol\")\n", | ||
| "```\n", | ||
| "\n", | ||
| "To check which executable will be used:\n", | ||
| "```python\n", | ||
| "executable = molify.find_packmol_executable()\n", | ||
| "print(f\"Using: {executable}\")\n", | ||
| "```" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify auto-detection preference order.
The documentation states that auto-detection searches for native binary first (line 136), then Julia version (line 137). However, line 272 in "Key Takeaways" says it "Prefers Julia version (packmol.jl) for better compatibility."
Please clarify and ensure consistency:
- Which version is actually preferred by the auto-detection?
- Update the documentation to be consistent throughout.
🏁 Script executed:
#!/bin/bash
# Find the implementation of find_packmol_executable to verify preference order
echo "=== Searching for find_packmol_executable implementation ==="
ast-grep --pattern 'def find_packmol_executable($$$) {
$$$
}'
echo -e "\n=== Alternative search with rg ==="
rg -n "def find_packmol_executable" -A 20 --type pyLength of output: 1145
Update documentation to match actual auto-detection order.
The notebook documentation incorrectly lists native binary first (line 136), then Julia version (line 137). However, the actual implementation in molify/utils.py:268 prioritizes Julia version first for cross-platform compatibility, then the native binary.
Swap the order in the bulleted list:
- Julia version (
packmol.jl) — preferred for cross-platform compatibility - Native binary (
packmol)
This will align the documentation with the real behavior implemented in find_packmol_executable().
🤖 Prompt for AI Agents
In docs/source/packmol_tools.ipynb around lines 134 to 155, the bulleted
auto-detection order is reversed relative to the implementation: update the list
so **Julia version** (`packmol.jl`) appears first and **Native binary**
(`packmol`) second, and add the note that the Julia version is preferred for
cross-platform compatibility so the documentation matches molify/utils.py:268
behavior.
* build wheels * `build_data['pure_python'] = False` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * brew install gcc before build * search for gfortran * Handle packmol.exe on Windows platforms - Check for both 'packmol' and 'packmol.exe' after compilation - Windows builds produce packmol.exe, Unix/Linux/macOS produce packmol - Always install as 'packmol' (no extension) for cross-platform consistency 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * update * remove `packmol.jl` * Fix macOS deployment target and Windows Unicode error - Set MACOSX_DEPLOYMENT_TARGET=14.0 for macOS wheel compatibility - Remove Unicode checkmark (✓) that causes UnicodeEncodeError on Windows - Add CIBW_BEFORE_ALL_MACOS to install gcc before building 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * set wheel dependens * Add PyPI publishing and improve workflow - Remove gcc installation (pre-installed on runners) - Rename build_sdist → make_sdist with fetch-depth: 0 - Add upload_all job for PyPI publishing on releases - Use trusted publishing with pypa/gh-action-pypi-publish - Download all cibw-* artifacts before publishing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * run pytest on all versions * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update build targets * move files / fix tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update packmol docs * run notebook * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * set `MACOSX_DEPLOYMENT_TARGET=15.0` * remove `windows-11-arm` tests --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/source/index.rst (1)
134-177: Recheck the external Packmol link targetThe narrative and build hook now bundle the Packmol Fortran source under
external/packmol, but thePackmol_reference currently points to thePackmol.jlJulia wrapper on GitHub. Consider pointing this link to the actual upstream Packmol project you vendor (e.g., the Fortran repository) or clarifying that you are specifically referencing the Julia wrapper if that’s intentional.
♻️ Duplicate comments (3)
src/molify/utils.py (2)
258-264: Doc default vs signature: node_size mismatch.The docstring states
node_sizedefault is 3000 (line 260), but the function signature usesnode_size=500(line 234). These should match.Apply this diff to align the docstring:
node_size : int, optional - Size of the nodes. Default is 3000. + Size of the nodes. Default is 500.
325-331: Guard spring_layout weights when bond_order is None.When
weight_by_bond_order=Trueandlayout="spring", the code passesweight="bond_order"tonx.spring_layout. However, edgebond_orderattributes can beNone(as seen inase2networkxoutput when connectivity is distance-based). NetworkX's spring layout expects numeric weights and may error or behave inconsistently withNonevalues.Consider this approach to ensure numeric weights:
if layout == "spring": if weight_by_bond_order: - # Use bond_order as weights - higher order = stronger spring - pos = nx.spring_layout(graph, weight="bond_order", seed=42) + # Ensure all bond_order values are numeric (fallback None to 1.0) + for u, v, data in graph.edges(data=True): + if data.get("bond_order") is None or not isinstance(data.get("bond_order"), (int, float)): + data["bond_order"] = 1.0 + pos = nx.spring_layout(graph, weight="bond_order", seed=42) else: pos = nx.spring_layout(graph, seed=42)tests/test_substructure.py (1)
70-134: Fix “explict” typos in comments for explicit‑hydrogen SMARTS testsIn the explicit‑hydrogen SMARTS section, the comments still read “explict only hydrogens are returned” (Lines 122 and 132). Please correct these to “explicit” (e.g., “only explicit hydrogens are returned”).
This was already flagged in a previous review; updating the comments will avoid confusion.
🧹 Nitpick comments (7)
src/molify/utils.py (1)
31-61: Consider removing the NetworkX fallback.NetworkX is already a hard dependency (imported at line 8), so the
ImportErrorfallback path in this function is unreachable. The manual DFS implementation is well-written but appears to be dead code.If you want to simplify:
def find_connected_components(connectivity: list[tuple[int, int, float]]): - try: - import networkx as nx - - graph = nx.Graph() - for i, j, _ in connectivity: - graph.add_edge(i, j) - for component in nx.connected_components(graph): - yield component - except ImportError: - adjacency = defaultdict(list) - for i, j, _ in connectivity: - adjacency[i].append(j) - adjacency[j].append(i) - - visited = set() - for start in adjacency: - if start in visited: - continue - - component = [] - stack = [start] - while stack: - node = stack.pop() - if node in visited: - continue - visited.add(node) - component.append(node) - stack.extend(n for n in adjacency[node] if n not in visited) - - yield component + graph = nx.Graph() + for i, j, _ in connectivity: + graph.add_edge(i, j) + for component in nx.connected_components(graph): + yield componenttests/test_compress.py (1)
24-58: Freeze test is solid; you can slightly tighten the OH-distance checksThe
freeze_molecules=Truetest nicely checks density, fragment count, and that water geometries are preserved. Currently you only assert one O–H distance per molecule depending on the O index; if you want stronger coverage, you could explicitly identify the O and both H indices and assert both O–H distances againstref_water. Not required, but would make the intent even clearer.hatch_build.py (2)
14-41:_find_gfortranis pragmatic; version selection could be more robust (optional)The approach of preferring plain
gfortranand then scanning common versioned locations is practical. If you ever need to distinguish between multiple installed versions (e.g.,gfortran-9,gfortran-11,gfortran-13), the current lexicographicmatches.sort(reverse=True)may not pick the numerically highest version. Not urgent, but if you hit such setups, consider parsing the suffix as an integer and sorting numerically.
43-116: Minor lint nits: unusedversionargument and long error stringsRuff’s ARG002/TRY003 hints here are mostly cosmetic:
versionininitializeis required by the hook interface but unused (you could rename it to_version), and the multi-line error messages are fine but could be moved into a dedicated exception class if you want to satisfy TRY003. These are optional cleanups and do not affect behavior.src/molify/packmol.py (1)
21-29: Consider Windows.exehandling inget_packmol_binary
get_packmol_binaryalways looks for a file namedbinaries/packmol. On Windows, the compiled executable is typicallypackmol.exe; unless your build hook explicitly names the file without an extension, this will raise the “not found” error even when the binary exists.It’s safer to either:
- Ensure the build hook always produces a consistent cross‑platform filename, or
- Add platform‑aware lookup (e.g., try
packmol.exeon Windows before failing).src/molify/pack.py (1)
64-78: Consider adding a timeout or reusing the higher‑levelrun_packmolhelper
_run_packmolcallssubprocess.runwithout a timeout and letsCalledProcessErrorbubble up directly. For long‑running or stuck Packmol runs, this can hang the caller indefinitely, unlike therun_packmol(..., timeout=...)helper which already centralizes nicer error handling.Longer‑term, it may be worth either:
- Threading an optional
timeoutparameter through to_run_packmol, or- Reusing
molify.packmol.run_packmolhere so Packmol invocation is handled in one place.src/molify/substructure.py (1)
13-170: Robust implementation with comprehensive hydrogen handling.The function correctly handles pattern parsing, atom mapping validation, and three hydrogen inclusion modes. The duplicate detection in "include" mode (lines 134-136) and the map number sorting (lines 99-106) are particularly well-thought-out.
Static analysis suggests extracting long error messages to constants or class-level attributes (TRY003). This is a minor style preference:
# At module level _INVALID_PATTERN_MSG = "Invalid SMARTS/SMILES: {}" _DUPLICATE_MAP_MSG = "Atom map label '{}' is used multiple times" _INVALID_HYDROGENS_MSG = "Invalid value for `hydrogens`: {}. Expected one of 'include', 'exclude', 'isolated'." # Then in the function: raise ValueError(_INVALID_PATTERN_MSG.format(smarts_or_smiles))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (24)
.github/workflows/build-wheels.yaml(1 hunks).github/workflows/pages.yaml(1 hunks).github/workflows/publish.yaml(0 hunks).github/workflows/pytest.yaml(1 hunks).gitmodules(1 hunks)README.md(1 hunks)docs/source/index.rst(1 hunks)docs/source/packmol_tools.ipynb(2 hunks)external/packmol(1 hunks)hatch_build.py(1 hunks)pyproject.toml(4 hunks)src/molify/__init__.py(2 hunks)src/molify/pack.py(6 hunks)src/molify/packmol.py(1 hunks)src/molify/substructure.py(1 hunks)src/molify/utils.py(1 hunks)tests/conftest.py(2 hunks)tests/integration/test_ase2x.py(0 hunks)tests/integration/test_networkx2x.py(0 hunks)tests/test_com.py(1 hunks)tests/test_compress.py(2 hunks)tests/test_solvate.py(4 hunks)tests/test_substructure.py(8 hunks)tests/test_utils.py(0 hunks)
💤 Files with no reviewable changes (4)
- tests/test_utils.py
- tests/integration/test_networkx2x.py
- tests/integration/test_ase2x.py
- .github/workflows/publish.yaml
✅ Files skipped from review due to trivial changes (1)
- external/packmol
🧰 Additional context used
🧬 Code graph analysis (8)
tests/conftest.py (1)
src/molify/pack.py (1)
pack(143-247)
tests/test_solvate.py (2)
src/molify/smiles2x.py (1)
smiles2conformers(64-149)src/molify/pack.py (1)
pack(143-247)
src/molify/utils.py (2)
src/molify/ase2x.py (1)
ase2networkx(97-184)src/molify/rdkit2x.py (1)
rdkit2networkx(61-125)
src/molify/__init__.py (2)
src/molify/substructure.py (2)
group_matches_by_fragment(173-233)match_substructure(13-170)src/molify/utils.py (1)
draw_molecular_graph(230-384)
tests/test_compress.py (1)
src/molify/pack.py (1)
pack(143-247)
tests/test_substructure.py (5)
src/molify/ase2x.py (1)
ase2rdkit(187-220)src/molify/substructure.py (3)
match_substructure(13-170)get_substructures(236-272)group_matches_by_fragment(173-233)src/molify/smiles2x.py (2)
smiles2conformers(64-149)smiles2atoms(33-61)src/molify/pack.py (1)
pack(143-247)tests/conftest.py (3)
ethanol_water(8-12)alanine_dipeptide(16-18)alanine_dipeptide_box(22-25)
src/molify/pack.py (2)
src/molify/packmol.py (1)
get_packmol_binary(8-29)src/molify/utils.py (1)
calculate_box_dimensions(75-84)
src/molify/substructure.py (2)
src/molify/ase2x.py (1)
ase2rdkit(187-220)src/molify/utils.py (1)
find_connected_components(31-61)
🪛 LanguageTool
README.md
[style] ~46-~46: Using many exclamation marks might seem excessive (in this case: 6 exclamation marks for a text that’s 1405 characters long)
Context: ...u like packmol, give it a star on GitHub! ```py from molify import pack, smiles2...
(EN_EXCESSIVE_EXCLAMATION)
🪛 Ruff (0.14.4)
hatch_build.py
36-41: Avoid specifying long messages outside the exception class
(TRY003)
43-43: Unused method argument: version
(ARG002)
56-59: Avoid specifying long messages outside the exception class
(TRY003)
70-70: Starting a process with a partial executable path
(S607)
78-78: subprocess call: check for execution of untrusted input
(S603)
79-79: Starting a process with a partial executable path
(S607)
88-88: Avoid specifying long messages outside the exception class
(TRY003)
96-100: Avoid specifying long messages outside the exception class
(TRY003)
src/molify/utils.py
28-28: Avoid specifying long messages outside the exception class
(TRY003)
182-182: Avoid specifying long messages outside the exception class
(TRY003)
208-208: Consider moving this statement to an else block
(TRY300)
212-216: Avoid specifying long messages outside the exception class
(TRY003)
319-319: Avoid specifying long messages outside the exception class
(TRY003)
336-339: Avoid specifying long messages outside the exception class
(TRY003)
src/molify/packmol.py
24-27: Avoid specifying long messages outside the exception class
(TRY003)
56-56: Avoid specifying long messages outside the exception class
(TRY003)
61-61: subprocess call: check for execution of untrusted input
(S603)
71-75: Avoid specifying long messages outside the exception class
(TRY003)
77-79: Avoid specifying long messages outside the exception class
(TRY003)
93-93: subprocess call: check for execution of untrusted input
(S603)
102-102: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
src/molify/pack.py
72-72: subprocess call: check for execution of untrusted input
(S603)
237-237: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
src/molify/substructure.py
86-86: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
162-165: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: pytest (3.10, windows-latest)
- GitHub Check: pytest (3.11, windows-latest)
- GitHub Check: pytest (3.12, windows-latest)
- GitHub Check: Build wheels on windows-11-arm
- GitHub Check: Build wheels on macos-latest
- GitHub Check: Build wheels on macos-15-intel
- GitHub Check: Build wheels on windows-latest
- GitHub Check: pytest (3.13, windows-latest)
- GitHub Check: pytest (3.13, macos-15-intel)
- GitHub Check: Build wheels on ubuntu-latest
- GitHub Check: Build wheels on ubuntu-24.04-arm
- GitHub Check: Build wheels on macos-latest
- GitHub Check: Build wheels on macos-15-intel
- GitHub Check: Build wheels on ubuntu-latest
- GitHub Check: Build wheels on windows-latest
- GitHub Check: Build wheels on windows-11-arm
- GitHub Check: Build wheels on ubuntu-24.04-arm
🔇 Additional comments (36)
.github/workflows/pages.yaml (1)
29-31: LGTM! Checkout action updated and submodules enabled.The upgrade from
actions/checkout@v2tov5brings the workflow up to date, and thesubmodules: recursiveconfiguration correctly enables fetching theexternal/packmolsubmodule introduced in this PR.src/molify/utils.py (7)
18-28: LGTM! Bond order mapping is correct.The function correctly maps numeric bond orders (1.0, 2.0, 3.0, 1.5) to RDKit BondType enums, including proper handling of aromatic bonds (1.5).
64-72: LGTM! Density calculation is correct.The conversion from atomic mass units to kg and from ų to m³ is properly implemented.
75-84: LGTM! Box dimension calculation is correct.The function correctly computes cubic box dimensions from total mass and target density.
87-177: LGTM! Unwrapping algorithm is well-designed.The DFS-based approach to unwrapping molecular structures across periodic boundaries is sound, and the comprehensive docstring with examples is excellent.
180-216: LGTM! Bond determination logic handles edge cases well.The special cases for single-atom ions (alkali metals, halogens) and PF6⁻ are reasonable workarounds for RDKit's bond perception limitations. The multi-charge-state fallback strategy is robust.
219-227: LGTM! Simple conversion wrapper.The function correctly converts a list of SMILES strings to NetworkX graphs via RDKit molecules.
387-495: LGTM! Bond order visualization is well-implemented.The edge drawing logic correctly handles single, double, triple, and aromatic bonds with appropriate line styles and offsets. The perpendicular offset calculation is mathematically sound.
tests/conftest.py (2)
11-11: LGTM! Test updated for default Packmol backend.The removal of the explicit
packmolparameter aligns with the new bundled Packmol approach, where the default behavior uses the built-in binary.
24-24: LGTM! Test updated for default Packmol backend.Consistent with the changes throughout the test suite to rely on the bundled Packmol binary by default.
.gitmodules (1)
1-3: LGTM! Packmol submodule properly configured.The submodule addition points to the official m3g/packmol repository and enables the bundled Packmol integration described throughout this PR.
src/molify/__init__.py (3)
10-16: LGTM! Public API updated with new substructure functionality.The addition of
group_matches_by_fragmentexposes the new fragment-aware matching capability introduced in this PR.
17-20: LGTM! New visualization function exposed.The addition of
draw_molecular_graphto the public API provides users with bond-order-aware molecular graph visualization.
24-47: LGTM! all list updated consistently.The exports correctly reflect the API additions (
group_matches_by_fragment,draw_molecular_graph) and are consistent with the imports.README.md (1)
44-46: LGTM! Documentation accurately reflects bundled Packmol.The updated README correctly states that molify now ships with Packmol, consistent with the submodule integration and build changes in this PR. The acknowledgment of the upstream project is appropriate.
tests/test_com.py (1)
106-108: LGTM! Test updated for default Packmol backend.The removal of the
packmolparameter is consistent with the new default bundled behavior throughout the test suite.tests/test_solvate.py (1)
9-126: LGTM! Tests updated to use bundled Packmol backend.All
pack()calls throughout this test file have been consistently updated to remove the explicitpackmolparameter, relying on the new default bundled binary. Test function names have been appropriately simplified to remove backend-specific references. The test logic and assertions remain unchanged.tests/test_compress.py (1)
7-22: Default-pack backend usage intest_compresslooks goodUsing
molify.pack(..., density=500)without an explicit backend and then validating both pre- and post-compression densities is consistent with the new default/bundled Packmol behavior; the test is clear and appropriately strict..github/workflows/pytest.yaml (1)
24-30: Broader OS matrix is reasonable; just confirm runner/tooling supportThe expanded matrix plus
actions/checkout@v5with recursive submodules is a nice improvement. Given the new hatch build hook compiles Packmol withgfortran/make, please double-check that all configured runners (especiallyubuntu-24.04-arm,macos-15-intel, andwindows-latest) either (a) have the necessary toolchain preinstalled or (b) install it beforeuv sync, and that the runner labels all exist as expected in GitHub Actions.docs/source/index.rst (1)
4-132: Index rewrite is clear and aligned with the current APIThe new introduction, conversion diagrams, and quick-start examples (RDKit↔ASE↔NetworkX, conformer generation, and packing) read well and match the current public API (
rdkit2ase,ase2rdkit,smiles2conformers,pack, etc.). The hidden toctree plus explicit sections give a good high-level overview without overloading the front page.hatch_build.py (1)
90-116: Based on the web search result, Hatchling'sforce_includedocumentation confirms that the mapping keys support both relative and absolute paths. Relative paths are interpreted from the project root, while absolute paths are supported as external sources.Looking at the current code,
target_binaryis constructed as an absolute filesystem path (via Path objects), and the current implementation usesstr(target_binary)as the key. According to Hatchling's documentation, this approach is valid and supported—there is no inherent "path-resolution issue" with absolute paths as the review comment suggests.The suggested refactor to use relative paths would also work correctly, but it is not necessary to avoid issues. The review comment's premise that absolute paths cause problems on different machines or CI environments is not supported by Hatchling's documented behavior.
The review comment mischaracterizes how Hatchling handles
force_includepaths. Absolute paths (which the current code uses) are supported and properly handled by Hatchling's builder. While changing to relative paths would also be correct, the current implementation is not problematic and does not require fixing.Likely an incorrect or invalid review comment.
pyproject.toml (1)
3-79: Project metadata, docs extra, and build hook wiring verified as correctVerification confirms all configuration is properly wired:
molify.packmol:packmol_cliexists atsrc/molify/packmol.py:82and is exportedhatch_build.pyis present with complete logic to compile packmol, place the binary insrc/molify/binaries/, and include it in the wheel viaforce_include- Hatch build hook is properly registered and wheel target correctly specifies
packages = ["src/molify"]for the src layout- Build hook marks the wheel as platform-specific, ensuring correct wheel tags
.github/workflows/build-wheels.yaml (2)
15-30: Overall workflow structure for wheels and sdists looks solidCheckout with recursive submodules, cibuildwheel usage, uv SDist build, and artifact aggregation for PyPI publishing are wired together cleanly and align with the new bundled Packmol setup.
Also applies to: 31-51, 60-68
12-13: The runner labels are valid GitHub-hosted labels and will not fail scheduling.All three runner labels (
ubuntu-24.04-arm,windows-11-arm, andmacos-15-intel) are documented as available GitHub-hosted runner labels in 2025. The review comment's claim that these "do not match the usual GitHub-hosted runner labels" and are "likely to fail scheduling" is incorrect; these are standard, supported labels.Regarding
MACOSX_DEPLOYMENT_TARGET=15.0: This is a deliberate policy choice to target macOS 15+, not a misconfiguration. While it does restrict compatibility to newer macOS versions, this may be intentional based on project requirements and is not a technical error.Likely an incorrect or invalid review comment.
src/molify/packmol.py (1)
82-99: CLI shim is straightforward and aligns with the bundled‑binary designForwarding stdin/stdout/stderr and propagating the Packmol return code makes the
packmolconsole entry behave like the underlying binary, while still benefiting from your bundled distribution. Exportingpackmol_clivia__all__keeps the module surface tidy.Also applies to: 102-102
tests/test_substructure.py (5)
155-215: Hydrogen‑handling tests exercise the new semantics wellThe tests around
hydrogens="exclude","include","isolated"and explicit[H]SMARTS onethanol_molclosely mirror the documented behavior inmatch_substructureand should catch regressions in ordering and hydrogen selection.
219-251: Mapped SMILES + hydrogen filtering coverage looks strongThe
mapped_only=Truetests with and without hydrogens for"C[O:1]"validate both map‑number ordering and hydrogen inclusion/exclusion/isolated behavior. This is a good, targeted regression suite for the mapping logic inmatch_substructure.
268-386: Fragment‑grouping tests align withgroup_matches_by_fragmentdesignThe
test_group_matches_by_fragmentand*_mapped_order_boxcases verify:
- Grouping by disconnected fragments,
- Stable ordering of indices inside groups, and
- Consistent behavior under different mapping orders and hydrogen modes.
These tests nicely pin down the fragment logic and mapped‑atom ordering.
412-450: Good regression test forase2rdkithandling ofNonebond orders
test_match_substructure_ions_with_none_bond_ordersfaithfully reproduces the reported failure mode by injectingNonebond orders intoconnectivityand then asserting thatmatch_substructurestill finds Na⁺, Cl⁻, and water. This should prevent regressions in the newase2rdkitrobustness logic.
452-510: Visualization tests cover the main usage patternsThe
visualize_selected_moleculestests (basic, empty, overlapping, multiple selections, custom alpha) give good smoke coverage of the visualization API and color‑precedence behavior without over‑specifying image internals.src/molify/pack.py (1)
150-177: Bundled‑binary default and improved logging look correctUpdating
pack()to acceptpackmol: str | Noneand default tostr(get_packmol_binary())cleanly switches the public API to the bundled Packmol binary while still allowing a custom executable path. The added logging of the Packmol path whenmixture.<format>is missing should make debugging misconfigurations significantly easier, without changing the happy path behavior.Also applies to: 202-227, 235-242
docs/source/packmol_tools.ipynb (1)
7-12: Notebook now matches the bundled‑Packmol API and examples are consistentThe updated notebook correctly documents that
molifyships with a pre‑compiled Packmol binary and demonstratespack()usage without backend arguments, along with connectivity preservation, mixture creation, RDKit visualization, and density calculation. This aligns well with the newpack()/get_packmol_binary()behavior.Just ensure
molify.utils.calculate_densityis present and public so the density example keeps working as shown.Also applies to: 29-36, 75-86, 219-249, 274-279
src/molify/substructure.py (4)
1-11: LGTM!Imports are well-organized and appropriate for the module's functionality.
173-233: LGTM! Clean grouping logic with proper deduplication.The fragment assignment logic (lines 218-222) correctly assumes that substructure matches are within connected components. The duplicate removal (lines 229-230) preserves order efficiently using the
seenset pattern.
236-272: LGTM! Clean integration between ASE and RDKit.The function correctly bridges ASE and RDKit worlds, with clear documentation that
**kwargsare passed toase2rdkitfor bond detection control.
316-384: LGTM! Well-structured helper functions.The fragment collection logic correctly maps indices (lines 342-358), and the deduplication via canonical SMILES (line 376) is the right approach for identifying structurally identical molecules.
| def initialize(self, version, build_data): | ||
| """Compile packmol before building the wheel.""" | ||
| # Paths | ||
| project_root = Path(self.root) | ||
| packmol_source = project_root / "external" / "packmol" | ||
| binaries_dir = project_root / "src" / "molify" / "binaries" | ||
| target_binary = binaries_dir / "packmol" | ||
|
|
||
| # Ensure binaries directory exists | ||
| binaries_dir.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| # Verify packmol source exists | ||
| if not packmol_source.exists(): | ||
| raise RuntimeError( | ||
| f"Packmol source not found at {packmol_source}. " | ||
| "Run: git submodule update --init --recursive" | ||
| ) | ||
|
|
||
| # Find gfortran compiler | ||
| gfortran_path = self._find_gfortran() | ||
| print(f"Using gfortran: {gfortran_path}") | ||
|
|
||
| # Compile packmol | ||
| print("Compiling packmol...") | ||
| try: | ||
| # Clean previous build artifacts | ||
| subprocess.run( | ||
| ["make", "clean"], | ||
| cwd=packmol_source, | ||
| check=False, # Don't fail if clean fails | ||
| capture_output=True, | ||
| ) | ||
|
|
||
| # Override FORTRAN variable to use detected gfortran | ||
| # Note: Not using -j4 due to Fortran module dependencies | ||
| result = subprocess.run( | ||
| ["make", f"FORTRAN={gfortran_path}"], | ||
| cwd=packmol_source, | ||
| check=True, | ||
| capture_output=True, | ||
| text=True, | ||
| ) | ||
| print(result.stdout) | ||
| except subprocess.CalledProcessError as e: | ||
| print(f"Compilation failed!\nstdout: {e.stdout}\nstderr: {e.stderr}") | ||
| raise RuntimeError(f"Failed to compile packmol: {e}") from e | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Build hook behavior is good but currently hard-requires toolchain on all build platforms
The hook correctly ensures Packmol is built from external/packmol and surfaces useful diagnostics on failure. However, it unconditionally raises if the submodule is missing or gfortran/make compilation fails, which will cause pip/uv installs (and CI on platforms without a Fortran toolchain, especially Windows) to fail outright. If you intend Packmol to be strictly required, this is fine; otherwise, you might want to (a) gate compilation on an env var, (b) skip/soft-fail on unsupported platforms, or (c) provide a clearer, platform-specific message about missing compilers.
🌐 Web query:
How does Hatchling’s build hook lifecycle handle failures in `initialize` for `BuildHookInterface`? Is failing the hook the recommended way to enforce a hard dependency on native tooling, or is there a documented pattern for making such tooling optional?
💡 Result:
Short answer: If your BuildHookInterface.initialize raises an exception, the hook failure aborts the build — initialize runs immediately before each build and unhandled exceptions will stop the build. That makes failing the hook an effective way to force a hard failure, but it’s not the recommended pattern for expressing tooling requirements. Instead you should either declare the tool as a build dependency (so the build frontend will install it) or make the hook detect the tool and (when optional) mutate build_data (e.g. add deps or skip actions). The Hatch build-hook docs show initialize/finalize behavior and the CLI supports build hooks adding build-time dependencies; an existing issue/discussion also demonstrates using hooks to add dependencies for editable builds rather than hard-failing. [1][2][3]
Sources
- Hatch build-hook reference (initialize/finalize semantics). [1]
- Hatch CLI / build command (hooks, --hooks-only / --no-hooks options). [2]
- Issue/discussion demonstrating using build hooks to add dependencies for editable builds (example of making a hook add dependencies rather than relying on hard failures). [3]
I'll verify the codebase to confirm whether there are any fallback mechanisms, optional tooling handling, or build dependency declarations that might affect the severity of this review comment.
---
🏁 Script executed:
```shell
#!/bin/bash
# Check for pyproject.toml and build configuration
fd -e "pyproject.toml" -e "setup.py" -e "setup.cfg" | head -20
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Look for any conditional/optional compilation logic in Python build files
rg -n "optional|windows|platform|skip|soft.fail" --type=py -i $(fd -e "*.py" | grep -E "(build|setup|hatch)" | head -10)Length of output: 2749
🏁 Script executed:
#!/bin/bash
# Check if there's any documentation about platform support or compilation requirements
fd -e "README*" -e "INSTALL*" -e "CONTRIBUTING*" -e "docs" -type f | head -20Length of output: 170
🏁 Script executed:
#!/bin/bash
# Examine pyproject.toml for build dependencies and hook configuration
fd "pyproject.toml" -x cat {}Length of output: 1809
🏁 Script executed:
#!/bin/bash
# Look at the entire hatch_build.py to see if there's any handling we missed
cat hatch_build.pyLength of output: 4322
Hard requirement on Fortran toolchain will fail builds on unsupported platforms (especially Windows)
The hook unconditionally raises on missing gfortran, make, or packmol source, which conflicts with Hatchling's documented patterns for handling native tooling. This will break builds on Windows and any CI environment without a Fortran compiler, despite the codebase showing intent to support multiple platforms (evidenced by the commented-out cibuildwheel config that explicitly notes Windows requires special setup).
Instead of raising unconditionally:
- Declare gfortran/make as build dependencies so the build frontend can install them, OR
- Detect unavailable tooling and mutate
build_datato skip compilation or add conditional dependencies, OR - Gate compilation behind an environment variable for optional builds
See hatch_build.py lines 36–38 (_find_gfortran), 56–58 (submodule check), and 73 (compilation failure) — all currently unconditional raises.
🧰 Tools
🪛 Ruff (0.14.4)
43-43: Unused method argument: version
(ARG002)
56-59: Avoid specifying long messages outside the exception class
(TRY003)
70-70: Starting a process with a partial executable path
(S607)
78-78: subprocess call: check for execution of untrusted input
(S603)
79-79: Starting a process with a partial executable path
(S607)
88-88: Avoid specifying long messages outside the exception class
(TRY003)
| input_path = Path(input_file) | ||
| if not input_path.exists(): | ||
| raise FileNotFoundError(f"Input file not found: {input_path}") | ||
|
|
||
| binary = get_packmol_binary() | ||
|
|
||
| try: | ||
| result = subprocess.run( | ||
| [str(binary)], | ||
| stdin=open(input_path), | ||
| capture_output=True, | ||
| text=True, | ||
| timeout=timeout, | ||
| check=True, | ||
| ) | ||
| return result.stdout + result.stderr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Close the input file used for stdin in run_packmol
run_packmol passes stdin=open(input_path) directly into subprocess.run without closing the file object. This can leak file descriptors if run_packmol is called many times in a process.
Wrap the open call in a context manager:
- try:
- result = subprocess.run(
- [str(binary)],
- stdin=open(input_path),
- capture_output=True,
- text=True,
- timeout=timeout,
- check=True,
- )
+ try:
+ with input_path.open() as fin:
+ result = subprocess.run(
+ [str(binary)],
+ stdin=fin,
+ capture_output=True,
+ text=True,
+ timeout=timeout,
+ check=True,
+ )I'll verify the review comment by examining the actual code and checking for similar patterns in the codebase.
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine the actual code at the specified lines in src/molify/packmol.py
if [ -f "src/molify/packmol.py" ]; then
echo "=== Lines 45-75 of src/molify/packmol.py ==="
sed -n '45,75p' src/molify/packmol.py
echo ""
echo "=== Full run_packmol function ==="
grep -n "def run_packmol" src/molify/packmol.py
else
echo "File not found"
fi
Length of output: 1035
🏁 Script executed:
#!/bin/bash
# Search for similar subprocess.run calls with stdin=open pattern
echo "=== Searching for stdin=open patterns in codebase ==="
rg -n "stdin\s*=\s*open" --type py
echo ""
echo "=== Searching for all subprocess.run calls ==="
rg -n "subprocess\.run" --type py -A 5 -B 2Length of output: 2356
🏁 Script executed:
#!/bin/bash
# Check for imports and context at the top of packmol.py
echo "=== First 40 lines of src/molify/packmol.py ==="
head -40 src/molify/packmol.pyLength of output: 965
Close the file descriptor leak in run_packmol at line 63
The code passes stdin=open(input_path) to subprocess.run() without closing the file. This creates a file descriptor leak when run_packmol is called repeatedly, potentially exhausting system resources. The codebase already demonstrates the correct pattern in src/molify/pack.py (lines 71-77).
Apply the suggested context manager fix:
- try:
- result = subprocess.run(
- [str(binary)],
- stdin=open(input_path),
- capture_output=True,
- text=True,
- timeout=timeout,
- check=True,
- )
+ try:
+ with input_path.open() as fin:
+ result = subprocess.run(
+ [str(binary)],
+ stdin=fin,
+ capture_output=True,
+ text=True,
+ timeout=timeout,
+ check=True,
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| input_path = Path(input_file) | |
| if not input_path.exists(): | |
| raise FileNotFoundError(f"Input file not found: {input_path}") | |
| binary = get_packmol_binary() | |
| try: | |
| result = subprocess.run( | |
| [str(binary)], | |
| stdin=open(input_path), | |
| capture_output=True, | |
| text=True, | |
| timeout=timeout, | |
| check=True, | |
| ) | |
| return result.stdout + result.stderr | |
| input_path = Path(input_file) | |
| if not input_path.exists(): | |
| raise FileNotFoundError(f"Input file not found: {input_path}") | |
| binary = get_packmol_binary() | |
| try: | |
| with input_path.open() as fin: | |
| result = subprocess.run( | |
| [str(binary)], | |
| stdin=fin, | |
| capture_output=True, | |
| text=True, | |
| timeout=timeout, | |
| check=True, | |
| ) | |
| return result.stdout + result.stderr |
🧰 Tools
🪛 Ruff (0.14.4)
56-56: Avoid specifying long messages outside the exception class
(TRY003)
61-61: subprocess call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
In src/molify/packmol.py around lines 54 to 69, run_packmol opens the input file
with open(input_path) and passes that file object to subprocess.run as stdin
without closing it, causing a file descriptor leak; change it to open the input
file using a context manager (with open(input_path, "r") as f:) and pass that f
to subprocess.run so the file is automatically closed after the call, preserving
the same subprocess.run arguments and returning result.stdout + result.stderr.
| def iter_fragments(atoms: ase.Atoms) -> list[ase.Atoms]: | ||
| """Iterate over connected molecular fragments in an ASE Atoms object. | ||
| If a 'connectivity' field is present in ``atoms.info``, it will be used | ||
| to determine fragments. Otherwise, ``ase.build.separate`` will be used. | ||
| Parameters | ||
| ---------- | ||
| atoms : ase.Atoms | ||
| A structure that may contain one or more molecular fragments. | ||
| Yields | ||
| ------ | ||
| ase.Atoms | ||
| Each connected component (fragment) in the input structure. | ||
| Examples | ||
| -------- | ||
| >>> from molify import smiles2atoms | ||
| >>> from rdkit.Chem import CombineMols | ||
| >>> | ||
| >>> # Create multi-fragment system | ||
| >>> ethanol = smiles2atoms("CCO") | ||
| >>> methanol = smiles2atoms("CO") | ||
| >>> combined = ethanol + methanol | ||
| >>> | ||
| >>> # Iterate over fragments | ||
| >>> fragments = list(iter_fragments(combined)) | ||
| >>> len(fragments) | ||
| 2 | ||
| """ | ||
| if "connectivity" in atoms.info: | ||
| # connectivity is a list of tuples (i, j, bond_type) | ||
| connectivity = atoms.info["connectivity"] | ||
| for component in find_connected_components(connectivity): | ||
| yield atoms[list(component)] | ||
| else: | ||
| for molecule in separate(atoms): | ||
| yield molecule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect return type annotation for generator function.
The return type annotation states -> list[ase.Atoms] but the function uses yield (lines 310, 313), making it a generator. This will cause type checking failures.
Apply this diff:
-def iter_fragments(atoms: ase.Atoms) -> list[ase.Atoms]:
+def iter_fragments(atoms: ase.Atoms) -> tp.Generator[ase.Atoms, None, None]:
"""Iterate over connected molecular fragments in an ASE Atoms object.Alternatively, if you want to return a list instead of a generator, replace yield with list accumulation:
def iter_fragments(atoms: ase.Atoms) -> list[ase.Atoms]:
"""Iterate over connected molecular fragments in an ASE Atoms object.
...
- Yields
+ Returns
------
- ase.Atoms
- Each connected component (fragment) in the input structure.
+ list[ase.Atoms]
+ List of connected components (fragments) in the input structure.
...
"""
+ result = []
if "connectivity" in atoms.info:
connectivity = atoms.info["connectivity"]
for component in find_connected_components(connectivity):
- yield atoms[list(component)]
+ result.append(atoms[list(component)])
else:
for molecule in separate(atoms):
- yield molecule
+ result.append(molecule)
+ return resultCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/molify/substructure.py around lines 275 to 313, the function
iter_fragments is annotated as returning list[ase.Atoms] but uses yield, so
update the signature to a generator/iterator type (e.g., Iterator[ase.Atoms] or
Iterable[ase.Atoms]) and add the corresponding import from typing at the top of
the file; alternatively, if you prefer a concrete list return, collect fragments
into a list and return it instead of yielding. Ensure the chosen change is
consistent with callers and type hints elsewhere.
| def visualize_selected_molecules( | ||
| mol: Chem.Mol, | ||
| *args, | ||
| mols_per_row: int = 4, | ||
| sub_img_size: tuple[int, int] = (200, 200), | ||
| legends: list[str] | None = None, | ||
| alpha: float = 0.5, | ||
| ): | ||
| """Visualize molecules with optional atom highlighting. | ||
| If no atom selections are provided, displays the molecule without highlights. | ||
| Duplicate molecular structures will only be plotted once. | ||
| Parameters | ||
| ---------- | ||
| mol : Chem.Mol | ||
| The RDKit molecule object, which may contain multiple fragments. | ||
| *args : list[int] or tuple[int] | ||
| Variable number of lists/tuples containing atom indices to be highlighted. | ||
| Each selection will be assigned a different color from matplotlib's tab10 | ||
| colormap. If no arguments provided, displays the molecule without highlights. | ||
| mols_per_row : int, default=4 | ||
| Number of molecules per row in the grid. | ||
| sub_img_size : tuple[int, int], default=(200, 200) | ||
| Size of each molecule image in pixels. | ||
| legends : list[str], optional | ||
| Custom legends for each molecule. If None, default legends will be used. | ||
| alpha : float, default=0.5 | ||
| Transparency level for the highlighted atoms (0.0 = fully transparent, | ||
| 1.0 = opaque). | ||
| Returns | ||
| ------- | ||
| PIL.Image | ||
| A PIL image object of the grid. | ||
| Examples | ||
| -------- | ||
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | ||
| >>> from molify import visualize_selected_molecules | ||
| >>> | ||
| >>> # Create and convert molecule | ||
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | ||
| >>> | ||
| >>> # Find aromatic and aliphatic carbons | ||
| >>> aromatic = match_substructure(mol, "c") | ||
| >>> aliphatic = match_substructure(mol, "[C;!c]") | ||
| >>> | ||
| >>> # Flatten matches for visualization | ||
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | ||
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | ||
| >>> | ||
| >>> # Visualize with highlights | ||
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | ||
| <PIL.Image> | ||
| """ | ||
| # Handle empty args case - display molecule without highlights | ||
| if not args: | ||
| img = Draw.MolsToGridImage( | ||
| [mol], | ||
| molsPerRow=mols_per_row, | ||
| subImgSize=sub_img_size, | ||
| legends=legends if legends is not None else ["Molecule 0"], | ||
| ) | ||
| return img | ||
|
|
||
| # Collect highlighted fragments | ||
| candidate_mols, candidate_highlights, candidate_colors = ( | ||
| _collect_highlighted_fragments(mol, args, alpha) | ||
| ) | ||
|
|
||
| if not candidate_mols: | ||
| print("No molecules to draw with the given selections.") | ||
| return None | ||
|
|
||
| # Filter for unique molecules | ||
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | ||
| candidate_mols, candidate_highlights, candidate_colors | ||
| ) | ||
|
|
||
| # Draw the grid | ||
| final_legends = ( | ||
| legends | ||
| if legends is not None | ||
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | ||
| ) | ||
|
|
||
| img = Draw.MolsToGridImage( | ||
| mols_to_draw, | ||
| molsPerRow=mols_per_row, | ||
| subImgSize=sub_img_size, | ||
| legends=final_legends, | ||
| highlightAtomLists=highlight_lists, | ||
| highlightAtomColors=highlight_colors, | ||
| ) | ||
| return img |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document or handle the None return case.
The docstring states the function returns PIL.Image, but line 460 returns None when no molecules match the selections. This inconsistency could cause issues for callers expecting an image.
Consider one of these approaches:
Option 1: Document the None return (simplest)
Returns
-------
-PIL.Image
- A PIL image object of the grid.
+PIL.Image or None
+ A PIL image object of the grid, or None if no molecules contain
+ any of the selected atoms.Option 2: Raise an exception instead (fail fast)
if not candidate_mols:
- print("No molecules to draw with the given selections.")
- return None
+ raise ValueError(
+ "No molecules contain any of the selected atoms. "
+ "Check that atom indices are valid and correspond to the molecule."
+ )Option 3: Return a blank/placeholder image (always returns an image)
if not candidate_mols:
- print("No molecules to draw with the given selections.")
- return None
+ # Return placeholder image when no matches
+ from PIL import Image, ImageDraw
+ img = Image.new('RGB', sub_img_size, 'white')
+ draw = ImageDraw.Draw(img)
+ draw.text((10, sub_img_size[1]//2), "No matches found", fill='black')
+ return img📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def visualize_selected_molecules( | |
| mol: Chem.Mol, | |
| *args, | |
| mols_per_row: int = 4, | |
| sub_img_size: tuple[int, int] = (200, 200), | |
| legends: list[str] | None = None, | |
| alpha: float = 0.5, | |
| ): | |
| """Visualize molecules with optional atom highlighting. | |
| If no atom selections are provided, displays the molecule without highlights. | |
| Duplicate molecular structures will only be plotted once. | |
| Parameters | |
| ---------- | |
| mol : Chem.Mol | |
| The RDKit molecule object, which may contain multiple fragments. | |
| *args : list[int] or tuple[int] | |
| Variable number of lists/tuples containing atom indices to be highlighted. | |
| Each selection will be assigned a different color from matplotlib's tab10 | |
| colormap. If no arguments provided, displays the molecule without highlights. | |
| mols_per_row : int, default=4 | |
| Number of molecules per row in the grid. | |
| sub_img_size : tuple[int, int], default=(200, 200) | |
| Size of each molecule image in pixels. | |
| legends : list[str], optional | |
| Custom legends for each molecule. If None, default legends will be used. | |
| alpha : float, default=0.5 | |
| Transparency level for the highlighted atoms (0.0 = fully transparent, | |
| 1.0 = opaque). | |
| Returns | |
| ------- | |
| PIL.Image | |
| A PIL image object of the grid. | |
| Examples | |
| -------- | |
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | |
| >>> from molify import visualize_selected_molecules | |
| >>> | |
| >>> # Create and convert molecule | |
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | |
| >>> | |
| >>> # Find aromatic and aliphatic carbons | |
| >>> aromatic = match_substructure(mol, "c") | |
| >>> aliphatic = match_substructure(mol, "[C;!c]") | |
| >>> | |
| >>> # Flatten matches for visualization | |
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | |
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | |
| >>> | |
| >>> # Visualize with highlights | |
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | |
| <PIL.Image> | |
| """ | |
| # Handle empty args case - display molecule without highlights | |
| if not args: | |
| img = Draw.MolsToGridImage( | |
| [mol], | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=legends if legends is not None else ["Molecule 0"], | |
| ) | |
| return img | |
| # Collect highlighted fragments | |
| candidate_mols, candidate_highlights, candidate_colors = ( | |
| _collect_highlighted_fragments(mol, args, alpha) | |
| ) | |
| if not candidate_mols: | |
| print("No molecules to draw with the given selections.") | |
| return None | |
| # Filter for unique molecules | |
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | |
| candidate_mols, candidate_highlights, candidate_colors | |
| ) | |
| # Draw the grid | |
| final_legends = ( | |
| legends | |
| if legends is not None | |
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | |
| ) | |
| img = Draw.MolsToGridImage( | |
| mols_to_draw, | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=final_legends, | |
| highlightAtomLists=highlight_lists, | |
| highlightAtomColors=highlight_colors, | |
| ) | |
| return img | |
| def visualize_selected_molecules( | |
| mol: Chem.Mol, | |
| *args, | |
| mols_per_row: int = 4, | |
| sub_img_size: tuple[int, int] = (200, 200), | |
| legends: list[str] | None = None, | |
| alpha: float = 0.5, | |
| ): | |
| """Visualize molecules with optional atom highlighting. | |
| If no atom selections are provided, displays the molecule without highlights. | |
| Duplicate molecular structures will only be plotted once. | |
| Parameters | |
| ---------- | |
| mol : Chem.Mol | |
| The RDKit molecule object, which may contain multiple fragments. | |
| *args : list[int] or tuple[int] | |
| Variable number of lists/tuples containing atom indices to be highlighted. | |
| Each selection will be assigned a different color from matplotlib's tab10 | |
| colormap. If no arguments provided, displays the molecule without highlights. | |
| mols_per_row : int, default=4 | |
| Number of molecules per row in the grid. | |
| sub_img_size : tuple[int, int], default=(200, 200) | |
| Size of each molecule image in pixels. | |
| legends : list[str], optional | |
| Custom legends for each molecule. If None, default legends will be used. | |
| alpha : float, default=0.5 | |
| Transparency level for the highlighted atoms (0.0 = fully transparent, | |
| 1.0 = opaque). | |
| Returns | |
| ------- | |
| PIL.Image or None | |
| A PIL image object of the grid, or None if no molecules contain | |
| any of the selected atoms. | |
| Examples | |
| -------- | |
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | |
| >>> from molify import visualize_selected_molecules | |
| >>> | |
| >>> # Create and convert molecule | |
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | |
| >>> | |
| >>> # Find aromatic and aliphatic carbons | |
| >>> aromatic = match_substructure(mol, "c") | |
| >>> aliphatic = match_substructure(mol, "[C;!c]") | |
| >>> | |
| >>> # Flatten matches for visualization | |
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | |
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | |
| >>> | |
| >>> # Visualize with highlights | |
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | |
| <PIL.Image> | |
| """ | |
| # Handle empty args case - display molecule without highlights | |
| if not args: | |
| img = Draw.MolsToGridImage( | |
| [mol], | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=legends if legends is not None else ["Molecule 0"], | |
| ) | |
| return img | |
| # Collect highlighted fragments | |
| candidate_mols, candidate_highlights, candidate_colors = ( | |
| _collect_highlighted_fragments(mol, args, alpha) | |
| ) | |
| if not candidate_mols: | |
| print("No molecules to draw with the given selections.") | |
| return None | |
| # Filter for unique molecules | |
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | |
| candidate_mols, candidate_highlights, candidate_colors | |
| ) | |
| # Draw the grid | |
| final_legends = ( | |
| legends | |
| if legends is not None | |
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | |
| ) | |
| img = Draw.MolsToGridImage( | |
| mols_to_draw, | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=final_legends, | |
| highlightAtomLists=highlight_lists, | |
| highlightAtomColors=highlight_colors, | |
| ) | |
| return img |
| def visualize_selected_molecules( | |
| mol: Chem.Mol, | |
| *args, | |
| mols_per_row: int = 4, | |
| sub_img_size: tuple[int, int] = (200, 200), | |
| legends: list[str] | None = None, | |
| alpha: float = 0.5, | |
| ): | |
| """Visualize molecules with optional atom highlighting. | |
| If no atom selections are provided, displays the molecule without highlights. | |
| Duplicate molecular structures will only be plotted once. | |
| Parameters | |
| ---------- | |
| mol : Chem.Mol | |
| The RDKit molecule object, which may contain multiple fragments. | |
| *args : list[int] or tuple[int] | |
| Variable number of lists/tuples containing atom indices to be highlighted. | |
| Each selection will be assigned a different color from matplotlib's tab10 | |
| colormap. If no arguments provided, displays the molecule without highlights. | |
| mols_per_row : int, default=4 | |
| Number of molecules per row in the grid. | |
| sub_img_size : tuple[int, int], default=(200, 200) | |
| Size of each molecule image in pixels. | |
| legends : list[str], optional | |
| Custom legends for each molecule. If None, default legends will be used. | |
| alpha : float, default=0.5 | |
| Transparency level for the highlighted atoms (0.0 = fully transparent, | |
| 1.0 = opaque). | |
| Returns | |
| ------- | |
| PIL.Image | |
| A PIL image object of the grid. | |
| Examples | |
| -------- | |
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | |
| >>> from molify import visualize_selected_molecules | |
| >>> | |
| >>> # Create and convert molecule | |
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | |
| >>> | |
| >>> # Find aromatic and aliphatic carbons | |
| >>> aromatic = match_substructure(mol, "c") | |
| >>> aliphatic = match_substructure(mol, "[C;!c]") | |
| >>> | |
| >>> # Flatten matches for visualization | |
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | |
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | |
| >>> | |
| >>> # Visualize with highlights | |
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | |
| <PIL.Image> | |
| """ | |
| # Handle empty args case - display molecule without highlights | |
| if not args: | |
| img = Draw.MolsToGridImage( | |
| [mol], | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=legends if legends is not None else ["Molecule 0"], | |
| ) | |
| return img | |
| # Collect highlighted fragments | |
| candidate_mols, candidate_highlights, candidate_colors = ( | |
| _collect_highlighted_fragments(mol, args, alpha) | |
| ) | |
| if not candidate_mols: | |
| print("No molecules to draw with the given selections.") | |
| return None | |
| # Filter for unique molecules | |
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | |
| candidate_mols, candidate_highlights, candidate_colors | |
| ) | |
| # Draw the grid | |
| final_legends = ( | |
| legends | |
| if legends is not None | |
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | |
| ) | |
| img = Draw.MolsToGridImage( | |
| mols_to_draw, | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=final_legends, | |
| highlightAtomLists=highlight_lists, | |
| highlightAtomColors=highlight_colors, | |
| ) | |
| return img | |
| def visualize_selected_molecules( | |
| mol: Chem.Mol, | |
| *args, | |
| mols_per_row: int = 4, | |
| sub_img_size: tuple[int, int] = (200, 200), | |
| legends: list[str] | None = None, | |
| alpha: float = 0.5, | |
| ): | |
| """Visualize molecules with optional atom highlighting. | |
| If no atom selections are provided, displays the molecule without highlights. | |
| Duplicate molecular structures will only be plotted once. | |
| Parameters | |
| ---------- | |
| mol : Chem.Mol | |
| The RDKit molecule object, which may contain multiple fragments. | |
| *args : list[int] or tuple[int] | |
| Variable number of lists/tuples containing atom indices to be highlighted. | |
| Each selection will be assigned a different color from matplotlib's tab10 | |
| colormap. If no arguments provided, displays the molecule without highlights. | |
| mols_per_row : int, default=4 | |
| Number of molecules per row in the grid. | |
| sub_img_size : tuple[int, int], default=(200, 200) | |
| Size of each molecule image in pixels. | |
| legends : list[str], optional | |
| Custom legends for each molecule. If None, default legends will be used. | |
| alpha : float, default=0.5 | |
| Transparency level for the highlighted atoms (0.0 = fully transparent, | |
| 1.0 = opaque). | |
| Returns | |
| ------- | |
| PIL.Image | |
| A PIL image object of the grid. | |
| Examples | |
| -------- | |
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | |
| >>> from molify import visualize_selected_molecules | |
| >>> | |
| >>> # Create and convert molecule | |
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | |
| >>> | |
| >>> # Find aromatic and aliphatic carbons | |
| >>> aromatic = match_substructure(mol, "c") | |
| >>> aliphatic = match_substructure(mol, "[C;!c]") | |
| >>> | |
| >>> # Flatten matches for visualization | |
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | |
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | |
| >>> | |
| >>> # Visualize with highlights | |
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | |
| <PIL.Image> | |
| """ | |
| # Handle empty args case - display molecule without highlights | |
| if not args: | |
| img = Draw.MolsToGridImage( | |
| [mol], | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=legends if legends is not None else ["Molecule 0"], | |
| ) | |
| return img | |
| # Collect highlighted fragments | |
| candidate_mols, candidate_highlights, candidate_colors = ( | |
| _collect_highlighted_fragments(mol, args, alpha) | |
| ) | |
| if not candidate_mols: | |
| raise ValueError( | |
| "No molecules contain any of the selected atoms. " | |
| "Check that atom indices are valid and correspond to the molecule." | |
| ) | |
| # Filter for unique molecules | |
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | |
| candidate_mols, candidate_highlights, candidate_colors | |
| ) | |
| # Draw the grid | |
| final_legends = ( | |
| legends | |
| if legends is not None | |
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | |
| ) | |
| img = Draw.MolsToGridImage( | |
| mols_to_draw, | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=final_legends, | |
| highlightAtomLists=highlight_lists, | |
| highlightAtomColors=highlight_colors, | |
| ) | |
| return img |
| def visualize_selected_molecules( | |
| mol: Chem.Mol, | |
| *args, | |
| mols_per_row: int = 4, | |
| sub_img_size: tuple[int, int] = (200, 200), | |
| legends: list[str] | None = None, | |
| alpha: float = 0.5, | |
| ): | |
| """Visualize molecules with optional atom highlighting. | |
| If no atom selections are provided, displays the molecule without highlights. | |
| Duplicate molecular structures will only be plotted once. | |
| Parameters | |
| ---------- | |
| mol : Chem.Mol | |
| The RDKit molecule object, which may contain multiple fragments. | |
| *args : list[int] or tuple[int] | |
| Variable number of lists/tuples containing atom indices to be highlighted. | |
| Each selection will be assigned a different color from matplotlib's tab10 | |
| colormap. If no arguments provided, displays the molecule without highlights. | |
| mols_per_row : int, default=4 | |
| Number of molecules per row in the grid. | |
| sub_img_size : tuple[int, int], default=(200, 200) | |
| Size of each molecule image in pixels. | |
| legends : list[str], optional | |
| Custom legends for each molecule. If None, default legends will be used. | |
| alpha : float, default=0.5 | |
| Transparency level for the highlighted atoms (0.0 = fully transparent, | |
| 1.0 = opaque). | |
| Returns | |
| ------- | |
| PIL.Image | |
| A PIL image object of the grid. | |
| Examples | |
| -------- | |
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | |
| >>> from molify import visualize_selected_molecules | |
| >>> | |
| >>> # Create and convert molecule | |
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | |
| >>> | |
| >>> # Find aromatic and aliphatic carbons | |
| >>> aromatic = match_substructure(mol, "c") | |
| >>> aliphatic = match_substructure(mol, "[C;!c]") | |
| >>> | |
| >>> # Flatten matches for visualization | |
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | |
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | |
| >>> | |
| >>> # Visualize with highlights | |
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | |
| <PIL.Image> | |
| """ | |
| # Handle empty args case - display molecule without highlights | |
| if not args: | |
| img = Draw.MolsToGridImage( | |
| [mol], | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=legends if legends is not None else ["Molecule 0"], | |
| ) | |
| return img | |
| # Collect highlighted fragments | |
| candidate_mols, candidate_highlights, candidate_colors = ( | |
| _collect_highlighted_fragments(mol, args, alpha) | |
| ) | |
| if not candidate_mols: | |
| print("No molecules to draw with the given selections.") | |
| return None | |
| # Filter for unique molecules | |
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | |
| candidate_mols, candidate_highlights, candidate_colors | |
| ) | |
| # Draw the grid | |
| final_legends = ( | |
| legends | |
| if legends is not None | |
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | |
| ) | |
| img = Draw.MolsToGridImage( | |
| mols_to_draw, | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=final_legends, | |
| highlightAtomLists=highlight_lists, | |
| highlightAtomColors=highlight_colors, | |
| ) | |
| return img | |
| def visualize_selected_molecules( | |
| mol: Chem.Mol, | |
| *args, | |
| mols_per_row: int = 4, | |
| sub_img_size: tuple[int, int] = (200, 200), | |
| legends: list[str] | None = None, | |
| alpha: float = 0.5, | |
| ): | |
| """Visualize molecules with optional atom highlighting. | |
| If no atom selections are provided, displays the molecule without highlights. | |
| Duplicate molecular structures will only be plotted once. | |
| Parameters | |
| ---------- | |
| mol : Chem.Mol | |
| The RDKit molecule object, which may contain multiple fragments. | |
| *args : list[int] or tuple[int] | |
| Variable number of lists/tuples containing atom indices to be highlighted. | |
| Each selection will be assigned a different color from matplotlib's tab10 | |
| colormap. If no arguments provided, displays the molecule without highlights. | |
| mols_per_row : int, default=4 | |
| Number of molecules per row in the grid. | |
| sub_img_size : tuple[int, int], default=(200, 200) | |
| Size of each molecule image in pixels. | |
| legends : list[str], optional | |
| Custom legends for each molecule. If None, default legends will be used. | |
| alpha : float, default=0.5 | |
| Transparency level for the highlighted atoms (0.0 = fully transparent, | |
| 1.0 = opaque). | |
| Returns | |
| ------- | |
| PIL.Image | |
| A PIL image object of the grid. | |
| Examples | |
| -------- | |
| >>> from molify import smiles2atoms, ase2rdkit, match_substructure | |
| >>> from molify import visualize_selected_molecules | |
| >>> | |
| >>> # Create and convert molecule | |
| >>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1")) | |
| >>> | |
| >>> # Find aromatic and aliphatic carbons | |
| >>> aromatic = match_substructure(mol, "c") | |
| >>> aliphatic = match_substructure(mol, "[C;!c]") | |
| >>> | |
| >>> # Flatten matches for visualization | |
| >>> aromatic_flat = [idx for match in aromatic for idx in match] | |
| >>> aliphatic_flat = [idx for match in aliphatic for idx in match] | |
| >>> | |
| >>> # Visualize with highlights | |
| >>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat) | |
| <PIL.Image> | |
| """ | |
| # Handle empty args case - display molecule without highlights | |
| if not args: | |
| img = Draw.MolsToGridImage( | |
| [mol], | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=legends if legends is not None else ["Molecule 0"], | |
| ) | |
| return img | |
| # Collect highlighted fragments | |
| candidate_mols, candidate_highlights, candidate_colors = ( | |
| _collect_highlighted_fragments(mol, args, alpha) | |
| ) | |
| if not candidate_mols: | |
| # Return placeholder image when no matches | |
| from PIL import Image, ImageDraw | |
| img = Image.new('RGB', sub_img_size, 'white') | |
| draw = ImageDraw.Draw(img) | |
| draw.text((10, sub_img_size[1]//2), "No matches found", fill='black') | |
| return img | |
| # Filter for unique molecules | |
| mols_to_draw, highlight_lists, highlight_colors = _filter_unique_molecules( | |
| candidate_mols, candidate_highlights, candidate_colors | |
| ) | |
| # Draw the grid | |
| final_legends = ( | |
| legends | |
| if legends is not None | |
| else [f"Molecule {i}" for i in range(len(mols_to_draw))] | |
| ) | |
| img = Draw.MolsToGridImage( | |
| mols_to_draw, | |
| molsPerRow=mols_per_row, | |
| subImgSize=sub_img_size, | |
| legends=final_legends, | |
| highlightAtomLists=highlight_lists, | |
| highlightAtomColors=highlight_colors, | |
| ) | |
| return img |
🤖 Prompt for AI Agents
In src/molify/substructure.py around lines 387 to 482, the function docstring
claims it returns a PIL.Image but the code returns None at line ~460 when no
molecules match; update the docstring and type information to reflect that the
function may return None (i.e., return type Optional[PIL.Image]) and document
the None case explicitly (or alternatively choose to raise a ValueError or
return a placeholder image—if you pick that, implement the chosen behavior and
update the docstring accordingly). Ensure imports/types reflect Optional and
that the docstring describes when None is returned and what callers should do.
for more information, see https://pre-commit.ci
…nto update-docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_substructure.py (1)
70-135: Explicit-[H] SMARTS tests are valuable; consider clarifying the comments.The BMIM/BF₄⁻ tests (including the explicit-[H] SMARTS cases) are a strong regression suite for
ase2rdkitandmatch_substructure. The only nit is the comments:
- Lines 113–114 and 133–134 currently say “explicit only hydrogens are returned”, which reads awkwardly and is slightly misleading for the
hydrogens="include"case (heavy atoms are also returned).A clearer wording would be:
- # When pattern has explicit [H], only heavy atoms are returned by default + # When the pattern has explicit [H], only heavy atoms are returned by default @@ - # explicit only hydrogens are returned + # Only hydrogens from the explicit-[H] pattern are returned @@ - # explicit only hydrogens are returned - assert len(match) == 25 # 10 heavy + 15 H + # All heavy atoms plus their explicit hydrogens are returned + assert len(match) == 25 # 10 heavy + 15 H
♻️ Duplicate comments (2)
src/molify/substructure.py (2)
275-313: iter_fragments return type annotation is inconsistent with its generator behavior.
iter_fragmentsusesyieldin both branches, so it’s a generator, but the signature advertises-> list[ase.Atoms]and the docstring still uses a “Yields” section. This will confuse type checkers and readers, and it was already flagged previously.A minimal fix is to adjust the return type (and possibly the docstring heading):
-def iter_fragments(atoms: ase.Atoms) -> list[ase.Atoms]: +def iter_fragments(atoms: ase.Atoms) -> tp.Generator[ase.Atoms, None, None]: @@ - Yields - ------ - ase.Atoms - Each connected component (fragment) in the input structure. + Yields + ------ + ase.Atoms + Each connected component (fragment) in the input structure.Callers like
list(iter_fragments(...))remain valid.
387-482: visualize_selected_molecules docstring should acknowledge the None return.When there are no candidate molecules to draw (no selections intersect any fragment), the function prints a message and returns
None, but the docstring claims it always returns aPIL.Image. This mismatch was noted earlier and can surprise callers.If you keep the current behavior, please update the docstring “Returns” section, e.g.:
- Returns - ------- - PIL.Image - A PIL image object of the grid. + Returns + ------- + PIL.Image or None + A PIL image object of the grid, or None if no molecules contain + any of the selected atoms.Optionally, consider replacing the
ValueErrorif you’d rather fail fast.
🧹 Nitpick comments (2)
src/molify/substructure.py (1)
13-170: match_substructure logic matches the tests; consider dropping the unusednoqa.The SMARTS/SMILES parsing, atom-map validation, map-ordering, and hydrogen handling modes all align with the test expectations and the docstring examples. The
# noqa: C901on Line 13 is no longer needed per Ruff (RUF100: unusednoqa), so you can safely remove it to keep the file lint-clean.src/molify/utils.py (1)
180-217: rdkit_determine_bonds heuristic is useful; consider guarding the MolFromXYZ failure path.The charge-scan loop plus special cases for monoatomic ions and PF₆⁻ are practical. One edge case to consider is when
rdkit.Chem.MolFromXYZBlock(xyz)returnsNone(e.g., malformed XYZ); in that caseDetermineBonds(mol, ...)will raise a non-ValueError exception that isn’t caught, and the finalValueErrorwon’t be reached. A simple defensive check before the charge loop (e.g.,if mol is None: raise ValueError("Failed to parse XYZ block")) would make failures easier to interpret.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
docs/source/index.rst(2 hunks)docs/source/packmol_tools.ipynb(2 hunks)pyproject.toml(4 hunks)src/molify/substructure.py(1 hunks)src/molify/utils.py(1 hunks)tests/test_substructure.py(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_substructure.py (5)
src/molify/ase2x.py (1)
ase2rdkit(187-220)src/molify/substructure.py (4)
match_substructure(13-170)get_substructures(236-272)group_matches_by_fragment(173-233)visualize_selected_molecules(387-482)src/molify/smiles2x.py (2)
smiles2conformers(64-149)smiles2atoms(33-61)src/molify/pack.py (1)
pack(143-247)tests/conftest.py (3)
ethanol_water(8-12)alanine_dipeptide(16-18)alanine_dipeptide_box(22-25)
src/molify/substructure.py (2)
src/molify/ase2x.py (1)
ase2rdkit(187-220)src/molify/utils.py (1)
find_connected_components(31-61)
src/molify/utils.py (2)
src/molify/ase2x.py (1)
ase2networkx(97-184)src/molify/rdkit2x.py (1)
rdkit2networkx(61-125)
🪛 Ruff (0.14.4)
src/molify/substructure.py
13-13: Unused noqa directive (non-enabled: C901)
Remove unused noqa directive
(RUF100)
86-86: Avoid specifying long messages outside the exception class
(TRY003)
95-95: Avoid specifying long messages outside the exception class
(TRY003)
162-165: Avoid specifying long messages outside the exception class
(TRY003)
src/molify/utils.py
28-28: Avoid specifying long messages outside the exception class
(TRY003)
182-182: Avoid specifying long messages outside the exception class
(TRY003)
208-208: Consider moving this statement to an else block
(TRY300)
212-216: Avoid specifying long messages outside the exception class
(TRY003)
230-230: Unused noqa directive (non-enabled: C901)
Remove unused noqa directive
(RUF100)
321-321: Avoid specifying long messages outside the exception class
(TRY003)
338-341: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: pytest (3.12, ubuntu-latest)
- GitHub Check: Build wheels on windows-11-arm
- GitHub Check: Build wheels on windows-latest
- GitHub Check: Build wheels on ubuntu-latest
- GitHub Check: Build wheels on ubuntu-24.04-arm
- GitHub Check: Build wheels on ubuntu-latest
- GitHub Check: Build wheels on windows-11-arm
- GitHub Check: Build wheels on windows-latest
- GitHub Check: Build wheels on ubuntu-24.04-arm
🔇 Additional comments (12)
docs/source/index.rst (1)
4-180: Index page restructure and cross-links look consistent with the new API.The updated overview, quick start, advanced features, and the toctree/doc links (including
smiles2conformers,pack, and thepackmol_tools/ase_tools/rdkit_tools/atom_selectiontargets) are coherent with the current public API and the notebook-based docs layout.docs/source/packmol_tools.ipynb (1)
4-307: Packmol notebook narrative matches the current packing API.The water box and water+ethanol examples, connectivity discussion, and the density analysis via
calculate_densityall line up with the newpackbehavior and the RDKit/ASE-based workflow; this is a good, concrete companion to the updated docs.tests/test_substructure.py (6)
7-56: Core substructure and get_substructures tests are well-aligned with the new API.These tests validate CH₃ matching, multi-molecule boxes, and NO₃ extraction (both single structure and packed box) via
match_substructureandget_substructureswith suggestions. They give good coverage of the RDKit-centric path and atom-index expectations.
58-68: iter_fragments tests correctly cover connectivity-present and -absent cases.Using
remove_connectivityto toggle between explicit connectivity andase.build.separateensures both code paths initer_fragmentsare exercised and confirm the expected 10 water trimers in all cases.
181-261: Hydrogen-handling and mapped-SMILES tests give strong behavioral guarantees.The parametrized tests for
hydrogens="exclude","include", and"isolated"plus the mapped-SMILES cases (with and without hydrogens) tightly pin down the ordering and contents of returned indices, which is crucial for users relying on deterministic mapping semantics.
269-372: Fragment grouping and map-order tests thoroughly exercise grouping logic.The
group_matches_by_fragmenttests with ethanol+water and alanine dipeptide (including error-on-duplicate-label and map-order invariants) robustly validate fragment detection and map-number sorting, both with and without hydrogens. This matches the documented behavior ofgroup_matches_by_fragmentandmatch_substructure(mapped_only=True).
414-452: None bond-order regression test correctly protects the ase2rdkit fix.
test_match_substructure_ions_with_none_bond_ordersaccurately reproduces the reported failure mode by forcing allbond_ordervalues toNoneand then verifyingase2rdkitandmatch_substructurework for Na⁺, Cl⁻, and water. This is an excellent high-value regression test.
459-512: Visualization tests sensibly cover basic usage patterns.The visualization tests (basic, empty selections, overlapping/multiple selections, and varying alpha) assert that
visualize_selected_moleculesreturns a non-None image under valid selections, which is a pragmatic smoke test for the new highlighting interface.src/molify/utils.py (4)
18-62: Connectivity helpers are straightforward and robust.
bond_type_from_orderandfind_connected_componentsare simple and clear; the latter’s NetworkX-first, DFS-fallback approach avoids hard dependency issues while only touching thei, jparts of the connectivity tuples, soNonebond orders are harmless here.
64-178: Density and unwrapping utilities look consistent with ASE conventions.
calculate_densityandcalculate_box_dimensionsfollow ASE units, andunwrap_structures’ DFS over the ase2networkx graph (with image offsets viacell.scaled_positions) matches the documented behavior of unwrapping bonded fragments without mutating the originalatoms. The example in the docstring reflects the implemented algorithm.
219-227: suggestions2networkx is a clean bridge from SMILES to graphs.The function correctly normalizes each SMILES string via RDKit (including
Chem.AddHs) and reusesrdkit2networkx, keeping SMILES-to-graph logic centralized.
389-497: Bond-order edge rendering is well thought out.
_draw_edges_with_bond_orderscorrectly encodes single/double/triple/aromatic bonds as parallel lines with perpendicular offsets, and gracefully falls back to thin or single lines when bond_order isNoneor unknown. The geometry math is simple and guards against zero-length edges.
| def draw_molecular_graph( # noqa: C901 | ||
| graph: nx.Graph, | ||
| layout: Literal["spring", "circular", "kamada_kawai"] = "kamada_kawai", | ||
| figsize: tuple[float, float] = (8, 8), | ||
| node_size: int = 500, | ||
| font_size: Optional[int] = None, | ||
| show_bond_orders: bool = True, | ||
| weight_by_bond_order: bool = True, | ||
| ax: Optional[Axes] = None, | ||
| ) -> Figure: | ||
| """Draw a molecular graph with bond order visualization. | ||
| This function visualizes NetworkX molecular graphs with proper representation | ||
| of chemical bonds. Single, double, triple, and aromatic bonds are displayed | ||
| with appropriate line styles. | ||
| Parameters | ||
| ---------- | ||
| graph : nx.Graph | ||
| NetworkX graph with node attribute 'atomic_number' and edge attribute | ||
| 'bond_order'. Can be created from molify conversion functions like | ||
| ase2networkx() or rdkit2networkx(). | ||
| layout : Literal["spring", "circular", "kamada_kawai"], optional | ||
| Layout algorithm to use for node positioning. Options are: | ||
| * `spring`: Force-directed layout | ||
| * `circular`: Nodes arranged in a circle | ||
| * `kamada_kawai`: Force-directed using Kamada-Kawai algorithm (default) | ||
| figsize : tuple[float, float], optional | ||
| Figure size as (width, height) in inches. Default is (8, 8). | ||
| node_size : int, optional | ||
| Size of the nodes. Default is 500. | ||
| font_size : int, optional | ||
| Font size for node labels (atomic numbers). If None, automatically scaled | ||
| based on node_size (approximately 0.66 * sqrt(node_size)). Default is None. | ||
| show_bond_orders : bool, optional | ||
| If True, visualize different bond orders with multiple parallel lines. | ||
| Default is True. | ||
| weight_by_bond_order : bool, optional | ||
| If True and layout is 'spring', use bond_order as edge weights in the | ||
| spring layout algorithm. Higher bond orders pull atoms closer together. | ||
| This makes double/triple bonds appear shorter than single bonds. | ||
| Default is True. | ||
| ax : Axes, optional | ||
| Matplotlib axes to draw on. If None, creates a new figure. | ||
| Returns | ||
| ------- | ||
| Figure | ||
| Matplotlib figure containing the molecular graph visualization. | ||
| Notes | ||
| ----- | ||
| **Node Colors:** | ||
| - Nodes are colored using Jmol color scheme based on atomic numbers | ||
| - Common colors: Carbon (gray), Hydrogen (white), Oxygen (red), | ||
| Nitrogen (blue) | ||
| **Bond Order Visualization:** | ||
| - Single bond (1.0): 1 solid line | ||
| - Double bond (2.0): 2 parallel solid lines | ||
| - Triple bond (3.0): 3 parallel solid lines | ||
| - Aromatic bond (1.5): 1 solid line + 1 dashed line (parallel) | ||
| - Unknown (None): 1 thin solid line | ||
| Examples | ||
| -------- | ||
| >>> from molify import smiles2atoms, ase2networkx, draw_molecular_graph | ||
| >>> atoms = smiles2atoms("C=C=O") # Ketene | ||
| >>> graph = ase2networkx(atoms) | ||
| >>> fig = draw_molecular_graph(graph, layout='spring') | ||
| >>> # For benzene showing aromatic bonds | ||
| >>> benzene = smiles2atoms("c1ccccc1") | ||
| >>> graph_benzene = ase2networkx(benzene) | ||
| >>> fig = draw_molecular_graph(graph_benzene, show_bond_orders=True) | ||
| """ | ||
| # Calculate font_size and linewidth based on node_size if not provided | ||
| if font_size is None: | ||
| font_size = int(0.66 * np.sqrt(node_size)) | ||
|
|
||
| linewidth = 0.091 * np.sqrt(node_size) | ||
|
|
||
| # Create figure if axes not provided | ||
| if ax is None: | ||
| fig, ax = plt.subplots(figsize=figsize) | ||
| created_fig = True | ||
| else: | ||
| fig_or_none = ax.get_figure() | ||
| if fig_or_none is None: | ||
| raise ValueError("Provided axes is not attached to a figure") | ||
| # Cast to Figure (assuming user provides proper Axes, not SubFigure axes) | ||
| fig = cast(Figure, fig_or_none) | ||
| created_fig = False | ||
|
|
||
| # Compute node positions using the specified layout | ||
| if layout == "spring": | ||
| if weight_by_bond_order: | ||
| # Use bond_order as weights - higher order = stronger spring | ||
| pos = nx.spring_layout(graph, weight="bond_order", seed=42) | ||
| else: | ||
| pos = nx.spring_layout(graph, seed=42) | ||
| elif layout == "circular": | ||
| pos = nx.circular_layout(graph) | ||
| elif layout == "kamada_kawai": | ||
| pos = nx.kamada_kawai_layout(graph) | ||
| else: | ||
| raise ValueError( | ||
| f"Unknown layout '{layout}'. " | ||
| f"Choose from 'spring', 'circular', 'kamada_kawai'." | ||
| ) | ||
|
|
||
| # Draw edges with bond order visualization | ||
| if show_bond_orders: | ||
| _draw_edges_with_bond_orders(graph, pos, ax) | ||
| else: | ||
| nx.draw_networkx_edges(graph, pos, ax=ax, width=2) | ||
|
|
||
| # Get node colors based on atomic numbers using Jmol colors | ||
| node_colors = [] | ||
| for node in graph.nodes(): | ||
| atomic_number = graph.nodes[node].get("atomic_number", 0) | ||
| # jmol_colors is indexed by atomic number (1-indexed for elements) | ||
| if atomic_number > 0 and atomic_number < len(jmol_colors): | ||
| node_colors.append(jmol_colors[atomic_number]) | ||
| else: | ||
| node_colors.append([0.5, 0.5, 0.5]) # Gray for unknown | ||
|
|
||
| # Draw nodes with Jmol colors | ||
| nx.draw_networkx_nodes( | ||
| graph, | ||
| pos, | ||
| node_size=node_size, | ||
| node_color=node_colors, | ||
| edgecolors="black", | ||
| linewidths=linewidth, | ||
| ax=ax, | ||
| ) | ||
|
|
||
| # Draw labels (atomic numbers) | ||
| labels = nx.get_node_attributes(graph, "atomic_number") | ||
| nx.draw_networkx_labels( | ||
| graph, pos, labels=labels, font_size=font_size, font_color="black", ax=ax | ||
| ) | ||
|
|
||
| # Add margins to prevent node clipping at edges | ||
| ax.margins(0.15) | ||
| ax.axis("off") | ||
| fig.tight_layout() | ||
|
|
||
| # Close figure to prevent double display in notebooks | ||
| # (only if we created it; otherwise it's managed externally) | ||
| if created_fig: | ||
| plt.close(fig) | ||
|
|
||
| return fig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
spring layout weighting breaks when bond_order is None; normalize weights before calling NetworkX.
draw_molecular_graph advertises support for graphs from ase2networkx() and rdkit2networkx(), but edges from ase2networkx have bond_order=None. With layout="spring" and weight_by_bond_order=True (the default), this leads to:
pos = nx.spring_layout(graph, weight="bond_order", seed=42)which will pass None weights into NetworkX’s spring layout. Depending on NetworkX version, this is likely to raise a TypeError or produce undefined behavior, and it contradicts the docstring example that uses draw_molecular_graph(ase2networkx(atoms), layout='spring').
To make this robust, normalize weights before calling spring_layout, e.g.:
- if layout == "spring":
- if weight_by_bond_order:
- # Use bond_order as weights - higher order = stronger spring
- pos = nx.spring_layout(graph, weight="bond_order", seed=42)
- else:
- pos = nx.spring_layout(graph, seed=42)
+ if layout == "spring":
+ if weight_by_bond_order:
+ # Build a view with numeric weights: None (or missing) -> 1.0
+ g_w = nx.Graph()
+ g_w.add_nodes_from(graph.nodes(data=True))
+ for u, v, data in graph.edges(data=True):
+ w = data.get("bond_order")
+ w_num = float(w) if isinstance(w, (int, float)) else 1.0
+ g_w.add_edge(u, v, weight=w_num)
+ pos = nx.spring_layout(g_w, weight="weight", seed=42)
+ else:
+ pos = nx.spring_layout(graph, seed=42)Also, Ruff reports the # noqa: C901 on the function signature as unused; you can drop it once complexity is acceptable to your linter.
I'll verify the critical claims in this review comment by examining the codebase.
🏁 Script executed:
#!/bin/bash
# Find and examine ase2networkx and rdkit2networkx implementations
fd -e py -type f | xargs rg -l "def ase2networkx|def rdkit2networkx" 2>/dev/null | head -5Length of output: 212
🏁 Script executed:
#!/bin/bash
# Search for ase2networkx function definition and check bond_order assignment
rg -A 30 "def ase2networkx" --type pyLength of output: 1562
🏁 Script executed:
#!/bin/bash
# Search for rdkit2networkx function definition
rg -A 30 "def rdkit2networkx" --type pyLength of output: 1465
🏁 Script executed:
#!/bin/bash
# Check NetworkX version in dependencies
fd -e txt -o -e toml -o -e cfg | xargs rg -i "networkx|nx" 2>/dev/null | head -20Length of output: 1488
🏁 Script executed:
#!/bin/bash
# Search for tests using draw_molecular_graph with spring layout
rg -B 5 -A 5 "draw_molecular_graph.*spring|spring.*draw_molecular_graph" --type pyLength of output: 672
🏁 Script executed:
#!/bin/bash
# Check if noqa comment C901 is actually unused by examining function complexity
rg -C 10 "def draw_molecular_graph" --type py | head -40Length of output: 1069
🏁 Script executed:
#!/bin/bash
# Verify if there are test cases exercising the specific docstring example
rg -B 2 -A 10 "smiles2atoms.*ase2networkx.*draw_molecular_graph" --type pyLength of output: 795
🏁 Script executed:
#!/bin/bash
# Check what happens when bond_order is None in edge data
rg -B 5 -A 10 "bond_order.*None|None.*bond_order" --type pyLength of output: 27836
🏁 Script executed:
#!/bin/bash
# Look for version pin on networkx
cat pyproject.toml requirements.txt setup.py 2>/dev/null | grep -i networkxLength of output: 45
🏁 Script executed:
#!/bin/bash
# Check if there's actual error handling or validation for bond_order values
rg -B 3 -A 3 "weight_by_bond_order" --type pyLength of output: 1358
#!/usr/bin/env python3
# Test what NetworkX 3.4.2+ does with None values in edge weights
import networkx as nx
# Create a simple graph with None edge weight
g = nx.Graph()
g.add_node(0)
g.add_node(1)
g.add_edge(0, 1, bond_order=None)
try:
# Try to use spring_layout with bond_order as weight
pos = nx.spring_layout(g, weight="bond_order", seed=42)
print("SUCCESS: spring_layout handled None weights without error")
print(f"Positions: {pos}")
except TypeError as e:
print(f"TypeError caught: {e}")
except Exception as e:
print(f"Exception caught ({type(e).__name__}): {e}")🏁 Script executed:
#!/bin/bash
# Search for actual tests that call draw_molecular_graph with spring layout
rg -B 5 -A 10 "draw_molecular_graph" tests/ --type pyLength of output: 0
🏁 Script executed:
#!/bin/bash
# Find _draw_edges_with_bond_orders implementation
rg -B 5 -A 40 "def _draw_edges_with_bond_orders" --type pyLength of output: 2138
🏁 Script executed:
#!/bin/bash
# Check if there are any tests that exercise spring layout with None bond_order
rg -B 10 -A 10 "spring.*ase2networkx|ase2networkx.*spring" tests/ --type pyLength of output: 0
🏁 Script executed:
#!/bin/bash
# Look for any error handling around NetworkX spring_layout calls
rg -B 5 -A 5 "spring_layout" src/ --type pyLength of output: 746
🏁 Script executed:
#!/bin/bash
# Check actual implementations and edge handling in spring layout path
rg -B 2 -A 2 "weight_by_bond_order.*True" src/ --type pyLength of output: 254
🌐 Web query:
NetworkX spring_layout None edge weight values behavior
💡 Result:
Short answer: spring_layout treats the parameter weight as the name of the edge attribute that scales the attractive spring force; if you pass weight=None then the algorithm treats the graph as unweighted (every edge weight = 1). [1][2]
References
- spring_layout docstring (weight: string or None — "If None, then all edge weights are 1"). [1]
- NetworkX release notes / API: "Use weight=None to consider all weights equally (unweighted case)". [2]
🌐 Web query:
NetworkX spring_layout TypeError None values edge attributes weights
💡 Result:
Cause
- spring_layout expects the edge attribute named by weight to contain numeric values (or weight=None to treat all edges as weight=1). If some edges have the attribute but its value is None, the layout code will try numeric operations / comparisons on None and raise a TypeError. See the spring_layout docs which state weight is a string or None and that None means all edge weights are 1. [1][2]
Fixes (pick one)
- Tell spring_layout to ignore edge attributes and use uniform weights:
- nx.spring_layout(G, weight=None)
- Replace missing/None edge weights with numbers before calling spring_layout:
- for u,v,d in G.edges(data=True):
if d.get('weight') is None:
d['weight'] = 1.0
- for u,v,d in G.edges(data=True):
- Or set a default for all edges:
- nx.set_edge_attributes(G, 1.0, 'weight')
References
- NetworkX spring_layout (documentation). [1]
- NetworkX layout source/docs (weight parameter behavior). [2]
🏁 Script executed:
#!/bin/bash
# Check Ruff configuration for C901 (McCabe complexity)
fd -name "ruff.toml" -o -name "pyproject.toml" -o -name ".ruff.toml" | xargs cat 2>/dev/null | grep -A 5 -B 5 "C901\|complexity\|mccabe"Length of output: 230
🏁 Script executed:
#!/bin/bash
# Count cyclomatic complexity indicators in draw_molecular_graph (if, elif, for, etc.)
rg -A 160 "def draw_molecular_graph" src/molify/utils.py | grep -E "^\s+(if|elif|for|while|except)" | wc -lLength of output: 2
Fix broken spring layout with None bond_order values from ase2networkx; normalize weights before calling spring_layout.
The review comment is accurate. When spring_layout expects the edge attribute named by weight to contain numeric values, if some edges have the attribute but its value is None, the layout code will try numeric operations on comparisons on None and raise a TypeError. The docstring example calls draw_molecular_graph(ase2networkx(atoms), layout='spring'), and since ase2networkx() produces edges with bond_order=None, this will fail at runtime with the current code.
The suggested fix—normalizing None weights to 1.0 before passing to spring_layout—is the correct approach. The secondary point about the unused # noqa: C901 comment is less critical but reasonable to address once the complexity is reduced or confirmed acceptable by your linter.
🧰 Tools
🪛 Ruff (0.14.4)
230-230: Unused noqa directive (non-enabled: C901)
Remove unused noqa directive
(RUF100)
321-321: Avoid specifying long messages outside the exception class
(TRY003)
338-341: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In src/molify/utils.py around lines 230-386, the spring_layout call breaks when
some edges have bond_order=None; normalize edge weights before calling
spring_layout by creating a numeric edge attribute (e.g., "spring_weight") where
you set spring_weight = float(bond_order) for numeric bond_order values and
spring_weight = 1.0 for None/unknown, then call nx.spring_layout(graph,
weight="spring_weight", seed=42); ensure you either remove the temporary
attribute after layout or set it on a shallow copy of the graph so original
bond_order values remain unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/workflows/build-wheels.yaml (1)
25-25: Consider normalizing submodule fetch strategy for consistency.The
build_wheelsjob usessubmodules: recursive(line 25) whilebuild_sdistusessubmodules: true(line 46). This inconsistency is unlikely to cause issues, but since the comment flags packmol as critical, both jobs should fetch submodules identically to ensure the SDist includes all dependencies.Update line 46 to match line 25 for clarity:
- uses: actions/checkout@v5 with: fetch-depth: 0 - submodules: true + submodules: recursiveAlso applies to: 46-46
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/build-wheels.yaml(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: pytest (3.10, windows-latest)
- GitHub Check: pytest (3.11, windows-latest)
- GitHub Check: pytest (3.12, macos-15-intel)
- GitHub Check: pytest (3.12, macos-latest)
- GitHub Check: pytest (3.13, windows-latest)
- GitHub Check: pytest (3.12, windows-latest)
- GitHub Check: Build wheels on windows-11-arm
- GitHub Check: Build wheels on macos-15-intel
- GitHub Check: Build wheels on windows-latest
- GitHub Check: Build wheels on macos-latest
- GitHub Check: Build wheels on ubuntu-latest
- GitHub Check: Build wheels on ubuntu-24.04-arm
🔇 Additional comments (3)
.github/workflows/build-wheels.yaml (3)
3-11: ✓ Critical blocker resolved: release trigger now present.The previous review flagged that the
upload_pypijob was unreachable because the workflow lacked areleasetrigger. This has now been fixed—the workflow correctly triggers onrelease: { types: [published] }, so PyPI uploads will execute when you publish a GitHub Release.
31-31: Verify that macOS 15.0 deployment target is intentional.Setting
MACOSX_DEPLOYMENT_TARGET=15.0(macOS Sequoia, released Sept 2024) is extremely restrictive and will exclude all users on macOS 11–14. This is unusually aggressive for package distribution.Confirm this is intentional and aligns with your project's minimum macOS version requirements. If you need broader compatibility, consider a lower deployment target (e.g., 11.0 or 10.9).
20-20: Verify ARM runner availability in your GitHub environment.The matrix includes
ubuntu-24.04-armandwindows-11-armrunners. These are not standard GitHub-hosted runners; they require either GitHub Large Runners (paid tier) or custom self-hosted runner configuration. If these runners are unavailable, the matrix will fail to expand or jobs will hang waiting for runners.Confirm:
- Are you using GitHub Large Runners?
- Are the runner labels (
ubuntu-24.04-arm,windows-11-arm) correctly registered?- If not needed, consider removing ARM variants to reduce cost/complexity.
WIP
Summary by CodeRabbit
New Features
Documentation
Infrastructure
Tests