feat: use __doc__ as dataset description#3021
Draft
dhairya-pandya wants to merge 2 commits intopgmpy:devfrom
Draft
feat: use __doc__ as dataset description#3021dhairya-pandya wants to merge 2 commits intopgmpy:devfrom
dhairya-pandya wants to merge 2 commits intopgmpy:devfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a human-readable dataset description sourced directly from the dataset class docstring (__doc__) and surfaces it via the Dataset object.
Changes:
- Add
descriptionfield toDatasetand include it inDataset.__str__. - Populate
Dataset.descriptioninload_dataset()fromtarget_cls.__doc__and update docs example. - Refactor
load_model()lookup via a helper and align the “available models” error message + tests.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pgmpy/datasets/_base.py | Introduces Dataset.description, prints it, and assigns it from __doc__ in load_dataset(). |
| pgmpy/example_models/_base.py | Adds _find_model_class() helper and updates load_model() error handling/message. |
| pgmpy/tests/test_datasets/test_datasets.py | Minor formatting-only change (blank line). |
| pgmpy/tests/test_example_models/test_example_models.py | Updates expected error message string for load_model(). |
| pgmpy/utils/utils.py | Formatting-only change (blank line). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Comment on lines
138
to
143
| def test_invalid_tag(): | ||
| with pytest.raises(ValueError, match="Unrecognized filter argument"): | ||
| list_datasets(is_paraterized=True) # typo | ||
|
|
||
| with pytest.raises(ValueError, match="Unrecognized filter argument"): | ||
| list_datasets(num_samples=100) # wrong key name entirely |
| data: pd.DataFrame | ||
| expert_knowledge: Optional[ExpertKnowledge] = None | ||
| ground_truth: Optional[DAG] = None | ||
| description: str | None = None |
Comment on lines
32
to
37
| def __str__(self) -> str: | ||
| return ( | ||
| f"Dataset(name={self.name}, \n data=DataFrame of size: {self.data.shape}, \n " | ||
| f"expert_knowledge={self.expert_knowledge}, \n ground_truth={self.ground_truth}, \n tags={self.tags})" | ||
| f"expert_knowledge={self.expert_knowledge}, \n ground_truth={self.ground_truth}, \n " | ||
| f"description={self.description}, \n tags={self.tags})" | ||
| ) |
b2ecbbc to
2d652fe
Compare
Member
|
Marking this draft till it is ready for review. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #3021 +/- ##
=======================================
Coverage 95.64% 95.64%
=======================================
Files 504 504
Lines 29111 29117 +6
=======================================
+ Hits 27844 27850 +6
Misses 1267 1267
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The following checklist is mandatory.
Your PR will be closed if you remove the checklist or do not answer the questions to a satisfactory level. Use of LLMs is strictly forbidden for any part of this checklist (including for improving language), and will result in a ban if we find any use of LLMs.
Your checklist for this pull request
Please answer the following questions:
Did you use an LLM for any assistance with this PR? Please describe in detail (around a paragraph) how and what you used it for?
[Please Answer Here]
What steps have you taken to verify that the changes correctly address the issue? And what edge cases have you considered? Other than running tests, what else have you verified?
[Please Answer Here]
Has the LLM added try-except blocks? They will need to be removed; any error handling must be explicit.
[Please Answer Here]
Have you used LLM for generating tests? They need to be compressed into a smaller number of tests without reducing coverage.
[Please Answer Here]
Issue number(s) that this pull request fixes
List of changes to the codebase in this pull request
Datasetobject as discussed in the PR [ENH]: Add get_reference() for programmatic access to dataset/model citations #2684 inside the load_datase