[ENH] ICA LiNGAM algorithm in casual_discovery#2998
[ENH] ICA LiNGAM algorithm in casual_discovery#2998Jatinbhardwaj-093 wants to merge 11 commits intopgmpy:devfrom
Conversation
Signed-off-by: Jatin Bhardwaj <[email protected]>
Signed-off-by: Jatin Bhardwaj <[email protected]>
Signed-off-by: Jatin Bhardwaj <[email protected]>
Signed-off-by: Jatin Bhardwaj <[email protected]>
23a75bd to
6570834
Compare
Remaining Tasks
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #2998 +/- ##
========================================
Coverage 95.57% 95.57%
========================================
Files 504 506 +2
Lines 29122 29295 +173
========================================
+ Hits 27833 27999 +166
- Misses 1289 1296 +7
|
ankurankan
left a comment
There was a problem hiding this comment.
@Jatinbhardwaj-093 I see that you are also trying to include expert knowledge in the implementation. Did you find a reference paper for it? If not, I would suggest keeping the initial implementation without it.
I initially assumed that expert knowledge was a native feature in pgmpy, which is why I didn’t explore it in depth during the first draft. I will look for relevant reference papers to support its inclusion. If I’m unable to find sufficient backing, I’ll remove or revise this part accordingly. |
|
I was thinking about the variant parameter you suggested. Please correct me if I’m wrong, but my understanding is that this would correspond to different types of LiNGAM algorithms, such as ICA, Direct, VAR, Kernel, and others. If supporting multiple variants is a primary goal, would it make sense to create a single master wrapper class that selects the appropriate LiNGAM implementation based on the variant parameter? Which will have this parameter. |
|
@Jatinbhardwaj-093 In the case of LiNGAM, I think the two main algorithms are Direct and ICA. I don't think there is a significant overlap between the two, so might be better to implement them in separate classes and not have a variant argument. Sorry, if I suggested to add that earlier. |
pgmpy/causal_discovery/LiNGAM.py
Outdated
| # Step 1: Apply an ICA algorithm to obtain a decomposition X = AS where S has | ||
| # the same size as X and contains in its rows the independent components. | ||
| # From here on, we will exclusively work with W = A^-1. | ||
| ica = FastICA(random_state=self.random_state, max_iter=1000) |
There was a problem hiding this comment.
I would suggest taking an instance of the FastICA as the argument itself. This makes the method composable, and users can fine-tune the FastICA hyperparameters without us having to design our method in a way that allows users to specify that.
Signed-off-by: Jatin Bhardwaj <[email protected]>
Signed-off-by: Jatin Bhardwaj <[email protected]>
Signed-off-by: Jatin Bhardwaj <[email protected]>
Signed-off-by: Jatin Bhardwaj <[email protected]>
The following checklist is mandatory.
Your PR will be closed if you remove the checklist or do not answer the questions to a satisfactory level. Use of LLM is strictly forbidden for any part of this checklist (even for improving language).
Your checklist for this pull request
Please answer the following questions:
Yes. I used Gemini 3.1 primarily for assistance in understanding and summarizing parts of the LiNGAM research paper. In particular, it helped clarify
The algorithm for permutation of the weight matrix.
The overall steps of the LiNGAM algorithm.
The statistical pruning methods described in Section 6 of the paper (Wald test and A Chi-Square Test for Evaluating the Overall Fit of the Estimated Model), which are not yet implemented in the current code.
The implementation itself was written by me based on the original paper and my understanding of the algorithm.
The implementation was written based on the LiNGAM paper and verified by running the algorithm on synthetic datasets with known causal structures.
A small number of tests were generated with the help of an AI tool and then manually reviewed. Currently, two simple test cases are included to verify that the discovered causal graph matches the expected structure.
No.
Some initial test scaffolding was generated with the help of an AI tool and then reviewed manually. However, the test suite is not finalized in this PR. A more complete and compact set of tests will be added in a subsequent update.
Issue number(s) that this pull request fixes
List of changes to the codebase in this pull request