Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SiblingAssociationHook causes NullPointerException in ESSearchDAO when using standalone MAE consumer #10041

Closed
Masterchen09 opened this issue Mar 13, 2024 · 2 comments
Labels
bug Bug report

Comments

@Masterchen09
Copy link
Contributor

Masterchen09 commented Mar 13, 2024

Describe the bug
When using the standalone deployment of the MAE consumer the SiblingAssociationHook is causing a NullPointerException in ESSearchDAO when searching for siblings. When using the non-standalone deployment the MAE consumer integrated in the GMS does not cause a NullPointerException and everything seems to work fine.

To Reproduce
Steps to reproduce the behavior:

  1. Use the standalone deployment of the MAE consumer
  2. Produce a Metadata Change Log event which is eligible to be processed by the SiblingAssociationHook e.g. a dataset key aspect (see also here), we have actually noticed the issue when restoring the search indices because then the corresponding Metadata Change Log events were produced and processed by the SiblingAssociationHook
  3. MAE consumer logs a NullPointerException exception when searching for siblings, because aspectRetriever in ESSearchDAO is null:
2024-03-12 22:18:54,617 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.h.s.SiblingAssociationHook - Urn urn:li:dataset:(urn:li:dataPlatform:XYZ,XYZ,PROD) received by Sibling Hook.
2024-03-12 22:18:54,617 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.siblings.SiblingAssociationHook
java.lang.NullPointerException: Cannot invoke "com.linkedin.metadata.aspect.AspectRetriever.getEntityRegistry()" because "this.aspectRetriever" is null
	at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.lambda$search$2(ESSearchDAO.java:244)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
	at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.search(ESSearchDAO.java:245)
	at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.search(ElasticSearchService.java:171)
	at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.search(ElasticSearchService.java:153)
	at com.linkedin.metadata.kafka.hook.siblings.SiblingAssociationHook.handleEntityKeyEvent(SiblingAssociationHook.java:140)
	at com.linkedin.metadata.kafka.hook.siblings.SiblingAssociationHook.invoke(SiblingAssociationHook.java:128)
	at com.linkedin.metadata.kafka.MetadataChangeLogProcessor.consume(MetadataChangeLogProcessor.java:113)
	at jdk.internal.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:169)
	at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:119)
	at org.springframework.kafka.listener.adapter.HandlerAdapter.invoke(HandlerAdapter.java:56)
	at org.springframework.kafka.listener.adapter.MessagingMessageListenerAdapter.invokeHandler(MessagingMessageListenerAdapter.java:376)
	at org.springframework.kafka.listener.adapter.RecordMessagingMessageListenerAdapter.onMessage(RecordMessagingMessageListenerAdapter.java:92)
	at org.springframework.kafka.listener.adapter.RecordMessagingMessageListenerAdapter.onMessage(RecordMessagingMessageListenerAdapter.java:53)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeOnMessage(KafkaMessageListenerContainer.java:2848)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeOnMessage(KafkaMessageListenerContainer.java:2826)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.lambda$doInvokeRecordListener$56(KafkaMessageListenerContainer.java:2744)
	at io.micrometer.observation.Observation.observe(Observation.java:565)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeRecordListener(KafkaMessageListenerContainer.java:2742)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeWithRecords(KafkaMessageListenerContainer.java:2595)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListener(KafkaMessageListenerContainer.java:2481)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeListener(KafkaMessageListenerContainer.java:2123)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeIfHaveRecords(KafkaMessageListenerContainer.java:1478)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1442)
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1313)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)

Expected behavior
The standalone MAE consumer should have the same behavior as the non-standalone MAE consumer integrated in the GMS - i.e. no NullPointerException should be thrown.

Screenshots
not relevant

Desktop (please complete the following information):
not relevant

Additional context

It seems that the aspectRetriever in ESSearchDAO is set as part of the postConstruct method of the ElasticSearchService. The postConstruct method of the ElasticSearchService is called by the postConstruct method of the JavaEntityClient or the postConstruct method of SystemUpdateConfig (see also the @PostConstruct annotation there). The postConstruct method of the JavaEntityClient is called by the postConstruct method of the EntityClientAspectRetriever (there is also a @PostConstruct annotation).

The instance of the EntityClientAspectRetriever is created using the AspectRetrieverFactory. It seems this is only needed/used for the UpdateIndicesServiceFactory, which might only be used as part of the GMS and not the standalone MAE consumer. The SystemUpdateConfig seems only to be relevant for the datahub-upgrade and not the GMS or the standalone MAE consumer.

Long story short - The aspectRetriever in ESSearchDAO is only assigned as a side-effect of the initialization of the GMS and needs also to be set for the standalone MAE consumer somehow. Unfortunately I do not know the architecture of the Java stack or the Spring Framework itself good enough to fix it myself - I hope the (pre-)analysis of the issue might help a little bit anyway. ;)

Potential PRs that could have caused this issue, as there were changes related to the constructor of the SiblingAssociationHook: #9626 (or maybe also #9892)

edit:

I have totally forgot to mention it: The exception was not thrown with version 0.12.1, but now with version 0.13.0.

@Masterchen09 Masterchen09 added the bug Bug report label Mar 13, 2024
@Masterchen09
Copy link
Contributor Author

@david-leifker I would assume your PR #10125 should solve this issue, right?

@Masterchen09
Copy link
Contributor Author

I have tested it with version 0.13.2, the issue was presumably fixed by #10125.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

1 participant