Skip to content

(feat/extract) New re-ranker + multi entity extraction#1061

Merged
nickscamara merged 53 commits intomainfrom
nsc/semantic-index-extract
Jan 14, 2025
Merged

(feat/extract) New re-ranker + multi entity extraction#1061
nickscamara merged 53 commits intomainfrom
nsc/semantic-index-extract

Conversation

@nickscamara
Copy link
Member

No description provided.

@nickscamara nickscamara merged commit 5e5b5ee into main Jan 14, 2025
1 check failed
timoa pushed a commit to timoa/firecrawl that referenced this pull request Feb 2, 2025
* agent that decides if splits schema or not

* split and merge properties done

* wip

* wip

* changes

* ch

* array merge working!

* comment

* wip

* dereferentiate schema

* dereference schemas

* Nick: new re-ranker

* Create llm-links.txt

* Nick: format

* Update extraction-service.ts

* wip: cooking schema mix and spread functions

* wip

* wip getting there!!!

* nick:

* moved functions to helpers

* nick:

* cant reproduce the error anymore

* error handling all scrapes failed

* fix

* Nick: added the sitemap index

* Update sitemap-index.ts

* Update map.ts

* deduplicate and merge arrays

* added error handler for object transformations

* Update url-processor.ts

* Nick:

* Nick: fixes

* Nick: big improvements to rerank of multi-entity

* Nick: working

* Update reranker.ts

* fixed transformations for nested objs

* fix merge nulls

* Nick: fixed error piping

* Update queue-worker.ts

* Update extraction-service.ts

* Nick: format

* Update queue-worker.ts

* Update pnpm-lock.yaml

* Update queue-worker.ts

---------

Co-authored-by: rafaelmmiller <[email protected]>
Co-authored-by: Thomas Kosmas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants