Skip to content

[Bug] Structured Extract with French Language #870

Closed as not planned
Closed as not planned
@KuriaMaingi

Description

Describe the Bug
Attempting to use the structured extract of a French language site

class ExtractSchema(BaseModel):
image: str
product_title: str
product_description: str
price: float
age: str
ean_or_productcode: str
brand: str
format: str
number_of_players: str
length_or_width: str
height: str
depth: str
playing_time: str
mechanisms: str
price_currency: str

1st Link Fails:
Link 1
Results: 'extract': 'ogLocaleAlternate:|google:notranslate'

2nd Link Successful:

Link 2
Results: 'extract': "ogTitle:Acheter Nexcube 3x3 Classic - MoYu - Casse-têtes|ogDescription:'Avec Nexcube 3x3 Classic, faites tourner les cases de ce Cube jusqu''à ce que chaque côté du cube ait une couleur uniforme. Un casse-tête ergonomique conçu pour la compétition.'|ogImage:https://cdn1.philibertnet.com/517165-large_default/nexcube-3x3-classic.jpg|ogLocaleAlternate:|ogSiteName:Philibert|og:title:Acheter Nexcube 3x3 Classic - MoYu - Casse-têtes|og:site_name:Philibert|og:description:'Avec Nexcube 3x3 Classic, faites tourner les cases de ce Cube jusqu''à ce que chaque côté du cube ait une couleur uniforme. Un casse-tête ergonomique conçu pour la compétition.'|og:type:product|og:image:https://cdn1.philibertnet.com/517165-large_default/nexcube-3x3-classic.jpg|google-site-verification:eOyJ7NyAZOoDK45PX0O9qnGLhUd3ebBikLzZOD7D-Ic"},

To Reproduce
Steps to reproduce the issue:
firecrawl_client.scrape_url( url, params={'formats': ['extract'], 'extract': {'schema':extract_schema}, 'location': {'country': 'FR'} }

Expected Behavior
I would expect the LLM to be able to translate between the two languages given the location param.

If the issue isn't the language but rather the site vs. the schema, would be good to know as well

Environment (please complete the following information):

  • OS: [Windows]
  • Firecrawl Version: [e.g. 1.4.0]

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Labels

blockedbugSomething isn't workingquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions