Description
Describe the Bug
Attempting to use the structured extract of a French language site
class ExtractSchema(BaseModel):
image: str
product_title: str
product_description: str
price: float
age: str
ean_or_productcode: str
brand: str
format: str
number_of_players: str
length_or_width: str
height: str
depth: str
playing_time: str
mechanisms: str
price_currency: str
1st Link Fails:
Link 1
Results: 'extract': 'ogLocaleAlternate:|google:notranslate'
2nd Link Successful:
Link 2
Results: 'extract': "ogTitle:Acheter Nexcube 3x3 Classic - MoYu - Casse-têtes|ogDescription:'Avec Nexcube 3x3 Classic, faites tourner les cases de ce Cube jusqu''à ce que chaque côté du cube ait une couleur uniforme. Un casse-tête ergonomique conçu pour la compétition.'|ogImage:https://cdn1.philibertnet.com/517165-large_default/nexcube-3x3-classic.jpg|ogLocaleAlternate:|ogSiteName:Philibert|og:title:Acheter Nexcube 3x3 Classic - MoYu - Casse-têtes|og:site_name:Philibert|og:description:'Avec Nexcube 3x3 Classic, faites tourner les cases de ce Cube jusqu''à ce que chaque côté du cube ait une couleur uniforme. Un casse-tête ergonomique conçu pour la compétition.'|og:type:product|og:image:https://cdn1.philibertnet.com/517165-large_default/nexcube-3x3-classic.jpg|google-site-verification:eOyJ7NyAZOoDK45PX0O9qnGLhUd3ebBikLzZOD7D-Ic"},
To Reproduce
Steps to reproduce the issue:
firecrawl_client.scrape_url( url, params={'formats': ['extract'], 'extract': {'schema':extract_schema}, 'location': {'country': 'FR'} }
Expected Behavior
I would expect the LLM to be able to translate between the two languages given the location param.
If the issue isn't the language but rather the site vs. the schema, would be good to know as well
Environment (please complete the following information):
- OS: [Windows]
- Firecrawl Version: [e.g. 1.4.0]
Activity