Open
Description
Describe the bug
We currently (and consistently) need to wait a long time (~10 minutes or longer) for even simple files to finish. This is a new phenomena and also does not happen when using the vendor multimodal model option (for example GPT-4o).
When looking at the related job details inside cloud.llamaindex.ai, we see the following error messages:
Page 3 [warning] - OCR_ERROR : OCR failed on image /home/user/dist/worker/pipeline/../../../tmp/8c1b928c-9066-44b4-a7cd-795dc09edbe8/img/img_p2_2.png. Details: socket hang up
Page 3 [warning] - OCR_ERROR : OCR failed on image /home/user/dist/worker/pipeline/../../../tmp/8c1b928c-9066-44b4-a7cd-795dc09edbe8/img/img_p2_1.png. Details: socket hang up
Page 6 [warning] - OCR_ERROR : OCR failed on image /home/user/dist/worker/pipeline/../../../tmp/8c1b928c-9066-44b4-a7cd-795dc09edbe8/img/img_p5_5.png. Details: socket hang up
Files
One of the documents we've been testing with is the public AirBnB pitchdeck, please find it attached.
Job ID
- 31513454-e41c-41e9-8117-4aed16203ea5
- b7964a02-c44e-435e-99a3-6231d98dff8f
- 5ed77e71-a68b-421b-9856-809569f84311
(and many more)
Client:
- API
Additional context
Does not happen in multimodal mode.
This is the data we submit to your API via POST:
const formData = createFormData({
file,
parsing_instruction,
do_not_cache: true,
invalidate_cache: true,
...(richContent && {
use_vendor_multimodal_model: true,
vendor_multimodal_model_name: 'openai-gpt4o',
vendor_multimodal_api_key: OPENAI_API_KEY,
}),
})