DataFictor Alimentos
JOX Pipeline
Intelligent ETL pipeline for structured data extraction from agricultural bulletins in PDF using LLMs.
J
The Problem
Daily agricultural bulletins in PDF without structure, requiring manual reading to extract quotes and news relevant to the business.
The Solution
Structured output with automatic retry and invalid JSON repair, date-based idempotency, complete extraction auditing and typed Gold layer for BI.
Result
Automated daily extraction with 100% idempotency, structured data in Delta Tables ready for BI consumption.
// Related Projects
Related Projects
R
Fictor Alimentos
Data
RPA Suite Fictor
Suite of 50+ RPA pipelines automating critical logistics, supply chain and sales reports for 5 subsidiaries.
PythonSeleniumBeautifulSoupFastAPI+1
CodeDetails →