Input
Drop a PDF here or click to upload
PDF only, max 10MB
Output
Parsed results will appear here
| # | Description | Qty | Unit | Specs | Mfr |
|---|
How Frontier works — Architecture & Technical Deep-Dive
The Problem
Industrial procurement still relies heavily on PDF-based RFQs sent via email. A single RFQ can contain dozens of line items with part numbers, specifications, quantities, and delivery requirements — all in inconsistent formats across different companies. Procurement teams spend hours manually transcribing this data into quoting systems, introducing errors that can cascade into costly mistakes. A unit-of-measure mismatch on a valve order can mean a $50,000 error.
The Solution
This tool uses AI to parse unstructured RFQ documents into clean, structured data in seconds. It handles varying formats, extracts metadata and line items, and returns everything in a standardized schema ready for downstream systems.
Architecture
Browser (upload PDF) │ ▼ FastAPI Backend (Python) ├── Validate: file type, size, magic bytes ├── Extract text: pdfplumber (in memory, never stored) ├── Validate: page count, text length │ ▼ LLM API (NLP Extraction) ├── System prompt anchored against injection ├── Structured extraction with JSON schema ├── Response validation before returning │ ▼ Browser (render table + CSV export)
Key Technical Decisions
- Python + FastAPI — pdfplumber is the best PDF text extraction library available, and it's Python-only. FastAPI gives us async handling and auto-generated API docs.
- Stateless design — No database, no user accounts, no file storage. PDFs are processed in memory and immediately discarded. This eliminates entire categories of security vulnerabilities.
- Prompt engineering — The system prompt is anchored to treat all document text as data, not instructions. This mitigates prompt injection via malicious PDF content.
- Schema validation — The AI response is validated against an expected JSON structure before being returned to the frontend. Malformed responses are rejected.
Security
- PDF magic byte validation + 10MB file size limit
- 50-page and 100KB text extraction limits
- 30-second processing timeout
- Rate limiting: 10 requests/hour per IP, 100/day global
- API keys server-side only, never exposed to browser
- CORS restricted to demo subdomain
About
Built by Andres Tobacia — Industrial Engineer with 17+ years in manufacturing, supply chain, and space exploration. This demo combines deep domain expertise in industrial procurement with applied AI to solve a real operational pain point.
Uploaded files are processed in memory and immediately discarded. No data is stored. AI processing powered by a large language model API. No training is performed on uploaded data.