Context
FDOT's Professional Services FTP portal contains public procurement records and supporting materials that can be valuable for analysis, proposal strategy, and institutional knowledge. However, the information is distributed across archived files and formats that are not immediately useful for AI-enabled search.
Challenge
The source archive included different document types, file structures, and media formats. Before any RAG dashboard or chatbot could be useful, the underlying records needed to be collected, normalized, converted, and organized into a structure suitable for downstream AI processing.
My Role
I led the prototype workflow design and development. I built the data collection and preparation process and defined how procurement materials could be transformed into AI-ready inputs.
Approach
The workflow crawled and downloaded public archive materials, converted PDF files into markdown for LLM-based context analysis, and transcribed audio files to improve searchability. The process treated data preparation as the foundation of the AI system rather than an afterthought.
Output
The project produced an automated archive collection and preparation workflow. It created structured inputs that could later support retrieval-augmented generation, dashboard search, and chatbot-based procurement information access.
Impact
The prototype turned a static public archive into a more usable knowledge source. It established the data foundation needed for future AI applications in transportation procurement analysis and business development workflows.