This project builds a fault-tolerant, resumable data pipeline that scrapes public issue data from the Apache Jira instance and converts it into structured JSONL format suitable for LLM training, data ...