AI Job Search Assistant: From PDFs to Career Intelligence

What this project is

AI Job Search Assistant is a TypeScript CLI that turns job posting PDFs and a resume PDF into structured career intelligence. Instead of manually reading postings and guessing what employers want, the tool extracts requirements, aggregates market signals, compares them against a resume, and writes reports that can be reused during applications.

The project is built around a practical workflow: drop PDFs into a folder, run CLI commands, and get Markdown plus JSON outputs that explain market demand, skill gaps, readiness score, and application guidance.

Pipeline design

The pipeline has three main stages.

First, the market command reads job posting PDFs, extracts structured text, asks an LLM to convert that text into a validated job schema, optionally enriches the company with Tavily search data, then aggregates skills, salary ranges, remote-work patterns, and experience levels.

Second, the gaps command reads a resume PDF and compares the resume against the generated market analysis. It produces a readiness score, strengths, gaps, quick wins, medium-term priorities, and longer-term recommendations.

Third, the advise command is planned as the application-specific layer. It will take one target job and generate resume adaptation guidance, cover letter direction, interview preparation, and fit analysis.

Why validation matters

The important engineering choice is treating LLM output as untrusted data. Every model response is validated with Zod before it is saved. If a model returns malformed JSON, misses required fields, or invents an unusable structure, the pipeline rejects that output instead of silently carrying bad data into the final report.

That makes the tool more dependable. The CLI does not just "ask AI" and print whatever comes back. It uses models as extraction and reasoning components inside a typed workflow with clear intermediate artifacts.

Model fallback

All LLM calls go through a shared retry wrapper. The assistant tries a primary OpenRouter model first, then falls through to fallback models if the first call fails, rate limits, or returns invalid output.

That fallback chain helps in two ways. It makes the CLI less fragile during transient provider issues, and it keeps the command-line experience simple for the user. The user runs npm run market or npm run gaps; the tool handles retries and model routing internally.

Output artifacts

The project intentionally writes both JSON and Markdown.

JSON outputs are useful for chaining commands, testing, and future automation. Markdown outputs are useful for reading, sharing, and publishing. For example, market analysis can be stored under data/analysis/market-analysis.json while a human-readable version is saved to reports/market-analysis.md.

That split is useful because the same pipeline can serve two audiences: the machine that needs structured data and the person who wants an understandable report.

What this demonstrates

This project shows more than prompt writing. It demonstrates CLI design, PDF ingestion, typed schemas, model retry logic, external research enrichment, report generation, and resilient file-based workflows.

The core lesson is that useful AI tooling usually needs boring engineering around it: input normalization, schema validation, retries, logging, saved artifacts, and clear commands. Without those pieces, the output may look impressive once but be hard to trust repeatedly.