Paste a git log. Choose your audience. Get back polished, structured Markdown — ready to drop into GitHub Releases, a client email, or a Notion page.

Source Code (GitHub): https://github.com/bbornino/claude-changelog-release-notes-generator


The Problem It Solves

Every software team ships code. Almost none of them enjoy writing the changelog.

The commit history exists. The information is all there. But transforming e4f5a6b fix: resolve login redirect loop on session timeout into something a client can actually read — and care about — requires a context switch that developers hate making at the end of a sprint. So it either doesn’t happen, or it happens badly, or it happens three days late.

This tool closes that gap. Feed it the raw git log, tell it who’s reading the output, and Claude does the translation.


What It Does

curl -X POST http://localhost:8000/generate/ \
  -H "Content-Type: application/json" \
  -d '{
    "git_log": "a1b2c3d feat: add dark mode\ne4f5a6b fix: login redirect loop\n...",
    "version": "v2.4.1",
    "audience": "client",
    "tone_instructions": "Keep it friendly and non-technical. Focus on user benefit.",
    "date": "2026-06-07"
  }'

The API returns structured Markdown release notes, shaped for the audience you specified:

developer — Technical, grouped by commit type (Features, Bug Fixes, Performance, Breaking Changes, Chores). The kind of thing you’d put in a GitHub Release or a PR description.

client — Plain English, benefit-focused. No jargon. Grouped into “What’s New,” “Fixes,” and “Under the Hood.” The kind of thing you paste into a client-facing sprint review or a Notion page.

executive — Three to five bullets, maximum. Business impact first. No implementation details. The kind of thing you drop into a stakeholder update when you have thirty seconds of their attention.

The same commit history. Three completely different documents. One API call each.


Why the Audience Split Matters

The temptation with this kind of tool is to just summarize the commits and call it done. That’s not release notes — that’s a slightly more readable version of the git log. Real release notes require a translation layer, and the translation depends entirely on who’s reading.

A developer reading a changelog wants to know: what changed, what broke, what do I need to update. They want the type prefix. They might even want the SHA.

A client reading a changelog wants to know: does this sprint make my product better. They don’t know what a commit is. They don’t care about the implementation. They want “you can now export reports as CSV” — not 1a2b3c4 feat: implement CSV export for all report types.

An executive reading a changelog wants to know: are we moving forward. Two sentences. Maybe a bullet point. Done.

The audience mode isn’t a cosmetic toggle. It changes the framing, the vocabulary, the level of detail, and what gets included at all.


Tools & Tech Stack

  • Language: Python 3.11+
  • API Framework: FastAPI + uvicorn
  • AI Backend: Anthropic Claude API (claude-sonnet-4-20250514)
  • Validation: Pydantic v2
  • Output Format: Markdown
  • Testing: pytest + pytest-asyncio + httpx (55 tests, 7 classes — all mocked, no API key required)
  • Linting: ruff

API Features Demonstrated

Prompt Engineering — Modular Audience Injection

The interesting file in this project is app/prompts/templates.py. The system prompt establishes a tight persona: “You are a senior technical writer who specializes in developer-facing release documentation.” That’s it — no hedging, no “you can also help with other things.”

The audience instruction is then injected as a separate block, assembled at request time. This means you can A/B test tone independently from the persona, add new audience modes without touching the system prompt, and tune each block in isolation when the output isn’t quite right.

The git log is passed last, after all instructions, and wrapped in a clear delimiter. This isn’t accidental — it’s a structural choice to prevent commit messages from being interpreted as additional instructions. One of the test classes specifically exercises adversarial inputs: commits that contain “Ignore previous instructions,” fake delimiters, script tags, and system prompt keywords. The delimiter-last structure handles them cleanly.

Input Resilience — 16 Fixture Files of Commit Chaos

Real git logs are not clean. Teams don’t agree on commit conventions. Repos have decades of accumulated style drift. Any tool that only works on well-formatted conventional commits isn’t a tool — it’s a demo.

The test suite was built around 16 fixture files designed to cover every style pathology a real client repo might have:

  • Clean conventional commits (feat:fix:chore:)
  • Plain English with no type labels (“Fixed the broken search results on mobile”)
  • ALL CAPS (“FIXED THE LOGIN BUG THAT WAS BREAKING EVERYTHING”)
  • all lowercase no punctuation (“add user auth”)
  • Emoji prefixes (✨ 🐛 ⚡️ 💥)
  • Ticket number prefixes (PROJ-1042, [JIRA-204], GH-512, #881)
  • Mixed styles in a single log — all of the above, together
  • Merge commits and revert commits
  • WIP noise: “WIP,” “asdf,” “.” “save,” “DO NOT MERGE”
  • 200+ character commit messages
  • Special characters: backticks, HTML entities, null bytes, Unicode arrows, emoji
  • Code snippets in messages (fix: replace forEach() with map(), SQL injection patches)
  • Non-English commits: Spanish, French, Japanese, Arabic, German, Russian, Chinese
  • A 120-commit large log combining all of the above

Every one of those fixtures runs through the full endpoint and must return a 200 with non-empty Markdown. The service doesn’t get to refuse because the commits are ugly.


The Test Suite

pytest tests/ -v
========================= 55 passed in 0.64s ==========================

55 tests across 7 classes. All Claude API calls are mocked with AsyncMock — no API key needed to run the suite.

  • Happy path — all three audience modes, optional fields, version echoing, defaults
  • Input validation — every field boundary: min_lengthmax_length, invalid enum values, null fields, missing fields, extra fields
  • Commit style chaos — all 16 fixture files, one test each
  • Prompt injection & adversarial input — injection in git_log and tone_instructions, fake delimiters, script tags, system prompt keywords
  • API error handling — 502 on APIStatusErrorAPIConnectionError, and RateLimitError; 500 on unexpected exceptions
  • Response contract — shape completeness, type checking on token counts, null safety, case-exact field echoing
  • HTTP & routing — 405 on wrong method, 404 on unknown route, 422 on wrong content-type, 422 on empty body

Project Structure

claude-changelog-release-notes-generator/
├── app/
│   ├── main.py                # FastAPI app, health check, logging setup
│   ├── routers/
│   │   └── changelog.py       # POST /generate/ endpoint
│   ├── services/
│   │   └── claude_service.py  # Anthropic API calls, lazy lru_cache client
│   ├── models/
│   │   └── schemas.py         # Pydantic v2 request/response models
│   └── prompts/
│       └── templates.py       # System prompt + per-audience instruction blocks
├── tests/
│   ├── conftest.py            # Shared fixtures (mock_claude, async_client)
│   ├── test_changelog.py      # 55 tests, 7 classes
│   └── fixtures/              # 16 commit-style fixture files
├── CLAUDE.md                  # Build spec — the handoff doc for Claude Code
├── SESSION_LOG.md             # Build diary with timing and wrong turns
├── requirements.txt
└── pyproject.toml

Build Notes — What the Session Log Captured

This project took about 1 hour 45 minutes of active work with Claude Code. The core application — schemas, prompt templates, Claude service, router, and FastAPI wiring — was done in 7 minutes across five sequential commits. Clean, fast, no surprises.

The interesting part of the session was what went wrong.

Wrong directory. The scaffold command asked Claude Code to “create a new Python project called ai-changelog-generator.” It interpreted that as an instruction to create a new sibling directory — and wrote all 14 scaffold files there instead of into the existing project folder. The mistake wasn’t caught until after the schema step was committed. Recovery required recreating everything in the right path and deleting the wrong directory. Cost: about 15 minutes.

Context window overload. The first attempt at Step 7 combined fixture generation (16 files, including a 120-commit stress log) with test writing (55 test cases across 7 classes) into a single prompt. Claude Code stalled in a “thinking” loop for 19 minutes and produced nothing. Splitting it into two focused commands — fixtures first, tests second — completed both in under 15 minutes total.

The lesson isn’t about file count. It’s about cognitive load. Generating content and reasoning about that content simultaneously are two different kinds of work. When a prompt asks for both at once, the quality of each suffers. The fix is simple: split the prompt. The rule now lives in CLAUDE.md for every project that follows.

The full session log — including commit timestamps, bug descriptions, exact error messages, and fixes — is committed to the repo as SESSION_LOG.md.


A Bug That Tests Caught Before It Could Matter

Before writing a single test, Claude Code identified a latent initialization bug in claude_service.py. The Anthropic client was being instantiated at module load time — before load_dotenv() had run. In any environment where the API key lived in .env rather than the OS environment, this would fail at import, before any test could even start.

The fix was a lazy getter with @functools.lru_cache:

@functools.lru_cache(maxsize=1)
def _get_client() -> anthropic.Anthropic:
    return anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

The client is now created on first use, after the environment is loaded. This is the kind of thing that would have been a confusing production failure — and the kind of thing a well-structured test setup surfaces before it gets that far.


Setup & Installation

git clone https://github.com/bbornino/claude-changelog-release-notes-generator.git
cd claude-changelog-release-notes-generator
python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Add your ANTHROPIC_API_KEY to .env

You’ll need an Anthropic API key from console.anthropic.com. New accounts include free credits.

Running the Server

uvicorn app.main:app --reload

Interactive API docs at http://localhost:8000/docs. Health check at http://localhost:8000/health.

Running the Tests

pytest tests/ -v

No API key required. All 55 tests use mocked Claude responses and run in under one second.


What’s Not Built Yet

  • Web UI — a paste-and-click frontend (planned as a future project)
  • GitHub Actions integration — auto-generate and commit CHANGELOG.md on tag push
  • Style profiles — save named tone presets per client or project
  • Slack / email delivery — pipe output directly to a webhook
  • Multi-repo aggregation — merge logs from microservices into one unified changelog

Background

Seven years at CalPERS building internal workflow automation means I’ve watched good teams ship great software and then write terrible changelogs — or none at all. The gap between “we shipped a lot this sprint” and “here’s what changed and why it matters” is a communication problem, and communication problems are exactly what language models are built for.

This is Portfolio Project #2 in the Claude API series. Each project adds a new capability on top of the last.

[1] Legacy Code Explainer     — streaming, prompt caching, extended thinking
[2] Changelog Generator       ← you are here — prompt engineering, audience modes, FastAPI
[3] ...