Tag: SoftwareEngineering

  • [FastAPI + Ollama] Hunting the Real 404

    ๐Ÿ› ๏ธ Debugging Story

    The Route Exists. So Why Is It Returning 404?
    A FastAPI + Ollama Deep Dive ๐Ÿ”

    I finally got my mini AI-RAG and chatbot service talking to my app. Still a prototype โ€” text only, nothing fancy to look at. But when it all clicked together, that feeling made every late night worth it.

    โšก FastAPI๐Ÿณ Docker๐Ÿค– Ollamaโ˜๏ธ Cloud Run

    ๐Ÿ—๏ธ Service Architecture

    ๐Ÿ“ฑ Appโ†’โ˜๏ธ Cloud Runโ†’๐Ÿ”’ Cloudflareโ†’๐Ÿ  Home Routerโ†’๐Ÿ–ฅ๏ธ Prod Serverโ†’๐Ÿ’ป Dev Server (RTX 3060)

    ๐Ÿ˜ฐAs the layers stacked up, strange things started happening

    iptables + NordVPN + Docker were interfering with each other โ€” packets were vanishing, and tcpdump was the only thing giving me any direction. My MacBook’s SSH tunnel kept dropping and reconnecting, and at some point the dev server started silently swallowing requests. Nothing in the logs, server still running. Rebooted the MacBook. Fixed immediately.

    Once I got through all that, a 404 was waiting for me.

    ๐ŸšจThe route exists. So why 404?

    POST /internal/llm/chat kept returning 404 inside Docker while working perfectly on the dev server. I printed the route table directly โ€” the endpoint was clearly registered. It wasn’t a proxy issue either, since requests were showing up in the FastAPI logs.

    ๐Ÿ’ก From inside the container (127.0.0.1): 401 Unauthorized. From outside via nginx: 404 Not Found. Same container, same process, different results.

    ๐Ÿ”ŽThe key clue: response size

    The response sizes in the nginx access log looked wrong.

    โš ๏ธ FastAPI default 404 {"detail":"Not Found"} = 22 bytes
    Actual response sizes = 64โ€“89 bytes
    Something else was generating that 404.

    Working backwards, it was an Ollama error message:

    {“detail”:”model ‘llama3’ not found, try pulling it first”}

    ๐Ÿ•ต๏ธRoot cause: upstream error propagation

    A client service was reading OLLAMA_MODEL=llama3 from GCP Secret Manager and passing it in the request body. The model actually installed was llama3.1:8b. Ollama’s 404 was propagating straight through FastAPI to the client โ€” making it look like the route didn’t exist. Not a routing problem. An upstream error propagation problem.

    ๐Ÿค–Debugging with AI

    I had been relying on AI to analyze the flood of logs. At some point, it started nudging me toward tearing down the architecture itself. I stepped back, worked through it with Claude from scratch, and eventually tracked down the real cause.

    AI compressed what could have been weeks of debugging into days. But holding the design together was still a human job. The more network layers you add, the easier it is for AI to lose the thread too.


    ๐Ÿ“What I took away

    1๏ธโƒฃ

    Check the response size

    22 bytes vs 64 bytes. Two 404s can mean completely different things. That single number was the decisive clue.

    2๏ธโƒฃ

    404 doesn’t always mean routing failure

    Upstream errors propagate silently. If you don’t handle them explicitly, they’re very hard to trace.

    3๏ธโƒฃ

    Wrap upstream errors as 502/503

    Returning internal errors and routing errors with the same status code multiplies your debugging time significantly.

    4๏ธโƒฃ

    When nothing makes sense โ€” reboot. Seriously.

    Sometimes the most powerful debugging tool is turning it off and back on. Your mental health will thank you.

    When the logs go quiet, widen your view. ๐Ÿ”ญ
    And when nothing makes sense โ€” reboot first.

  • [FastAPI + Ollama] 404์˜ ์ง„์งœ ๋ฒ”์ธ์„ ์ฐพ์•„์„œ

    ๋ผ์šฐํŠธ๋Š” ์žˆ๋Š”๋ฐ ์™œ 404๊ฐ€ ๋‚ ๊นŒ์š”?
    FastAPI + Ollama ์‚ฝ์งˆ๊ธฐ ๐Ÿ”

    ์ง‘์—์„œ mini AI-RAG์™€ ์ฑ—๋ด‡ ์„œ๋น„์Šค๋ฅผ ์•ฑ์— ๊ฐ„์‹ ํžˆ ์—ฐ๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค. ์•„์ง ํ”„๋กœํ† ํƒ€์ž…์ด๋ผ ํ…์ŠคํŠธ๋ฐ–์— ์—†์–ด ๋ณผํ’ˆ์—†์ง€๋งŒ, ๋”ฑ ๋ถ™์—ฌ์„œ ๋˜์—ˆ์„ ๋• ์ •๋ง ๋ฟŒ๋“ฏํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๊ณผ์ •์—์„œ ์ƒ๊ธด ์ด์•ผ๊ธฐ์ž…๋‹ˆ๋‹ค.

    โšก FastAPI๐Ÿณ Docker๐Ÿค– Ollamaโ˜๏ธ Cloud Run

    ๐Ÿ—๏ธ ์„œ๋น„์Šค ๊ตฌ์„ฑ

    ๐Ÿ“ฑ ์•ฑโ†’โ˜๏ธ Cloud Runโ†’๐Ÿ”’ Cloudflareโ†’๐Ÿ  ๊ณต์œ ๊ธฐโ†’๐Ÿ–ฅ๏ธ ์šด์˜ ์„œ๋ฒ„โ†’๐Ÿ’ป ๊ฐœ๋ฐœ ์„œ๋ฒ„ (RTX 3060)

    ๐Ÿ˜ฐ๋ ˆ์ด์–ด๊ฐ€ ์Œ“์ผ์ˆ˜๋ก ์ด์ƒํ•œ ์ผ์ด ์ƒ๊ฒผ์Šต๋‹ˆ๋‹ค

    iptables + NordVPN + Docker๊ฐ€ ์–ฝํžˆ๋ฉด์„œ ํŒจํ‚ท์ด ์‚ฌ๋ผ์กŒ๊ณ , tcpdump๋กœ ๊ฒจ์šฐ ๋ฐฉํ–ฅ์„ ์žก์•˜์Šต๋‹ˆ๋‹ค. ๋งฅ๋ถ SSH ํ„ฐ๋„์ด ๋Š๊ฒผ๋‹ค ๋ถ™์—ˆ๋‹ค ํ•˜๋ฉด์„œ ๊ฐœ๋ฐœ ์„œ๋ฒ„๊ฐ€ ์š”์ฒญ์„ ํ†ต์งธ๋กœ ์‚ผํ‚ค๋Š” ์ฆ์ƒ๋„ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ๋กœ๊ทธ์—” ์•„๋ฌด๊ฒƒ๋„ ์•ˆ ์ฐํžˆ๋Š”๋ฐ ์„œ๋ฒ„๋Š” ์‚ด์•„์žˆ๋Š” ์ƒํ™ฉ โ€” ๊ป๋‹ค ์ผœ๋‹ˆ๊นŒ ๋ฐ”๋กœ ํ•ด๊ฒฐ๋์Šต๋‹ˆ๋‹ค.

    ๊ทธ๊ฑธ ๋šซ๊ณ  ๋‚˜๋‹ˆ ์ด๋ฒˆ์—” 404๊ฐ€ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

    ๐Ÿšจ์ฆ์ƒ: ๋ผ์šฐํŠธ๋Š” ์žˆ๋Š”๋ฐ 404

    Docker ์ปจํ…Œ์ด๋„ˆ์—์„œ POST /internal/llm/chat ์š”์ฒญ์ด ๊ณ„์† 404๋ฅผ ๋ฐ˜ํ™˜ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฐœ๋ฐœ ์„œ๋ฒ„์—์„œ๋Š” ์ •์ƒ ๋™์ž‘. ๋ผ์šฐํŠธ ๋ชฉ๋ก์„ ์ง์ ‘ ์ถœ๋ ฅํ•ด๋ดค๋”๋‹ˆ ์—”๋“œํฌ์ธํŠธ๋Š” ๋ถ„๋ช…ํžˆ ๋“ฑ๋ก๋˜์–ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

    ๐Ÿ’ก ๋‚ด๋ถ€(127.0.0.1)์—์„œ๋Š” 401 Unauthorized, ์™ธ๋ถ€(nginx)์—์„œ๋Š” 404 Not Found. ๊ฐ™์€ ์ปจํ…Œ์ด๋„ˆ, ๊ฐ™์€ ํ”„๋กœ์„ธ์Šค์—์„œ ๋‹ค๋ฅธ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”์Šต๋‹ˆ๋‹ค.

    ๐Ÿ”Ž๊ฒฐ์ •์  ๋‹จ์„œ: ์‘๋‹ต ํฌ๊ธฐ

    nginx ์•ก์„ธ์Šค ๋กœ๊ทธ์˜ ์‘๋‹ต ํฌ๊ธฐ๊ฐ€ ์ด์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

    โš ๏ธ FastAPI ๊ธฐ๋ณธ 404 ์‘๋‹ต {"detail":"Not Found"} = 22๋ฐ”์ดํŠธ
    ์‹ค์ œ ์‘๋‹ต ํฌ๊ธฐ = 64~89๋ฐ”์ดํŠธ
    ๋ˆ„๊ตฐ๊ฐ€ ๋‹ค๋ฅธ 404๋ฅผ ๋งŒ๋“ค๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

    ์—ญ์‚ฐํ•ด๋ณด๋‹ˆ Ollama์˜ ์—๋Ÿฌ ๋ฉ”์‹œ์ง€์˜€์Šต๋‹ˆ๋‹ค.

    {“detail”:”model ‘llama3’ not found, try pulling it first”}

    ๐Ÿ•ต๏ธ์›์ธ: upstream ์—๋Ÿฌ ์ „ํŒŒ

    ํด๋ผ์ด์–ธํŠธ ์„œ๋น„์Šค๊ฐ€ GCP Secret Manager์—์„œ OLLAMA_MODEL=llama3๋ฅผ ์ฝ์–ด ์š”์ฒญ body์— ๋‹ด์•„ ๋ณด๋‚ด๊ณ  ์žˆ์—ˆ๊ณ , ์„œ๋ฒ„์— ์‹ค์ œ ์„ค์น˜๋œ ๋ชจ๋ธ์€ llama3.1:8b์˜€์Šต๋‹ˆ๋‹ค. Ollama์˜ 404๊ฐ€ FastAPI๋ฅผ ํ†ตํ•ด ๊ทธ๋Œ€๋กœ ์ „ํŒŒ๋œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ผ์šฐํŒ… ๋ฌธ์ œ๊ฐ€ ์•„๋‹ˆ๋ผ upstream ์—๋Ÿฌ ์ „ํŒŒ ๋ฌธ์ œ์˜€์Šต๋‹ˆ๋‹ค.

    ๐Ÿค–AI์™€ ํ•จ๊ป˜ ํŒ ์ด์•ผ๊ธฐ

    AI์—๊ฒŒ ๋Œ€๋Ÿ‰์˜ ๋กœ๊ทธ ๋ถ„์„์„ ๋งก๊ฒผ๋Š”๋ฐ, ์–ด๋А ์ˆœ๊ฐ„ AI๊ฐ€ ์„ค๊ณ„ ๊ตฌ์„ฑ ์ž์ฒด๋ฅผ ๋ฌด๋„ˆ๋œจ๋ฆฌ๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์œ ๋„ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค. Claude์™€ ํ•จ๊ป˜ ๋‹ค์‹œ ์ฒ˜์Œ๋ถ€ํ„ฐ ํŒŒ๊ณ ๋“ค์–ด์„œ ๊ฐ„์‹ ํžˆ ์‹ค์ œ ์›์ธ์„ ์ฐพ์•„๋ƒˆ์Šต๋‹ˆ๋‹ค.

    AI๋Š” ๋ช‡ ์ฃผ์˜ ์‚ฝ์งˆ์„ ๋ฉฐ์น ๋กœ ์ค„์—ฌ์คฌ์ง€๋งŒ, ์„ค๊ณ„๋ฅผ ์ง€ํ‚ค๋Š” ๊ฑด ๊ฒฐ๊ตญ ์‚ฌ๋žŒ์˜ ๋ชซ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋„คํŠธ์›Œํฌ ๊ด€๋ฆฌ ํฌ์ธํŠธ๊ฐ€ ๋Š˜์–ด๋‚ ์ˆ˜๋ก, AI๋„ ๊ธธ์„ ์žƒ์Šต๋‹ˆ๋‹ค.


    ๐Ÿ“์ด ๊ฒฝํ—˜์—์„œ ์–ป์€ ๊ตํ›ˆ

    1๏ธโƒฃ

    ์‘๋‹ต ํฌ๊ธฐ๋ฅผ ํ™•์ธํ•˜๋ผ

    22๋ฐ”์ดํŠธ vs 64๋ฐ”์ดํŠธ. ๊ฐ™์€ 404๋„ ๋‚ด์šฉ์ด ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ์‘๋‹ต ํฌ๊ธฐ ํ•˜๋‚˜๊ฐ€ ๊ฒฐ์ •์  ๋‹จ์„œ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.

    2๏ธโƒฃ

    404๊ฐ€ ํ•ญ์ƒ ๋ผ์šฐํŒ… ๋ฌธ์ œ๋Š” ์•„๋‹ˆ๋‹ค

    Upstream ์—๋Ÿฌ๋Š” ์กฐ์šฉํžˆ ์ „ํŒŒ๋ฉ๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š์œผ๋ฉด ์ถ”์ ์ด ๋งค์šฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

    3๏ธโƒฃ

    Upstream ์—๋Ÿฌ๋Š” 502/503์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋ผ

    ๋‚ด๋ถ€ ์˜ค๋ฅ˜์™€ ๋ผ์šฐํŒ… ์˜ค๋ฅ˜๋ฅผ ๊ฐ™์€ ์ฝ”๋“œ๋กœ ๋ฐ˜ํ™˜ํ•˜๋ฉด ๋””๋ฒ„๊น… ์‹œ๊ฐ„์ด ๋ช‡ ๋ฐฐ๋กœ ๋Š˜์–ด๋‚ฉ๋‹ˆ๋‹ค.

    4๏ธโƒฃ

    ์ดํ•ด ์•ˆ ๋˜๋ฉด ์ผ๋‹จ ๋„๊ณ  ์ผœ๋ผ โ€” ์ง„์ง€ํ•˜๊ฒŒ

    ๋ญ”๊ฐ€ ์ •๋ง ์ดํ•ด ์•ˆ ๋˜๋Š” ์ƒํ™ฉ์ด ์ƒ๊ธฐ๋ฉด, ์ผ๋‹จ ์ •๋ฆฌํ•˜๊ณ  ์ปดํ“จํ„ฐ๋ฅผ ๊ป๋‹ค ํ‚ค๋Š” ๊ฒŒ ์ •์‹ ๊ฑด๊ฐ•์— ์ข‹์Šต๋‹ˆ๋‹ค.

    ๋กœ๊ทธ๊ฐ€ ์กฐ์šฉํ• ์ˆ˜๋ก, ์‹œ์•ผ๋ฅผ ๋„“ํ˜€์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๐Ÿ”ญ
    ๊ทธ๋ฆฌ๊ณ  ๋ญ”๊ฐ€ ์ด์ƒํ•˜๋ฉด โ€” ์ผ๋‹จ ์žฌ๋ถ€ํŒ…๋ถ€ํ„ฐ.