{"id":9,"date":"2026-04-12T23:22:33","date_gmt":"2026-04-12T23:22:33","guid":{"rendered":"https:\/\/blog-api.minpox.com\/?p=9"},"modified":"2026-04-12T23:22:33","modified_gmt":"2026-04-12T23:22:33","slug":"fastapi-ollama-hunting-the-real-404","status":"publish","type":"post","link":"https:\/\/blog-api.minpox.com\/?p=9","title":{"rendered":"[FastAPI + Ollama] Hunting the Real 404"},"content":{"rendered":"\n<p>\ud83d\udee0\ufe0f Debugging Story<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Route Exists. So Why Is It Returning 404?<br>A FastAPI + Ollama Deep Dive \ud83d\udd0d<\/h2>\n\n\n\n<p>I finally got my mini AI-RAG and chatbot service talking to my app. Still a prototype \u2014 text only, nothing fancy to look at. But when it all clicked together, that feeling made every late night worth it.<\/p>\n\n\n\n<p>\u26a1 FastAPI\ud83d\udc33 Docker\ud83e\udd16 Ollama\u2601\ufe0f Cloud Run<\/p>\n\n\n\n<p>\ud83c\udfd7\ufe0f Service Architecture<\/p>\n\n\n\n<p>\ud83d\udcf1 App\u2192\u2601\ufe0f Cloud Run\u2192\ud83d\udd12 Cloudflare\u2192\ud83c\udfe0 Home Router\u2192\ud83d\udda5\ufe0f Prod Server\u2192\ud83d\udcbb Dev Server (RTX 3060)<\/p>\n\n\n\n<p>\ud83d\ude30As the layers stacked up, strange things started happening<\/p>\n\n\n\n<p>iptables + NordVPN + Docker were interfering with each other \u2014 packets were vanishing, and tcpdump was the only thing giving me any direction. My MacBook&#8217;s SSH tunnel kept dropping and reconnecting, and at some point the dev server started silently swallowing requests. Nothing in the logs, server still running. Rebooted the MacBook. Fixed immediately.<\/p>\n\n\n\n<p>Once I got through all that, a 404 was waiting for me.<\/p>\n\n\n\n<p>\ud83d\udea8The route exists. So why 404?<\/p>\n\n\n\n<p><code>POST \/internal\/llm\/chat<\/code>&nbsp;kept returning 404 inside Docker while working perfectly on the dev server. I printed the route table directly \u2014 the endpoint was clearly registered. It wasn&#8217;t a proxy issue either, since requests were showing up in the FastAPI logs.<\/p>\n\n\n\n<p>\ud83d\udca1 From inside the container (127.0.0.1):&nbsp;<strong>401 Unauthorized<\/strong>. From outside via nginx:&nbsp;<strong>404 Not Found<\/strong>. Same container, same process, different results.<\/p>\n\n\n\n<p>\ud83d\udd0eThe key clue: response size<\/p>\n\n\n\n<p>The response sizes in the nginx access log looked wrong.<\/p>\n\n\n\n<p>\u26a0\ufe0f FastAPI default 404&nbsp;<code>{\"detail\":\"Not Found\"}<\/code>&nbsp;=&nbsp;<strong>22 bytes<\/strong><br>Actual response sizes =&nbsp;<strong>64\u201389 bytes<\/strong><br>Something else was generating that 404.<\/p>\n\n\n\n<p>Working backwards, it was an Ollama error message:<\/p>\n\n\n\n<p>{&#8220;detail&#8221;:&#8221;model &#8216;llama3&#8217; not found, try pulling it first&#8221;}<\/p>\n\n\n\n<p>\ud83d\udd75\ufe0fRoot cause: upstream error propagation<\/p>\n\n\n\n<p>A client service was reading&nbsp;<code>OLLAMA_MODEL=llama3<\/code>&nbsp;from GCP Secret Manager and passing it in the request body. The model actually installed was&nbsp;<code>llama3.1:8b<\/code>. Ollama&#8217;s 404 was propagating straight through FastAPI to the client \u2014 making it look like the route didn&#8217;t exist. Not a routing problem. An upstream error propagation problem.<\/p>\n\n\n\n<p>\ud83e\udd16Debugging with AI<\/p>\n\n\n\n<p>I had been relying on AI to analyze the flood of logs. At some point, it started nudging me toward tearing down the architecture itself. I stepped back, worked through it with Claude from scratch, and eventually tracked down the real cause.<\/p>\n\n\n\n<p>AI compressed what could have been weeks of debugging into days. But holding the design together was still a human job. The more network layers you add, the easier it is for AI to lose the thread too.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>\ud83d\udcddWhat I took away<\/p>\n\n\n\n<p>1\ufe0f\u20e3<\/p>\n\n\n\n<p>Check the response size<\/p>\n\n\n\n<p>22 bytes vs 64 bytes. Two 404s can mean completely different things. That single number was the decisive clue.<\/p>\n\n\n\n<p>2\ufe0f\u20e3<\/p>\n\n\n\n<p>404 doesn&#8217;t always mean routing failure<\/p>\n\n\n\n<p>Upstream errors propagate silently. If you don&#8217;t handle them explicitly, they&#8217;re very hard to trace.<\/p>\n\n\n\n<p>3\ufe0f\u20e3<\/p>\n\n\n\n<p>Wrap upstream errors as 502\/503<\/p>\n\n\n\n<p>Returning internal errors and routing errors with the same status code multiplies your debugging time significantly.<\/p>\n\n\n\n<p>4\ufe0f\u20e3<\/p>\n\n\n\n<p>When nothing makes sense \u2014 reboot. Seriously.<\/p>\n\n\n\n<p>Sometimes the most powerful debugging tool is turning it off and back on. Your mental health will thank you.<\/p>\n\n\n\n<p>When the logs go quiet,&nbsp;widen your view.&nbsp;\ud83d\udd2d<br>And when nothing makes sense \u2014 reboot first.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\ud83d\udee0\ufe0f Debugging Story The Route Exists. So Why Is It Returning 404?A FastAPI + Ollama Deep Dive \ud83d\udd0d I finally got my mini AI-RAG and chatbot service talking to my app. Still a prototype \u2014 text only, nothing fancy to look at. But when it all clicked together, that feeling made every late night worth [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12],"tags":[7,3,9,5,4,6,8,10],"class_list":["post-9","post","type-post","status-publish","format-standard","hentry","category-it","tag-ai","tag-backend","tag-debugging","tag-docker","tag-fastapi","tag-ollama","tag-rag","tag-softwareengineering"],"_links":{"self":[{"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=\/wp\/v2\/posts\/9","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9"}],"version-history":[{"count":1,"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=\/wp\/v2\/posts\/9\/revisions"}],"predecessor-version":[{"id":10,"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=\/wp\/v2\/posts\/9\/revisions\/10"}],"wp:attachment":[{"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog-api.minpox.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}