Tracking outbound API calls from your application: why, what worked (and what didn’t)
We recently had to do an on-prem deployment (i.e. our SaaS is installed and run on the customer's own physical hardware, servers, and data centers, rather than on our cloud) for a customer—a particularly paranoid one (read: fintech 🏦).
Our application is containerized, and we already had scripts and artifacts for different deployment scenarios (Helm charts for Kubernetes, CloudFormation for AWS ECS, etc.). But there was one thing we didn’t have ready:
👉 A list of all outbound API calls our application makes.
This customer had a strict outbound firewall policy, and they needed an exhaustive whitelist of API endpoints.
The Problem
Until this point, we hadn’t really thought about outbound calls. Our own infra had no outbound restrictions, so we happily used libraries, SDKs, and third-party services without tracking what domains they pinged. But for on-prem deployments with locked-down firewalls, this becomes a critical requirement.
Our first instinct was: “How hard can it be?”
- Just go through
requirements.txt
, - Look at the SDKs we use,
- Identify direct API calls in our code (e.g.,
requests
), - Then compile the list of domains.
Sounds reasonable, right?
Except… it didn’t work.
Why? Because SDKs don’t always map cleanly to a single domain.
- Many SDKs call multiple endpoints.
- Some use alternative domains for redundancy.
- Others make hidden calls to services you never expected.
Example: One LLM library suddenly made a call to Azure Blob Storage to fetch a file used for token calculation. Not exactly obvious from the docs.
So the manual approach = incomplete.
Exploring Approaches
We turned to ChatGPT (of course) to ask how to track all API calls made by a Python backend. We got a bunch of suggestions that sounded good in theory, but didn’t fully solve the problem in practice.
Here’s a summary:
❌ Approach 1: Monkey Patch requests
- Idea: Patch the
requests
module to log every call. - Issue: Not all SDKs use
requests
. Some usehttpx
,urllib3
, or even custom HTTP clients. - Result: We only saw a subset of calls.
❌ Approach 2: Use httpretty
or wrapt
- Idea: Intercept calls by decorating functions dynamically.
- Issue: Same limitation—only works if the library is using something we patched. Anything else bypassed this.
❌ Approach 3: Logging Middleware
- Idea: Wrap the app with middleware that logs outbound traffic.
- Issue: Works if we control the code making requests, but useless for opaque SDKs that bypass middleware.
❌ Other Attempts
- Packet capture tools like
tcpdump
orwireshark
: noisy, low-level, and painful to parse. - Debugging proxies tied only to
requests
: again, incomplete.
None of these gave us the full picture.
The Solution: A Proxy for Everything
The only reliable approach turned out to be:
➡️ Force all outbound calls to go through a proxy, and log everything at the proxy.
We used mitmproxy. It’s lightweight, easy to run, and gives a clean log of every outbound request.
Making Sure All Calls Go Through the Proxy
Specifying a proxy in just the requests
client wasn’t enough, because many SDKs bypass it. The trick was to use environment variables that most HTTP clients respect by default:
import os
os.environ["HTTP_PROXY"] = "http://your-proxy:port"
os.environ["HTTPS_PROXY"] = "http://your-proxy:port"
With this, requests
, httpx
, and many other libraries automatically route traffic through the proxy—without touching our code.
Once we had mitmproxy logging everything, we finally got a complete list of domains our backend was calling. ✅
What About Frontend Calls?
Backend calls are only half the story. The frontend may also directly hit third-party services—for example:
- Intercom (support chat)
- Posthog / Clarity (analytics)
- Sentry (error tracking)
How do we capture these?
Super simple:
- Open Chrome DevTools → Network tab.
- Start recording.
- Walk through all major pages and flows in the app.
- Export the HAR file at the end.
That HAR file contains every outbound request the frontend made. We then parse it (with a quick script) to extract unique domains.
⚠️ The tricky part: making sure we triggered all flows. One approach is to leave the recording on while using the app normally for a day, then export the file. That way we’re more likely to catch edge cases.
Takeaways
- Don’t assume you know what domains your app calls. SDKs are sneaky.
- Manual inspection isn’t enough.
- A proxy-based approach (e.g., mitmproxy + environment variables) gives you a complete picture for backend.
- For frontend, HAR files from DevTools are the simplest solution.
If you’re doing on-prem deployments with strict firewall rules, it’s worth setting this up early—before the customer is blocked and waiting on you for a whitelist.