Problem
Before releases, the team had no reliable way to measure how the system behaved under parallel call load. Release decisions depended on assumptions, not data.
- memory, CPU and thread usage under load were not measured
- stability during batch calls was unknown until production
- signs of memory leaks only visible after deployment
- performance regressions reached users before they were identified internally
Action
Built a repeatable load test setup that runs against the real API and captures system metrics for each run.
- load tests against the real backend endpoint — not synthetic simulations
- parallel collection of memory, CPU, thread, handle and active-call metrics
- results stored as charts and CSV for comparison across runs
- configurable load intensity and scenario parameters
Result
The team can now make release decisions based on actual performance data instead of assumptions.
- memory and CPU impact visible before each release
- degradation trends identified by comparing runs over time
- first issue found: memory usage did not stabilize under sustained load — caught before production
When it makes sense
- voicebots and callbots handling parallel sessions
- backend services sensitive to memory, CPU or thread load
- teams that need performance evidence before a release decision
Technologies
Python · requests · OAuth2 · pandas · matplotlib · CSV · Bash