Optimizing Cloud Infrastructure: How Former03 Achieved Operational Excellence with AWS

Key Challenges
Key Results
Overview
Former03 GmbH is a Munich-based digital agency specialising in sophisticated web development, UX/UI design, and multimedia solutions for enterprise clients across Germany. Founded in 2003, this established SMB has built a reputation serving high-profile clients includingDATEV (financial software), Volkswagen's Elli (EV charging solutions), Dallmayr(premium retail), and PONS (publishing). Former03 delivers mission-critical web applications and APIs that demand enterprise-grade reliability, scalability and performance.
As their client portfolio expanded to include more complex, high-traffic applications, their AWS infrastructure began showing critical inefficiencies in auto-scaling behaviour and API rate limiting that threatened service reliability and client satisfaction.
Challenges
Former03 faced critical infrastructure reliability and cost optimisation challenges threatening their enterprise client relationships. Their auto-scaling configuration was triggering unnecessary scaling events due to suboptimal CloudWatch alarm thresholds set at6 MB network output, causing false alarms when processing legitimate 8 MB JSON payloads from client applications. This resulted in unpredictablei nfrastructure costs fluctuating by €3,500 monthly and inefficient resource allocation across staging (1-3 instances) and production (1-5 instances)environments.
The company's rate limiting architecture based solely on IP addresses became completely ineffective when multiple users connected through shared VPN connections, creating scenarios where all traffic appeared to originate from a single NAT IP. This lack of granular user identification led to unfair throttling where one heavy user could impact all others, creating service degradation for legitimate client requests and risking€500,000 in annual recurring revenue from major accounts.
Their lean technical team was spending 120hours monthly troubleshooting false alarms and manual infrastructure interventions, diverting critical resources from billable client work.Additionally, application crashes caused by insufficient 2GB RAM instances were impacting service availability for enterprise clients who demanded 99.9% uptimeSLAs.
Solution
Ankercloud partnered with Former03 to implement a comprehensive infrastructure optimisation strategy addressing operational excellence, reliability, performance efficiency, and costoptimization, all pillars of the AWS Well-Architected Framework.
We transitioned from network-based to intelligent CPU and memory-based auto-scaling policies, implementing CloudWatchAgent for comprehensive hardware metrics collection with 1-minute granularity. CloudWatch alarm thresholds were re-calibrated from 6 MB to 8+ MB network output, eliminating false triggers while maintaining responsiveness to genuine traffic surges. We established Grafana and Prometheus integration for enhanced observability and proactive monitoring.
For rate limiting, we implemented a header-based architecture using Lambda Authorisers that inspect customX-Client-ID headers, with DynamoDB storing per-client request counters. Thissolution enabled granular rate limiting (100 requests per minute per client)regardless of VPN configuration, ensuring fair API access for all users.
Infrastructure right-sizing upgraded production instances from inadequate 2GB RAM to optimal 8GB RAM (c7g.xlarge compute-optimized instances), eliminating application crashes. We implemented lifecycle hooks for graceful job draining during scale-down events, preventing stuck jobs in the scheduler.
Business Outcome
The transformation delivered measurableresults across operational efficiency and infrastructure reliability:
Operational Excellence
- False auto-scaling events reduced by 80%, achieving 95% cost predictability
- Engineering troubleshooting time decreased by 70% (from 120 to 36 hours monthly)
- Platform stability improved with zero application crashes post-upgrade
- Comprehensive monitoring enabled proactive issue detection
Financial Impact
- €42,000 annual savings from eliminated unnecessary scaling events
- 84 engineering hours monthly redirected to billable client work (€8,400 monthly value at €100/hour)
- Infrastructure costs stabilised with predictable monthly spending
Strategic Advancement:
- Enterprise-grade rate limiting protected against service abuse and DDoS attacks
- Enhanced monitoring capabilities improved client SLA compliance
- Maintained €500,000 in at-risk annual recurring revenue from DATEV, Elli, and Dallmayr
- Positioned for 30% revenue growth through improved service reliability

