Software Engineer, Production Infrastructure, 11/2021 - Present
Zone Aware Routing: Improved intra availability zone routing with ROI of $2 million.
- Reduced inter-AZ production traffic by 40%.
- Migrated 1416 microservices to envoy load balancing subsets.
- Deprecated error prone load balancing components in favor of configuring load balancing subsets in Envoy.
- Wrote design spec and grafana/kibana dashboards. Communicated with customer teams to debug load balancing edge cases.
No More Yaml (NoMoYa): Decreased time-to-deploy networking settings from 15 minutes to less than 30 seconds.
- Implemented new configuration API server (GoLang) and user interface (TypeScript + React) handling circuit breakers, health check endpoints, traffic migrations, and network dependency allow lists.
- Improved team operations through self-service SEV mitigation, preventing context switching from team members.
Control Plane Backend Sharding: Collaborated with tech lead to simplify endpoint discovery.
- Transitioned from leader-elected writers to independent writers, enhancing service reliability and simplifying the deployment pipeline.
- Implemented new data layer on control plane frontend and backend with 0 downtime, and a 99.95% mesh availability SLA.
- Parallelized service discovery queue reducing endpoint query latency by roughly 50%.
Networking Day-to-Day
- Participated in debugging and mitigating more than 100 incidents through analyzing various kibana and prometheus queries, and SSHing into hosts themselves to validate networking components.
- Performed technical deep dives on load balancing, did 30+ candidate interviews, actively helped in team planning, and acted as mentor for both interns and new-hires.