Engineering Manager, Platform Infrastructure
Decagon
Other Engineering
San Francisco, CA, USA · New York, NY, USA
USD 280k-430k / year + Equity
Location
San Francisco; New York City
Employment Type
Full time
Location Type
On-site
Department
Engineering
Compensation
- Base Salary $280K – $430K • Offers Equity
About Decagon
Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.
Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.
We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.
We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.
About the Team
The Infrastructure team builds and operates the foundations that power Decagon: platform, model inference, compute, data, and developer experience. We partner closely with product, research, and applied AI teams to deliver high-scale, low-latency systems with clear SLOs and great developer ergonomics.
We organize around a couple of focus areas:
Platform: The foundational cloud stack — networking, compute, storage, security, and infrastructure-as-code — to ensure reliability, scale, and cost efficiency. CI/CD, paved paths, and core services that make shipping fast, safe, and consistent across teams.
ML & Data: Streaming/batch data platforms powering analytics/BI and customer-facing telemetry, including for customer-managed and on-prem environments. Realtime databases that enable low-latency agents. GPU and model-serving platforms for LLM inference with multi-provider routing.
Our mission is to deliver magical support experiences — AI agents working alongside humans to resolve issues quickly and accurately.
About the Role
We're looking for a hands-on Engineering Manager to lead the Platform team. This is a deeply technical player/coach role that sits at the foundation of everything Decagon ships. You'll lead the team responsible for the compute, networking, CI/CD, and deployment systems that every other engineering team builds on — from our multi-cloud SaaS environments to the single-tenant VPC and on-prem deployments we operate for regulated enterprise customers like major financial institutions.
You'll stay close to the code and systems — reviewing designs, participating in incident response, and contributing directly when it helps the team move faster. You'll also lead by example on AI-assisted engineering, setting the standard for how the team uses AI coding tools to ship higher-quality work more quickly.
You'll hire and develop a high-performing team while partnering closely with Security, Product Engineering, AI & Data Infrastructure, and customer-facing teams to make shipping fast and safe across a wide range of environments — from our primary cloud to air-gapped customer deployments. Success requires strong people leadership, crisp execution across concurrent enterprise commitments, and the technical depth to make sound architectural calls under real constraints.
In this role, you will
Build, lead, and develop a high-performing team of infrastructure engineers, including hiring, coaching, and performance management.
Own the technical strategy and roadmap for Decagon's Platform — compute, networking, CI/CD, IaC, and the deployment systems that underpin both SaaS and enterprise environments.
Stay hands-on: review designs and PRs with depth, lead architecture for hard problems, and contribute code directly when the team needs it — whether that's a critical migration, an on-call escalation, or an enterprise deployment under time pressure.
Drive architecture for multi-cloud and on-prem/cloud-prem deployments, including single-tenant VPC topologies, private connectivity, and air-gapped environments for regulated customers.
Set reliability, security, and cost standards across the platform, and build an operating cadence (on-call, incident review, capacity planning) that prevents repeated incidents and keeps the platform healthy as we scale.
Invest in developer experience — paved paths, golden templates, and CI/CD systems that let product teams ship quickly without compromising safety or consistency.
Raise the bar on AI-assisted engineering: define how your team uses AI coding tools, agents, and internal tooling to deliver faster with higher quality, and build the workflows, evals, and guardrails that make this durable.
Partner with Security, Product Engineering, and customer-facing teams to deliver enterprise deployments on aggressive timelines, navigate compliance requirements, and translate customer constraints into durable platform capabilities.
Your background looks something like this
2+ years of engineering management experience leading high-performing infrastructure, platform, or SRE teams in fast-moving environments, with a strong IC background before that.
Deep technical depth across infrastructure — you can design, review, and when needed, build core systems in compute, networking, CI/CD, or deployment orchestration. You're comfortable dropping into the codebase and shipping a PR.
Hands-on experience with cloud platforms (AWS, GCP, or Azure), Kubernetes, infrastructure-as-code (Terraform or similar), and modern CI/CD systems.
A track record of delivering multi-quarter infrastructure initiatives — migrations, platform rebuilds, or capability launches — through ambiguity, creating clarity for your team and stakeholders.
A strong point of view on AI-assisted engineering: you actively use AI coding tools yourself, have opinions on where they work and where they don't, and see it as a core part of how modern infrastructure teams should operate.
Care deeply about engineering craft and operational excellence, including reliability engineering, observability, incident learning, and cost discipline.
Communicate clearly and collaborate well across Security, Product Engineering, and customer-facing functions.
Even better if you have
Experience delivering on-premises, air-gapped, or single-tenant deployments for regulated enterprise customers (financial services, healthcare, government).
Experience with multi-cloud or cloud-to-cloud migrations at scale.
Background in security and compliance frameworks (SOC 2, PCI DSS, FedRAMP, or similar).
Experience building developer platforms or paved-path systems that meaningfully raised engineering velocity.
Experience building internal tooling, agents, or workflows that use LLMs to accelerate engineering work.
Compensation
$280,000 - $430,000 + Offers Equity
Benefits
We proudly offer the following benefits for our full-time employees:
Take what you need vacation policy (subject to local requirements; UK employees receive 25 days of statutory leave)
Medical, Dental, and Vision benefits for you and your family
Life Insurance and Disability Benefits
Retirement Plan (e.g., 401K, pension)
Parental Leave
Fertility and family building benefits through Carrot
Daily lunches and snacks in the office to keep you at your best
These benefits are described in more detail in Decagon’s policies, may vary by location, and can change at any time according to applicable compensation and benefits plans.
Compensation Range: $280K - $430K