Lead Platform Engineer

Relish

Relish

Software Engineering

Remote

USD 130k-170k / year

Posted on Apr 30, 2026

Lead Platform Engineer

Remote
$130,000 - $170,000 a year - Full-time

Job details

Pay

  • $130,000 - $170,000 a year

Job type

  • Full-time

Work setting

  • Remote

Benefits
Pulled from the full job description

  • Referral program
  • 401(k)
  • Health insurance
  • 401(k) matching
  • Vision insurance
  • Dental insurance

Full job description

Lead Platform EngineerAbout Us

We are a rapidly growing, high-impact startup in the procurement space, expanding at 50% YoY. We process massive volumes of sensitive invoice and supplier data using advanced OCR, structured data extraction, matching, validation, and anomaly detection. Because we handle critical financial data for our clients, security, compliance, and trust are foundational to everything we build. As we rapidly expand our product capabilities into Agentic AI workflows, we are overhauling our infrastructure to support our next massive phase of growth securely and efficiently.

The Opportunity

We are looking for a Lead Platform Engineer to spearhead the modernization of our cloud infrastructure. Today, we operate heavily in an AWS Serverless environment (Lambdas, Step Functions, DynamoDB). Tomorrow, with your leadership, we are migrating to a robust, highly scalable, and compliant platform built on Kubernetes, Temporal.io, and a mix of modern databases (PostgreSQL, MySQL, NoSQL).

In this role, you will own the transition, lay down the foundational infrastructure using Terraform CDK, and build out modern CI/CD pipelines from scratch. A major mandate for this role is to design the new platform to be as cloud-agnostic as possible, preventing vendor lock-in while maintaining high performance. You will ensure the platform is secure by design—adhering to frameworks like SOC 2 and GDPR—while designing the infrastructure required to run cutting-edge Agentic AI workloads.

Key Responsibilities

· Architect the Future: Lead the migration from our current AWS Serverless stack to a modern, cloud-agnostic Kubernetes architecture capable of supporting diverse database workloads (PostgreSQL, MySQL, NoSQL).

· Security & Compliance by Design: Build the new infrastructure to strictly adhere to compliance frameworks (e.g., SOC 2, ISO 27001, GDPR). Implement DevSecOps principles, zero-trust networking, secure secrets management, and robust IAM policies.

· Workflow Orchestration: Implement and manage Temporal.io to replace AWS Step Functions, ensuring highly reliable, scalable, and resilient distributed workflows.

· Advanced Observability: Design and implement comprehensive observability using OpenTelemetry and leading log management/APM platforms (e.g., Datadog, Splunk, Elastic, Grafana Loki). You will ensure we have deep, distributed tracing and visibility across microservices, Temporal workflows, and AI agents.

· Infrastructure as Code (IaC): Fully automate our cloud infrastructure provisioning and management using Terraform CDK, ensuring everything is version-controlled, auditable, repeatable, and easily portable across cloud providers.

· CI/CD Pipeline Design: Build and maintain fast, automated deployment pipelines using [GitHub Actions / Azure DevOps] with automated security, testing, and compliance guardrails built-in.

· AI Infrastructure: Collaborate closely with our Data and AI engineering teams to provision and optimize secure infrastructure for Agentic AI workflows (e.g., managing GPU compute, vector databases, and AI model deployments).

· Cloud FinOps: Actively monitor, analyze, and optimize infrastructure costs without sacrificing performance, security, or reliability.

What We're Looking For

· Experience: 7+ years of experience in DevOps, Cloud Infrastructure, or Platform Engineering, with a proven track record of leading large-scale architectural migrations.

· Cloud & Agnosticism: Deep expertise in AWS (EKS, networking, IAM), paired with a strong architectural mindset for building portable, cloud-agnostic systems that abstract away underlying provider dependencies.

· Security & Compliance: Proven experience designing and managing cloud environments subject to rigorous compliance audits (SOC 2, GDPR, etc.). You know how to secure a Kubernetes cluster and manage sensitive data at scale.

· Kubernetes Expert: Extensive hands-on experience designing, deploying, securing, and managing production Kubernetes clusters.

· Modern IaC: Strong proficiency in Infrastructure as Code. Experience with Terraform CDK (using TypeScript or Python) is highly preferred.

· Observability Champions: Deep understanding of distributed tracing, metrics, and logging. Hands-on experience with OpenTelemetry and major observability platforms.

· Programming Skills: You are a strong coder—not just a scripter. Proficiency in TypeScript, Python, or Go is required to effectively use CDKTF and support our engineering teams.

· Database Knowledge: Experience with database migrations and optimizing a mix of relational (PostgreSQL, MySQL) and modern NoSQL databases for high-throughput, data-intensive workloads.

Bonus Points

· Hands-on experience with Temporal.io or similar workflow orchestration engines (e.g., Airflow, Cadence).

· Experience supporting AI/ML infrastructure, MLOps, or running LLM agents in production.

· Background in B2B SaaS, fintech, or procurement data processing.

Pay: $130,000.00 - $170,000.00 per year

Benefits:

  • 401(k)
  • 401(k) matching
  • Dental insurance
  • Health insurance
  • Referral program
  • Vision insurance

Experience:

  • Infrastructure as code: 3 years (Required)
  • Platform Engineering: 1 year (Required)
  • Terraform CDK: 3 years (Required)
  • AWS: 4 years (Required)

Work Location: Remote