We are looking for a SRE / DevOps Engineer to build and scale enterprise-grade cloud platforms. This is a balanced role (70% engineering, 30% operations) focused on:

Building reliable, scalable infrastructure
Driving automation and platform engineering
Enabling system resilience through testing and chaos engineering
You will not just operate systems—you will design and build the platform itself.
Build and operate Tier-0 / Tier-1 systems where reliability is critical
Design infrastructure that scales predictably without runaway costs
Develop automation frameworks for load testing, validation, and resilience
Enable secure and compliant environments (FedRAMP-aligned systems)
Contribute to internal platforms and developer tooling (IDP mindset)
Modernize and refactor legacy systems without disrupting production

Responsibilities :

Design and implement scalable AWS infrastructure for production systems
Build Infrastructure-as-Code modules for consistent, reproducible environments
Develop and maintain CI/CD pipelines for deployment, testing, and validation

Build automation for:

Load testing and system readiness
Snapshot validation and recovery checks
Smoke testing and health verification

Define and improve:

Monitoring, alerting, and observability systems
Incident prevention (not just response)
Collaborate across teams to build shared platform capabilities
Contribute to architecture decisions and platform evolution

Requirements

2-4 years of experience in DevOps / SRE / Cloud Engineering roles
Strong hands-on experience in: AWS production environments (enterprise scale preferred), Infrastructure-as-Code (Terraform or CloudFormation), CI/CD pipelines (Jenkins, GitHub Actions)
Strong coding/scripting skills in: Python or Bash (must)
Proven experience with: Designing and operating scalable, reliable systems. Debugging production issues and improving system stability
Automation of infrastructure and workflows
Solid understanding of: Distributed systems and cloud architecture.Performance, scalability, and cost optimization

Tech Stack :

AWS (RDS, Lambda, EventBridge, ECS/Kubernetes, FIS*)
Terraform / CloudFormation (IaC)
CI/CD: Jenkins, GitHub Actions
Observability: CloudWatch, Prometheus, Grafana
Scripting/Development: Python, Bash (Node.js a plus)

*Chaos engineering tools are good to have, not mandatory
Good to Have

Experience with chaos engineering (AWS FIS, Gremlin, etc.)
Exposure to FedRAMP or regulated environments
Experience with Kubernetes or ECS
Background in database operations and disaster recovery
Experience transitioning from backend engineering to SRE

Benefits

Opportunity to work with a dynamic and fast-paced IT organization.
Make a real impact on the company's success by shaping a positive and engaging work culture.
Work with a talented and collaborative team.
Be part of a company that is passionate about making a difference through technology.

Apply on the website

SRE/DevOps Engineer