Job Summary
- Technical Skill:
- DevOps ,
- AWS ,
- Kubernetes ,
- DNS ,
- PostgreSQL ,
- MySQL ,
- Python ,
- Networking ,
- TCP/IP ,
- API ,
- Jenkins ,
- Github ,
- Apache cluster ,
- Elasticsearch ,
- CDN ,
- Docker ,
- Firewall ,
- Observability ,
- MS Azure ,
- Grafana ,
- Golang ,
- Amazon RDS ,
- Vault ,
- Ecommerce ,
- DevSecOps ,
- GCP ,
- DigitalOcean ,
- IAM ,
- FinOps ,
- IaC ,
- Terraform ,
- Pulumi ,
- AWS CloudFormation ,
- GitOps ,
- FluxCD ,
- ArgoCD ,
- CI/CD ,
- Gitlab ,
- SAST ,
- DAST ,
- Helm ,
- Prometheus ,
- LokiJS ,
- OpenTelemetry ,
- DataDog ,
- Elastic Stack ,
- ELK ,
- Bash ,
- CLI ,
- RBAC ,
- CKA ,
- Service Mesh ,
- Istio ,
- Amazon Aurora ,
- VPC
Job description
Overview of job
About Us
We're not just building products, we're building a place where people grow. Personal development isn't a perk here, it's a priority. Every team member levels up alongside the company through hands-on projects, cutting-edge tools, and industry best practices.
Our clients expect excellence, and that standard shapes everything we do from the infrastructure we design to the culture we cultivate.
The Role
We're looking for a Senior DevOps Engineer who's ready to take ownership of our infrastructure, solve tough problems at scale, and build the foundation that lets our teams ship products people actually love using.
This isn't a "keep the lights on" role. You'll shape how we build, deploy, and operate influencing architecture decisions, championing reliability, and raising the bar for the entire engineering organization.
What You'll Be Doing
- Architect & Automate: Design, build, and maintain scalable, secure, and cost-effective cloud infrastructure (AWS, DigitalOcean, or similar) using Infrastructure as Code. You'll own the full lifecycle from initial provisioning to ongoing drift detection and iterative improvement.
- CI/CD Ownership: Own and continuously evolve our CI/CD pipelines to maximize developer productivity. This includes ensuring fast, reliable, and automated releases while integrating quality gates, security scanning, and progressive delivery strategies (canary, blue/green, feature flags).
- Container Orchestration: Manage and scale containerized workloads on Kubernetes, including cluster lifecycle management, resource optimization (HPA, VPA, pod disruption budgets), networking, and storage not just deploying, but deeply understanding how things run in production.
- Observability & Reliability: Build and maintain a comprehensive observability stack (metrics, logs, traces) using tools like Prometheus, Grafana, Loki, or OpenTelemetry. Define and operate against SLOs/SLIs to make reliability measurable, not just aspirational.
- Security Integration (DevSecOps): Embed security into every layer — from image scanning and secrets management to RBAC hardening and network policies. You'll champion a shift-left security culture rather than treating it as an afterthought.
- Collaboration & Mentorship: Partner closely with development teams to optimize application performance, deployment strategies, and developer experience. Mentor junior and mid-level engineers, and advocate for DevOps principles across the organization.
- Incident Response & Continuous Improvement: Participate in an on-call rotation and lead incident response efforts. Drive blameless post-mortems that result in actionable improvements not just documentation.
- Cost Optimization & FinOps: Proactively monitor cloud resource usage, implement right-sizing strategies, and establish tagging and budget controls. You'll ensure we scale efficiently without burning through budget.
- Competitive pay — Above-industry-standard salaries with a 13th-month bonus.
- Great benefits — Full insurance coverage and regular team-building activities.
- Flexibility — Friendly working environment and flexible working hours.
- Growth-focused — We invest in you as much as in our products. Whether it’s improving your English, mastering a new design tool, developing your design thinking, or exploring a new skill, we’ll support your growth every step of the way.
- Meaningful work — We work hand-in-hand with clients, fully investing in their product development to create solutions with true market potential.
- Balanced culture — We respect both products and technology equally, ensuring every team member understands, believes in, and contributes to what they’re building.
- Work-life respect — We stick to a sustainable pace with standard 8-hour days, Monday to Friday — no burnout culture here.
- Agile and adaptive — We work in sprints with realistic goals, improving processes continuously to match our true velocity.
- Ownership & autonomy — You’ll have the freedom to experiment, innovate, and shape your role, with a team that’s always open to better ways of working.
Job Requirement
Must-Haves
- Experience: 5+ years of progressive experience in DevOps, SRE, or Platform Engineering roles, with at least 2 years operating at a senior or lead level (mentoring, architectural decision-making, cross-team collaboration).
- Cloud Proficiency: Proven, hands-on experience with at least one major cloud provider (AWS preferred, Azure, GCP, or DigitalOcean), including compute, storage, networking, IAM design, cost optimization/FinOps awareness, and multi-account or multi-project governance (e.g., AWS Organizations, Landing Zones).
- Infrastructure as Code (IaC): Expertise in IaC tools such as Terraform (preferred), Pulumi, or CloudFormation, with experience writing modular, reusable, and version-controlled infrastructure with proper state management and drift detection. Familiarity with GitOps delivery models using tools like FluxCD or ArgoCD.
- CI/CD: Strong experience designing, scaling, and maintaining CI/CD pipelines (GitLab CI, GitHub Actions, or Jenkins), including pipeline security (secret management, SAST/DAST integration, artifact signing, supply-chain security) and progressive delivery strategies (canary, blue/green, feature flags).
- Containerization & Orchestration: Deep proficiency with Docker and Kubernetes in production, including cluster lifecycle management, Helm charts/Kustomize/operator patterns, resource management (requests/limits, HPA, VPA, pod disruption budgets), and debugging production workloads at the scheduling, networking (CNI), and storage (CSI) level.
- Observability & Reliability: Hands-on experience building observability stacks (metrics, logs, traces) with tools like Prometheus, Grafana, Loki, OpenTelemetry, Datadog, or the ELK stack. Experience defining and operating against SLOs/SLIs/error budgets, with proven incident response and blameless post-mortem experience.
- Scripting & Automation: Strong scripting skills in two or more of Python, Go, or Bash, with the ability to build internal tooling, custom controllers/operators, or CLI utilities that improve developer experience.
- Security Mindset: Working knowledge of DevSecOps practices — image scanning, runtime security (Falco, etc.), secrets management (Vault, SOPS, Sealed Secrets), RBAC hardening, network policies, and understanding of compliance-as-code principles and policy engines (OPA/Gatekeeper, Kyverno).
- Collaboration & Communication: Excellent analytical and troubleshooting skills with a systematic, documented approach to resolving complex issues. Ability to translate infrastructure decisions into business impact for non-technical stakeholders, with a track record of improving Developer Experience (DX) through internal platforms, self-service tooling, and golden paths.
Nice-to-Haves
- Professional cloud certifications (e.g., AWS Certified DevOps Engineer Professional, CKA/CKAD/CKS).
- Experience with service mesh technologies (Istio, Linkerd, Cilium Service Mesh).
- Experience managing and automating relational databases in production (PostgreSQL, MySQL, RDS/Aurora), including backup strategies, replication, and migrations.
- Strong understanding of networking principles — VPC design, DNS, TCP/IP, load balancers, firewalls, CDN, and zero-trust networking concepts.
- Exposure to chaos engineering practices (Litmus, Chaos Monkey, Gremlin).
- Experience with multi-cluster or multi-region Kubernetes architectures.
- Familiarity with platform engineering concepts — internal developer portals (Backstage), self-service infrastructure, and API-driven provisioning.
- Contributions to open-source DevOps/infrastructure projects.
Languages
-
English
Speaking: Intermediate - Reading: Intermediate - Writing: Intermediate
Technical Skill
- DevOps
- AWS
- Kubernetes
- DNS
- PostgreSQL
- MySQL
- Python
- Networking
- TCP/IP
- API
- Jenkins
- Github
- Apache cluster
- Elasticsearch
- CDN
- Docker
- Firewall
- Observability
- MS Azure
- Grafana
- Golang
- Amazon RDS
- Vault
- Ecommerce
- DevSecOps
- GCP
- DigitalOcean
- IAM
- FinOps
- IaC
- Terraform
- Pulumi
- AWS CloudFormation
- GitOps
- FluxCD
- ArgoCD
- CI/CD
- Gitlab
- SAST
- DAST
- Helm
- Prometheus
- LokiJS
- OpenTelemetry
- DataDog
- Elastic Stack
- ELK
- Bash
- CLI
- RBAC
- CKA
- Service Mesh
- Istio
- Amazon Aurora
- VPC
COMPETENCES
- Teamwork
- Reliable
- Communication Skills
- Analytic Skills