ItJobs Logo
Home About us Conditions
vi en
Login Sign Up
Logo

IT Jobs

Close
  • Home
  • About us
  • Conditions
  • Privacy
  • Contact
  • eng vi
TOP JOBS
Rowboat Software
Principal Backend Engineer
Rowboat Software
Up to 7000USD
Rakuten Fintech Vietnam
Mid/Sr Java Developer
Rakuten Fintech Vietnam
Up to 3200USD
Ingenico Group
Android Developer
Ingenico Group
Up to 3000USD
CodeHQ
Senior .NET Fullstack Developer
CodeHQ
Up to 3000USD
Techcombank
Senior Officer, Data Engineer
Techcombank
Up to 3000USD
Nakivo
QA Team Lead
Nakivo
Up to 3000USD
SVTEK
(Remote) Web Developer
SVTEK
Up to 2900USD
SVTEK
(Remote) Technical Lead
SVTEK
Up to 2900USD
SVTEK
(Remote) Senior Full-Stack Developer
SVTEK
Up to 2900USD

Money Forward Vietnam

E-Town Central, 11 Doan Van Bo, TP Hồ Chí Minh

Company Size : 25-99

View more

Job Summary

  • 25-99
  • Product
  • Việt Nam

Agent Ops Engineer

Money Forward Vietnam

  • Quận 4, TP Hồ Chí Minh
  • Negotiable
  •  Full Time
  •  English
  •  Experienced (Non-Manager)
1
1

  •  Posted:24/06/2026

  • Apply now
Agent Ops Engineer
Apply now
Technical Skill: AI (Artificial Intelligence) , DevOps , AWS , Java , Spring , Python , Regression Testing , API , Architecture , Observability , Protocol , OpenAI , Apache Kafka , Caching , Spring Boot , Kotlin , Kubernetes , Cloud Infrastructure , IaC , MLOps , CI/CD , LLM , LangChain , RAG , LangGraph , OpenTelemetry , MCP

Job description

Overview of job

We’re hiring an Agent Ops Engineer to scale AI agent capabilities across HRS Domain and products. This is a high-impact role at the intersection of AI engineering, platform operations, and knowledge enablement. You’ll provide directions and build AI agents reliable in production across teams by owning the lifecycle, quality gates, observability, and operational standards—while embedding with teams to accelerate adoption. The larger goal of this centralized Agent Ops model is to enable Ai enablers and product builders within each product team for agent development and at the same time contribution common best practices, guard rails, to MFBS adoption across other domains like ERP and SMB. 

What you will do:

1) Agent Engineering & operation

  • Design, build, and maintain production-grade AI agent systems, including: context engineering and instruction architecture, prompt hardening and safe execution boundaries, tool integrations and multi-step orchestration, memory strategies and reliability patterns.
  • Own the full agent lifecycle: prototype → evaluate → deploy → monitor → iterate.
  • Build and maintain an evaluation pipeline to measure agent quality, catch regressions, and enforce deployment gates (golden datasets, scenario suites, automated checks).
  • Instrument agents and agent platforms for production observability: structured logging, tracing, and metrics; latency and cost monitoring; tool-call success rates and failure analysis.
  • Define operational readiness standards including: rollback criteria, incident response playbooks, recovery paths for common failure modes.

2) Team Enablement & Coaching

  • Embed with product engineering teams to identify high-value use cases ready for agent automation. We will be operating in a Central Agent Ops role enabling Ai product builders through AI enablers. 
  • Translate business workflows into agent-executable tasks with clear: contact boundaries/interfaces, assumptions and inputs/outputs, failure modes and safe fallbacks.
  • Deliver targeted coaching to engineers on: context engineering best practices, harness design and regression testing patterns, agent skill design and tool-contract discipline.
  • Reduce onboarding time for teams adopting AI capabilities—from first conversation to a production-ready agent.
  • Train product engineers to extend and maintain agent skills independently.

3) Standards & Knowledge operations

  • Author and maintain org-level standards for agents, including: naming conventions, context file structures and ownership rules, skill interface contracts (inputs/outputs, invariants, error handling), evaluation criteria and release quality bars.
  • Establish and enforce “repo-as-discipline” practices so agent knowledge is: versioned, reviewable, discoverable, reusable; not trapped in prompt snippets or individual heads.
  • Build and grow a shared agent skills library that teams can reuse and extend.
  • Track and aggregate AI tooling/framework updates and external best practices, serving as a central intake so product teams don’t each have to follow the entire AI landscape.
  • Run internal knowledge-sharing sessions, showcases, and retrospectives to propagate learnings efficiently.

Caring Mental & Physical Recreation: 

  • Hybrid working
  • Full salary in probation & 13th month salary
  • Social insurance on full salary from probation
  • Premium Health insurance from probation
  • Flexible start 8AM-9AM from Mon-Fri
  • 16 days off annually + 1 Birthday Leave 
  • Paternity leave extra 5 days 
  • Annual company trip; Quarterly team building activities
  • Club activities
  • Annual health check

Caring Career & Development: 

  • Clear Career path
  • Foreign language & International technology-related certifications sponsoring
  • Well-equipped facility: Macbook pro,  additional monitor,..
  • Soft skill workshops
  • Tech seminars
  • Monthly and biannually Recognition Awards
  • Performance review twice/year

Job Requirement

What you bring:

Must have

  • 12+ years of experience in the software development industry
  • Hands-on experience building and deploying production AI agents using modern frameworks (LangGraph, LangChain, OpenAI Agents SDK, trueAI, or equivalent).
  • Strong understanding of context engineering, including instruction architecture, token management, caching strategies, and latency-aware design.
  • Experience building evaluation pipelines: golden datasets and scenario libraries; automated quality gates and regression detection.
  • Familiarity with agent observability: tracing, structured logging, latency, and cost monitoring; tool-call reliability metrics and failure analysis.
  • Ability to design guardrails: output validation; prompt injection mitigation; safe execution boundaries for tools/actions.
  • Solid backend engineering skills; comfortable owning services/APIs end-to-end.
  • Strong communicator who can coach engineers, facilitate cross-team discussions, and write clear technical documentation.
  • Experience with production reliability and platform operations, including: event-driven architectures (Kafka and/or message queues); retries/backoff, DLQs, idempotency, ordering, backpressure; CDC/outbox-style patterns (or similar asynchronous reliability patterns); Kubernetes-based deployment and day-2 operations; CI/CD pipelines and infrastructure as code; on-call, incident response, postmortems, and SRE-style practices (SLOs/SLIs, runbooks).

Nice to have

  • Experience with RAG systems: ingestion, chunking, embeddings, hybrid search, retrieval evaluation.
  • Familiarity with MCP / Model Context Protocol or similar agent tooling standards (e.g., “MPTV”), and tool integration ecosystems.
  • Proficiency across Java/Kotlin (Spring Boot) and Python in production environments.

Who thrives in this role?

  • Engineers with an SRE/DevOps background pivoting into AI who naturally think about reliability, observability, and incident response.
  • Backend engineers with hands-on LLM/agent framework experience who want to work cross-functionally and enable multiple teams.
  • MLOps/LLM engineers who want to embed in product orgs and ship applied systems (not only model infrastructure).
  • Engineers who treat documentation, standards, and knowledge transfer as first-class engineering outputs.

What you can expect

  • A greenfield mandate to define what “good AI operations” looks like at scale inside an engineering organization.
  • Direct influence on the standards, patterns, and tooling multiple product teams will adopt.
  • A role that grows from team-level impact to organization-wide impact as the practice matures.
  • Work at the frontier of applied AI engineering, where best practices are still being written.

Our stack

Agent frameworks and LLM APIs, OpenTelemetry, Kafka/event-driven systems, Kubernetes, Spring Boot, Java, Kotlin, Python, CI/CD pipelines, AWS/cloud infrastructure.

Languages

    • English

    • Speaking: Intermediate - Reading: Intermediate - Writing: Intermediate

Technical Skill

  • AI (Artificial Intelligence)
  • DevOps
  • AWS
  • Java
  • Spring
  • Python
  • Regression Testing
  • API
  • Architecture
  • Observability
  • Protocol
  • OpenAI
  • Apache Kafka
  • Caching
  • Spring Boot
  • Kotlin
  • Kubernetes
  • Cloud Infrastructure
  • IaC
  • MLOps
  • CI/CD
  • LLM
  • LangChain
  • RAG
  • LangGraph
  • OpenTelemetry
  • MCP

COMPETENCES

  • Reliable
  • Analytic Skills
  • Documentation

Search for the right jobs

BUSINESS PROFILE

Money Forward Vietnam aims to solve money-related issues of all individuals and businesses through building an open and fair financial platform and providing essential services.

We contribute to building a better society by providing services that enable users to “see money in a positive light and broaden their range of opportunities,” thereby significantly enriching their lives.

MORE JOBS FROM THIS EMPLOYER

  • 25-99
  • Product
  • Việt Nam

(Hanoi) Principal/ Senior Golang Engineer

Money Forward Vietnam

  • Quận 4, TP Hồ Chí Minh
  • Negotiable
  •  Full Time
  •  Experienced (Non-Manager)
1
Posted: 24/06/2026
Skills: Golang, PostgreSQL, MySQL, Docker, Architecture, AWS, Ecommerce, Microservices, CI/CD
  • 25-99
  • Product
  • Việt Nam

Solution Architect/ Technical Lead

Money Forward Vietnam

  • Quận 4, TP Hồ Chí Minh
  • Negotiable
  •  Full Time
  •  Experienced (Non-Manager)
1
Posted: 24/06/2026
Skills: Java, Golang, TypeScript, Distributed Systems, Nginx, Middleware, Docker, Redis, Architecture, DDD, Amazon S3, Amazon RDS, Apache Kafka, SonarQube, DynamoDB, AWS, Amazon SQS, MySQL, Jira, System Design, API, Jenkins, Github, Slack, Kotlin, Kubernetes, SentryOne, DataDog, GCP, Cloud Infrastructure, IaC, Clean Architecture, Terraform, Amazon EKS, Amazon ECR, Amazon SNS, Zoom, Rollbar, CI/CD, ArgoCD, CircleCI, GitHub Actions, Claude, Cursor

Search for the right jobs

footer_logo

WHO WE ARE

ITJobs is founded in 2014 in Vietnam and the primary goal is grow to one of the leading specialists in recruitment and selection of IT staff in Asia.

  • READ MORE

Jobs from Ho Chi Minh

  • Java jobs
  • C# jobs
  • Tester jobs
  • iOS jobs
  • ASP.NET jobs

Jobs from Hanoi

  • C++ jobs
  • Java jobs
  • Linux jobs
  • SQL jobs
  • .NET jobs

Information

  • About Us
  • Conditions
  • Privacy
  • Contact Us

ITJobs © Copyright 2013-2021