ItJobs Logo
Home About us Conditions
vi en
Login Sign Up
Logo

IT Jobs

Close
  • Home
  • About us
  • Conditions
  • Privacy
  • Contact
  • eng vi
TOP JOBS
Hybrid Technologies HCM
Bridge Engineer (Salesforce | Japanese N2)
Hybrid Technologies HCM
Up to 4000USD
Akila
Senior AI Engineer
Akila
Up to 3500USD
CodeHQ
Senior Frontend Developer
CodeHQ
Up to 3000USD
Atomi Digital
Technical Leader
Atomi Digital
Up to 3000USD
Oolio Vietnam
Lead Backend Engineer - Golang Expert
Oolio Vietnam
Up to 3000USD
One Mount Group
Senior Data Engineer
One Mount Group
Up to 3000USD
Hybrid Technologies HCM
Business System Manager
Hybrid Technologies HCM
Up to 3000USD
LIFULL Tech Vietnam
Mobile Application Engineer
LIFULL Tech Vietnam
Up to 60000000VND
VNGGames
Senior Frontend Developer
VNGGames
Up to 2500USD

Techcombank

Techcombank Tower, 191 Ba Trieu, Hà Nội

Company Size : 100-499

View more

Job Summary

  • 100-499
  • Product
  • Việt Nam

Senior Site Reliability Engineer

Techcombank

  • Hai Bà Trưng, Hà Nội
  • Negotiable
  •  Full Time
  •  Experienced (Non-Manager)
1
1

  •  Posted:31/10/2025

  • Expired
Senior Site Reliability Engineer
Expired
Technical Skill: Java , Python , Apache Spark , Unity , AWS Lambda , Observability , Scala , DevOps , Grafana , OpenAI , Amazon S3 , Apache Kafka , ITIL , DataDog , Terraform , AWS CloudFormation , Prometheus , Apache Airflow , Apache Flink , Databricks , AWS Glue , IaC , FAISS , LLM , Vector , GenAI , RAG , Claude

Job description

Overview of job

1. About the Role:

We are seeking a highly skilled Site Reliability Engineer with experience applying GenAI to automate and enhance the reliability of complex data platforms in Data Division. You will be responsible for building self-healing infrastructure, AI-powered observability, and automating incident response across data pipelines (e.g., Databricks, Glue, Kafka, Flink). This is a high-impact role where you will shape the future of data reliability at Techcombank, mentor engineers, and lead initiatives that span multiple teams and domains.

2. Key Responsibilities:

Platform Reliability & Automation
• Design, implement, and operate reliable, scalable, and observable data platforms.
• Automate incident triage, remediation, and postmortems using GenAI-powered tools.
• Develop intelligent runbooks and self-healing workflows using LLMs.
GenAI-Enabled SRE Practices
• Build and integrate GenAI copilots for on-call support, anomaly detection, and RCA (root cause analysis).
• Fine-tune or prompt engineer LLMs for specific use cases like summarizing logs, interpreting metrics, or generating remediation steps.
• Leverage vector databases (e.g., FAISS, Weaviate) to retrieve telemetry and incident history for GenAI prompts.
Observability & Anomaly Detection
• Integrate GenAI with observability tools (e.g., Datadog, Prometheus, Grafana, OpenTelemetry).
• Build systems for natural language querying of platform health and pipeline performance.
• Collaborate with data engineers to monitor SLIs/SLOs across ingestion, transformation, and delivery layers.
CI/CD & Risk Management
• Integrate GenAI into CI/CD pipelines to generate blast radius analyses and deployment guardrails.
• Use LLMs to assess the risk of configuration or schema changes before production rollout.
• Automate validation and rollback strategies based on historical outcomes.

WHY BECOME IT/DATA EXPERTS AT TECHCOMBANK?

  • Investing over 500 million USD to develop large-scale IT projects, Techcombank is one of the leading bank in Technology trends in Vietnam
  • You will grow with Techcombank by having the opportunity to learn from top experts from across the world
  • Techcombank provides a rewarding remuneration structure that commensurate with your achievement and contribution
  • Techcombank is the Top 2 Best place to work in the banking industry where you can experience various exciting activities throughout the year: Company anniversary, Team building, Active Saturday , Year End Party, etc.

Job Requirement

• Bachelor's degree in computer science, software engineering or information technology
• Good at English

• 5+ years in SRE, DevOps, or Data Engineering roles with strong focus on automation and observability.
• Solid experience in cloud-native data platforms (e.g., Databricks, Glue, Kafka, Flink, S3, Lambda).
• Proven experience using or integrating GenAI tools (OpenAI, Claude, HuggingFace Transformers).

• Proficiency in Python or Scala; experience with Spark and Airflow a plus.
• Familiarity with LLM techniques: prompt engineering, embeddings, retrieval-augmented generation (RAG).
• Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Datadog).
• Experience with Infrastructure as Code (e.g., Terraform, CloudFormation).
Preferred:
• Experience fine-tuning LLMs or integrating GenAI agents into production systems.
• Familiarity with vector databases (e.g., Pinecone, Qdrant, FAISS).
• Knowledge of data quality frameworks and lineage tools (e.g., DeeQu, Great Expectations, Amundsen, Unity Catalog).
• Understanding of ITIL/incident management frameworks.
• Strong communication and documentation skills, especially in on-call and postmortem environments.

Languages

    • English

    • Speaking: Intermediate - Reading: Intermediate - Writing: Intermediate

Technical Skill

  • Java
  • Python
  • Apache Spark
  • Unity
  • AWS Lambda
  • Observability
  • Scala
  • DevOps
  • Grafana
  • OpenAI
  • Amazon S3
  • Apache Kafka
  • ITIL
  • DataDog
  • Terraform
  • AWS CloudFormation
  • Prometheus
  • Apache Airflow
  • Apache Flink
  • Databricks
  • AWS Glue
  • IaC
  • FAISS
  • LLM
  • Vector
  • GenAI
  • RAG
  • Claude

COMPETENCES

  • Reliable
  • Communication Skills
  • Documentation

Search for the right jobs

BUSINESS PROFILE

Techcombank aspires to be the best bank and a leading business in Vietnam.

MISSION:

• To be the preferred and most trusted financial partner of our customers, providing them with a full range of financial products and services through a personalized/customer centric relationship.

• To provide our employees with a great working environment where they have multiple opportunities to develop, contribute and build a successful career

• To offer our shareholders superior long term returns by executing a fast growth strategy while enforcing rigorous corporate governance and risk management best practices

CORE VALUES:

1. Customer first: what we do is only valued if it is truly beneficial to our customers and colleagues.

2. Innovation: Make improvements to lead the way.

3. Team work: At Techcombank, you will not have good performance without cooperation.

4. People development: People with proven capability will bring the organization competitive advantages and remarkable successes.

5. Accountability: Be committed to overcoming difficulties and achieving great successes.

MORE JOBS FROM THIS EMPLOYER

  • 100-499
  • Product
  • Việt Nam

Senior Data Scientist

Techcombank

  • Hai Bà Trưng, Hà Nội
  • Negotiable
  •  Full Time
  •  Experienced (Non-Manager)
1
Posted: 24/10/2025
Skills: Java, Machine Learning, Python, C, Deep Learning, R, MS SQL, Apache Spark, Scala, Data Analysis, Statistics, C++
  • 100-499
  • Product
  • Việt Nam

Scrum Master

Techcombank

  • Hai Bà Trưng, Hà Nội
  • Negotiable
  •  Full Time
  •  Team Leader/Supervisor
1
Posted: 09/10/2025
Skills: CSM, PSM, PMP, ATDD, Six Sigma, Jira, TDD, Jenkins, BitBucket, Confluence, SonarQube, CI/CD

Search for the right jobs

footer_logo

WHO WE ARE

ITJobs is founded in 2014 in Vietnam and the primary goal is grow to one of the leading specialists in recruitment and selection of IT staff in Asia.

  • READ MORE

Jobs from Ho Chi Minh

  • Java jobs
  • C# jobs
  • Tester jobs
  • iOS jobs
  • ASP.NET jobs

Jobs from Hanoi

  • C++ jobs
  • Java jobs
  • Linux jobs
  • SQL jobs
  • .NET jobs

Information

  • About Us
  • Conditions
  • Privacy
  • Contact Us

ITJobs © Copyright 2013-2021