One Mount Group

Times City, 458 Minh Khai, Hà Nội

Company Size : 25-99

View more

Job Summary

25-99

Product

Việt Nam

AI Infrastructure Network Engineer

One Mount Group

Hoàn Kiếm, Hà Nội

  • English
  • Experienced (Non-Manager)
  • Full Time
  • Negotiable
  • Posted:21/03/2026
  • 1

Job description

Overview of job

We are looking for a highly specialized AI Infrastructure Network Engineer to design, implement, and optimize the high-speed data fabric that powers our supercomputing and AI clusters. You will be responsible for the low-latency, high-throughput interconnects that allow thousands of GPUs to work as a single unit. Your expertise in InfiniBand (IB), RDMA, and advanced network topologies will be critical in scaling our AI training and inference capabilities.

  • Fabric Design & Architecture: Design and scale high-performance InfiniBand (IB) fabrics using advanced topologies such as Fat-Tree, Dragonfly, and Torus to support massive AI workloads.
  • Interconnect Optimization: Manage and optimize NVLink (NVL) domains and multi-GPU communication across nodes to ensure maximum throughput and minimal collective communication overhead.
  • High-Speed Data Transmission: Implement and fine-tune RDMA (Remote Direct Memory Access), including RoCE and InfiniBand Verbs, to reduce CPU overhead and latency in data transfers.
  • Supercomputer Networking: Configure and maintain the backend "Compute Fabric" specifically tailored for distributed deep learning and large-scale parallel processing.
  • Performance Tuning: Monitor and troubleshoot congestion, adaptive routing, and quality of service (QoS) within the IB fabric to prevent bottlenecks during large-scale model training.
  • Collaboration: Work closely with AI Systems Engineers to align network performance with the requirements of frameworks like PyTorch and distributed training libraries.

Salary & Allowances

  • 13-month salary with annual performance bonus, project incentives, sales incentives (based on position)
  • Lunch allowance: 730.000 VND/month
  • Special occasion bonus: 3.000.000 - 5.000.000 VND/year
  • Annual leaves: Up to 20 days/year (based on levels)
  • Health: Social insurance, premium health insurance, yearly health check
  • Laptop, screen and other needed facilities/ accounts/ tools for work

Career Growth

  • Yearly salary review and promotion
  • Diverse career path: Management or Expert and functions rotation opportunity
  • Free learning sources in Udemy, Coursera, O'relly platforms; internal workshop, certification sponsorship, and exclusive mentoring from C-levels
  • Recognition and awards at team and organizational levels.

Working Environment

  • Open & collaborative working space foster both individual focus and teamwork activities
  • Young, dynamic, and collaborative working atmosphere
  • Unwind zones: gaming, table tennis, yoga, gyms, bath rooms, sleep corner.
  • Quarterly/yearly teambuilding & engaged internal events.

Job Requirement

  • Expertise in HPC Networking: Deep understanding of data transmission mechanics within supercomputers and AI clusters.
  • Network Topologies: Practical experience or strong theoretical knowledge of Fat-Tree, Dragonfly, and SlimFly architectures.
  • Protocol Mastery: Advanced knowledge of the InfiniBand stack, RDMA, and Ethernet-based high-speed networking.
  • Hardware Knowledge: Familiarity with NVIDIA/Mellanox Quantum switches, ConnectX NICs, and NVLink/NVSwitch technologies.
  • Systems Proficiency: Strong Linux networking skills, including experience with OFED (OpenFabrics Enterprise Distribution) and subnet managers.
  • Education: Relevant experience in AI infrastructure or honors programs is highly valued. No degree required, so long as you can prove your knowledge and value.

Preferred Skills

  • Experience in Fintech or large-scale AI production environments.
  • Knowledge of GPU-aware MPI and collective communication libraries (NCCL).
  • Experience managing networking for NVIDIA Jetson or GPU clusters.

Languages

  • English

    Speaking: Intermediate - Reading: Intermediate - Writing: Intermediate

Technical Skill

  • Networking
  • HPC
  • Protocol
  • Quantum
  • Linux
  • Ethernet
  • Fintech
  • GPU
  • Switches
  • Jetson

COMPETENCES

  • Communication Skills

BUSINESS PROFILE

One Mount Group (1MG) goal is to build Vietnam’s largest-scale technological ecosystem.

1MG was established with the vision of promoting and contributing to the economy’s efficiency, creating a technology infrastructure for Vietnamese businesses to accelerate its value added, providing products to consumers at a more competitive cost of goods sold.

1MG is committed to building a strong and sustainable Vietnamese business, creating a broad playing field to nurture and grow future start-ups. We believe that from our core infrastructure the following “giant” businesses of Vietnam will be generated. The goal of 1MG is to build Vietnam’s largest-scale technological ecosystem with solutions in order to link, optimize and close the gaps of the value chain of focused economic sectors having strong growth in Vietnam.

With a sound financial position and business administration, 1MG has competitive advantages to attract and retain the best Vietnamese talents from all over the world.