Advanced Compute Infrastructure Specialist

5 days ago


Bade, Taiwan beBeeCloudReliabilityEngineer Full time
Job Overview
The position of Cloud Reliability Engineer is crucial in ensuring the high availability, performance, scalability, and security of our Linux-based AI cloud platforms. This role involves deploying, scaling, and maintaining GPU-accelerated compute clusters, Kubernetes workloads, and supporting storage/network infrastructure. The ideal candidate will have expertise in automating infrastructure deployment, enhancing observability, and applying SRE best practices to support reliable AI and MLOps environments.

Key Responsibilities:
1. Design and deploy infrastructure on bare metal or cloud using Terraform, Ansible, or Helm. Automate workflows with Python or Go.
2. Maintain and scale GPU clusters, Kubernetes, and AI-optimized storage (Ceph, BeeGFS, Weka) to ensure stability and performance.
3. Use Prometheus, Grafana, ELK, etc., to monitor system health and trigger alerts on anomalies.
4. Analyze usage patterns and forecast infrastructure needs for AI workloads.
5. Lead root cause analysis and manage SLOs/SLIs/SLAs to maintain high availability.
6. Work with DevOps/MLOPs teams on CI/CD pipelines using GitLab, ArgoCD, or similar tools.
7. Secure Linux systems, manage certificates, and enforce access controls (RBAC, LDAP SSO, TLS, segmentation).

  • Bade, Taiwan beBeeHardware Full time

    Job Title: Infrastructure Support SpecialistJob Summary:Support system evaluation during New Product Introduction (NPI), POC, and tradeshow. Provide solutions and integrations for storage systems.Resolve hardware issues by writing debug programs using shell scripts.Collaborate with production and R&D teams to identify root causes and implement fixes.Utilize...


  • Bade, Taiwan beBeePowerSupplyEngineer Full time

    Job OverviewWe are seeking a highly skilled Power Supply Engineer to join our team. This individual will be responsible for assisting in the development and validation of server power supply and related products.The ideal candidate will have experience working with cross-functional teams and possess strong communication skills. They will also have a strong...


  • Bade, Taiwan beBeeProduction Full time

    Job TitleWe are seeking a skilled Production Specialist to join our team. In this role, you will collaborate with the R&D team to optimize manufacturing and NPI processes at CMs.About the Role:Balancing the workload of production engineers across all CMsWorking with the Test Engineering (TE) team to ensure comprehensive testing coverage for new...


  • Bade, Taiwan beBeesoftware Full time

    Software Developer RolesAbout UsWe are a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide.We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.Key...


  • Bade, Taiwan beBeeTestEngineer Full time

    Job Opportunity: Software Test EngineerThe ideal candidate for this role will possess a strong attention to detail and the ability to think critically. You will be responsible for thoroughly examining all aspects of products or applications, conducting tests, and implementing test automation strategies.This is an exciting opportunity to join our software...