AI Evaluation Specialist

8 hours ago


Taiwan Binance Full time NT$900,000 - NT$1,200,000 per year

Binance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100 countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

We are seeking a dedicated AI Evaluation Specialist responsible for designing, implementing, and managing comprehensive evaluation frameworks that span the entire lifecycle of LLM agents—from pre-deployment testing to post-deployment monitoring and iterative refinement. Your work will directly influence Binance's AI adoption journey by ensuring the reliability, adaptability, and governance compliance of AI agents operating across various domains such as Customer Service, Growth, and Compliance.

Responsibilities:
  • Participate in the entire software development lifecycle, encompassing all stages from requirements analysis to test planning, execution, defect tracking, through to product release and maintenance.
  • Go to person in relation to A.I Agents evaluation and continuously monitoring.
  • Create comprehensive and effective test strategies and hands-on testing to ensure the accuracy, reliability, and performance of AI and data applications .
  • Root cause analysis of test failures and product issues in an effective manner, and drive optimization for future enhancements.
  • Design and develop internal tools leveraging AI technology to improve engineering and testing work efficiency.
Requirements:
  • Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Data Science, or a related field.
  • Strong understanding of Large Language Models (LLMs), autonomous AI agents, and their system architectures.
  • Experience with AI evaluation methodologies, including offline benchmarking, online monitoring, and hybrid human-AI evaluation approaches.
  • Familiarity with software engineering best practices such as Test-Driven Development (TDD), Behavior-Driven Development (BDD), and their limitations in AI contexts.
  • Proficiency in designing adaptive, lifecycle-spanning evaluation frameworks that incorporate both quantitative and qualitative metrics.
  • Experience with evaluation tools and frameworks (e.g., Opik,LangSmith) is a plus.
  • Ability to analyze complex system-level behaviors, including reasoning pipelines, tool integrations, and emergent agent actions.
  • Strong analytical skills with experience in data-driven diagnostics and root cause analysis.
  • Excellent communication skills to document evaluation plans, results, and recommendations clearly.
  • Experience working in cross-functional teams and managing feedback loops between evaluation and development.
  • Experience collaborating with infrastructure or platform teams to improve AI tooling and automation platforms.

Why Binance


• Shape the future with the world's leading blockchain ecosystem


• Collaborate with world-class talent in a user-centric global organization with a flat structure


• Tackle unique, fast-paced projects with autonomy in an innovative environment


• Thrive in a results-driven workplace with opportunities for career growth and continuous learning


• Competitive salary and company benefits


• Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.



  • Taiwan CrowdGen by Appen Full time NT$37,248 per year

    Join the CrowdGen team as an Independent Contractor for Project Rowan and earn while shaping the future of AI Are you detail-oriented and passionate about improving AI technology? Project Rowan gives you the opportunity to contribute directly to refining chatbot performance by evaluating and comparing AI-generated responses. Your insights will help ensure...

  • Senior AI Engineer

    1 week ago


    Taiwan Houzz Full time NT$60,000 - NT$180,000 per year

    About the RoleWe are looking for a highly skilled and motivated Senior AI Engineer to join our innovative Houzz Pro team in Taiwan. If you are passionate about pushing the boundaries of AI and machine learning, we'd love to hear from you.As a Senior AI Engineer, you will be responsible for designing, developing, and implementing advanced AI models and...


  • Taiwan Appier Full time NT$1,800,000 - NT$2,400,000 per year

    About Appier Appier is a software-as-a-service (SaaS) company that uses artificial intelligence (AI) to power business decision-making. Founded in 2012 with a vision of democratizing AI, Appier's mission is turning AI into ROI by making software intelligent. Appier now has 17 offices across APAC, Europe, and the U.S., and is listed on the Tokyo Stock...


  • Taiwan VAST Data Full time $90,000 - $120,000 per year

    VAST Data is looking for a Customer Support Engineer to join our growing team in Taiwan This is a great opportunity to be part of one of the fastest-growing infrastructure companies in history, an organization that is in the center of the hurricane being created by the revolution in artificial intelligence."VAST's data management vision is the future of the...

  • HSE Engineer

    8 hours ago


    Taiwan, Baoshan Township Air Liquide Full time

    Established in 1987, Air Liquide Far Eastern (ALFE) is a prominent joint-venture between Air Liquide Group, a global leader in gases, technologies, and services for Industry and Health, and Far Eastern New Century Group. In recent years, ALFE has emerged as the top French company in terms of investment in Taiwan's manufacturing industry. Presently, ALFE...


  • Taiwan Hyphen Connect Limited Full time NT$120,000 - NT$240,000 per year

    Company Overview: We're working with an innovative company focused on delivering cutting-edge payment orchestration solutions to streamline and optimize global financial transactions. We are seeking a visionary Chief Technology Officer (CTO) to lead the technology strategy and drive the development of scalable, secure, and high-performance payment systems. ...


  • Hsinchu City, Hsinchu City, Taiwan Graphcore Full time NT$1,200,000 - NT$2,400,000 per year

    About Graphcore Graphcore is one of the world's leading innovators in Artificial Intelligence compute. It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry. As part of the SoftBank Group, Graphcore is a member of...


  • Taiwan Speak Full time NT$720,000 - NT$1,440,000 per year

    About us Our mission is to reinvent the way people learn, starting with language. Learning a language can change a life by opening doors to new cultures, careers, and communities. Two billion people around the world are actively trying to learn a language, but the best way to learn (one-on-one tutoring) is hard to access at scale and hasn't been...


  • Taoyuan, Taiwan Office Alarm Full time

    Summary is seeking a Senior Production Engineer to apply his/her technical knowhows and commitment to quality to help design, build, test and validate product assembly and system testing, ranging from stand-alone test stations to multi-station production lines. As part of a small, hardworking and accomplished team of experts, you will directly support our...