AI Evaluation Specialist
1 day ago
As an A.I. Agent Evaluation and Optimisation Specialist, play a critical role in ensuring both the outstanding performance and continuous improvement of large language model (LLM)-driven autonomous agents. Responsibilities span from designing and implementing robust evaluation frameworks to proactively identifying and executing optimisation strategies that enhance reliability, adaptability, and compliance across the agent lifecycle. Responsibilities:
- Design, Develop & Optimise Evaluation Plans:
- Create structured, risk-aware, and adaptive evaluation and optimisation plans. Align these with user goals, governance requirements, and system architectures. Translate objectives into measurable criteria, scenarios, and optimisation targets.
- Test Suite Development & Performance Tuning:
- Develop and curate tests covering standard, edge, and emergent agent behaviours. Collaborate to generate synthetic data and incorporate domain expertise and use hands-on optimisation techniques to improve agent robustness.
- Multi-Stage Evaluation & Optimisation:
- Execute controlled (offline) and real-world (online) evaluations, assessing not just outputs but also reasoning steps, tool usage, and workflow execution. Identify and resolve performance bottlenecks, drive fine-tuning, and recommend systemic improvements.
- Analyse, Diagnose & Optimise:
- Conduct deep analysis of evaluation results to find performance gaps, failure modes, and optimisation opportunities at both the model and system level. Provide clear, actionable recommendations to directly improve agent efficiency, accuracy, and reliability.
- Drive Continuous Improvement:
- Collaborate closely with development teams to translate evaluation and optimisation findings into runtime adaptations, code performance enhancements, architectural upgrades, and targeted model retraining, including prompt engineering and reinforcement learning from human feedback (RLHF) methodologies.
- Implement Feedback Loops:
- Establish feedback mechanisms that combine human and machine evaluator input for ongoing monitoring, anomaly detection, and dynamic agent behaviour adjustment, integrating optimisation insights into deployment pipelines.
- Ensure Compliance and Safety:
- Maintain up-to-date governance documentation and safety cases, overseeing regulatory, ethical, and operational compliance through both evaluation and optimisation cycles.
- Cross-Functional Collaboration:
- Work with A.I. researchers, engineers, and domain experts to align evaluation and optimisation strategies with product objectives and user needs.
- Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Data Science, or a related field.
- Demonstrated hands-on A.I. agent development experience, with a track record of identifying and implementing agent performance improvements.
- In-depth understanding of large language models (LLMs), their optimisation, and agent system architectures.
- Experience in both A.I. evaluation methodologies (like benchmarking, online/offline analysis) and direct agent optimisation, such as model fine-tuning or prompt design.
- Familiarity with software engineering best practices (e.g. TDD, BDD), and deep exposure to AI-specific frameworks, observability, and lifecycle analytics.
- Proven ability to perform data-driven diagnostics and root cause analysis, with direct contributions to measurable improvement in A.I. agent performance.
- Strong communication skills, especially for documenting evaluation plans, optimisation strategies, result rationales, and technical recommendations.
- Effective teamwork and cross-functional feedback process experience, bridging evaluation, development, and operations.
- Programming skills in Python plus experience with major A.I./ML libraries and APIs, including hands-on development of LLM agents.
• Shape the future with the world's leading blockchain ecosystem
• Collaborate with world-class talent in a user-centric global organization with a flat structure
• Tackle unique, fast-paced projects with autonomy in an innovative environment
• Thrive in a results-driven workplace with opportunities for career growth and continuous learning
• Competitive salary and company benefits
• Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)
Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success. By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice. We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
-
AI Evaluation Specialist
1 day ago
Taipei, Taipei City, Taiwan Binance Full timeBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...
-
AI Evaluator
1 day ago
Taiwan Appen Full time NT$36,520 per yearJoin CrowdGen as we launch an exciting new AI Voice Interaction Project designed to help improve the way voice assistants understand and respond to users We're looking for detail-oriented contributors to evaluate and analyze real user interactions with voice assistants. In this project, you'll listen to or read conversations between users and voice...
-
Talent Acquisition Specialist
1 day ago
Asia / Taiwan, Taipei / Hong Kong Binance Full time $80,000 - $120,000 per yearBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...
-
Senior CRM Operation Specialist
1 day ago
Asia / Taiwan, Taipei / Hong Kong Binance Full time NT$120,000 - NT$240,000 per yearBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...
-
Wallet Operations Specialist
1 day ago
Asia / Taiwan, Taipei / Hong Kong Binance Full time NT$120,000 - NT$240,000 per yearBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...
-
Senior AI Integration Specialist
7 days ago
Taipei, Taiwan KKCompany Technologies Full timeTeam Segment : CorporateKKCompany Technologies, Asias leading AI multimedia technology group is dedicated to creating values for customers with core businesses of multimedia technologies, digital cloud, and AI applications. At KKCompany, we believe in Innovation Made Simple, and technology is the answer to the struggles faced by every industry. Since its...
-
Web3 Earn Product Operations Specialist
1 day ago
Asia / Taiwan, Taipei / Hong Kong Binance Full time $80,000 - $120,000 per yearBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...
-
Asia / Taiwan, Taipei / Hong Kong Binance Full time NT$120,000 - NT$240,000 per yearBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...
-
Senior AI Integration Specialist
5 days ago
Taipei, Taiwan KKCompany Technologies Full time $60,000 - $120,000 per yearTeam Segment : CorporateKKCompany Technologies, Asias leading AI multimedia technology group is dedicated to creating values for customers with core businesses of multimedia technologies, digital cloud, and AI applications. At KKCompany, we believe in Innovation Made Simple, and technology is the answer to the struggles faced by every industry. Since its...
-
Accelerator Program
1 day ago
Asia / Taiwan, Taipei / Hong Kong Binance Full time $60,000 - $80,000 per yearBinance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 230 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...