Create a free account to apply for AI Platform Lead and track your applications.
Job Title – AI Platform LeadCompany – TCS (MEA)Location – DubaiJob type – Full timeAbout Us:Tata Consultancy Services (TCS) is an IT services, consulting and business solutions organization that has been partnering with many of the world’s largest businesses in their transformation journeys for over 50 years. TCS offers a consulting-led, cognitive powered, integrated portfolio of business, technology and engineering services and solutions. This is delivered through its unique Location Independent Agile™ delivery model, recognized as a benchmark of excellence in software development.A part of the Tata group, India's largest multinational business group, TCS has over 616,171 of the world’s best-trained consultants with 157 nationalities in 53 countries. For more information, visit www.tcs.com and follow TCS news at @TCS_News.Job Description:Key Accountabilities:AI Platform OperationsIncident and Problem ManagementAI Agent and Workflow SupportLLM and Model Service SupportAPI, AI Gateway and MCP SupportOperate and troubleshoot the Azure API Management layer that serves as both the enterprise API gateway and the MCP Gateway for Gernas. Diagnose issues with API policies, authentication, authorisation, routing, rate limiting, quotas, caching, and backend connectivity. Provide deep support for MCP servers, MCP tools, and the MCP Gateway pattern — including tool discovery, schema validation, protocol-level failures, and the coordination between MCP clients hosted in agents and the underlying tool endpoints. Ensure that the gateway remains a secure, observable, and policy-compliant control plane for all AI traffic.Observability and Performance ManagementUse Comet Opik as the primary observability surface for agent and LLM execution, working with traces, prompts, agent execution paths, latency breakdowns, token usage, errors, and model quality indicators. Build and maintain operational dashboards, alert rules, and correlation views that combine Opik telemetry with Azure Monitor, Application Insights, Log Analytics, and CloudWatch data. Lead performance optimisation initiatives where trace evidence shows hotspots in prompts, tools, retrieval steps, or model selection, and ensure that observability coverage keeps pace with platform evolution.Voice AI SupportSupport the production operation of ElevenLabs-based voice AI capabilities, including speech generation, voice-agent connectivity, real-time audio session handling, and API consumption patterns. Investigate latency, audio quality, dropped sessions, and integration failures across the voice channel and its dependent platforms, and coordinate with ElevenLabs and integration partners on upstream issues.Release and Change ManagementValidate releases prior to and immediately following deployment, exercising production verification scripts, smoke tests, and rollback procedures. Maintain release readiness through clear configuration management, environment parity checks, and pre-deployment risk reviews. Operate within FAB's change-management framework, ensuring that all production changes including patches, upgrades, configuration adjustments, and model or prompt rotations pass through the appropriate change controls and post-implementation review.Security, Risk and ComplianceUphold the security and compliance posture of Gernas and dependent AI products. Manage identity and access controls, secrets, certificates, and managed identities across the platform; coordinate vulnerability remediation and patching cycles; and maintain audit evidence for internal audit, supervisory reviews, and external assurance. Operate responsible-AI controls including content filtering, PII and PCI detection, data egress controls, and model-access governance and ensure that secure integration patterns are followed across every API, MCP tool, and external dependency.Service Improvement and AutomationDrive continuous reduction of manual support effort through automation of routine operational tasks, self-healing patterns, monitoring enhancements, and proactive remediation. Maintain a current and high-quality library of support playbooks, runbooks, knowledge articles, and standard operating procedures. Identify and lead service-improvement initiatives that lift platform reliability metrics, reduce incident volume, and shorten mean time to resolution.Stakeholder and Vendor CoordinationOperate as a credible technical counterpart to business units, engineering teams, the Cloud Platform team, Cybersecurity, Architecture, AI Governance, and Service Management. Lead vendor engagement with Microsoft, AWS, Core42, ElevenLabs, and other technology partners on incidents, capacity reviews, roadmap items, and product issues, ensuring that vendor accountability is exercised and that escalations are progressed effectively.On-Call and Operational ReadinessParticipate in a 24×7 support model, including a structured on-call rotation, major incident leadership, disaster-recovery exercises, business-continuity testing, and production-readiness assessments for new agents, models, and integrations entering the platform. Treat operational readiness as a release gate rather than an afterthought, and ensure that nothing reaches production without explicit operational sign-off.TECHNICAL SKILLS: minimum 8-10 yrs of working experience mandatoryCloud AI ServicesAgentic AI FrameworksLLM OperationsKubernetes and ContainersAPI and MCP GatewayObservabilityDevOps and AutomationVoice AISecurity and GovernanceService ManagementKNOWLEDGE, SKILLS, & EXPERIENCEEducationBachelor's degree in Computer Science, Artificial Intelligence, Information Technology, Engineering, or a closely related discipline. A relevant master's degree in AI, Machine Learning, or Cloud Computing will be considered an advantage.Professional ExperienceApproximately 8–10 years of overall IT experience, with significant time spent in cloud application support, platform engineering, DevOps, Site Reliability Engineering, production operations, or senior technical support roles. Of this, a minimum of 3–5 years of relevant experience supporting AI, machine learning, Generative AI, conversational AI, cloud-native platforms, or other data-intensive enterprise systems is required.Banking and Regulated EnvironmentExperience operating within banking, financial services, government, telecommunications, or another highly regulated enterprise environment is strongly preferred. Familiarity with regulatory expectations around data protection, model governance, audit evidence, and operational resilience is highly valued.Technical TroubleshootingStrong analytical and diagnostic skills, with demonstrated ability to troubleshoot complex distributed systems across applications, APIs, AI models, agents, cloud services, Kubernetes workloads, networking, identity, and third-party integrations. Capable of reasoning from symptom to root cause across stack boundaries without losing fidelity.Operational SkillsProven track record of producing high-quality runbooks, support procedures, monitoring standards, operational dashboards, knowledge articles, root-cause analysis reports, and service-improvement plans. Comfort operating with formal SLAs, OLAs, and change-management discipline.Communication and LeadershipStrong written and verbal communication skills, with the ability to lead incidents under pressure, coordinate vendors and cross-functional stakeholders, mentor junior engineers in the AI operations function, and translate complex technical issues into clear narratives for both technical and senior business audiences.Thank you for your interest in applying for this position with TCS. We will review your application and will get back to you if we are considering your interest in this opportunity.Application Deadline: 30- July -2026Privacy Note:https://www.tcs.com/connect-with-tcs/privacy-policy
Tailor your CV
Highlight your most relevant AI/ML experience
Research Tata Consultancy Services
Check their AI products and latest news
Show impact
Use metrics to quantify your achievements
AI Engineer
Tata Consultancy Services · Dubai, United Arab Emirates
Data Science Specialist
Dautom · Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates
Data Scientist
Atain · Abu Dhabi Emirate, United Arab Emirates
AI/ML Architect - GenAI and AgenticAI
HireOn · Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates