Here at OCI we're building the world's largest AI clusters and we're the fastest at bringing them to the market. The AI Infrastructure organization at OCI is leading this effort by creating a GPU focused cloud and integrate with frontier model providers such as OpenAI, xAI and Gemini. This is your chance to be part of the AI revolution by creating systems that allow customers to scale from tens to thousands of millions of tokens without compromising performance. You will have the opportunity to work with cutting-edge technologies and make a significant impact on our organization's success.
We are looking for an experienced engineering manager to lead a software team responsible for developing solutions to scale and optimize AI infrastructure for customers' AI workloads. In this role, you will set and communicate priorities and expectations to a team of strong developers working on cutting-edge technologies. You will collaborate with cross-functional teams to enhance our AI infrastructure to deliver exceptional customer experience.
Responsibilities
Own and build solutions to scale and optimize partner model integration with the goal to optimize customer experience and customer workload performance.
Set and communicate individual expectations and team goals such that they align with the broader organization goals.
Model and coach team members and drive modern software engineering practices like leveraging data/telemetry to make decisions, well-defined interfaces across components, design reviews, coding standards, code reviews, and comprehensive coverage from unit test, integration test and active production monitoring.
Prioritize team's work with focus on customer issues and requirements.
Ensure that team solutions are well-defined and modularized, secure, reliable, diagnosable, actively monitored, compliant and reusable.
Create roadmap, define SMART goals, and track team progress against committed OKRs.
Qualifications & Skills
BS (or equivalent experience) in Computer Science, Engineering, or related field.
6 years of experience in software development with programming languages including, but not limited to, C, C++, C#, Java, Go, Rust.
2 years of experience in people management or leadership role while working on cross-functional projects.
3 years of experience designing and developing large-scale distributed systems, services, and infrastructure.
Strong communication, collaboration, and project management skills.
Ability to adapt to a fast-paced, dynamic environment and manage multiple tasks and priorities effectively.
Preferred Qualifications
Experience in managing cloud infrastructure with hundreds of thousands of servers.
* Experience in containerization technologies such as Docker and Kubernetes.
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.