: Design and implement scalable and automated operational processes for incident management, change execution, security operations, capacity planning, monitoring, and disaster recovery.
Drive Reliability Engineering
: Collaborate with Operations and Development teams to ensure that operational workflows align with reliability and scalability goals.
Operational Excellence
: Define and implement KPIs and SLAs for operational performance, and develop continuous improvement programs to meet and exceed them.
Automation First
: Lead efforts to automate repetitive and manual operational tasks using tools, scripts, and platforms to improve efficiency and reduce risk.
Incident Management Leadership
: Develop and refine incident management and response strategies, ensuring rapid resolution and root cause analysis for critical issues.
Capacity and Performance Management
: Architect and implement systems to monitor, predict, and optimize infrastructure utilization across a global scale.
Cross-Functional Collaboration
: Partner with engineering and product teams to ensure operational readiness for new services and features.
Mentorship and Knowledge Sharing
: Act as a thought leader and mentor within the operations team, sharing best practices and driving operational excellence across the organization.
US Citizenship AND active TS/SCI w/Poly US Government Security Clearance required.
US Citizenship AND active TS/SCI w/Poly US Government Security Clearance required.
Technical Expertise
Expertise in Exadata and Oracle databases on Exadata platform
Proven experience in designing and managing large-scale cloud infrastructure operations in environments like OCI, AWS, Azure, GCP, or similar platforms.
Strong knowledge of automation and orchestration tools (e.g., Terraform, Ansible, Kubernetes, etc.).
Expertise in monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, New Relic).
Deep understanding of operational frameworks such as ITIL, SRE principles, and DevOps methodologies.
Problem-Solving and Strategy
Demonstrated ability to analyze complex operational challenges and design innovative solutions.
Strategic mindset with experience in aligning operational processes with business goals.
Communication and Leadership
Strong collaboration and communication skills to work effectively with cross-functional teams and stakeholders.
Ability to lead through influence and drive consensus on technical and process improvements.
Education and Experience
10+ years of experience in cloud infrastructure operations, SRE, or similar roles.
Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.
Preferred Qualifications
Experience in multi-cloud or hybrid cloud environments.
Certifications such as AWS Certified Solutions Architect, Google Cloud Professional Architect, or similar is desirable.
Familiarity with regulatory and compliance frameworks like SOC 2, ISO 27001, or PCI-DSS highly desirable.
* Previous experience mentoring teams or acting as a technical lead.
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.