|| Prefer Candidates from SaaS Cloud Companies ||
Role: NOC Manager
Exp Band: 8-12 Years
Work Location: Bangalore Karnataka, India (Vector Prestige Marathahalli)
Summary of Responsibilities:
Supervise a 24x7 Tier1 Network Operation Centers (NOC) shift operations and personnel,
Responsible for managing production incidents, outages, SLA, uptime, service availability, root cause analysis
Responsible to work as Ops Program manager and working closely with remote counterpart Ops program manager in US to get project done on time.
Manages the process to restore normal service operation as quickly as possible to minimize the impact to business operations.
Oversee day to day Tier1 NOC operations, escalations, ticketing and communications with all stakeholders primarily with Tier2 SRE.
NOC is a fast paced environment, critical thinking is essential. Ideas will be extrapolated from one situation to another. Strict adherence to defined SLA, Production runbooks and MOPs (Method of Procedure), Company policies, Confidentiality and mature judgment must be demonstrated at all times.
Primary Responsibilities & Objectives:
Managing the outsourced NOC team and ensuring they manage the operation in accordance with identified SLA, Policies and Procedures.
Leading technical bridges and coordinating service restoration efforts with various stakeholders including SRE, Customer Support, Engineering, Vendors and Third-party service providers to assure continued QOS and stable operations.
Responsible for the quality and integrity of the Incident Management process and End to end Management of the lifecycle of Major incidents, identify and capture impact and urgency.
Managing break-fix activities during incidents to provide any workaround solution incident resolution
Experience with root cause analysis of critical business and production issues
Manage the Ops program cycle from initiation through delivery, interfacing with external customers and Vendors as needed.
Formulate and monitor execution plan form inter-connected initiatives Ops projects. Working closely with remote counterpart Ops program manager in US to get project done on time.
Expectation to run scrum meetings, track milestones and keep the stakeholders updated of progress
Review, Refine and further develop support documentations, processes, procedures and system requirements within the NOC.
Generate key reports for Management including but not limited to: system availability, service level agreements, ticket resolution and production incidents and root cause analysis
Manage release, hot fix, other production deployments including critical script run and ensure NOC team adhere to Release deployment MOPs.
Report deployment report status to leadership. Work closely with SRE and engineering for successful release deployment in production.
Expert in proactive monitoring, alerting, trend analysis and self-healing systems
Continuously improve and manage systems to proactively monitor the infrastructure which demands deep troubleshooting and scripting skills to improve the availability, capacity, and security of Client Cloud Services
Participate in on-call rotations, driving restoration and repair of service-impacting issues
Strong mentoring and coaching skills that encourage growth for more junior members
Clear understanding of SRE and NOC best practices, and the product development lifecycle.
Subject to call 24 hours, 7 days a week
Education, Experience, & Skills Required:
9-12 years of experience as a Manager working on a SaaS product running on Private and Public Data centers
With 5+ year of experience in managing NOC environment
Bachelors Degree or equivalent
Hands on experience in the public cloud, specifically Amazon Web Services (AWS)
Strong scripting experience in Python/Bash and good understanding of scripts
Experience in Linux System Administration
Experience in technical program management
Experience with monitoring solutions (e.g. DataDog, Icinga, New Relic)
Experience in Infrastructure as code using Chef, Terraform, Ansible, Cloudformation
Experience with Continuous Integration and Continuous Delivery concepts using Jenkins and Rundeck
Experience with elastically scalable, fault tolerance and other cloud architecture patterns
Experience with modern cloud development practices (Microservices architectures, REST interfaces, etc.)
Ability to design roadmaps and relevant technical documentations
Aware of standard network best practices and integration of all tools;
Knowledge of H.323, SIP, Microsoft Lync and any other technologies in video / voice conferencing
Virtualization management and integration (ESX, OpenStack)
Familiarity with Containerization and Orchestrations concepts like Docker, Containers, Kubernetes and NoSQL
CCNA / CCNP/CCIE AWS / Azure Certification is big plus
Soft Skills Required:
Good communicator and highly adaptive
Ability to interact efficiently with peers and customers is required;
Ability to multitask effectively and be effective mentor and technical leader to team members
Ability to and proven success working cross-functionally with demonstration of effective team work and interpersonal skills
Takes responsibility and ownership for decisions, actions and results. Accountable for both how and what is accomplished
Be a self-starter, quick learner, has a strong attention to detail, and works well in independent situations
Proactive-ness and Resourcefulness
Possess strong mentoring and coaching skills
Details Required for processing:
Current Employer/Work Location:
Education/Year of Passing:
Visa (if any):
Salary: INR 15,00,000 - 25,00,000 P.A. DOE
Industry:IT-Software / Software Services
Functional Area:IT Software - Network Administration, Security
Desired Candidate Profile
Kaizen SRA Technologies Pvt. Ltd.
Recruiter Name:Pradeep Krishnan
Contact Company:Kaizen SRA Technologies Pvt. Ltd.