Principal Incident Manager, Prime Video

Job Description

Amazon

DESCRIPTION

Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads.

As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people.



This is a senior-level incident management role responsible for leading the incident response function for Prime Video’s video-on-demand platform. The key responsibilities include:

– Defining the strategy and operating model for the incident response team to minimize the duration and severity of customer-impacting incidents.


– Leveraging technical expertise to develop the vision for incident management tooling and capabilities to improve observability and triage.

– Owning operational metrics and goals for incident response quality, and fostering a culture of continuous improvement.


– Directly managing high-severity incidents, coordinating cross-functional teams, and driving resolution for complex/ambiguous issues.

– Serving as the point of escalation for critical customer issues, and building relationships with incident response teams across Amazon.


– Educating leaders and engineers on incident response best practices and capabilities.

The ideal candidate has 10+ years of incident management experience, including incident response for a large-scale enterprise. They have strong technical, analytical, and communication skills to liaise effectively with engineering and executive teams.



Key job responsibilities
Define the strategy for the evolution of Prime Video’s response to video on demand impacting incidents. Establish the operating model for how the new, dedicated function will operate within Prime Video.

As a technical lead, leverage your domain expertise to, develop the vision for the incident response tooling and capabilities needed to minimise the duration of customer impacting incidents. Increase the scope of team-maintained dashboards. Influence the roadmaps of the engineering teams developing incident management, observability and triage tooling.



Lead the Incident Response function. Ensure the globally distributed team is ready to respond 24×7. Actively mentor and develop the junior Incident Managers.

Own operational metrics. Set clear, measurable, goals for the quality of incident response and establish mechanisms to drive continual improvement. Foster a culture of continuous improvement through mentoring, feedback and metrics



Lead incident response for high severity incidents. Drive towards resolution by co-ordinating efforts across multiple engineering and operational teams, including for ambiguous problems we might not have seen before. Decompose complex incidents into work streams that can be managed by multiple incident responders in parallel. Manage communications and be the single point of contact for executive leaders

Drive critical, complex customer escalations in situations that are sometimes technically challenging in collaboration with Engineering Teams


Build relationships with the other Incident Response teams across Amazon to share best practice and enable effective collaboration during cross organizational outages/incidents

Educate leaders and engineers across Prime Video on advances in incident response capabilities, the role they play in enabling improved incident response and how they can leverage READI tooling to reduce time to mitigate lower severity incidents.


Communicate ideas effectively, both verbally and in writing, to all types of audiences.

Perform other duties as required by the organization


About the team
The Prime Video platform is complex and constantly changing. consists of thousands of cloud-based services, is built and maintained by thousands of engineers, and serves hundreds of millions customers. We are establishing a team of dedicated incident managers who will be front-and-centre in driving down the duration of incidents impacting customer’s ability to watch video on demand by utilizing their operational experience, knowledge of best practices, and effective usage of incident management tools. We’re looking for an expert in incident response, who has owned operational and/or incident management for at least one large-scale enterprise, to shape the incident response function, define the operational framework and drive delivery of incident response tooling. The team will provide incident response 24x7x365 from two locations.

BASIC QUALIFICATIONS



– 10+ years experience in Incident Management
– Have owned operational and/or incident management for at least one large-scale enterprise.
– 3+ years of team lead / people management experience
– Bachelor Degree in Computer Science or related field
– Exceptional written and verbal communication skills including executive communications.
– Proven analytical skills identifying customer impacting issues
– Experience developing and implementing standard operating procedures and developing or driving development of the tools to support this
– Experience partnering with Engineering Teams
– Strong critical thinking. Problem Management expertise in identifying root cause and develop reporting to track mitigation status
– Excellent interpersonal and customer relationship skills
– Demonstrated skill and passion for operational excellence
– Development/scripting skills in at least one interpreted language (e.g. Java, Python) as well as shell
– Experience automating tasks through creation and maintenance of scripts and tools

PREFERRED QUALIFICATIONS


– Experience as a Mission Control Center leader that acts as the single POC during large scale customer impacting incidents
– Strong experience driving critical, complex customer escalations in situations that are sometimes technically challenging in collaboration with Engineering Teams
– Experience influencing Senior Leaders
– Ability to work in ambiguous environments
– Project/Program management experience managing small to mid-size project and program from inception to delivery

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.



Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.


Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $149,200/year in our lowest geographic market up to $299,100/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.

Source

To apply, please visit the following URL:https://www.jobmonkeyjobs.com/career/26646869/Principal-Incident-Manager-Prime-Video-Washington-Seattle-7375/→