Job Search
Podemos ayudarlo a desarrollar una carrera excepcional.
Site Reliability and Operations Lead-US IRC251934
Job: | IRC251934 |
Location: | United States - Basking Ridge NJ, Irving TX |
Designation: | Specialist Engineer |
Experience: | 10-15 years |
Function: | Engineering |
Skills: | AWS, Cloud Security and Networking, Incident Management, Infrastructure managemnet, Operations/project management |
Work Model: | On-Site/Office |
Description:
- Minimum 10+ years relevant experience, including 5+ years of people management
- Comfortable collaborating with developers, product managers, and other key stakeholders
- Able to establish and document methodologies/processes and training materials to help new team members onboard more quickly
- Process-oriented and accustomed to Agile work environments and a start-up pace to projects
- Experience managing a high-availability platform operations team responsible for site reliability including logging, monitoring, reporting, and alerting
Requirements:
- Experience using, building, or maintaining REST APIs in production
- Experience with GCP cloud resources in a production environment
- Experience with mentoring and knowledge sharing between team members
- The team lead will be a key contributor to the Engineering and Delivery team, managing the complex processes and systems needed to ensure site reliability.
- In addition to managing the ongoing operational aspects of software delivery, they will collaborate with our design, product, and engineering teams to ensure success for the entire program
Job Responsibilities:
- Experience with AWS Cloud (EKS/EC2/S3/Route 53/Lambda/ALBs, Kubernetes, CI/CD systems, Linux, and modern cloud networking is a must
- Experience managing an on-call rotation, incident management, and root cause analysis is required
- A strong understanding of security best practices, including keeping cloud infrastructure updated with security updates on an ongoing basis and secure secrets management
- The ability to prioritize and delegate tasks, and to run a daily stand-up meeting for the operations team members
- Experience with build and deployment automation with Jenkins, GitHub Actions, GitLab, and/or ArgoCD
- Bachelor’s or Master’s degree in Computer Science, Computer or Electrical Engineering, Mathematics, or a related field.
We Offer
Exciting Projects:Come take your place at the forefront of digital transformation! With clients across all industries and sectors, we offer an opportunity to work on market-defining products using the latest technologies.
Collaborative Environment: You can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities!
Work-Life Balance:GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules and opportunities to work from home.
Professional Development:We provide continuing education classes, professional certification and training (technical, soft skills, language, and communication skills) to help you realize your professional goals. Being part of a global organization, there are additional learning opportunities through international knowledge exchanges.
Excellent Benefits:We provide our employees with competitive salaries, health and life insurance, short-term and long-term disability insurance, a matched contribution 401K plan, flexible spending accounts, and PTO and holidays