Site Reliability Engineering Expert

Przegląd oferty

Lokalizacja

ÅÃ³dÅº, ÅÃ³dzkie

Rodzaj pracy

PeÅ‚ny etat

Data opublikowania

3 lat temu

Szczegóły

ID oferty

4090

Typ pracy

Hybrydowo

Rozmiar firmy

ponad 200

Wynagrodzenie

Brak informacji

Wykorzystywane technologie

Bash, Powershell, Jenkins, GitLab, Terraform, Puppet, ServiceNow

Typ umowy

Umowa o pracÄ™

Rekrutacja

Online

Rekrutacja w jÄ™zyku

Angielski

KorzyÅ›ci

Prywatna opieka medyczna, Finansowanie kursÃ³w, Dodatkowe ubezpieczenie, Lunch card, Elastyczne godziny pracy,

Poziom doÅ›wiadczenia

Senior

Wymagana

Inny

Opis oferty

Company Description

At Bosch, we shape the future by inventing high-quality technologies and services that spark enthusiasm and enrich peopleâ€™s lives. Our promise to our associates is rock-solid: we grow together, we enjoy our work, and we inspire each other.

Job Description

Bosch is seeking a highly motivated individual to step into the senior expert role of Site Reliability Engineer (SRE) in ÅÃ³dÅº or Warsaw.

This role will influence IT teams globally to apply a software engineering mindset to the technical operations of IT products and solutions.

You will support teams in achieving higher levels of service reliability, scalability, and performance for Bosch IT products, services, and solutions

You will also design and help codify highly technical automation levels that span CICD pipelines, provisioning, configuration, monitoring/alerting, and system ops (extending to emerging AIops).

You will design and help to realize monitoring concepts based on Service Level Objectives (SLOs) attached to customer-oriented SLAs (e.g., availability SLA). He/She will work in the context of IT projects and activities, complete tasks with a practical SRE mindset, and help influence teams to learn and adopt such practices.

Technology stack:

Experience with Continuous Integration and Continuous Delivery toolchains: e.g., Jenkins, ArgoCD, GitLab CI, GitHub Actions, Azure DevOps, AWS CodePipeLine, or similar

Experience with infrastructure-as-code (e.g., Terraform, Ansible) and configuration management (e.g., GitOps, Puppet)

Solid experience introducing automation to reduce manual ops tasks (or toil) using shell scripting techniques (e.g., Bash, PowerShell), Ansible playbooks and beyond

Monitoring design/realization experience with an APM (e.g., AppDynamics) and/or open-source monitoring offerings such as Prometheus, Grafana, and Kabana

Your responsibilities will be to:

Install SRE practices into operational teams that will run the next-generation Bosch IT service automation platform

Design and realize an integrated ops model for real-time service monitoring, alerting (on service levels), SRE-oriented incident handling, and on-call procedures

Help to establish runbooks and playbooks, ideally enhanced with technical automation

Coach team members, work hands-on, and lead by example to concretely install SRE into the daily routine of the team

Contribute to the design and realization of automated integrations linked to the new Bosch IT service automation platform (based on ServiceNow)

Influence ops models for IT service automation according to SRE principals

Engage supporting Bosch IT teams to participate in an SRE-oriented integrated ops model for the global Bosch IT service automation portfolio. This engagement span backend services involving teams who operate Linux, Docker, Kubernetes (or other orchestration technologies), networking services such as load balancers, database backend systems, a variety of hybrid cloud services, etc.

Design and work to install release strategies across the service automation portfolio that transition in new service features via progressive deployment techniques (leverages feature flags, canary deployments, automated rollbacks for errors, etc.)

Comfortable splitting time coaching teams about SRE practices and remaining proficient with hands-on technical tasks

Thought-leader in all things DevOps and SRE

Hold a positive attitude and a solid commitment to delivering quality output while leading independently (self, others)

Take ownership of projects and gain professional satisfaction in achieving stable operational environments with high degrees of technical automation and remaining within the SRE-calculated error budget

Qualifications

Profile:

7+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering roles

Experience in working with SRE practices

Advanced knowledge of following terms like Critical User Journey, Error budget, Blameless postmortems

Hands-on technical with many open-source and cloud technologies in the DevOps space

Coding skills in one of the following: Python, Node.js, Java (as well as skills with Bash, Ansible coding, and/or PowerShell scripting)

Solid understanding of enterprise IT integration architectures (e.g., ETLs, real-time interface endpoints, messaging architectures, technical approaches to scalable message streaming)

Experience with various microservice implementation styles that can realize a robust, scalable production configuration

Experience with backend integration involving relational databases, NoSQL databases, cache subsystems (e.g., Redis), cloud databases (e.g., Azure CosmosDB), and cloud storage offerings (e.g., Azure storage accounts, S3)

Experience designing end-to-end monitoring solutions with log aggregation, metric collections (and visualization), and best practice know-how for alerting and automated incident handling generation

Experience with time-series databases for production metrics data collection and management (e.g., InfluxDB, Prometheus)

Some level of skill to review IT solution designs for scale and performance concerns, and, as needed, the ability to apply this to trouble-shooting incident scenarios while under pressure

Good understanding of what Service Management means in a large enterprise

Experience working with large, complex, and diverse IT systems under constant change and with a global ops practice

Technical experience with at least one leading cloud platform: Microsoft Azure, AWS, or GCP (cloud certifications are significant pluses)

Technical exposure to the ServiceNow platform is a significant plus

The personal attributes of a technical leader, coach, and professional consultant (Independent, convincing, committed)

Builds close relationships with local/regional DevOps teams and global IT teams

Open to business trips (e.g., Germany), including international travel (China, India, Brazil, United States)

Proficiency in English

Additional information

Benefits:

We would like to offer you number of amenities for you and your loved ones.

Work #LikeABosch:

Contract of employment and a competitive salary (together with annual bonus)

Flexible working hours with home office after the pandemic as well

Referral Bonus Program

Copyright costs for IT employees

Canteen in the office with co-financed lunches

Grow #LikeABosch:

Complex environment of working, professional support and possibility to share knowledge and best practices

On-going development opportunities in a multinational environment

Broad access to professional trainings, conferences and webinars

Language courses

Live #LikeABosch:

Private medical care and life insurance

Multisport card and sports teams

Number of benefits for families (for instance summer camps for kids)

Non working days on the 24th and 31st of December

Discounts for Bosch products