Site Reliability Engineer - Enterprise Console (Consultant)
The Bloomberg Enterprise Console group designs scalable Big Data solutions that have a deep impact on enterprise-level applications for B2B products essential to the entire global financial market. Our engineers are responsible for providing cloud-based infrastructure and a suite of self-service applications for our clients to detect, diagnose and remediate issues affecting client's data flows with Bloomberg's various enterprise software solutions.
We are a team of diverse and self-motivated engineers with expertise in different parts of our stack, built to the latest industry standards with open source technologies. We are currently seeking a motivated site reliability engineer who can help on improving Bloomberg Enterprise Console product reliability, stability, and scaling with interest in fault-tolerant distributed system design.
What's in it for you:
As a Site Reliability Engineer (SRE) at Bloomberg Enterprise Console, your mission is to improve the platform's reliability, scalability, and performance on cloud-based infrastructure. As part of the team, you will have the opportunity to work alongside engineers with the same goals in mind and be exposed to many open-source solutions and tools. Our team also plans to upgrade our platform to use the latest version of some open-source technologies and adopt new ones, so it is a rewarding experience you can explore with us.
We'll trust you to:
- Own the production environment running our PaaS product
- Plan, prioritize, and manage migrations while working with multiple stakeholders and ensuring continuous service availability
- Improve overall observability by implementing monitoring, metrics, logs, and Service Level Objectives (SLO)
- Write scripts in Python to automate tasks and interact with APIs
- Troubleshoot production problems as they occur, and drive the post-mortem process
- Measure current capacity, predict future capacity needs and make suggestions accordingly
You need to have:
- Proficiency in developing code in least one high-level programming languages (Java, Python, C/C++, or C#)
- 2+ years of experience working on highly available, fault-tolerant distributed systems
- Experience in all phases of the Agile and test-driven SDLC
We'd love to see:
- 3+ years of Java/Scala experience
- 1+ years of hands-on experience working with Kafka, HBase, Hadoop, and Streaming frameworks
- Familiarity with Kubernetes/docker/containers
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
- A keen interest in technological advances and the ability to incorporate new technology into existing systems
- Create project ideas and implement them with effective collaboration and communication
If this sounds like something you would be passionate about, please apply!
Bloomberg is an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.