Reporting to the Head of Engineering, we have an exciting opportunity for a Site Reliability Engineer to join our expanding development team.
Within this role you will oversee production environments and identify productions issues; implementing integrations that improve the customer experience on both sides of our marketplace.
You’ll support and monitor production and network environments by developing and deploying logging and monitoring tools that support disaster recovery, backup, redundancy and capacity planning.You will support heavy releases, ensuring uptime and automatically scaling the platform due to increased traffic at peak times.
The ideal candidate will have experience with Google Cloud Platform along with experience working with monitoring tools; such as Promentheus, Grafana or similar.
Support and assist with expansion of technical growth through Google Cloud Platform
Reduce cost of automations by utilising platforms in the most effective and efficient way
Support performance optimisation by monitoring distributed systems
Provide structure and help to our release process, suggesting and making improvements where possible
Work in a fast paced environment, monitoring alerts and reduce any down time, improving site configurations and uptime
A high level understanding and experience of optimising Google Cloud Platform and configuring for scale, performance, and cost
Experience with monitoring tools, such as Prometheus, Grafana, or similar
Working experience with alerting tools (Pagerduty preferred)
Knowledge of Kubernetes and Docker
Experience working with infrastructure as code (Terraform preferred)
Work hard, take breaks – 25 days annual leave plus bank holidays
Flexible working – we are a remote-first organisation believing in asynchronous working practices. You won’t find back-to-back Zoom meetings here!
Generous parental leave policies to support you as your family grows.
Company pension that allows you to save for the future.
Equipment – We’ll hook you up with a brand new MacBook, monitor, and any other accessories you need to do your best work.
Learning and development – we fully support your professional development, whether that’s paying for new tools, books, courses or coaches.
Mental health support – we want to ensure everyone in the company has the access they need to mental health support so we provide free access to therapy sessions via Spill.
Regular socials – we’re a social bunch, that’s why we invest in bringing the team together through regular socials and annual employee retreats. Check out our first annual retreat here. (Link)
Statement of Diversity and Inclusion
Diversity and Inclusion is a priority for us all at Add to Event. We are committed to cultivating a fair and healthy working environment, where our employees can be themselves and thrive in their work. We treat one another with humility, respect and dignity at all times.
As a remote-first business, we have a flexible and collaborative approach to all aspects of our work. As such, we are open to discussions about flexible working for all our employees. If you are the right person for us, we will do our very best to make it work for you.
We work to ensure that our recruitment processes are as inclusive as possible to everyone. This includes making adjustments for people who have a disability or long-term condition. If you would like us to make adjustments during the application process, please contact us at firstname.lastname@example.org