Tabby makes shopping more rewarding by empowering people with the flexibility and freedom to get what they want. Tabby lets you shop now, pay later and earn cash – without the interest, fees or debt traps. Over +3,000 retailers online and in-store and +1,000,000 loyal shoppers including the likes of IKEA, SHEIN, and Marks & Spencer use Tabby to accelerate growth by offering easy and flexible payments online and in stores. Tabby is the Middle East’s first and largest buy now, pay later provider and has raised +$180m in funding from global and regional investors.
Tabby creates financial products designed to inspire and create financial freedom. In a few years, retail checkout will look vastly different and we want you to be part of that change.
We are looking for a Software Developer with an SRE background who will be able to deliver this role.
Team: 4 DevOps and several office IT admins, 1 SRE.
Projects you will work on
- Infrastructure automation with Terraform;
- Improving monitoring and tuning alerts in DataDog as well as new metrics development;
- Interaction with product managers and teams to develop SLO services to improve their reliability;
- Automation of auxiliary processes on duty (on-call);
- Setting up and tuning the OpsGenie alert system.
About the role
You’ll be working in a dynamic, rapidly evolving environment with the following responsibilities:
- Advise developers on the choice of service level indicators and target indicators of service indicators;
- Participate in the duty of the SRE team (solve user problems and respond to alerts and prevent incidents);
- Manage incidents and conduct incident reviews;
- Collaborate with development teams to ensure the stable and reliable operation of services;
- Set up a monitoring system to respond to symptoms of problems;
- Help to equip services with tracing, metrics and logs;
- Write documentation for the actions you perform (run book), in order to define repeatable processes and automate them;
- Ensure the observability of services and systems;
- Improve operational processes in cooperation with the DevOpS team;
- Debugging and investigating production issues in services and different levels of the stack;
- Improve team practices through code review and incident handling.
What we expect
- You apply at your work Grafana or Prometheus;
- Have skills in finding and solving problems in distributed systems;
- Know what percentiles are and understand charts;
- You are good at with relational databases, you know SQL (writing complex queries);
- Understand how HTTP-API, gRPC work;
- Have experience with one of the monitoring and logging systems (we use DataDog) and manual problem solving;
- Solve tasks with Docker, Kubernetes, Helm or similar technologies on a daily basis;
- Know OpsGenie/PagerDuty notification system or similar;
- Familiar with flexible methodologies (as part of a Scrum team).
As a plus
- Worked with Google Cloud Platform;
- You can develop Golang microservices: commercial experience or pet projects;
- Can read and understand HTML, JS, TypeScript, Python, Shell/Bash;
- Have experience with SLO/Error Budget;
- Able for using API tools: curl, Postman/Bloom or similar;
- English: Upper-Intermediate.
What you can expect
- A competitive salary dependent upon your experience
- We offer flexible working hours and trust you to work enough hours to do your job well, at times that suit you and your team
- A working environment that gives you autonomy and responsibility from day one
- You should be comfortable with the idea that the quality of your work will influence the shape of your career
- Flexible vacation policy
We are passionate about creating an equitable, high-performing workplace that gives people from all backgrounds the support they need to thrive, grow and meet their goals (whatever they may be).
If this sounds exciting to you, we’d love to hear from you 😊