Full Job Description
SRE is responsible for guiding reliability of our applications and infrastructure so that we avoid – or if we cannot avoid quickly resolve – service disruptions.
You will be an evangelist of operational excellence by educating others and guiding strategic reliability improvement activities as well as acting as point of escalation within incident response process
In addition, you’ll get to work on ensuring the following: availability, latency, performance, observability, alerting, auto-healing, monitoring and capacity planning for our systems.
10 to 12 years hands on experience
Bachelor of Engineering or Higher education in software development, engineering or equivalent experience.
Experience in relevant SRE or DevOps role exposing to wide span of control.
Supporting private or public cloud infrastructure
Experience in handling servers (Windows & Linux)
Experience with server-side technologies such as Docker, Kubernetes, Openshift etc;
Any scripting or programing language (bash / python / perl / java script etc..).
Monitoring tools such as as Grafana, Prometheus, ELK Stack, Nagios, Zabbix, etc.
Application Performance Monitoring tools and techniques such as Zipkin, Jagger, Dynatrace, Appdynamics, DataDog, etc..
Knowledge of Agile software development principles
Experience with CI/CD pipelines including BitBucket, Jenkins.
Expertise in designing, developing, testing and deploying applications.
Good To Have Experience (any from below)
Experience with networking concepts (SSH, FTP, DNS, Firewalls, Load balancing, etc.).
Experience with TCP protocol and Packet Capture Analysis
Knowledge in Messaging and Streaming frameworks like – Redis, RabbitMQ, Kafka.
Personal Skills
Good proven investigative, communication, teamwork and inter personal skills
Good reasoning, troubleshooting and problem solving skills.
Creatively seeking and suggesting improvements and best solutions
Eager to share and expand your knowledge and expertise
Take complex operational opportunities with natural extra-mile deliveries
Value teamwork and failures as improvement opportunities
Focused and efficient
Delivery Focus