Resilience_Engineer_Interview_Flashcards

Question 1

Q

How do you decide whether to build a custom solution or use an existing SaaS tool?

Answer

A

Evaluate based on cost, team expertise, time to market, scalability, and maintenance. At Eficens, I chose Sage Intacct over custom Spring Boot build due to faster integration, compliance readiness, and resource optimization.

Question 2

Q

Describe a scenario where you integrated legacy systems with new cloud-native solutions.

Answer

A

Integrated an on-prem HRIS with Azure AD using Terraform and Python middleware. Deployed changes incrementally with feature flags and achieved seamless SSO without service disruption.

Question 3

Q

Describe your experience automating data flows using APIs or scripting.

Answer

A

Used AWS Lambda and Python to automate pulling logs from CloudTrail and Security Hub into Elasticsearch for real-time analysis. This eliminated manual log processing entirely.

Question 4

Q

How have you contributed to system resilience and fault tolerance?

Answer

A

Containerized services using Docker, deployed on AWS Lambda, and implemented retries with exponential backoff. Added Splunk alerts for failures. Result: 99.9% uptime and 60% faster recovery time.

Question 5

Q

How have you used PostgreSQL, Elasticsearch, or Snowflake in your work?

Answer

A

Used PostgreSQL for microservice data with optimized queries and indexing; Elasticsearch in AWS SOC project for log analysis; no direct experience with Snowflake but strong SQL/ETL background.

Question 6

Q

How do you ensure backward compatibility in system changes?

Answer

A

Use feature flags, schema versioning, regression tests, sandbox testing, and deploy with blue-green or canary strategy to avoid breaking changes.

Question 7

Q

Tell me about a time you worked with non-technical stakeholders.

Answer

A

Worked with HR to simplify IAM role assignment. Built a Flask-based web GUI that managed AWS IAM roles via Python scripts. Reduced IT dependency by 70% and provisioning time from 2 days to 1 hour.

Question 8

Q

Describe a time you had to manage competing priorities.

Answer

A

Balanced backend optimization and compliance deadline at TCS. Prioritized based on business impact, split tasks across sprints, and maintained open communication. Delivered both with minimal delay.

Question 9

Q

If your MVP solution starts failing in production, how would you handle it?

Answer

A

Diagnose with logs (Splunk), isolate the issue, implement rollback or retry, notify stakeholders, and write a postmortem with long-term fixes.

Question 10

Q

What’s your approach to identifying and clearing technical roadblocks?

Answer

A

Analyze performance metrics, use tracing (like AWS X-Ray), refactor for efficiency, consult documentation, and validate fixes with stress testing. Example: optimized Lambda memory bottleneck by batching and refactoring code.