We are seeking an experienced Staff Data Engineer to oversee the development and utilization of data systems. You will be reporting to the Sr. Manager – Data Engineering, to join our dynamic team in the Foreign Exchange payments processing industry. The ideal candidate is responsible for defining and implementing the data ETL pipelines and models and ensuring robust data governance across the organization. This role requires a deep understanding of business processes, technology, data management, and regulatory compliance. The successful candidate will work closely with business and IT leaders to ensure that the enterprise data platform supports business goals, and that data governance policies and standards are adhered to across the organization. Your responsibilities will include working closely with product managers, data architects, analysts, cross-functional teams, and other stakeholders to ensure that our data platform meets the needs of our organization and supports our data-driven initiatives. It also includes building a new data platform, integrating data from various sources, and ensuring data availability for various application and reporting needs.
In your role as a Staff Data Engineer, you will:
- Architect and Develop Data Solutions: Lead the end-to-end design and development of robust data pipelines and data architectures using AWS tools and platforms, including AWS Glue, S3, RDS, Lambda, EMR, and Redshift.
- Optimize ETL Processes: Design and optimize ETL workflows to facilitate the efficient extraction, transformation, and loading of data between diverse source and target systems, including data warehouses, data lakes, and both internal and external platforms.
- Collaborate and Design Data Models: Partner with stakeholders and business units to develop data models that align with business needs, analytical requirements, and industry standards.
- Data Integration and Architecture Maintenance: Collaborate with internal and external teams to design, implement, and maintain data integration solutions, ensuring high data integrity, consistency, and accuracy.
- Implementation and Troubleshooting: Oversee the implementation of data solutions from initial concept through to production. Troubleshoot and resolve complex technical issues to ensure data pipeline stability and high performance.
- Leadership and Mentorship: Provide guidance and leadership to engineering teams, promoting a culture of continuous improvement, knowledge sharing, and technical excellence. Mentor junior engineers and foster their professional growth.
- Innovation and Strategy: Drive technical innovation by staying abreast of industry trends and emerging technologies. Influence technical strategies and decisions to align with organizational goals and objectives.
- Documentation and Best Practices: Develop and maintain comprehensive documentation for data architectures, pipelines, and processes. Establish and enforce best practices for data engineering and quality assurance.
- Strong experience delivering Data Engineering solutions on AWS, Databricks, AWS DevOps for CI/CD, and on AWS - S3, EC2, PySpark, EMR, Glue, Lambda, Redshift, Snowflake, Cloud Watch, SNS, Auto Scaling, Step Functions, Event Bridge,
- Strong experience delivering Data Engineering batch solutions using Apache Spark/PySpark, Spark SQL, Test Driven Development, Creation of Re-usable plugins using Pybuilder for Python Development Street, Jenkins multi-branch pipelines and deployment to servers and Automation.
- Strong experience in delivering Data Engineering real time solutions using Spark Streaming, Confluent Kafka, Streamsets – Data Collector, Transformer and Control Hub, DynamoDB, MongoDB to provide end to end solution for data ingestion/ETL/integration with disparate origins and destinations.
- Strong experience in data modelling and developing solutions involving SQL/NoSQL databases like HBase Dynamo DB and MongoDB. Strong experience of working with SOAP APIs / REST APIs hosted on Apigee (OAuth 2.0) and TYK Gateways. Also, strong Data Integration experience with internal and third-party systems. Design, Developed and implemented Automated solutions to source / sink data including APIs.
- Experience building platforms supporting Customer/User journey orchestration and personalized recommendations through Microservices, APIs, Streaming and Event Driven Architecture.
- Strong development, testing (unit, system, integration, performance testing) and debugging skills with detail-oriented documentation skills.
- Strong Data Modelling experience – SQL / NoSQL – Oracle, DB2, PostgreSQL, MySQL, DynamoDB, MongoDB, HBase.
- Strong understanding of data quality best practices using automated quality checks and using libraries like Great Expectations, using TDD methodology etc., data governance principles, object- oriented programming, design, and architectural patterns
A successful candidate for this position should have:
- Bachelor's degree or equivalent in Engineering, or a related field with proven experience in designing, deploying, and managing cloud-based infrastructure, preferably for data platforms
- Minimum of 12-15 years of experience in enterprise data architecture, data management and data governance, or a related field.
- Strong proficiency in AWS, including data services (a must), compute, storage, networking, and security services
- Proficiency in programming languages such as Python, Java, or Scala, with a focus on data processing frameworks (e.g., Apache Spark, Kafka)
- Strong experience with data architecture principles, including data modelling, ETL/ELT processes, and data management and hands on experience with Big Data technologies such as Apache Hadoop, Apache Spark, and Apache Kafka
- Familiarity with database systems such as Snowflake, SQL Server, PostgreSQL, or NoSQL databases
Nice to have Qualifications:
- Experience with regulatory compliance related to data management (e.g., GDPR, HIPAA).
- Knowledge of emerging technologies such as AI, machine learning, and data analytics.
- Certifications in cloud platforms (e.g., AWS).