We are seeking a Senior Python Data Engineer with deep expertise in large-scale data processing, distributed computing, and backend application development. This role is ideal for an experienced engineer who has built data-intensive applications, implemented data science and machine learning solutions, and developed highly scalable APIs capable of handling large data volumes.
The ideal candidate will possess strong Python development skills, hands-on experience with PySpark and big data ecosystems, expertise in both relational and NoSQL databases, and a solid understanding of data science and machine learning workflows. This individual will collaborate closely with data scientists, architects, product teams, and global engineering teams to deliver scalable, high-performance solutions that drive business outcomes.
This role requires both strong technical execution and the ability to collaborate effectively across distributed teams, including offshore and international stakeholders.
Key Responsibilities
Python Application Development
- Design, develop, and maintain scalable Python applications and services.
- Build reusable, maintainable, and well-documented code components.
- Develop backend services and APIs using Django and/or Flask.
- Participate in architecture reviews and technical design discussions.
- Troubleshoot and optimize production systems.
Big Data Engineering
- Develop and optimize large-scale data processing pipelines using PySpark.
- Process, transform, and analyze large structured and unstructured datasets.
- Build scalable ETL and data ingestion frameworks.
- Optimize distributed data processing workloads for performance and reliability.
- Handle large-volume datasets across enterprise environments.
Data Science & Machine Learning
- Implement data science algorithms and predictive models in production environments.
- Collaborate with data scientists to operationalize machine learning solutions.
- Develop and optimize ML pipelines using scikit-learn and related libraries.
- Support model deployment, validation, and monitoring processes.
- Translate analytical requirements into scalable engineering solutions.
Data Processing & Analytics
- Perform complex data wrangling, cleansing, and transformation activities.
-
Build data pipelines utilizing:
-
Pandas
- NumPy
- PyArrow
-
PySpark
-
Improve data quality, consistency, and processing efficiency.
- Develop solutions for handling large-scale datasets and analytical workloads.
API Development
- Design and develop RESTful APIs supporting high-volume data operations.
- Build secure, scalable services capable of handling significant throughput.
- Implement API monitoring, logging, and performance optimization.
- Support integrations with internal and external systems.
Database Development
Software Engineering Excellence
- Follow established coding standards and software development best practices.
- Develop automated unit, integration, and regression tests.
- Participate in peer code reviews and architecture reviews.
- Ensure solutions meet performance, scalability, and maintainability requirements.
- Contribute to CI/CD and DevOps best practices.
Collaboration & Leadership
- Partner with product managers, architects, data scientists, business stakeholders, and engineering teams.
- Collaborate with onsite and offshore teams across multiple time zones.
- Participate in Agile ceremonies and planning activities.
- Mentor junior developers and provide technical guidance.
- Drive technical discussions and influence engineering direction.
Required Qualifications
Experience
- 8+ years of professional Python development experience.
- Proven experience developing enterprise-scale applications and data processing solutions.
- Experience working with large-scale datasets and big data platforms.
- Experience collaborating across global and distributed teams.
Python Development
- Expert-level Python programming skills.
- Strong understanding of object-oriented programming principles.
- Experience developing scalable and maintainable applications.
Frameworks
Strong experience with:
Experience building production-grade REST APIs.
Big Data Technologies
-
Strong hands-on experience with:
-
PySpark
- Distributed data processing
-
Large-scale ETL pipelines
-
Experience handling high-volume datasets.
Data Science & Machine Learning
Data Processing Libraries
Experience using:
- Pandas
- NumPy
- PyArrow
- Other Python data processing libraries
Databases
Strong experience with relational databases:
- PostgreSQL
- SQL Server
- Oracle
- MySQL
Strong SQL development skills.
NoSQL Databases
Experience with:
- Cassandra
- HBase
- Other distributed NoSQL platforms
Engineering Practices
Strong commitment to:
- Automated testing
- Code reviews
- CI/CD
- Source control
- Software engineering best practices
Engagement & Logistics
- Engagement Length: 12+ months.
- Time Zone: : PST - 8:00 AM - 5:00 PM
- Holidays Calendar : Client Holidays (USA – Mandatory)
- Equipment: Provided by the client.
Selection process
- Meeting with Resilient Co. team with KO questions.
- Technical interview
- 2 client interviews