Data Architect

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Bachelor's degree in Computer Science, Mathematics, Electrical Engineering, or related field., 7+ years of experience in data architecture or data engineering roles., Expertise in vector databases and their application in Generative AI., Proficiency in programming languages such as Python and Java, with hands-on experience in data streaming and real-time data pipelines..

Key responsibilities:

  • Design and implement scalable, high-performance data infrastructure for AI applications.
  • Develop and maintain event-driven data solutions using tools like Apache Kafka.
  • Manage and optimize various databases, including vector databases, for performance and scalability.
  • Collaborate with cross-functional teams to support AI and machine learning workloads.

Turtle Trax S.A. logo
Turtle Trax S.A. Startup https://www.turtle-trax.com/
2 - 10 Employees
See all jobs

Job description

SUMMARY

As a handson Data Architect, you will be crucial in designing, building, and optimizing the data architecture for our nextgeneration SaaS platform. This position requires expertise in eventdriven data architectures (e.g., Apache Kafka) and emerging technologies such as vector databases to support Generative AI applications. You will be deeply involved in implementing scalable, highperformance data systems that drive realtime analytics, AI applications, and dynamic data processing. Experience in the Utility industry and knowledge of AWS is preferred as we seek to optimize data systems that cater to the unique demands of this sector.

The development organization leverages Java, Spring Boot, AWS RDS (Postgres, SQL Server), Oracle, AWS Serverless technologies (Lambda, SQS), REST, JavaScript, and Mobile development with React Native hosted in AWS using Atlassian tools (Jira, BitBucket, andConfluence).

Dealbreakers:

Handson experience with Python andor Java is required. Must have practical experience with data streaming use cases. Strong verbal and written communication skills are essential for engaging with stakeholders in technical roles.

Highlight Responsibilities:

This person is directly responsible for maintaining a consistent focus on the aspects of data and collaborating with business, technical stakeholders to implement a robust set of data capabilities consistent with our nonfunctional and functional requirements.

JOB FUNCTIONS

Duties and Responsibilities

  • Design & Build Data Infrastructure: Architect and implement scalable, highperformance data infrastructure focusing on eventdriven architectures, realtime data streaming, and advanced AIdriven applications.
  • EventDriven Data Solutions: Develop eventdriven systems leveraging tools like Apache Kafka or similar technologies to support realtime data processing and lowlatency pipelines.
  • Handson Development: Actively develop and maintain data pipelines, ETLELT processes, and eventstreaming solutions using Apache Kafka, Apache Flink, Apache Spark, or similar tools, as well as AIspecific data systems.
  • Database Management: Manage and optimize SQL, NoSQL, OLAP and vector databases to ensure high availability, scalability, and performance, leveraging deep knowledge of database internals, mastery of concepts such as partitioning, sharding, embeddings, distributed database systems, and change data capture (CDC) techniques to drive efficiency and reliability across complex, largescale environments.
  • Data Integration: Build realtime and batch data pipelines that integrate structured and unstructured data from various sources, including AI models and thirdparty data sources.
  • Performance Tuning: Continuously monitor and optimize data systems for performance, ensuring that AI workloads are supported by highly efficient data pipelines and storage solutions.
  • Collaboration: Work closely with product managers, software engineers, and data scientists to align eventdriven architectures, vector databases, and data pipelines with the needs of AI and machine learning models.
  • Cloud Architecture: Architect and manage cloudbased data solutions AWS preferred, that support distributed data processing, AI workloads, and realtime data streaming.
  • Vector Databases: Design and implement vectorbased databases (e.g., Pinecone, pg_vector, Milvus) to support machine learning models, including Generative AI applications, efficiently handling highdimensional data such as embeddings and unstructured data.
    • Requirements:

      • Bachelor’s degree in Computer Science, Mathematics, Electrical Engineering, or equivalent knowledge and experience.
      • 7+ years’ experience in Data Architect or in a similar data engineering role, with direct involvement in designing and implementing eventdriven architectures.
      • Expertise in vector databases (e.g., Pinecone, Weaviate, Milvus) and their application in Generative AI and other machine learning models, including managing highdimensional data and embeddings.
      • Strong understanding of Generative AI applications and how to build data pipelines and infrastructure to support them.
      • Proficiency in programming (Python, Java, or similar languages) with the ability to write clean, efficient code for eventdriven data pipelines and AIdriven data architectures.
      • Experience with realtime data streaming, ETLELT processes, and tools like Apache Kafka, Apache Flink, Kinesis, etc.
      • Extensive experience with cloudbased data architectures and distributed systems.
      • Deep understanding of database technologies (SQL, NoSQL, OLAP, vector) and performance optimization for AI workloads.
      • Strong problemsolving skills and a handson approach to addressing technical challenges.
      • Experience in the SaaS industry or building scalable data systems for AIpowered products.
        • Preferred Qualifications:

          • Experience in the Utility industry is a plus.
          • Familiarity with modern data visualization tools (e.g., Tableau, Looker) and BI platforms
            • Production SupportOnCall Duties:
              As a key member of our engineering team, you will address escalated production issues from customer support. Your responsibilities will include:

              • Participating in a rotational oncall schedule to handle significant production issues.
              • Rapidly diagnosing and resolving technical challenges that arise in production.
              • Collaborating with customer support and engineering teams for seamless issue resolution.
              • Maintaining clear communication and documentation during and after incidents.
              • Leveraging these experiences to contribute to continuous process improvement.
                • Compensatory Time for OnCall Work:
                  We value worklife balance and recognize the extra effort required during oncall rotations. For hours spent actively working oncall, compensatory time off is provided, unless the law requires otherwise. This ensures your commitment is appropriately acknowledged. Please coordinate with your manager regarding the approval and scheduling of compensatory time, to align with team needs and workload.

                  Your contribution is essential in maintaining the smooth operation of our systems and in upholding high standards of customer satisfaction.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Communication
  • Problem Solving

Data Architect Related jobs