Key Facts

Remote From:

Romania , Serbia

Full time

Senior (5-10 years)

English

Hard Skills

Other Skills

•
Communication
•
Teamwork
•
Detail Oriented
•
Learning Agility
•
Problem Solving

Roles & Responsibilities

Proficiency in Python 3.12+ (FastAPI/Pydantic) and TypeScript (Strict mode/Zod); fluent English.
Strong SQL/Cypher data querying and data modeling experience.
Experience with AI/ML/LLM systems: DSPy optimization, LangGraph orchestration, vector retrieval, and multi-model integrations.
Production engineering expertise: monorepo tooling, Docker/Docker-compose, message queues, observability; experimentation with Jupyter notebooks for rapid prototyping and benchmarks.

Requirements:

Own end-to-end data-focused backend development from idea to production, including designing, implementing, and deploying AI extraction pipelines and FastAPI services.
Identify, onboard, and validate new data sources; assess data quality and usability; perform data comparisons and validation.
Define modeling approaches and productionize research-driven upgrades; build evaluation harnesses and measure impact.
Architect scalable research initiatives and bridge AI research with production engineering; improve reliability, observability, and speed to value.

SD Solutions

Information Technology & Services

About SD Solutions

SD Solutions is a Scaling Partner focused on end-to-end staffing services and consulting, providing tech companies with strategic solutions for global growth needs. The company offers a transparent business model, where the Scaling Partner covers all back-office operations: payroll, legal entity, IT infrastructure, HR management and talent acquisition. It allows businesses to scale staff in the most efficient way - time and cost-wise. There 40+ customers from the USA, Canada, UK and Israel who chose SD Solutions to: -Build integrated branches and R&D centres in Europe and Latin America. -Outsource business processes (Data Unit, Customer Support Studio, QA, DevOp/MLOps, SDR and Marketing Centers) -Hire a contractor abroad (EoR) -Grow fast with a Start-up scaling program SD Solutions offers custom solutions to tech companies pushing forward the industry: fintech, cloud services, health tech, eCommerce, cybersecurity, etc.

Company type: Scaleup

Industry: Information Technology & Services

Founded: 2018

Company size: 201 - 500

Website LinkedIn See all jobs →

Job description

On behalf of Atlas Invest, SD Solutions is looking for a talented Senior, research-oriented Data Engineer / Data-Focused Backend developer who can take a feature idea from concept through research, data validation, modeling approach, and full implementation. You will play a key role in designing, developing, and maintaining our core services, with a focus on performance, reliability, and scalability.

SD Solutions is a staffing company operating globally. Contact us to get more details about the benefits we offer.

As a Data-Focused Backend developer, you will own the full arc from idea to impact. "End-to-end" here isn't just a buzzword; it means you translate abstract problems into testable hypotheses. It means the same person who reads a paper on hybrid document classification prototypes it in a notebook, evaluates it with DSPy metrics, wires it into a LangGraph node, and deploys it into our production Python/TypeScript monorepo.

You will bridge the gap between abstract research and concrete engineering. You won’t stop at a "notebook win" or just build isolated models; you will build the pipelines, FastAPI services, and TypeScript integrations that serve them to the real world, ensuring reliability and measurable business value. We are looking for a "rockstar" who can seamlessly navigate the boundary between high-level AI orchestration and low-level system reliability.

Your First 90 Days

Month 1: Codebase Mastery & First Shipped Wins

Get fully onboarded by successfully running the monorepo locally and tracing a 'live' data request through our core AI and data services within your first few days.
Ship your first pipeline improvement to production (e.g., an extraction fix or a schema normalization) by the end of Week 1.
Reproduce a notebook experiment, publish a short gap analysis, and transition your first DSPy or LangGraph prototype into a tested FastAPI service.

Month 2: Pipeline Ownership & The Research Flywheel

Take end-to-end ownership of a complex pipeline component (like due diligence intelligence or multi-source data fusion).
Deliver a new evaluation harness tied to a live pipeline, and immediately use it to measure and drive a real-world performance increase.
Productionize a research-driven upgrade (like a new DSPy optimizer strategy) with clear before/after metrics.

Month 3: Architecture & Scale

Lead the architecture of a next-generation research initiative (e.g., advanced GraphRAG or a new autonomous diligence agent) from abstract idea to production deployment.
Define and accelerate a repeatable “research-to-release” playbook for your domain, setting the standard for how we bridge AI research and production engineering.

What You Will Own

AI Extraction Pipelines: Design and ship improvements to the OCR → Classify → Extract pipeline (using PaddleOCR, LangGraph, DSPy) to reduce extraction error and latency for complex document types like T12 financials, rent rolls, and appraisals.
Scale Data Normalization: Expand our property data aggregation layer. You will pull data from various top-tier real estate and demographic APIs, optimizing schema normalizations and conflict resolution to unify external datasets with our internal systems.
Strengthen Automated Risk Engines: Improve the underlying engine to generate smarter, cleaner, and higher-quality risk assessments.
Optimize Property Intelligence Pipelines: Enhance automated data enrichment to deliver instantaneous, actionable insights on asset-specific attributes and external risk factors.
External Provider Resilience: Expand and maintain our TypeScript-based provider ecosystem, ensuring reliability against third-party outages via robust caching, retries, and observability.
Drive the Research Flywheel: Conduct systematic gap analyses using custom evaluation suites (accuracy/precision-recall) on current modules. You will identify the next 2-3 bottlenecks, feed them back into the engineering loop, and implement academic approaches (e.g., SOTA advanced chunking, multi-step RLM reasoning) to continuously boost precision and recall.
Orchestrate Agentic Workflows: Use LangGraph to build complex, fault-tolerant state machines that connect our document classification, OCR, and schema extraction modules.

What hard skills do we need?

Note: We don't expect you to have every single skill listed below-that's nearly impossible. We value equivalent skills and a proven ability to learn fast, especially when it comes to specific technologies like DSPy or Neo4j Cypher.

Languages: Python 3.12+ (FastAPI/Pydantic), TypeScript (Strict mode/Zod), SQL/Cypher, and the newest programming language -> English.
AI/ML/LLM Systems: Prompts/DSPy optimization, LangGraph orchestration, vector retrieval (Weaviate, Elastic, or alternatives), prompt/eval loops, and multi-model integrations (OpenAI, Gemini, vLLM).
Data & Graphs: Neo4j modeling, schema design, multi-source data fusion, and ORMs (SQLAlchemy, Prisma, or Drizzle are an advantage).
Document Intelligence: Working with pre-implemented OCR pipelines, document parsing, and classification under noisy, real-world inputs/files/tables.
Production Engineering: Monorepo tooling, Docker/Docker-compose, message queues (RabbitMQ or others), and observability (tracing, structured logging).
Experimentation: Comfortable in Jupyter Notebooks for rapid prototyping, benchmark/evaluation harnesses, reproducible experiments, and A/B metric tracking.

Core Responsibilities:

Identify and onboard new data sources
Perform data comparisons & validation
Assess data quality and usability
Define the modeling approach
Implement and productionize solutions
Work independently with minimal structure

The Team X @ Atlas Mission & Culture

Atlas Invest’s Team X is building the intelligence layer for real estate. We ingest, normalize, and reason over the messiest data in one of the world's largest asset classes – property records scattered across multiple external providers, complex ownership networks buried in public filings, and financial details locked inside massive, unstructured rent rolls and appraisals.

Team X is a diverse, high-performing squad of engineers and researchers within Atlas. We value ownership, velocity, and craftsmanship. We ship a polyglot monorepo and treat the boundary between research and production as a feature, not friction. You will join a culture where people are trusted to run with ambiguity, publish Jupyter experiments on Monday, and deploy those results to production by Friday.

About the company:

Atlas Invest is transforming the bridge loan landscape, seamlessly connecting investors with real estate developers using advanced big data analytics for a personalized investment experience.

By applying for this position, you agree to the terms outlined in our Privacy Policy. Please take a moment to review our Privacy Policy https://sd-solutions.breezy.hr/privacy-notice, and make sure you understand its contents. If you have any questions or concerns regarding our Privacy Policy, please feel free to contact us.