Logo for SD Solutions

Atlas Invest | Senior Data-Focused Backend developer at SD Solutions

Roles & Responsibilities

  • Proficiency in Python 3.12+ (FastAPI/Pydantic) and TypeScript (Strict mode/Zod); fluent English.
  • Strong SQL/Cypher data querying and data modeling experience.
  • Experience with AI/ML/LLM systems: DSPy optimization, LangGraph orchestration, vector retrieval, and multi-model integrations.
  • Production engineering expertise: monorepo tooling, Docker/Docker-compose, message queues, observability; experimentation with Jupyter notebooks for rapid prototyping and benchmarks.

Requirements:

  • Own end-to-end data-focused backend development from idea to production, including designing, implementing, and deploying AI extraction pipelines and FastAPI services.
  • Identify, onboard, and validate new data sources; assess data quality and usability; perform data comparisons and validation.
  • Define modeling approaches and productionize research-driven upgrades; build evaluation harnesses and measure impact.
  • Architect scalable research initiatives and bridge AI research with production engineering; improve reliability, observability, and speed to value.

Job description

On behalf of Atlas Invest, SD Solutions is looking for a talented Senior, research-oriented Data Engineer / Data-Focused Backend developer who can take a feature idea from concept through research, data validation, modeling approach, and full implementation. You will play a key role in designing, developing, and maintaining our core services, with a focus on performance, reliability, and scalability.

SD Solutions is a staffing company operating globally. Contact us to get more details about the benefits we offer.

As a Data-Focused Backend developer, you will own the full arc from idea to impact. "End-to-end" here isn't just a buzzword; it means you translate abstract problems into testable hypotheses. It means the same person who reads a paper on hybrid document classification prototypes it in a notebook, evaluates it with DSPy metrics, wires it into a LangGraph node, and deploys it into our production Python/TypeScript monorepo.

You will bridge the gap between abstract research and concrete engineering. You won’t stop at a "notebook win" or just build isolated models; you will build the pipelines, FastAPI services, and TypeScript integrations that serve them to the real world, ensuring reliability and measurable business value. We are looking for a "rockstar" who can seamlessly navigate the boundary between high-level AI orchestration and low-level system reliability.

Your First 90 Days

Month 1: Codebase Mastery & First Shipped Wins

  • Get fully onboarded by successfully running the monorepo locally and tracing a 'live' data request through our core AI and data services within your first few days.
  • Ship your first pipeline improvement to production (e.g., an extraction fix or a schema normalization) by the end of Week 1.
  • Reproduce a notebook experiment, publish a short gap analysis, and transition your first DSPy or LangGraph prototype into a tested FastAPI service.

Month 2: Pipeline Ownership & The Research Flywheel

  • Take end-to-end ownership of a complex pipeline component (like due diligence intelligence or multi-source data fusion).
  • Deliver a new evaluation harness tied to a live pipeline, and immediately use it to measure and drive a real-world performance increase.
  • Productionize a research-driven upgrade (like a new DSPy optimizer strategy) with clear before/after metrics.

Month 3: Architecture & Scale

  • Lead the architecture of a next-generation research initiative (e.g., advanced GraphRAG or a new autonomous diligence agent) from abstract idea to production deployment.
  • Define and accelerate a repeatable “research-to-release” playbook for your domain, setting the standard for how we bridge AI research and production engineering.

What You Will Own

  • AI Extraction Pipelines: Design and ship improvements to the OCR → Classify → Extract pipeline (using PaddleOCR, LangGraph, DSPy) to reduce extraction error and latency for complex document types like T12 financials, rent rolls, and appraisals.
  • Scale Data Normalization: Expand our property data aggregation layer. You will pull data from various top-tier real estate and demographic APIs, optimizing schema normalizations and conflict resolution to unify external datasets with our internal systems.
  • Strengthen Automated Risk Engines: Improve the underlying engine to generate smarter, cleaner, and higher-quality risk assessments.
  • Optimize Property Intelligence Pipelines: Enhance automated data enrichment to deliver instantaneous, actionable insights on asset-specific attributes and external risk factors.
  • External Provider Resilience: Expand and maintain our TypeScript-based provider ecosystem, ensuring reliability against third-party outages via robust caching, retries, and observability.
  • Drive the Research Flywheel: Conduct systematic gap analyses using custom evaluation suites (accuracy/precision-recall) on current modules. You will identify the next 2-3 bottlenecks, feed them back into the engineering loop, and implement academic approaches (e.g., SOTA advanced chunking, multi-step RLM reasoning) to continuously boost precision and recall.
  • Orchestrate Agentic Workflows: Use LangGraph to build complex, fault-tolerant state machines that connect our document classification, OCR, and schema extraction modules.

What hard skills do we need?

Note: We don't expect you to have every single skill listed below-that's nearly impossible. We value equivalent skills and a proven ability to learn fast, especially when it comes to specific technologies like DSPy or Neo4j Cypher.

  • Languages: Python 3.12+ (FastAPI/Pydantic), TypeScript (Strict mode/Zod), SQL/Cypher, and the newest programming language -> English.
  • AI/ML/LLM Systems: Prompts/DSPy optimization, LangGraph orchestration, vector retrieval (Weaviate, Elastic, or alternatives), prompt/eval loops, and multi-model integrations (OpenAI, Gemini, vLLM).
  • Data & Graphs: Neo4j modeling, schema design, multi-source data fusion, and ORMs (SQLAlchemy, Prisma, or Drizzle are an advantage).
  • Document Intelligence: Working with pre-implemented OCR pipelines, document parsing, and classification under noisy, real-world inputs/files/tables.
  • Production Engineering: Monorepo tooling, Docker/Docker-compose, message queues (RabbitMQ or others), and observability (tracing, structured logging).
  • Experimentation: Comfortable in Jupyter Notebooks for rapid prototyping, benchmark/evaluation harnesses, reproducible experiments, and A/B metric tracking.

Core Responsibilities:

  • Identify and onboard new data sources
  • Perform data comparisons & validation
  • Assess data quality and usability
  • Define the modeling approach
  • Implement and productionize solutions
  • Work independently with minimal structure

The Team X @ Atlas Mission & Culture

Atlas Invest’s Team X is building the intelligence layer for real estate. We ingest, normalize, and reason over the messiest data in one of the world's largest asset classes – property records scattered across multiple external providers, complex ownership networks buried in public filings, and financial details locked inside massive, unstructured rent rolls and appraisals.

Team X is a diverse, high-performing squad of engineers and researchers within Atlas. We value ownership, velocity, and craftsmanship. We ship a polyglot monorepo and treat the boundary between research and production as a feature, not friction. You will join a culture where people are trusted to run with ambiguity, publish Jupyter experiments on Monday, and deploy those results to production by Friday.

About the company:

Atlas Invest is transforming the bridge loan landscape, seamlessly connecting investors with real estate developers using advanced big data analytics for a personalized investment experience.

By applying for this position, you agree to the terms outlined in our Privacy Policy. Please take a moment to review our Privacy Policy https://sd-solutions.breezy.hr/privacy-notice, and make sure you understand its contents. If you have any questions or concerns regarding our Privacy Policy, please feel free to contact us.

Back-End Developer Related jobs

Other jobs at SD Solutions

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.