Logo for Softgic

Data Scraping 1743

Roles & Responsibilities

  • Strong experience with web scraping and data extraction.
  • Practical programming experience using Python or similar scripting languages.
  • Experience working with HTML parsing, APIs, HTTP requests, FTP sources, and structured or unstructured data.
  • Strong analytical and problem-solving skills.

Requirements:

  • Research and identify public and government data sources.
  • Extract and normalize data from websites, APIs, feeds, and online repositories.
  • Build reusable, maintainable, and re-runnable scripts and scraping workflows.
  • Document data sources, extraction methodologies, challenges encountered, and re-run procedures.

Job description

This is a remote position.

We are seeking a Data Scraping to help collect, organize, and normalize data from public and government sources into a consistent, structured format. This role focuses on solving complex data acquisition challenges, researching unfamiliar sources, extracting information from websites and feeds, and transforming it into predefined formats that can be consumed by downstream systems. The ideal candidate enjoys working with messy datasets, investigating how websites and data sources are structured, and creating reusable solutions that can be executed repeatedly with consistent results. This position requires strong problem-solving skills, attention to detail, and the ability to work independently while documenting findings and processes clearly.

Schedule: Monday to Friday - 12:00 PM – 8:00 PM CST

Responsibilities:
Research and identify public and government data sources. Extract and normalize data from websites, APIs, feeds, and online repositories. Build reusable, maintainable, and re-runnable scripts and scraping workflows. Deliver structured outputs in predefined formats. Provide sample outputs for review before processing larger datasets. Document data sources, extraction methodologies, challenges encountered, and re-run procedures. Capture and report any relevant information discovered during extraction, including inconsistencies, amendments, effective dates, repeal notes, or related metadata. Troubleshoot data acquisition issues and propose alternative approaches when needed. Collaborate with stakeholders through regular check-ins and written communication. Maintain version-controlled code repositories and follow standard development practices.

Requisitos

 Strong experience with web scraping and data extraction. Practical programming experience using Python or similar scripting languages. Experience working with HTML parsing, APIs, HTTP requests, FTP sources, and structured or unstructured data. Ability to evaluate, debug, and improve scraping solutions. Strong analytical and problem-solving skills. Experience building reusable automation workflows rather than one-off scripts. Familiarity with relational databases (PostgreSQL preferred) and a normal Git workflow. Strong documentation and communication skills. Ability to work independently and take ownership of technical challenges. High attention to detail and commitment to data accuracy. Nice to Have: Experience working with government, regulatory, compliance, or public-sector datasets. Experience with Playwright, Selenium, Puppeteer, Scrapy, or similar scraping frameworks. Experience with data versioning, change detection, or document lineage. Familiarity with AI-assisted development tools and workflows.

Related jobs

Other jobs at Softgic

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.