hero

Breakthrough Energy Ventures Portfolio Company Career Opportunities

Data and Platform Engineer

Prolific Machines

Prolific Machines

Software Engineering
Los Angeles, CA, USA · Emeryville, CA, USA
Posted on Saturday, June 1, 2024
About This Job

We are seeking a passionate builder with experience in data pipeline development, platform integration, and data management to enhance Prolific Machines’ data infrastructure and build the foundations of our AI Platform. You will be responsible for developing and maintaining efficient data pipelines and processes to transform, migrate, analyze, and integrate data and metadata across various scientific systems. Your work will play a crucial role in ensuring the integrity, accuracy, and availability of our scientific and engineering data, directly impacting our core business operations and AI platform development.

Responsibilities

  • Design and develop scalable data pipelines for transforming and migrating data from diverse, multi-modal sources to target systems, ensuring data quality, accuracy, and consistency throughout the process
  • Own and manage integrations with the Ganymede platform and other relevant systems (ELN: Benchling, IoT: Particle), ensuring seamless data flow and system interoperability
  • Ensure structured, high-quality data accessibility to support model development, bioprocess capabilities and analysis, industrial-scale demonstrations, and customer-facing solutions
  • Work closely with Biology, BPD, Engineering, and AI teams to understand data needs and provide solutions that enhance research, development, and commercialization efforts
  • Provide infrastructure and development support for data analysis, model development, and hardware/instrument interaction
  • Document data transformation and migration processes, including data mappings, transformations, and dependencies, and maintain comprehensive documentation for future reference


About You

  • Excellent programmer: Primarily Python, and strong proficiency in SQL
  • Excellent understanding of relational and non-relational databases, data modeling principles, and query optimization techniques
  • Exposure to and good understanding of Biology / Bioprocess data and analysis
  • Experience with scientific data management and documentation tools, including electronic lab notebook (ELN) systems (Benchling) and/or laboratory information management systems (LIMS)
  • Experience with cloud platforms such as AWS, Azure, or Google Cloud, and their data services
  • 2+ years in industry and a B.S. or M.S. degree in Computer Science, Engineering, or a related field
  • Enjoy getting hands-on with data infrastructure and platform development in all stages of development
  • Have strong technical communication skills, are excited about working with interdisciplinary teams, and are eager to work in a high-growth, fast-paced startup environment
  • Experience with infrastructure as code (IaC) principles and technologies is a plus
  • Knowledge of data governance, data security, and data privacy practices is a plus
  • Experience with / understanding of AI is a plus
  • Equity in the form of common stock (subject to vesting)
  • Performance-based bonus