算

Lifescience Large Language Model Research Intern

算秩未来| 北京/上海· 北京市海淀区中关村东路1号院清华科技园8号楼B座赛尔大厦5层

实习互联网 / 电子 / 网游本科

发布于 2025-07-22

职位描述

About the Role We are building next-generation language models for life sciences. This internship focuses on developing and evaluating large-scale sequence models that can learn rich representations across DNA, RNA, amino acids, small molecules, and human-curated biomedical knowledge. You will explore multi-modal and multi-sequence fusion, helping to bridge biological data with domain knowledge using foundation model techniques. What You'll Do Design and fine-tune transformer-based language models for biological and chemical sequences. Integrate multiple biological modalities: genomic sequences, protein sequences, SMILES strings, and related annotations. Build and experiment with methods for cross-modal embedding, contrastive learning, or retrieval. Analyze biological sequence datasets (e.g., genome references, UniProt, PubChem). Collaborate with a multi-disciplinary team of ML engineers, bioinformaticians, and computational biologists. Who You Are ✅ Must-Have Familiarity with large language models, transformer architectures, and sequence modeling. Hands-on experience with PyTorch, TensorFlow, or JAX. Solid coding skills for data preprocessing and model training. Basic understanding of biological sequences (DNA, RNA, proteins) or chemical representations (SMILES). ✅ Nice-to-Have Experience training or fine-tuning LLMs with domain-specific corpora. Exposure to multi-modal learning techniques. Knowledge of relevant open datasets: PDB, UniProt, GenBank, or chemical libraries. Basic grasp of computational biology or cheminformatics concepts. Familiarity with NLP evaluation pipelines and retrieval tasks. ✅ Mindset You are curious about applying LLMs to new scientific frontiers. You enjoy learning from both ML and life science literature. You’re comfortable working in an interdisciplinary research setting. Why Join Us Be part of a team pushing the boundaries of LLMs for biology and chemistry. Access high-performance computing and cutting-edge models. Collaborate with experts at the intersection of NLP, bioinformatics, and drug discovery. Contribute to open science and next-generation bio-foundation models.

任职要求

暂无要求

大模型数据工程实习生

2026-04-21

算秩未来· 北京/上海·

26届研发

岗位详情

数据平台研发实习生

2026-04-08

算秩未来· 北京/上海·

26届互联网 / 电子 / 网游

岗位详情

生命科学大语言模型数据科学实习生

2026-03-23

算秩未来· 北京·

26届互联网 / 电子 / 网游

岗位详情

灵巧手算法工程师

2026-07-01

自变量机器人科技（深圳）有限公司· 深圳·

26届互联网 / 电子 / 网游

岗位详情

游戏服务运营实习生

2026-07-01

回响科技· 上海·

26届互联网 / 电子 / 网游

岗位详情

产品运营实习生

2026-07-01

回响科技· 上海·

26届互联网 / 电子 / 网游

岗位详情

Lifescience Large Language Model Research Intern

职位描述

任职要求

相关职位推荐

大模型数据工程实习生

数据平台研发实习生

生命科学大语言模型数据科学实习生

灵巧手算法工程师

游戏服务运营实习生

产品运营实习生