- Company’s multi-year biodiscovery programme, driven by strategic economic partnerships across 26 countries, has resulted in BaseDataTM, the largest and fastest growing database of new biological protein sequences - 9.8 billion - ever built
- New species data gives scientists new potential therapeutic solutions to areas including antimicrobial resistance, gene editing, and environmental sustainability
- Scientific preprint demonstrates how BaseData breaks through biology’s data wall and now exceeds the size, growth and diversity of public datasets that serve as the foundations for AI applications in the life sciences industry
LONDON and CAMBRIDGE, Mass., June 11, 2025 (GLOBE NEWSWIRE) -- Basecamp Research, an AI company dedicated to using nature to solve the most pressing challenges in the life sciences, today announced the discovery of over 1 million new species as part of BaseDataTM, the world’s largest biological protein sequence database, and the first to be purpose-built to power the next era of generative biology.
“The rise of generative biology – using AI foundation models to design, generate, and annotate proteins, pathways and therapeutics – creates unprecedented demand for large, diverse biological sequence databases,” said Glen Gowers, Ph.D., CEO and Co-Founder of Basecamp Research. “The million plus new species we’ve identified to date gives us an unrivaled understanding of how life on Earth has evolved over billions of years. We believe it will help the life sciences sector overcome today’s critical data bottleneck and build a truly generative biology ecosystem with applications across nearly every area of human and planetary health.”
Today, research relies heavily on public, open access biological sequence databases. Designed originally as academic collaboration tools, these databases are accessed over 100 million times per day and are the source of over 50% of all life science patents. This usage has grown by 10x in the last decade and continues to accelerate thanks in large part to the increasing use of data-hungry AI models in biopharma research.
However, compared to all life on Earth, these databases are incomplete, riddled with redundancies, and highly-biased – 70% of all sequence data comes from just 10 species. Furthermore, the information in these databases is growing slowly, and companies using these databases for commercial purposes face increasing scrutiny on various legal fronts.
This lack of growth and diversity holds back model performance and is a fundamental problem that slows research advancements in the life sciences: the data wall.
Basecamp Research has broken through this data wall by pioneering an economic partnership-based model that incentivises the collection of samples across the planet’s most extreme and biodiverse environments, working with 125+ communities in 26 countries. In a milestone disclosure, the company is unveiling the results of its biodiscovery initiative — BaseData has identified 9.8 billion new protein sequences. Once redundant sequences are removed, BaseData is currently over 10x bigger than all public databases combined. It continues to grow rapidly, offering a fundamentally new understanding of life on Earth.
“Biological foundation models are the key to continued progress in the life sciences, but their growth is slowing and performance is suffering,” said Oliver Vince, Ph.D., Co-Founder of Basecamp Research. “It’s this data wall that discourages life science teams in building and using huge biological foundation models. At Basecamp, we’ve demonstrated a new, scalable economic framework for expanding our knowledge of life on Earth and we show it is possible to overcome the data wall that’s limiting progress for others in bioAI. Training our own foundation models on this data is enabling collaborations across the life sciences sector to help solve pressing challenges in drug discovery.”
Basecamp Research scientists have uncovered more than one million new microbial species to date in some of the planet’s most remote and extreme environments, expanding the view of the tree of life by over 10 times. The previously hidden biological insights in this data, combined with foundational AI models, will help shape everything from environmental sustainability and repair to therapeutics development. Examples of such newly discovered species include:
- On a World War II shipwreck, a new species of Burkholderia, a type of bacteria known for its ability to remove heavy metals from the environment. This could help improve pollution control and deepen our understanding of how bacteria resist antibiotics.
- In acidic hot springs near an active volcano, a new member of the Sulfolobaceae family that thrives in searing temperatures. Its specialized proteins, stress-response systems and stability near boiling point could help develop new ways to deliver medicine or preserve biological materials under harsh conditions.
- In Antarctic soil, a new species of Candidatus Eremiobacterota, a bacterium that survives by drawing nourishment from the air, generating its own water using hydrogen as an energy source — a finding that could inform novel gas-based drug delivery systems or therapeutic approaches.
Basecamp Research is sharing further details in a pre-print paper and plans to offer early access to its unique data to life sciences researchers who express interest via its website www.basecamp-research.com to help them further their research.
About Basecamp Research
Basecamp Research is solving the most pressing challenges in the life sciences by exploring Beyond Known BiologyTM. We build foundational AI models on top of the world’s largest, ethically-sourced database of biological information to give AI the most complete understanding of biology ever. This allows us to design more complex biological systems with performance improving dramatically as AI sees more diversity and context. Basecamp Research collaborates with biopharma companies and academic research institutions to design novel protein sequences and biological systems that can transform therapeutic research and development. Our team of explorers, scientists and policy experts proudly work with more than 100 biodiversity partners across the globe, allowing us to deliver breakthroughs that can have a profound impact on healthcare and the lives of patients. Fast Company named Basecamp Research one of biotech’s top 10 Most Innovative Companies in 2025. For more information, visit basecamp-research.com.
BaseDataTM and Beyond Known BiologyTM are brand names and technologies of Basecamp Research.
Media Contact:
Adam Silverstein
SCIENT PR
adam@scientpr.com
Photos accompanying this announcement are available at
https://www.globenewswire.com/NewsRoom/AttachmentNg/5597a489-463d-40e5-9fce-0064cc18a55e
https://www.globenewswire.com/NewsRoom/AttachmentNg/98e6c5ce-da20-4bae-ad20-76687f9ddbc6
https://www.globenewswire.com/NewsRoom/AttachmentNg/61aeb561-580d-4009-bd50-95e4db6cfebf
https://www.globenewswire.com/NewsRoom/AttachmentNg/fd0cde34-a7f2-4b1e-9d32-5d728a23e880
https://www.globenewswire.com/NewsRoom/AttachmentNg/44187ee1-3843-4ddf-91e7-2ce1979017b4
https://www.globenewswire.com/NewsRoom/AttachmentNg/ad51db49-1170-47da-b785-b96762ce198a
