Microsoft’s BioGPT recorded 45,315 monthly downloads on Hugging Face as of December 2025, with 129 tagged model variants now hosted on the platform. Built on 347 million parameters and trained on 15 million PubMed abstracts, BioGPT remains one of the most downloaded domain-specific language models in biomedical research. This page compiles the latest BioGPT statistics for 2026, covering model architecture, benchmark performance, community adoption, and the broader AI healthcare market.
BioGPT Statistics 2026 — TL;DR
BioGPT has 347 million parameters distributed across 24 transformer layers with 1,024 hidden units.
The model recorded 45,315 monthly downloads on Hugging Face as of December 2025.
BioGPT achieved 78.2% accuracy on PubMedQA; BioGPT-Large reached 81.0%.
Microsoft trained BioGPT on 15 million PubMed abstracts using 8 NVIDIA V100 GPUs over approximately 10 days.
The biomedical AI community has created 63 fine-tuned BioGPT model derivatives for specialized applications.
BioGPT is a generative pre-trained Transformer model developed by Microsoft Research, designed specifically for biomedical text generation and mining. Unlike BERT-based biomedical models that only handle classification and extraction tasks, BioGPT can generate new biomedical text, answer research questions, and extract relations between drugs, diseases, and genes from published literature. The model is released under the MIT license, making it available for both commercial and academic use. With the growing adoption of AI tools across research and enterprise settings, BioGPT’s open-access approach has attracted a sizable community of developers and researchers building on top of it.
BioGPT Model Architecture and Training Data
BioGPT is built on the GPT-2 decoder architecture, adapted specifically for biomedical text. The base model contains 347 million parameters organized into 24 transformer layers. Each layer uses 16 attention heads spread across 1,024 hidden units. The vocabulary consists of 42,384 tokens generated through byte pair encoding tuned for medical terminology.
Microsoft trained BioGPT on 15 million PubMed abstracts spanning publications from the 1960s through 2021. Training ran for 200,000 steps on 8 NVIDIA V100 GPUs with a peak learning rate of 2 × 10⁻⁴ and 20,000 warm-up steps. Average token length per abstract measured 200 tokens. The larger variant, BioGPT-Large, follows GPT-2 XL architecture with 1.5 billion parameters.
| Specification | BioGPT (Base) | BioGPT-Large |
|---|---|---|
| Parameters | 347 million | 1.5 billion |
| Transformer Layers | 24 | 48 |
| Hidden Units | 1,024 | 1,600 |
| Attention Heads | 16 | 25 |
| Vocabulary Size | 42,384 | 42,384 |
| Training Data | 15M PubMed Abstracts | 15M PubMed Abstracts |
| Architecture Base | GPT-2 | GPT-2 XL |
| License | MIT | MIT |
Source: Microsoft Research, arXiv:2210.10341
How Does BioGPT Perform on Benchmarks?
Microsoft evaluated BioGPT across six biomedical NLP datasets. The base model reached 78.2% accuracy on PubMedQA, a benchmark for answering biomedical research questions. BioGPT-Large pushed that number to 81.0%. On relation extraction tasks, BioGPT scored 44.98% F1 on BC5CDR (chemical-disease relations), 38.42% on KD-DTI (drug-target interactions), and 40.76% on DDI (drug-drug interactions).
These results put BioGPT ahead of earlier biomedical models at the time of release, though newer large-scale models like Med-PaLM 2 (81.8% on PubMedQA) and MedGemma 27B (87.7% on MedQA) have since reached higher scores on some benchmarks. BioGPT’s advantage is its open-source availability and relatively small computational footprint compared to models with hundreds of billions of parameters.
| Benchmark | Task Type | BioGPT Score | BioGPT-Large Score |
|---|---|---|---|
| PubMedQA | Question Answering | 78.2% (Accuracy) | 81.0% (Accuracy) |
| BC5CDR | Relation Extraction | 44.98% (F1) | 50.12% (F1) |
| KD-DTI | Relation Extraction | 38.42% (F1) | 38.39% (F1) |
| DDI | Relation Extraction | 40.76% (F1) | 44.89% (F1) |
| HoC | Document Classification | — | 84.40% (F1) |
Source: Microsoft Research, Briefings in Bioinformatics (2022)
BioGPT Community Adoption on Hugging Face and GitHub
As of December 2025, the BioGPT base model on Hugging Face recorded 45,315 monthly downloads. The platform hosts 129 BioGPT-tagged models, including the base, BioGPT-Large, and 63 community-developed fine-tuned derivatives. The model page has accumulated 291 likes, and over 85 Hugging Face Spaces use BioGPT in some capacity.
On GitHub, Microsoft’s BioGPT repository has collected 4,500+ stars and 475 forks with 74 active watchers. The model’s MIT license, which permits both research and commercial applications, contributes to its steady adoption. Researchers have adapted BioGPT variants for tasks ranging from drug discovery literature mining to privacy-sensitive clinical documentation.
| Metric | Count | Platform |
|---|---|---|
| Monthly Downloads | 45,315 | Hugging Face |
| Tagged Models | 129 | Hugging Face |
| Fine-tuned Derivatives | 63 | Hugging Face |
| Likes | 291 | Hugging Face |
| Spaces Using BioGPT | 85+ | Hugging Face |
| GitHub Stars | 4,500+ | GitHub |
| GitHub Forks | 475 | GitHub |
| GitHub Watchers | 74 | GitHub |
Source: Hugging Face, GitHub (December 2025)
How Does BioGPT Compare to Other Biomedical AI Models?
BioGPT sits in a crowded field of biomedical language models. BERT-based models like BioBERT and PubMedBERT handle discriminative tasks — classification, named entity recognition, extraction — but cannot generate text. BioGPT fills that gap as a generative model. BioMedLM, also a GPT-style model, uses 2.7 billion parameters and scored 95.7% on BioASQ. Larger proprietary models like Med-PaLM 2 (Flan-PaLM based) reached 86.5% on MedQA and 81.8% on PubMedQA with self-consistency prompting.
BioGPT’s edge over these bigger models is accessibility. At 347 million parameters, it runs on a single consumer GPU. All checkpoints are open-source under MIT. For organizations adopting generative AI for specialized biomedical workflows, that low barrier matters. Newer entrants like MedGemma (released by Google in 2025) have pushed accuracy higher on clinical benchmarks, but BioGPT remains the go-to for text generation tasks built on PubMed literature.
| Model | Developer | Parameters | PubMedQA Score | Open Source |
|---|---|---|---|---|
| BioGPT | Microsoft | 347M | 78.2% | Yes (MIT) |
| BioGPT-Large | Microsoft | 1.5B | 81.0% | Yes (MIT) |
| BioMedLM | MosaicML | 2.7B | — | Yes |
| Med-PaLM 2 | ~340B | 81.8% | No | |
| MedGemma 27B | 27B | — | Partial | |
| PubMedBERT | Microsoft | 110M | — | Yes |
Source: arXiv, Google Research, Hugging Face
BioGPT and the AI Healthcare Market in 2026
The global AI in healthcare market reached $39.34 billion in 2025 and is projected to hit $56.01 billion in 2026, according to Fortune Business Insights. Long-term forecasts place the market at $613.81 billion by 2034, growing at a 36.83% CAGR. Software solutions hold the largest share at 44.60%, and language models like BioGPT are part of that software layer.
Generative AI specifically within healthcare is a $4.7 billion segment in 2026, on track for $39.8 billion by 2035 at a 26.7% CAGR. North America accounts for roughly 56% of that spend. The US FDA has cleared over 340 AI-enabled medical devices, with diagnostics leading adoption. For researchers and pharmaceutical companies building AI-powered workflows, domain-specific models like BioGPT handle specialized text tasks that general-purpose models often miss.
| Year | Market Size (USD Billions) |
|---|---|
| 2024 | $26.69 |
| 2025 | $39.34 |
| 2026 (Projected) | $56.01 |
| 2030 (Projected) | $187.69 |
| 2034 (Projected) | $613.81 |
Source: Fortune Business Insights, Precedence Research
BioGPT Statistics on Model Variants and Use Cases
Microsoft released seven BioGPT checkpoints optimized for different downstream tasks. These include the base pre-trained model, BioGPT-Large, and task-specific versions for question answering (PubMedQA), relation extraction (BC5CDR, KD-DTI, DDI), and document classification (HoC). All variants are available through Hugging Face Hub and Microsoft’s official download channels.
Community fine-tuned derivatives span drug discovery literature mining, clinical trial document summarization, bias detection in biomedical datasets, and gene-disease interaction mapping. Researchers have also used BioGPT for automated extraction of drug-target relationships from newly published papers, reducing manual review time. The rapid scaling of general-purpose chatbots like ChatGPT has pushed more organizations toward specialized models when accuracy on domain-specific literature is the priority.
| BioGPT Variant | Task | Availability |
|---|---|---|
| Pre-trained BioGPT | General biomedical text generation | Hugging Face / Microsoft |
| BioGPT-Large | Improved QA and generation | Hugging Face / Microsoft |
| RE-BC5CDR | Chemical-disease relation extraction | Microsoft Download |
| RE-DTI | Drug-target interaction extraction | Microsoft Download |
| RE-DDI | Drug-drug interaction extraction | Microsoft Download |
| QA-PubMedQA | Biomedical question answering | Hugging Face / Microsoft |
| DC-HoC | Document classification | Microsoft Download |
Source: Microsoft Research, GitHub
BioGPT Statistics — Physician AI Adoption Context
BioGPT’s adoption sits within a broader wave of physician AI use. According to a Doximity survey from November 2025 to January 2026, 63% of US physicians reported using AI tools, up from 47% in March 2025. AI captured 46% of all healthcare venture investment in 2025, totaling more than $18 billion out of $46.8 billion in total healthcare VC.
Q1 2026 digital health funding hit $4 billion, the strongest first quarter since the pandemic-era peak. Average deal size climbed to $36.7 million, and 12 megadeals at $100 million or higher captured 59% of that quarterly total. For enterprise technology buyers evaluating biomedical AI tools, this investment data suggests continued growth across the sector.
FAQ
What is BioGPT?
BioGPT is a generative pre-trained Transformer language model developed by Microsoft Research, trained on 15 million PubMed abstracts for biomedical text generation and mining tasks.
How many parameters does BioGPT have?
The base BioGPT model has 347 million parameters across 24 transformer layers. BioGPT-Large scales to 1.5 billion parameters with 48 layers.
What accuracy does BioGPT achieve on PubMedQA?
BioGPT scored 78.2% accuracy on PubMedQA. The larger variant, BioGPT-Large, reached 81.0% accuracy on the same benchmark.
Is BioGPT open source?
Yes. Microsoft released all BioGPT model checkpoints under the MIT license, allowing both commercial and research use without restrictions.
How many downloads does BioGPT get on Hugging Face?
BioGPT recorded 45,315 monthly downloads on Hugging Face as of December 2025, with 129 tagged model variants on the platform.
Sources
https://huggingface.co/microsoft/biogpt
https://github.com/microsoft/BioGPT
