SciBERT, the domain-adapted BERT model for scientific text, records over 219,000 monthly downloads on Hugging Face and has crossed 3,400 academic citations since its 2019 release. Built by the Allen Institute for AI and trained on 1.14 million Semantic Scholar papers, SciBERT remains a standard baseline in scientific NLP even as larger models crowd the space. This post breaks down its current usage numbers, benchmark performance across NLP tasks, and how it compares to competing domain-specific models in 2026.
SciBERT Statistics 2026 – TL;DR
SciBERT has 219,161 monthly downloads on Hugging Face as of May 2026, based on the allenai/scibert_scivocab_uncased model page.
The original SciBERT paper has accumulated 3,474 citations on Semantic Scholar, with 575 of those classified as highly influential.
SciBERT was pretrained on 3.1 billion tokens from 1.14 million scientific papers, roughly 80% from biomedical domains and 20% from computer science.
On NLP benchmarks, SciBERT outperforms BERT-base by an average of +2.11 F1 across scientific tasks when fine-tuned.
The GitHub repository has 1,700 stars and 231 forks, with 94 community fine-tuned models listed on Hugging Face.
How Many People Use SciBERT?
The primary SciBERT model (scibert_scivocab_uncased) logged 219,161 downloads in the last month on Hugging Face. A cased variant is also available but used less frequently. The model has 170 community likes and 57 active Spaces on Hugging Face that depend on it.
On GitHub, the allenai/scibert repository has attracted 1,700 stars and 231 forks since its release. The repo contains evaluation code for NER, text classification, relation extraction, dependency parsing, and PICO extraction tasks.
| Metric | Value |
|---|---|
| Monthly Hugging Face Downloads | 219,161 |
| Semantic Scholar Citations | 3,474 |
| Highly Influential Citations | 575 |
| GitHub Stars | 1,700 |
| GitHub Forks | 231 |
| Hugging Face Fine-tuned Models | 94 |
| Hugging Face Spaces | 57 |
| Hugging Face Community Likes | 170 |
Source: Hugging Face, Semantic Scholar, GitHub (May 2026)
SciBERT Usage Overview
SciBERT Model Architecture and Training Data
SciBERT uses the same 12-layer Transformer encoder as BERT-base, with 768-dimensional hidden states, 12 attention heads, and a feed-forward size of 3,072. Total parameter count is 110 million. The key difference from BERT-base is the vocabulary: SciBERT uses a custom SciVocab built from scientific text, which cuts out-of-vocabulary rates for technical terms.
The training corpus was drawn from Semantic Scholar and consists of 1.14 million papers with full text, totaling 3.1 billion tokens. About 82% of the papers come from biomedical fields, and 18% from computer science. The model was released in both cased and uncased variants, with uncased performing better on most tasks.
| Specification | Detail |
|---|---|
| Architecture | BERT-base (12-layer Transformer) |
| Parameters | 110 million |
| Hidden Dimensions | 768 |
| Attention Heads | 12 |
| Training Corpus Size | 1.14M papers / 3.1B tokens |
| Corpus Source | Semantic Scholar |
| Vocabulary | SciVocab (31K tokens) |
| Biomedical Papers Share | ~82% |
| Computer Science Papers Share | ~18% |
Source: Beltagy et al. (2019), Allen Institute for AI
SciBERT NLP Benchmark Performance
SciBERT outperforms BERT-base across biomedical and computer science NLP tasks. The largest gains appear on domain-heavy datasets like ACL-ARC (citation intent classification), where SciBERT scores 70.98 F1 versus 63.91 for BERT-base, a gap of +7.07 points. On ChemProt relation extraction, it scores 83.64 versus 79.14 (+4.50).
In biomedical tasks specifically, the fine-tuned improvement averages +1.92 F1. For computer science tasks, that average is +3.55 F1. With frozen embeddings (no fine-tuning), the gaps widen further: +3.59 F1 on biomedical tasks and +1.13 on CS tasks, according to the original evaluation.
| Task / Dataset | SciBERT F1 | BERT-base F1 | Difference |
|---|---|---|---|
| ACL-ARC (Citation Intent) | 70.98 | 63.91 | +7.07 |
| ChemProt (Relation Extraction) | 83.64 | 79.14 | +4.50 |
| SciERC (NER) | 67.57 | 65.24 | +2.33 |
| BC5CDR (NER) | 90.01 | 88.85 | +1.16 |
| SciCite (Classification) | 84.00 | 84.31 | -0.31 |
| JNLPBA (NER) | 77.28 | 76.99 | +0.29 |
| NCBI-disease (NER) | 88.57 | 86.72 | +1.85 |
Source: Beltagy et al. (2019), emergentmind.com
SciBERT vs BERT-base F1 Scores by Task
SciBERT Statistics by Domain Improvement
The F1 gains break down differently by domain. Biomedical tasks see a +1.92 average improvement with fine-tuning, while computer science tasks get +3.55 on average. When embeddings are frozen and not fine-tuned, SciBERT gains +3.59 F1 on biomedical tasks, suggesting the pretrained representations carry strong domain signal even without task-specific updates.
| Domain | Fine-tuned F1 Gain | Frozen Embeddings F1 Gain |
|---|---|---|
| Biomedical | +1.92 | +3.59 |
| Computer Science | +3.55 | +1.13 |
| Multi-domain | +0.72 | +2.47 |
| Overall Average | +2.11 | +2.43 |
Source: Beltagy et al. (2019)
F1 Score Improvement Over BERT-base by Domain
How Does SciBERT Compare to Other Scientific NLP Models?
PubMedBERT, developed by Microsoft Research and trained exclusively on PubMed abstracts, generally outperforms SciBERT on biomedical benchmarks. PubMedBERT achieved an 82.91 BLURB benchmark score and records roughly 522,000 monthly Hugging Face downloads as of late 2024, more than double SciBERT’s current rate. BioBERT, which continues pretraining from BERT-base on PubMed text, falls between the two on most biomedical tasks.
SciBERT’s advantage over PubMedBERT is its dual-domain coverage. Because the training data includes both biomedical and computer science papers, SciBERT performs better on CS-specific benchmarks like ACL-ARC and SciERC. For teams working across scientific disciplines rather than strictly in biomedicine, SciBERT remains a practical choice, especially given its lighter compute requirements for fine-tuning.
| Model | Training Corpus | Parameters | Monthly Downloads (HF) | Citations |
|---|---|---|---|---|
| SciBERT | 1.14M Semantic Scholar papers | 110M | ~219K | 3,474 |
| PubMedBERT | 14M PubMed abstracts | 110M | ~522K | 1,000+ |
| BioBERT | PubMed + PMC articles | 110M | ~300K | 5,000+ |
| MatSciBERT | Materials science literature | 110M | Moderate | 200+ |
Source: Hugging Face, Semantic Scholar, respective model papers
Monthly Hugging Face Downloads: Domain-Specific BERT Models
SciBERT Academic Citations Over Time
The original SciBERT paper was published at EMNLP 2019. It crossed 1,000 citations by mid-2021 and 2,000 by early 2023. As of May 2026, the count on Semantic Scholar stands at 3,474, with 575 classified as highly influential and 818 as methods citations, meaning those papers used SciBERT as part of their methodology.
Citation growth has slowed compared to the 2021-2023 period, which is typical for a model that has become a standard baseline. Newer models trained on larger or more focused corpora now attract incremental citations, but SciBERT continues to accumulate roughly 400-500 new citations per year.
| Year | Estimated Cumulative Citations |
|---|---|
| 2019 | ~50 |
| 2020 | ~450 |
| 2021 | ~1,200 |
| 2022 | ~2,000 |
| 2023 | ~2,700 |
| 2024 | ~3,100 |
| 2025 | ~3,350 |
| 2026 (May) | 3,474 |
Source: Semantic Scholar
SciBERT Cumulative Academic Citations (2019-2026)
SciBERT Derivative and Fine-tuned Models
The SciBERT architecture has spawned a range of domain-specific derivatives. MatSciBERT was trained on materials science literature and outperforms SciBERT on materials NER and classification tasks. COVID-SciBERT was fine-tuned during the pandemic for biomedical text related to SARS-CoV-2. SsciBERT targets social science text, and NukeBERT covers the nuclear domain.
On Hugging Face alone, 94 community fine-tuned models list SciBERT as their base. These cover tasks like drug adverse effect extraction, citation intent classification, scientific claim identification, and research paper topic tagging. An additional 3 adapter models and 3 quantized versions are available.
| Derivative Model | Domain | Base |
|---|---|---|
| MatSciBERT | Materials Science | SciBERT |
| COVID-SciBERT | COVID-19 / Biomedical | SciBERT |
| SsciBERT | Social Sciences | SciBERT |
| NukeBERT / NukeLM | Nuclear Domain | SciBERT / RoBERTa |
| SciEdBERT | Science Education | SciBERT |
| BatteryBERT | Battery / Energy | SciBERT |
Source: Hugging Face, arxiv.org
SciBERT Statistics in Clinical and Biomedical NLP
A 2025 study by Rubio-Martín et al. applied SciBERT to clinical note classification using hospital electronic health records. The model reached 0.96 accuracy and 0.97 F1 on that task, outperforming most traditional methods after hyperparameter tuning. On biomedical NER specifically, SciBERT with a CRF head scored 0.82 F1 on the NCBI Disease Corpus after grid search optimization, per a 2025 DOAJ study.
For medical abbreviation disambiguation on the MeDAL dataset, SciBERT hit 77.3% macro-F1 and 90.5% weighted F1. In citation intent classification within ensemble setups, SciBERT-based systems achieved macro-F1 above 89%. These numbers position SciBERT as a strong general-purpose scientific encoder, though PubMedBERT remains the top pick for purely biomedical pipelines where accuracy on medical terminology is the priority.
Scientific NLP Market Context for SciBERT
The global NLP market is projected at $70.11 billion in 2026, growing at 29% annually to reach $249.97 billion by 2031, according to MarketsandMarkets. Within that, the biomedical NLP segment reached $8.97 billion in 2025 and is expected to grow at a 34.74% CAGR through 2034.
The domain-specific AI language models market, which covers models like SciBERT and its peers, hit an estimated $6.62 billion in 2026, according to MarketIntelo. That figure is expected to reach $66.2 billion by 2034 at a 38% CAGR. Demand is driven by regulatory requirements in healthcare, finance, and legal sectors that favor auditable domain-constrained models over opaque general-purpose systems.
Domain-Specific AI Language Models Market Size ($B)
SciBERT Key Use Cases in 2026
SciBERT is used across a range of academic and industrial NLP tasks. Named entity recognition for genes, chemicals, and diseases is the most common application. Relation extraction, especially for drug-protein interactions (ChemProt), is another frequent use case. Citation intent classification, where the model identifies whether a paper cites another for background, method, or result, is a strong fit given SciBERT’s training on full paper text.
Other uses include automated systematic review filtering in evidence-based medicine, bioassay semantic annotation, scientific document classification for library systems, and clinical note processing. Research teams working on knowledge graphs from academic literature also rely on SciBERT embeddings for entity linking and document clustering.
FAQ
What is SciBERT used for?
SciBERT is used for scientific text processing tasks including named entity recognition, relation extraction, citation classification, and clinical note analysis in academic and biomedical research.
How many downloads does SciBERT have?
SciBERT’s primary model (scibert_scivocab_uncased) records 219,161 monthly downloads on Hugging Face as of May 2026.
Is SciBERT better than BERT for scientific text?
Yes. SciBERT outperforms BERT-base by an average of +2.11 F1 on scientific NLP benchmarks, with gains up to +7.07 F1 on citation intent tasks.
How does SciBERT compare to PubMedBERT?
PubMedBERT outperforms SciBERT on biomedical-only benchmarks. SciBERT covers both biomedical and computer science domains, making it more versatile across scientific disciplines.
How many citations does the SciBERT paper have?
The SciBERT paper has 3,474 citations on Semantic Scholar as of May 2026, with 575 classified as highly influential.
Sources:
https://huggingface.co/allenai/scibert_scivocab_uncased
https://www.emergentmind.com/topics/scibert
https://www.marketsandmarkets.com/Market-Reports/natural-language-processing-nlp-825.html
