SeqHub
ProductMay 8, 2026SeqHub Team4 min read

The SeqHub API Is Now in Beta

A lot of bioinformatics work happens in scripts and pipelines that run as part of a larger automated workflow. The SeqHub API makes it possible to include SeqHub annotations in that kind of workflow – same embedding-based annotations, but now also accessible programmatically.

Why we built it

API access has been one of our most-requested features since we launched SeqHub. The ask comes from two places: researchers who use the platform regularly and want to pull annotations into a script or notebook without exporting manually, and researchers who work primarily in command-line environments and need programmatic access to fit SeqHub into how they already work.

Annotation and genomic context data are only useful if they flow to where the analysis is happening. For researchers working outside the browser, the API is how we get it there.

Starting this week, it's available in public beta and free for non-commercial use.

What the API does

Submit a protein sequence or a batch of sequences, and get back functional annotations for your query protein and its genomic neighbors.

Annotations are derived from gLM2 embeddings (the same genomic language model that powers SeqHub search) rather than alignment alone. That means you get functional predictions for sequences that BLAST, eggNOG, and HMM-based tools often miss, particularly for hypothetical and poorly-characterized proteins and organisms.

Responses can include:

  • Predicted functional annotations for proteins of interest (query proteins)
  • Predicted functional annotations across contigs containing proteins similar to the query
The SeqHub API endpoints: functional annotation and protein context search
The SeqHub API offers two endpoints: functional annotation of proteins, and protein context search, which returns contigs containing similar proteins with annotated coding sequences and taxonomic information.

Who it's for

The API is most useful for researchers working at scale: annotating full proteomes, running MAGs through an automated pipeline, or integrating SeqHub output into a multi-step analysis where the web interface isn't practical.

The API uses standard REST endpoints and can be called directly from Python scripts and Jupyter notebooks, or wrapped as a step in Snakemake or Nextflow pipelines.

The SeqHub API in a standard bioinformatics pipeline
The SeqHub API sits alongside other annotation tools in a standard bioinformatics pipeline, accepting protein sequences and returning functional and genomic context annotations.

Specific use cases the API now enables:

  • Annotating >5,000 sequences at once
  • Running SeqHub annotations in parallel with eggNOG, Prokka/Bakta, or antiSMASH for comparison
  • Building SeqHub annotations into other automated pipelines

Beta scope and what comes next

This release covers protein sequence annotation and genomic context, and more details are available in our Documentation. We're actively working on expanding endpoint coverage.

Feedback from beta users will directly shape what we build next. If you have thoughts on other data you'd like to retrieve, or something is missing, broken, or confusing, email us at team@tatta.bio or open a thread in our Discord.

Pricing

API access is included in all SeqHub plans. Non-commercial use is free. Commercial users can try the API as part of SeqHub's 14-day free trial, after which it's included in the paid plan. See our pricing page for details and reach out at team@tatta.bio to discuss any custom requests.

Get started with the SeqHub API

Frequently asked questions

Can I annotate hypothetical proteins with the SeqHub API?

Yes. The SeqHub API uses gLM2 embeddings rather than sequence alignment, so it can return functional predictions for proteins that return no useful hits from BLAST, eggNOG, or HMM-based tools. Hypothetical and poorly-characterized proteins are a primary use case.

What does the SeqHub API return for a protein sequence?

The API has two endpoints. The functional annotation endpoint returns predicted functional annotations for each query protein. The protein context search endpoint takes a single query protein and returns annotated contigs containing similar proteins from the OG database, which includes 130,000+ microbial genomes.

Is the SeqHub API free?

The API is free for non-commercial use up to monthly limits — see our Documentation for details. Commercial users can access it as part of SeqHub's 14-day free trial, after which it's included in the paid plan. See our pricing page for details.

How do I get an API key for SeqHub?

Generate an API token directly from your SeqHub dashboard. Documentation is at docs.seqhub.org.

What are the SeqHub API rate limits?

Rate limits apply per endpoint. See docs.seqhub.org for current limits.

SeqHub is built by Tatta Bio, a scientific nonprofit. The platform and API are free for non-commercial academic use. The underlying gLM2 model and OG dataset are publicly available.

Try the SeqHub API

Programmatic access to embedding-based protein annotation and genomic context. Free for non-commercial use.

Join Discord