SeqHub
ProductFebruary 2, 2026SeqHub Team5 min read

Understanding protein co-occurrence in genomic context: Introducing CoSearch

CoSearch is a new, user-requested analysis in SeqHub that lets you explore where multiple proteins co-occur across genomes. It complements single-protein search and batch annotation by enabling relational, context-driven exploration of protein systems.

Biology is inherently relational, but exploring how proteins work together often requires stitching together results across literature and multiple analyses.

Today, we're introducing CoSearch, a user-requested capability in SeqHub that enables exploration of protein co-occurrence in shared genomic context. CoSearch complements existing single-protein search and batch annotation workflows by adding a new, relational lens for protein analysis.

What is CoSearch?

CoSearch (co-occurrence search) is a relational analysis that lets you explore where multiple proteins co-occur across genomes, making their shared genomic context explicit.

Instead of analyzing each protein independently, CoSearch treats a set of proteins as a unit of inquiry and asks: where do these proteins appear together?

Example: Imagine you're studying a suspected biosynthetic pathway and have identified three enzymes that may work together. With single-protein search or batch annotation, you can analyze each enzyme individually. With CoSearch, you can search for all three proteins together and directly see:

  • Which genomes contain all three proteins
  • Their relative genomic proximity and neighborhood context
  • How their genomic neighborhoods compare across species

This makes it easier to discover novel associations, understand evolutionary relationships across diverse taxa, and identify patterns that relate to functional subcategories.

Another example:

When studying CRISPR-associated transposon systems (CAST), researchers are often interested in identifying novel CAST architectures. Using CoSearch, you can search for TnsB and Cas12 together and explore genomes where these proteins co-occur, and their genomic organization and potential subtypes.

When should you use CoSearch?

SeqHub's single-protein search and batch annotation workflows are well-suited for questions focused on individual sequences or higher-throughput analysis, respectively.

CoSearch adds a complementary analytical lens for questions that are inherently about relationships between proteins. It is particularly useful when you want to understand how proteins behave together, rather than in isolation.

This relational view can be especially helpful when studying:

  • Hypothetical proteins whose function emerges from their genomic neighbors
  • Multi-protein complexes, systems and pathways
  • Operons and biosynthetic gene clusters (BGCs)

Together, these workflows allow scientists to move fluidly between sequence-level analysis and system-level context, depending on the question at hand.

Some use cases unlocked by CoSearch

Many of the use cases below reflect workflows our users shared with us and the questions they wanted SeqHub to help answer more directly.

1. Faster + better annotation of hypothetical proteins

Hypothetical proteins are often conserved by association, not by sequence.

With CoSearch, scientists can:

  • Search a hypothetical protein alongside known neighbors
  • Identify consistent co-occurrence patterns across genomes
  • Generate stronger functional hypotheses more quickly

This dramatically reduces time spent triangulating across separate analyses.

2. Exploring operons and multi-gene systems

Many biological functions emerge from multi-gene assemblies rather than single proteins.

CoSearch enables:

  • Operon-level conservation analysis
  • Detection of partial systems across genomes
  • Comparative genomics of modular protein systems

All within a single, contextualized workflow.

3. Comparing conserved modules across diverse taxa

If you're looking for a module that recurs across distantly related organisms, CoSearch (combined with Diversity Search) provides a direct way to explore this, helping you move beyond taxonomy-local patterns.

Why this matters

At a higher level, CoSearch unlocks a new way of reasoning about protein function.

Instead of asking: "What looks similar to this protein?"

You can now ask: "Which proteins appear together—and what does that imply?"

Many questions in genomics aren't about individual proteins, but about how proteins behave together. While sequence-based analysis remains foundational, functional insight often emerges from patterns of co-occurrence and shared genomic context.

CoSearch adds a relational layer to SeqHub, making these patterns easier to explore directly. This supports faster, more confident reasoning about individual proteins and multi-protein systems, especially when studying complex pathways or hypothetical proteins where sequence evidence alone is incomplete.

Try CoSearch in SeqHub

CoSearch is now available in SeqHub.

We're excited to see how scientists use it. As always, we'd love your feedback as you try it out and we'll continue refining CoSearch based on how it's used in real scientific workflows.

Ready to explore protein co-occurrence?

Join scientists using SeqHub to discover new insights through relational protein analysis.

Join Discord