introduction details body

Next-Generation Genomic Sequencing Analysis

Roles
  • Product Design Lead
Company/Client
  • Seven Bridges

The Seven Bridges Genomics Platform resolves the information bottleneck of genomics research by providing a central hub for teams to store, analyze, and jointly interpret their bioinformatic data. The NIH’s National Cancer Institute uses SBG to facilitate research and collaboration on petabyte-scale datasets like The Cancer Genomics Atlas (TCGA) in an environment that supports rapid discovery. Scientists use the Platform to collaborate on their research from anywhere in the world, bringing their questions to the data.

Amazon Web Services: Big Data & High-Performance Computing (Finalist) Amazon Web Services: Life Sciences Bio-IT: Best in Show

The Product

Seven Bridges is a leading biomedical data analysis company that specializes in genomics and bioinformatics. By providing powerful cloud-based solutions, Seven Bridges enables researchers, scientists, and healthcare professionals to accelerate discoveries, enhance precision medicine, and drive advancements in biomedical research.
Meet IGOR
IGOR is Seven Bridges’ no-code bioinformatics platform that simplifies complex genomic data analysis. It allows researchers to process and interpret DNA sequences through a user-friendly interface without needing to write any code. By automating tasks like sequence alignment and variant analysis, IGOR makes advanced genomic research accessible to biologists, eliminating the need for specialized programming skills. This democratizes bioinformatics, enabling scientists to focus on discoveries and insights rather than technical details.

Delivering Unique Value

As the lead product designer for Seven Bridges, my role in the creation of IGOR was to ensure that the platform outperformed its competitors by delivering a user experience that was intuitive, efficient, and user-friendly.
Seven Bridges was in competition with other platforms offering similar cloud-based computing capabilities. User research revealed three opportunities for us to deliver value that set us apart. One was to address the perception of genomic analysis tools as being prohibitively expensive in terms of time, effort, and money. The second was to deliver an experience that let scientists focus their energy on analysis rather than computation. The third was how to address what research revealed as a common frustration among scientists working with complex datasets: data fragmentation.
These three challenges had something in common: each was an expression of the distance - perceived or actual - between data collection and discovery. My goal was to shorten this distance to a point where new users would run and complete tasks on their first session, fewer tasks would fail or be abandoned, and committed users would report greater satisfaction with the results of their analysis.

No-Code Bioinformatics

The user experience with IGOR centers on empowering biologists to perform complex genomic analyses without writing any code. It features a graphical user interface where biologists c easily upload data, select analysis pipelines, and interpret results through intuitive visualizations and reports. By automating bioinformatics workflows such as sequence alignment, variant calling, and annotation, IGOR eliminates the need for programming skills. This accessibility enables researchers to conduct sophisticated data analyses independently, promoting a more inclusive approach to genomic research and accelerating scientific discovery by reducing barriers to advanced computational techniques.

Perception of Effort

One metric for success that we set for ourselves was the number of new users who completed a computation task on their first session. A review of site analytics revealed that new users were spending time exploring the platform but abandoning before they initiated the process of task creation. This happened even though new accounts received 300 free computation hours. When we contacted these new users the feedback we received was that they believed they would not be able to complete a task successfully.

Don't Leave New Users Hanging

Dumping new users into an unfamiliar product with no clear calls to action will leave them baffled as to what they should do next. This was contributing to the low number of regular users relative to the total number of accounts.
My strategy for addressing the issue of abandonment centered on anticipating new users’ interests and directing them to the areas most likely to deliver meaningful value immediately following sign-up.
Research revealed that researchers placed the most value on positive experiences with four key features: project management, computation, data discovery, and funding. In response I created a new landing page with clear pathways and CTAs focused on casual exploration and experimentation.
Presenting these as simple steps lowered the level of perceived commitment for users, increasing the likelihood that casual browsing would lead them to discover content and features that made them say, “Aha! This is what I need.”

Data Management & Data Integrity

Data fragmentation is a significant issue in genomic sequencing analysis because it leads to incomplete and disjointed data sets, complicating the assembly and interpretation of genetic information. Fragmented data can result in gaps and overlaps that hinder the accurate reconstruction of genomes, affecting the reliability of variant detection and gene annotation. This fragmentation requires additional computational resources and complex algorithms to piece together sequences, increasing time and cost. Moreover, it complicates the comparison of genomic data across different studies, reducing the effectiveness of collaborative research and the development of comprehensive genomic databases.
If we were going to resolve the issue of fragmentation we would need to accomplish three goals:

  1. Make data organization more intuitive for researchers
  2. Make it easier for researchers to find related files
  3. Make the process of updating metadata more humane with fewer opportunities for error.

Describing the Data

Sequencing analysis involves using a host of different files in concert with each other. First is the “raw read” - the text file of A’s, T’s, G’s, and C’s produced by the sequencing machine when it reads a sample. Then there are files researchers use to transform and parse the raw read into forms that can be analyzed: alignment files, index files, annotation files, call files, and reference files.
Each file may be used in multiple analysis tasks. What’s more, the contents of a file will evolve as analysis progresses, adding a new layer of complexity when it comes to repeatability. To make things worse, the metadata describing this provenance is traditionally stored in the filename, complicating data management even further by creating collections of files with long, largely indistinguishable strings for names.
The less work that is required to describe a file, the more likely the user is to do it
The first step I took in designing a solution was to address the question of how to distinguish files from one another or different versions of themselves. While many values can be pulled from a filename, the intricacies and difficulties of accurately parsing the string combined with the need to respect the user’s particular mental model of resource organization means that some values are best supplied by the user i.e. manually.
The solution I settled on was to add an automation step immediately following upload. The system inferred the expected metadata structure based on filetype and then supplied the remaining values by parsing the filename, reading the file contents, and evaluating the context of the file (e.g. project name, sample ID, task ID, etc.).
Derive meaning from composition
Analysis of data requires processing multiple files. For an experiment to be repeatable, these relationships need to be captured in some way. Traditionally, this was done by grouping files by directory. The solution I designed introduced a relational database that worked in concert with an open taxonomy to maintain these relationships without duplicating assets.

A Transformative Solution to Complex Problems

IGOR by Seven Bridges has significantly transformed the landscape of genomic data analysis. By addressing key challenges such as the complexity of genomic tools, user abandonment, and data fragmentation, IGOR has made advanced bioinformatics accessible to a broader audience. Through thoughtful UX design, the platform empowers researchers to focus on their scientific discoveries rather than computational hurdles.
Biologists using IGOR have identified novel genetic variants associated with rare diseases by leveraging the platform’s automated variant calling and annotation features. Cancer researchers have also benefited, as IGOR’s intuitive data management and analysis tools have enabled them to track tumor mutations more effectively, leading to personalized treatment strategies. Additionally, evolutionary biologists have utilized IGOR to compare large genomic datasets, uncovering new insights into species evolution.
This democratization of bioinformatics not only enhances individual research capabilities but also accelerates advancements in the field of genomics. By prioritizing user experience and integrating innovative solutions, IGOR sets a new standard for no-code bioinformatics platforms, fostering a more inclusive and efficient scientific community.

View ⤴