Submitting your research data to public repositories is a critical step in the scientific publication process. Most peer-reviewed journals now require researchers to deposit their raw sequencing data in publicly accessible databases before manuscript acceptance. Understanding how to submit data correctly can save time and avoid costly delays in your publication timeline.

The most widely used public repositories for genomics and transcriptomics data include NCBI, GEO, SRA, and ENA. Each repository has specific submission requirements, data formats, and metadata standards that researchers must follow to ensure successful data deposition.

Submitting to NCBI & SRA

The NCBI Sequence Read Archive (SRA) is the largest publicly available repository for high-throughput sequencing data. It accepts raw sequencing data from all NGS platforms including Illumina, PacBio, and Oxford Nanopore.

  • Create an NCBI account at ncbi.nlm.nih.gov
  • Register your BioProject and BioSample before data upload
  • Prepare your sequencing files in FASTQ or BAM format
  • Use the SRA Submission Portal or Aspera for large file uploads

Submitting to GEO (Gene Expression Omnibus)

GEO is the primary repository for gene expression data including microarray and RNA sequencing datasets. It requires both raw data files and processed data along with detailed experimental metadata.

  • Prepare raw data files in FASTQ format for RNA-seq submissions
  • Include processed data such as count matrices and normalized expression values
  • Complete the GEO submission form with full experimental metadata
  • Submit via GEO submission portal at ncbi.nlm.nih.gov/geo/submission

Submitting to ENA (European Nucleotide Archive)

The European Nucleotide Archive is the primary nucleotide sequence repository for European researchers and is part of the International Nucleotide Sequence Database Collaboration along with NCBI and DDBJ.

  • Register at the ENA Webin submission portal
  • Create a study, sample, and experiment before uploading files
  • Upload raw data files using Webin-CLI command line tool
  • Submit metadata in XML or interactive spreadsheet format

Common Submission Mistakes to Avoid

Data submission errors are one of the most common causes of manuscript rejection and publication delays. Following best practices and double-checking your submission before uploading can save significant time and effort.

Always ensure your metadata is complete, accurate, and consistent across all submission forms. Missing or incorrect metadata is the most common reason for submission rejection by repository curators.

At BioinformaticsNext, we assist researchers in preparing and submitting their data to all major public repositories including NCBI, GEO, SRA, and ENA — ensuring full compliance with journal and repository requirements.

Need Help with Data Submission?

Our expert bioinformatics team at BioinformaticsNext handles complete data submission to NCBI, GEO, SRA, and ENA on your behalf. We ensure your data meets all repository and journal requirements, saving you time and avoiding costly submission errors. Contact us today for a free consultation.