Norwegian version of this page

Resources

Here you will find resources for several steps in the reference genome generation pipeline in EBP-Nor.

Species selection

Thinking about sequencing the genome of your favourite species? First check GoaT!

GoaT - Genomes on a Tree presents genome-relevant metadata for all Eukaryotic taxa across the tree of life. Metadata in GoaT include, genome assembly attributes, genome sizes, C values, and chromosome numbers from multiple sources.

GoaT is helpful for checking the availability and quality of genomes of the species of interest and their relatives and for evaluating the feasibility of sequencing a particular genome (and much more!). To learn how to search GoaT, check out the GoaT help page and Illustrated User Guide. For more details and sample use cases, check out the GoaT flagship publication.

A goat looking at a leaf on Darwin's classic tree of life sketch.

Sampling

Hands holding plant material with tweezers and flowers in the foreground.

The ultimate goal of EBP-Nor is to sequence the genomes of all eukaryotic species that occur in Norway and to make the genome sequences freely available. In order to obtain the required high quality for a genome, the sampling of specimens must happen in the way that best preserves the DNA - for most species this means flash-freezing tissue in liquid nitrogen (-200 degrees C) and storing the sample at -80 degrees C.

In addition, we require that metadata have been registered for all specimens before we sequence them. Metadata include geolocation and time, collector, how the species was identified and by whom, and also photos of the specimen, along with a physical voucher that will be deposited in the natural history museum collections.

For a general overview of the sampling standard that we are using, have a look at the Earth Biogenome Project's guidelines.

In most cases, EBP-Nor will first make an informal agreement with species sample providers. Species sample providers for EBP-Nor should follow the following steps:

1. First specify which species you plan to sample and when. This is done by filling in the EBP-Nor Sampling commitment form and submitting it to ebpnor-sampling@ibv.uio.no. EBP-Nor can provide sampling equipment or assistance of needed.

2. When you have acquired a sample, please register its metadata in the EBP-Nor Sampling schema and submit it to ebpnor-sampling@ibv.uio.no. The sample is then ready to be shipped for DNA isolation, after agreement with EBP-Nor. For samples that are to be sent to NSC at UiO (for DNA isolation / HiC / genome sequencing), the EPB-Nor Sample submission form needs to be filled in.

The information/procedure above is likely to change, so please check for updates before sampling!

Sequencing

All EBP-Nor samples are sequenced on one of the two long-read instruments; Sequel IIe instrument from Pacific Biosciences (https://www.sequencing.uio.no/pacbio-services/) or PromethION 48 instrument from Oxford Nanopore Tech (https://cigene.no/lab-and-infrastructure/molecular-biology-lab/).

For both technologies, integrity and purity of the DNA sample is of utmost importance. Please check the DNA requirements for PacBio sequencing here: https://www.sequencing.uio.no/pacbio-services/dna-requirements/ or, contact the molecular biology lab at Cigene for DNA requirements using ONT.

Please let us know if you need help with DNA isolation (ebpnor-sampling@ibv.uio.no).

In addition to long-read data, Hi-C data is generated for scaffolding of long-read contigs. Hi-C sample prep starts directly from the tissue; again, high sample quality is essential. Please check the requirements here: https://www.sequencing.uio.no/pacbio-services/dna-sequencing/hi-c-sample-requirements/.

Genome assembly

When combining HiFi and Hi-C sequencing data, we can create haplotype resolved assemblies, meaning we can separate reads by maternal and paternal origin, without having access to parental data. In diploid, or polyploid organisms, this adds another level of information, and creates more accurate assemblies than a primary and alternate assembly would.

Testing, by us, but also earlier by Darwin Tree of Life and Vertebrate Genomes Project, among others, has shown that the combination of HiFi and Hi-C, in appropriate coverages, usually generates assemblies that fulfill the Earth Biogenome Project's criteria for assembly standards. There are other ways to get to these standards, by using combinations of Oxford Nanopore Technologies sequencing data and Illumina, but these are often less straight-forward and involves more steps to a final assembly that the strategy we outline here.

We have given workshops where we go through the process from raw sequencing reads to finished and curated genome assemblies. For one such, take a look at https://github.com/ebp-nor/genome-assembly-workshop-2023.