Ficus carica L. is a diploid species, with a genome size of 0.36 pg/2C, still poorly characterized at genetic and genomic level. With the aim of analysing the fig genome structure, we used Illumina technology to produce 25.64 genome equivalents of 35-511 nt long MiSeq sequences and 12.96 genome equivalents of 25-100 nt long HiSeq paired-end reads. The two libraries were subject to a first assembly run separately, then a hybrid assembly was performed; finally, contigs and supercontigs were scaffolded. This first rough assembly is composed of 264,088 scaffolds, up to 41,760 nt in length, covering 323,708,138 nt, that corresponds to 87.5% of the fig genome, with N50 = 2,523. Masking the scaffolds with a transcriptome of Rosaceae, from which sequences related to repetitive elements were removed, allowed us to establish that coding genes account for at least 6.8% of the fig genome. Gene prediction analysis produced 44,419 putative genes. A sample of around 5,000 predicted genes were annotated with regard to gene ontology and function. Concerning the repetitive component, the fig genome resulted composed for 58.3% of repeated sequences, of which none was especially redundant. Among identified repeats, the most represented were LTR-retrotransposons, with Gypsy elements more frequent than Copia.
An insight into structure and composition of the fig genome
BARGHINI, ELENACo-primo
;MASCAGNI, FLAVIACo-primo
;GIORDANI, TOMMASO;SOLORZANO ZAMBRANO, LICETH JANINA;NATALI, LUCIAPenultimo
;CAVALLINI, ANDREA
Ultimo
2017-01-01
Abstract
Ficus carica L. is a diploid species, with a genome size of 0.36 pg/2C, still poorly characterized at genetic and genomic level. With the aim of analysing the fig genome structure, we used Illumina technology to produce 25.64 genome equivalents of 35-511 nt long MiSeq sequences and 12.96 genome equivalents of 25-100 nt long HiSeq paired-end reads. The two libraries were subject to a first assembly run separately, then a hybrid assembly was performed; finally, contigs and supercontigs were scaffolded. This first rough assembly is composed of 264,088 scaffolds, up to 41,760 nt in length, covering 323,708,138 nt, that corresponds to 87.5% of the fig genome, with N50 = 2,523. Masking the scaffolds with a transcriptome of Rosaceae, from which sequences related to repetitive elements were removed, allowed us to establish that coding genes account for at least 6.8% of the fig genome. Gene prediction analysis produced 44,419 putative genes. A sample of around 5,000 predicted genes were annotated with regard to gene ontology and function. Concerning the repetitive component, the fig genome resulted composed for 58.3% of repeated sequences, of which none was especially redundant. Among identified repeats, the most represented were LTR-retrotransposons, with Gypsy elements more frequent than Copia.File | Dimensione | Formato | |
---|---|---|---|
2015 Barghini et al Fig Symposium postprint.pdf
accesso aperto
Descrizione: Articolo
Tipologia:
Documento in Post-print
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
628.35 kB
Formato
Adobe PDF
|
628.35 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.