We have developed a databank screening procedure, the In Silico Trans-splicing Retrieval System (ISTReS), to identify heterologous, spliced mRNAs with potential origin from chromosomal translocations, mRNA trans-splicing and multi-locus transcription. A parsing algorithm to screen cDNA versus genome Blast outputs was implemented. Key filtering criteria were Blast scores of > or =300, match lengths of > or =95% of the query sequences, junction of the two partners at exon-exon borders and concordant 'sense/sense' reading orientation. ISTReS was validated by the successful identification of bona fide chromosomal translocation-derived fusion transcripts in the HGI and RefSeq databanks. The performance of ISTReS was verified against recently identified chimeric antisense transcripts, where it revealed essentially no independent proof of antisense transcription and absence of exon-exon borders at the chimeric join, consistent with an artefactual origin. Analysis of the UNIGENE database revealed 21 742 chimeric sequences overall that correspond to approximately 1% of the database transcripts. Novel FOP-Rho GAP and methionyl tRNA synthetase-advillin chimeric mRNAs with the canonical features of heterologous-genes spliced-transcripts were identified among 246 chimeras from the RefSeq databank. This suggests a frequency of canonically-spliced chimeras of approximately 1% of all the hybrid sequences in current databanks. These findings demonstrate the efficiency of ISTReS and the overall feasibility of sequence/structure-based strategies to search for chimeric mRNAs candidate to derive from the splicing of heterologous transcripts.

Detection and analysis of spliced chimeric mRNAs in sequence databanks

Guerra, Emanuela;TREROTOLA, MARCO;ALBERTI, SAVERIO
2003-01-01

Abstract

We have developed a databank screening procedure, the In Silico Trans-splicing Retrieval System (ISTReS), to identify heterologous, spliced mRNAs with potential origin from chromosomal translocations, mRNA trans-splicing and multi-locus transcription. A parsing algorithm to screen cDNA versus genome Blast outputs was implemented. Key filtering criteria were Blast scores of > or =300, match lengths of > or =95% of the query sequences, junction of the two partners at exon-exon borders and concordant 'sense/sense' reading orientation. ISTReS was validated by the successful identification of bona fide chromosomal translocation-derived fusion transcripts in the HGI and RefSeq databanks. The performance of ISTReS was verified against recently identified chimeric antisense transcripts, where it revealed essentially no independent proof of antisense transcription and absence of exon-exon borders at the chimeric join, consistent with an artefactual origin. Analysis of the UNIGENE database revealed 21 742 chimeric sequences overall that correspond to approximately 1% of the database transcripts. Novel FOP-Rho GAP and methionyl tRNA synthetase-advillin chimeric mRNAs with the canonical features of heterologous-genes spliced-transcripts were identified among 246 chimeras from the RefSeq databank. This suggests a frequency of canonically-spliced chimeras of approximately 1% of all the hybrid sequences in current databanks. These findings demonstrate the efficiency of ISTReS and the overall feasibility of sequence/structure-based strategies to search for chimeric mRNAs candidate to derive from the splicing of heterologous transcripts.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11564/640053
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 24
  • Scopus 45
  • ???jsp.display-item.citation.isi??? 41
social impact