Genes regulate fundamental processes in living cells, such as the synthesis of proteins or other functional molecules. Studying gene expression is hence crucial for both diagnostic and therapeutic purposes. State-of-the-art Deep Learning techniques such as Xpresso have proposed to predict gene expression from raw DNA sequences. However, DNA sequences challenge computational approaches because of their length, typically in the order of the thousands, and sparsity, requiring models to capture both short-and long-range dependencies. Indeed, the application of recent techniques like transformers is prohibitive with common hardware resources. This paper proposes FNETCOMPRESSION, a novel gene-expression prediction method. Crucially, FNETCOM-PRESSION combines Convolutional encoders and memoryefficient Transformers to compress the sequence up to 95% with minimal performance tradeoff. Experiments on the Xpresso dataset show that FNETCOMPRESSION outscores our baselines and the margin is statistically significant. Moreover, FNETCOMPRESSION is 88% faster than a classical transformer-based architecture with minimal performance tradeoff. 1

Squeeze and Learn: Compressing Long Sequences with Fourier Transformers for Gene Expression Prediction

Vittorio Pipoli
Primo
;
Giuseppe Attanasio;
2024-01-01

Abstract

Genes regulate fundamental processes in living cells, such as the synthesis of proteins or other functional molecules. Studying gene expression is hence crucial for both diagnostic and therapeutic purposes. State-of-the-art Deep Learning techniques such as Xpresso have proposed to predict gene expression from raw DNA sequences. However, DNA sequences challenge computational approaches because of their length, typically in the order of the thousands, and sparsity, requiring models to capture both short-and long-range dependencies. Indeed, the application of recent techniques like transformers is prohibitive with common hardware resources. This paper proposes FNETCOMPRESSION, a novel gene-expression prediction method. Crucially, FNETCOM-PRESSION combines Convolutional encoders and memoryefficient Transformers to compress the sequence up to 95% with minimal performance tradeoff. Experiments on the Xpresso dataset show that FNETCOMPRESSION outscores our baselines and the margin is statistically significant. Moreover, FNETCOMPRESSION is 88% faster than a classical transformer-based architecture with minimal performance tradeoff. 1
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1324628
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact