En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK
1

Matrix factorisation techniques for data integration

Sélection Signaler une erreur
Multi angle
Auteurs : Lê Cao, Kim-Anh (Auteur de la conférence)
CIRM (Editeur )

Loading the player...

Résumé : Gene module detection methods aim to group genes with similar expression profiles to shed light into functional relationships and co-regulation, and infer gene regulatory networks. Methods proposed so far use clustering to group genes based on global similarity in their expression profiles (co-expression), bi-clustering to group genes and samples simultaneously, network inference to model regulatory relationships between genes. In this talk I will focus on multivariate matrix decomposition techniques that enable dimension reduction and the identification of molecular signatures.
We will consider two different types of assays: bulk and single cell assays. Bulk transcriptomics assays use RNA-sequencing techniques to monitor the average expression profile of all the constituent cells, but fail to identify the distinct transcriptional profiles from different cell types. Single cell assays use similar RNA-seq techniques (scRNA-seq) to those used for bulk cell populations, but provide unprecedented resolution at the cell level to understand cellular heterogeneity and uncover new biology. However, scRNA-seq present new computational and analytical challenges, because of their sheer size (100K – 500K of cells are sequenced) and their zero inflated distribution due to technical drop-outs.
I will illustrate how we can use matrix factorisation technique to mine these data and identify gene modules that underpin molecular mechanisms in cell identity in scRNA-seq. I will also give further perspective on how we could extend similar concepts to integrate different omics data types (e.g. bulk transcriptomics, proteomics, metabolomics) to identify tightly connected multi-omics signatures that holistically describe a biological system.

Mots-Clés : biomathematics; reduction dimension; data integration

Codes MSC :
15A23 - Factorization of matrices
92B15 - General biostatistics, See also {62P10}

    Informations sur la Vidéo

    Réalisateur : Hennenfent, Guillaume
    Langue : Anglais
    Date de Publication : 23/03/2020
    Date de Captation : 05/03/2020
    Sous Collection : Research talks
    Catégorie arXiv : Machine Learning ; Quantitative Biology
    Domaine(s) : Probabilités & Statistiques
    Format : MP4 (.mp4) - HD
    Durée : 01:26:46
    Audience : Chercheurs
    Download : https://videos.cirm-math.fr/2020-03-05_Le Cao.mp4

Informations sur la Rencontre

Nom de la Rencontre : Thematic Month Week 5: Networks and Molecular Biology / Mois thématique Semaine 5 : Réseaux et biologie moléculaire
Organisateurs de la Rencontre : Baudot, Anais ; Hubert, Florence ; Moss, Brigitte ; Rémy, Elisabeth ; Tichit, Laurent ; Vignes, Matthieu
Dates : 02/03/2020 - 06/03/2020
Année de la rencontre : 2020
URL de la Rencontre : https://conferences.cirm-math.fr/2305.html

Données de citation

DOI : 10.24350/CIRM.V.19620803
Citer cette vidéo: Lê Cao, Kim-Anh (2020). Matrix factorisation techniques for data integration. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.19620803
URI : http://dx.doi.org/10.24350/CIRM.V.19620803

Voir Aussi

Bibliographie

  • DRIER, Yotam, SHEFFER, Michal, et DOMANY, Eytan. Pathway-based personalized analysis of cancer. Proceedings of the National Academy of Sciences, 2013, vol. 110, no 16, p. 6388-6393. - https://doi.org/10.1073/pnas.1219651110

  • LIU, Chao, SRIHARI, Sriganesh, CAO, Kim-Anh Lê, et al. A fine-scale dissection of the DNA double-strand break repair machinery and its implications for breast cancer therapy. Nucleic acids research, 2014, vol. 42, no 10, p. 6106-6127. - https://doi.org/10.1093/nar/gku284

  • LIU, Chao, SRIHARI, Sriganesh, LAL, Samir, et al. Personalised pathway analysis reveals association between DNA repair pathway dysregulation and chromosomal instability in sporadic breast cancer. Molecular oncology, 2016, vol. 10, no 1, p. 179-193. - https://doi.org/10.1016/j.molonc.2015.09.007

  • HASTIE, Trevor et STUETZLE, Werner. Principal curves. Journal of the American Statistical Association, 1989, vol. 84, no 406, p. 502-516. - https://www.tandfonline.com/doi/abs/10.1080/01621459.1989.10478797

  • SAELENS, Wouter, CANNOODT, Robrecht, et SAEYS, Yvan. A comprehensive evaluation of module detection methods for gene expression data. Nature communications, 2018, vol. 9, no 1, p. 1-12. - https://doi.org/10.1038/s41467-018-03424-4

  • COMON, Pierre. Independent component analysis, a new concept?. Signal processing, 1994, vol. 36, no 3, p. 287-314. - https://doi.org/10.1016/0165-1684(94)90029-9

  • YAO, Fangzhou, COQUERY, Jeff, et LÊ CAO, Kim-Anh. Independent principal component analysis for biologically meaningful dimension reduction of large biological data sets. BMC bioinformatics, 2012, vol. 13, no 1, p. 24. - http://dx.doi.org/10.1186/1471-2105-13-24

  • SCHAUM, Nicholas, KARKANIAS, Jim, NEFF, Norma F., et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris: The Tabula Muris Consortium. Nature, 2018, vol. 562, no 7727, p. 367. - https://dx.doi.org/10.1038%2Fs41586-018-0590-4

  • CAO, Kim-Anh, ROSSOUW, Debra, ROBERT-GRANIÉ, Christèle, et al. A sparse PLS for variable selection when integrating omics data. Statistical Applications in Genetics & Molecular Biology, 2008, vol. 7, no 1, p. 1-29. - https://doi.org/10.2202/1544-6115.1390

  • BOITARD, Simon et BESSE, Philippe. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics june (12), Non paginé.(2011), 2011. - https://doi.org/10.1186/1471-2105-12-253

  • TENENHAUS, Arthur, PHILIPPE, Cathy, GUILLEMOT, Vincent, et al. Variable selection for generalized canonical correlation analysis. Biostatistics, 2014, vol. 15, no 3, p. 569-583. - https://doi.org/10.1093/biostatistics/kxu001

  • SINGH, Amrit, SHANNON, Casey P., GAUTIER, Benoît, et al. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics, 2019, vol. 35, no 17, p. 3055-3062. - https://doi.org/10.1093/bioinformatics/bty1054

  • ROHART, Florian, GAUTIER, Benoit, SINGH, Amrit, et al. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS computational biology, 2017, vol. 13, no 11, p. e1005752.
    - https://doi.org/10.1371/journal.pcbi.1005752

  • LEE, Amy H., SHANNON, Casey P., AMENYOGBE, Nelly, et al. Dynamic molecular changes during the first week of human life follow a robust developmental trajectory. Nature communications, 2019, vol. 10, no 1, p. 1-14. - https://doi.org/10.1038/s41467-019-08794-x

  • LE CAO, Kim-Anh, COSTELLO, Mary-Ellen, LAKIS, Vanessa Anne, et al. MixMC: a multivariate statistical framework to gain insight into microbial communities. PloS one, 2016, vol. 11, no 8. - https://dx.doi.org/10.1371%2Fjournal.pone.0160169

  • WANG, Yiwen et LÊCAO, Kim-Anh. Managing batch effects in microbiome data. Briefings in bioinformatics, 2019. - https://doi.org/10.1093/bib/bbz105

  • BODEIN, Antoine, CHAPLEUR, Olivier, DROIT, Arnaud, et al. A generic multivariate framework for the integration of microbiome longitudinal studies with other data types. Frontiers in Genetics, 2019, vol. 10. - https://dx.doi.org/10.3389%2Ffgene.2019.00963



Imagette Video

Sélection Signaler une erreur