Showing 200 of total 303 results (show query)

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 2 days ago.

immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project

185 stars 13.75 score 1.3k scripts 22 dependents

maxwell-geospatial

geodl:Geospatial Semantic Segmentation with Torch and Terra

Provides tools for semantic segmentation of geospatial data using convolutional neural network-based deep learning. Utility functions allow for creating masks, image chips, data frames listing image chips in a directory, and DataSets for use within DataLoaders. Additional functions are provided to serve as checks during the data preparation and training process. A UNet architecture can be defined with 4 blocks in the encoder, a bottleneck block, and 4 blocks in the decoder. The UNet can accept a variable number of input channels, and the user can define the number of feature maps produced in each encoder and decoder block and the bottleneck. Users can also choose to (1) replace all rectified linear unit (ReLU) activation functions with leaky ReLU or swish, (2) implement attention gates along the skip connections, (3) implement squeeze and excitation modules within the encoder blocks, (4) add residual connections within all blocks, (5) replace the bottleneck with a modified atrous spatial pyramid pooling (ASPP) module, and/or (6) implement deep supervision using predictions generated at each stage in the decoder. A unified focal loss framework is implemented after Yeung et al. (2022) <https://doi.org/10.1016/j.compmedimag.2021.102026>. We have also implemented assessment metrics using the 'luz' package including F1-score, recall, and precision. Trained models can be used to predict to spatial data without the need to generate chips from larger spatial extents. Functions are available for performing accuracy assessment. The package relies on 'torch' for implementing deep learning, which does not require the installation of a 'Python' environment. Raster geospatial data are handled with 'terra'. Models can be trained using a Compute Unified Device Architecture (CUDA)-enabled graphics processing unit (GPU); however, multi-GPU training is not supported by 'torch' in 'R'.

Maintained by Aaron Maxwell. Last updated 8 months ago.

12 stars 6.86 score 20 scripts

bioc

MoonlightR:Identify oncogenes and tumor suppressor genes from omics data

Motivation: The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). Results: We present an R/bioconductor package called MoonlightR which returns a list of candidate driver genes for specific cancer types on the basis of TCGA expression data. The method first infers gene regulatory networks and then carries out a functional enrichment analysis (FEA) (implementing an upstream regulator analysis, URA) to score the importance of well-known biological processes with respect to the studied cancer type. Eventually, by means of random forests, MoonlightR predicts two specific roles for the candidate driver genes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, MoonlightR can be used to discover OCGs and TSGs in the same cancer type. This may help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV) in breast cancer. In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments.

Maintained by Matteo Tiberti. Last updated 5 months ago.

dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment

17 stars 6.57 score

esteban-alfaro

adabag:Applies Multiclass AdaBoost.M1, SAMME and Bagging

It implements Freund and Schapire's Adaboost.M1 algorithm and Breiman's Bagging algorithm using classification trees as individual classifiers. Once these classifiers have been trained, they can be used to predict on new data. Also, cross validation estimation of the error can be done. Since version 2.0 the function margins() is available to calculate the margins for these classifiers. Also a higher flexibility is achieved giving access to the rpart.control() argument of 'rpart'. Four important new features were introduced on version 3.0, AdaBoost-SAMME (Zhu et al., 2009) is implemented and a new function errorevol() shows the error of the ensembles as a function of the number of iterations. In addition, the ensembles can be pruned using the option 'newmfinal' in the predict.bagging() and predict.boosting() functions and the posterior probability of each class for observations can be obtained. Version 3.1 modifies the relative importance measure to take into account the gain of the Gini index given by a variable in each tree and the weights of these trees. Version 4.0 includes the margin-based ordered aggregation for Bagging pruning (Guo and Boukir, 2013) and a function to auto prune the 'rpart' tree. Moreover, three new plots are also available importanceplot(), plot.errorevol() and plot.margins(). Version 4.1 allows to predict on unlabeled data. Version 4.2 includes the parallel computation option for some of the functions. Version 5.0 includes the Boosting and Bagging algorithms for label ranking (Albano, Sciandra and Plaia, 2023).

Maintained by Esteban Alfaro. Last updated 2 years ago.

5 stars 6.46 score 720 scripts 6 dependents

bioc

Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data

The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.

Maintained by Matteo Tiberti. Last updated 2 months ago.

dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment

5 stars 6.41 score 43 scripts

stla

uniformly:Uniform Sampling

Uniform sampling on various geometric shapes, such as spheres, ellipsoids, simplices.

Maintained by Stéphane Laurent. Last updated 2 years ago.

simulations

10 stars 6.21 score 109 scripts 1 dependents

cran

Directional:A Collection of Functions for Directional Data Analysis

A collection of functions for directional data (including massive data, with millions of observations) analysis. Hypothesis testing, discriminant and regression analysis, MLE of distributions and more are included. The standard textbook for such data is the "Directional Statistics" by Mardia, K. V. and Jupp, P. E. (2000). Other references include: a) Paine J.P., Preston S.P., Tsagris M. and Wood A.T.A. (2018). "An elliptically symmetric angular Gaussian distribution". Statistics and Computing 28(3): 689-697. <doi:10.1007/s11222-017-9756-4>. b) Tsagris M. and Alenazi A. (2019). "Comparison of discriminant analysis methods on the sphere". Communications in Statistics: Case Studies, Data Analysis and Applications 5(4):467--491. <doi:10.1080/23737484.2019.1684854>. c) Paine J.P., Preston S.P., Tsagris M. and Wood A.T.A. (2020). "Spherical regression models with general covariates and anisotropic errors". Statistics and Computing 30(1): 153--165. <doi:10.1007/s11222-019-09872-2>. d) Tsagris M. and Alenazi A. (2024). "An investigation of hypothesis testing procedures for circular and spherical mean vectors". Communications in Statistics-Simulation and Computation, 53(3): 1387--1408. <doi:10.1080/03610918.2022.2045499>. e) Yu Z. and Huang X. (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statistics, 18(1): 301--334. <doi:10.1214/23-EJS2210>. f) Tsagris M. and Alzeley O. (2024). "Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling". Australian & New Zealand Journal of Statistics (Accepted for publication). <doi:10.1111/anzs.12434>. g) Tsagris M., Papastamoulis P. and Kato S. (2024). "Directional data analysis: spherical Cauchy or Poisson kernel-based distribution". Statistics and Computing (Accepted for publication). <doi:10.48550/arXiv.2409.03292>.

Maintained by Michail Tsagris. Last updated 2 months ago.

3 stars 4.06 score 3 dependents

karlines

plot3Drgl:Plotting Multi-Dimensional Data - Using 'rgl'

The 'rgl' implementation of plot3D functions.

Maintained by Karline Soetaert. Last updated 2 years ago.

3.62 score 133 scripts 5 dependents

christophe314

longitudinalData:Longitudinal Data

Tools for longitudinal data and joint longitudinal data (used by packages kml and kml3d).

Maintained by Christophe Genolini. Last updated 6 months ago.

1 stars 3.55 score 65 scripts 11 dependents

asgr

sphereplot:Spherical Plotting

Various functions for creating spherical coordinate system plots via extensions to rgl.

Maintained by Aaron Robotham. Last updated 2 years ago.

3 stars 3.52 score 37 scripts 1 dependents

havishaj

Greymodels:Shiny App for Grey Forecasting Model

The 'Greymodels' Shiny app is an interactive interface for statistical modelling and forecasting using grey-based models. It covers several state-of-the-art univariate and multivariate grey models. A user friendly interface allows users to easily compare the performance of different models for prediction and among others, visualize graphical plots of predicted values within user chosen confidence intervals. Chang, C. (2019) <doi:10.24818/18423264/53.1.19.11>, Li, K., Zhang, T. (2019) <doi:10.1007/s12667-019-00344-0>, Ou, S. (2012) <doi:10.1016/j.compag.2012.03.007>, Li, S., Zhou, M., Meng, W., Zhou, W. (2019) <doi:10.1080/23307706.2019.1666310>, Xie, N., Liu, S. (2009) <doi:10.1016/j.apm.2008.01.011>, Shao, Y., Su, H. (2012) <doi:10.1016/j.aasri.2012.06.003>, Xie, N., Liu, S., Yang, Y., Yuan, C. (2013) <doi:10.1016/j.apm.2012.10.037>, Li, S., Miao, Y., Li, G., Ikram, M. (2020) <doi:10.1016/j.matcom.2019.12.020>, Che, X., Luo, Y., He, Z. (2013) <doi:10.4028/www.scientific.net/AMM.364.207>, Zhu, J., Xu, Y., Leng, H., Tang, H., Gong, H., Zhang, Z. (2016) <doi:10.1109/appeec.2016.7779929>, Luo, Y., Liao, D. (2012) <doi:10.4028/www.scientific.net/AMR.507.265>, Bilgil, H. (2020) <doi:10.3934/math.2021091>, Li, D., Chang, C., Chen, W., Chen, C. (2011) <doi:10.1016/j.apm.2011.04.006>, Chen, C. (2008) <doi:10.1016/j.chaos.2006.08.024>, Zhou, W., Pei, L. (2020) <doi:10.1007/s00500-019-04248-0>, Xiao, X., Duan, H. (2020) <doi:10.1016/j.engappai.2019.103350>, Xu, N., Dang, Y. (2015) <doi:10.1155/2015/606707>, Chen, P., Yu, H.(2014) <doi:10.1155/2014/242809>, Zeng, B., Li, S., Meng, W., Zhang, D. (2019) <doi:10.1371/journal.pone.0221333>, Liu, L., Wu, L. (2021) <doi:10.1016/j.apm.2020.08.080>, Hu, Y. (2020) <doi:10.1007/s00500-020-04765-3>, Zhou, P., Ang, B., Poh, K. (2006) <doi:10.1016/j.energy.2005.12.002>, Cheng, M., Li, J., Liu, Y., Liu, B. (2020) <doi:10.3390/su12020698>, Wang, H., Wang, P., Senel, M., Li, T. (2019) <doi:10.1155/2019/9049815>, Ding, S., Li, R. (2020) <doi:10.1155/2020/4564653>, Zeng, B., Li, C. (2018) <doi:10.1016/j.cie.2018.02.042>, Xie, N., Liu, S. (2015) <doi:10.1109/JSEE.2015.00013>, Zeng, X., Yan, S., He, F., Shi, Y. (2019) <doi:10.1016/j.apm.2019.11.032>.

Maintained by Jahajeeah Havisha. Last updated 2 years ago.

4 stars 3.30 score

cran

ashapesampler:Generating Alpha Shapes

Understanding morphological variation is an important task in many applications. Recent studies in computational biology have focused on developing computational tools for the task of sub-image selection which aims at identifying structural features that best describe the variation between classes of shapes. A major part in assessing the utility of these approaches is to demonstrate their performance on both simulated and real datasets. However, when creating a model for shape statistics, real data can be difficult to access and the sample sizes for these data are often small due to them being expensive to collect. Meanwhile, the landscape of current shape simulation methods has been mostly limited to approaches that use black-box inference---making it difficult to systematically assess the power and calibration of sub-image models. In this R package, we introduce the alpha-shape sampler: a probabilistic framework for simulating realistic 2D and 3D shapes based on probability distributions which can be learned from real data or explicitly stated by the user. The 'ashapesampler' package supports two mechanisms for sampling shapes in two and three dimensions. The first, empirically sampling based on an existing data set, was highlighted in the original main text of the paper. The second, probabilistic sampling from a known distribution, is the computational implementation of the theory derived in that paper. Work based on Winn-Nunez et al. (2024) <doi:10.1101/2024.01.09.574919>.

Maintained by Emily Winn-Nunez. Last updated 1 years ago.

3.30 score

mondrus96

fabisearch:Change Point Detection in High-Dimensional Time Series Networks

Implementation of the Factorized Binary Search (FaBiSearch) methodology for the estimation of the number and the location of multiple change points in the network (or clustering) structure of multivariate high-dimensional time series. The method is motivated by the detection of change points in functional connectivity networks for functional magnetic resonance imaging (fMRI) data. FaBiSearch uses non-negative matrix factorization (NMF), an unsupervised dimension reduction technique, and a new binary search algorithm to identify multiple change points. It requires minimal assumptions. Lastly, we provide interactive, 3-dimensional, brain-specific network visualization capability in a flexible, stand-alone function. This function can be conveniently used with any node coordinate atlas, and nodes can be color coded according to community membership, if applicable. The output is an elegantly displayed network laid over a cortical surface, which can be rotated in the 3-dimensional space. The main routines of the package are detect.cps(), for multiple change point detection, est.net(), for estimating a network between stationary multivariate time series, net.3dplot(), for plotting the estimated functional connectivity networks, and opt.rank(), for finding the optimal rank in NMF for a given data set. The functions have been extensively tested on simulated multivariate high-dimensional time series data and fMRI data. For details on the FaBiSearch methodology, please see Ondrus et al. (2021) <arXiv:2103.06347>. For a more detailed explanation and applied examples of the fabisearch package, please see Ondrus and Cribben (2022), preprint.

Maintained by Martin Ondrus. Last updated 7 months ago.

1 stars 3.00 score 2 scripts

z267xu

vennplot:Venn Diagrams in 2D and 3D

Calculate and plot Venn diagrams in 2D and 3D.

Maintained by Zehao Xu. Last updated 7 years ago.

cpp

2 stars 3.00 score 3 scripts

nialsig

doolkit:Exploration of Dental Surface Topography

Tools for exploring the topography of 3d triangle meshes. The functions were developed with dental surfaces in mind, but could be applied to any triangle mesh of class 'mesh3d'. More specifically, 'doolkit' allows to isolate the border of a mesh, or a subpart of the mesh using the polygon networks method; crop a mesh; compute basic descriptors (elevation, orientation, footprint area); compute slope, angularity and relief index (Ungar and Williamson (2000) <https://palaeo-electronica.org/2000_1/gorilla/issue1_00.htm>; Boyer (2008) <doi:10.1016/j.jhevol.2008.08.002>), inclination and occlusal relief index or gamma (Guy et al. (2013) <doi:10.1371/journal.pone.0066142>), OPC (Evans et al. (2007) <doi:10.1038/nature05433>), OPCR (Wilson et al. (2012) <doi:10.1038/nature10880>), DNE (Bunn et al. (2011) <doi:10.1002/ajpa.21489>; Pampush et al. (2016) <doi:10.1007/s10914-016-9326-0>), form factor (Horton (1932) <doi:10.1029/TR013i001p00350>), basin elongation (Schum (1956) <doi:10.1130/0016-7606(1956)67[597:EODSAS]2.0.CO;2>), lemniscate ratio (Chorley et al; (1957) <doi:10.2475/ajs.255.2.138>), enamel-dentine distance (Guy et al. (2015) <doi:10.1371/journal.pone.0138802>; Thiery et al. (2017) <doi:10.3389/fphys.2017.00524>), absolute crown strength (Schwartz et al. (2020) <doi:10.1098/rsbl.2019.0671>), relief rate (Thiery et al. (2019) <doi:10.1002/ajpa.23916>) and area-relative curvature; draw cumulative profiles of a topographic variable; and map a variable over a 3d triangle mesh.

Maintained by Ghislain Thiery. Last updated 1 years ago.

1 stars 2.70 score 1 scripts