R-universe search: neighbors

jlmelville

rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors

The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.

Maintained by James Melville. Last updated 8 months ago.

approximate-nearest-neighbor-search cpp

57.6 match 11 stars 7.31 score 75 scripts

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

19.3 match 2.4k stars 16.86 score 50k scripts 73 dependents

jlmelville

RcppHNSW:'Rcpp' Bindings for 'hnswlib', a Library for Approximate Nearest Neighbors

'Hnswlib' is a C++ library for Approximate Nearest Neighbors. This package provides a minimal R interface by relying on the 'Rcpp' package. See <https://github.com/nmslib/hnswlib> for more on 'hnswlib'. 'hnswlib' is released under Version 2.0 of the Apache License.

Maintained by James Melville. Last updated 3 months ago.

approximate-nearest-neighbor-search hnsw k-nearest-neighbors knn nearest-neighbor-search nmslib rcpp cpp

30.1 match 36 stars 10.07 score 63 scripts 77 dependents

statnet

ergm:Fit, Simulate and Diagnose Exponential-Family Models for Networks

An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.

Maintained by Pavel N. Krivitsky. Last updated 7 days ago.

18.0 match 100 stars 15.36 score 1.4k scripts 36 dependents

bioc

BiocNeighbors:Nearest Neighbor Detection for Bioconductor Packages

Implements exact and approximate methods for nearest neighbor detection, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Exact searches can be performed using the k-means for k-nearest neighbors algorithm or with vantage point trees. Approximate searches can be performed using the Annoy or HNSW libraries. Searching on either Euclidean or Manhattan distances is supported. Parallelization is achieved for all methods by using BiocParallel. Functions are also provided to search for all neighbors within a given distance.

Maintained by Aaron Lun. Last updated 12 days ago.

clustering classification cpp

25.9 match 10.14 score 646 scripts 89 dependents

satijalab

SeuratObject:Data Structures for Single Cell Data

Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

cpp

20.9 match 25 stars 11.69 score 1.2k scripts 88 dependents

josiahparry

sfdep:Spatial Dependence for Simple Features

An interface to 'spdep' to integrate with 'sf' objects and the 'tidyverse'.

Maintained by Dexter Locke. Last updated 6 months ago.

r-spatial spatial

30.7 match 130 stars 7.01 score 130 scripts

mhahsler

dbscan:Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms

A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.

Maintained by Michael Hahsler. Last updated 2 months ago.

clustering dbscan density-based-clustering hdbscan lof optics cpp

13.6 match 321 stars 15.62 score 1.6k scripts 84 dependents

geodacenter

rgeoda:R Library for Spatial Data Analysis

Provides spatial data analysis functionalities including Exploratory Spatial Data Analysis, Spatial Cluster Detection and Clustering Analysis, Regionalization, etc. based on the C++ source code of 'GeoDa', which is an open-source software tool that serves as an introduction to spatial data analysis. The 'GeoDa' software and its documentation are available at <https://geodacenter.github.io>.

Maintained by Xun Li. Last updated 9 days ago.

dataanalysis geoda geospatial cpp

22.3 match 73 stars 7.85 score 179 scripts 1 dependents

eddelbuettel

RcppAnnoy:'Rcpp' Bindings for 'Annoy', a Library for Approximate Nearest Neighbors

'Annoy' is a small C++ library for Approximate Nearest Neighbors written for efficient memory usage as well an ability to load from / save to disk. This package provides an R interface by relying on the 'Rcpp' package, exposing the same interface as the original Python wrapper to 'Annoy'. See <https://github.com/spotify/annoy> for more on 'Annoy'. 'Annoy' is released under Version 2.0 of the Apache License. Also included is a small Windows port of 'mmap' which is released under the MIT license.

Maintained by Dirk Eddelbuettel. Last updated 8 days ago.

annoy nearest nearest-neighbors cpp

14.4 match 72 stars 11.97 score 57 scripts 147 dependents

klausvigo

kknn:Weighted k-Nearest Neighbors

Weighted k-Nearest Neighbors for Classification, Regression and Clustering.

Maintained by Klaus Schliep. Last updated 4 years ago.

nearest-neighbor

14.7 match 23 stars 11.08 score 4.6k scripts 41 dependents

prodriguezsosa

conText:'a la Carte' on Text (ConText) Embedding Regression

A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.

Maintained by Pedro L. Rodriguez. Last updated 11 months ago.

17.0 match 104 stars 9.40 score 1.7k scripts

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 5 hours ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

6.8 match 582 stars 21.11 score 31k scripts 1.9k dependents

rich-iannone

DiagrammeR:Graph/Network Visualization

Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.

Maintained by Richard Iannone. Last updated 2 months ago.

graph graph-functions network-graph property-graph visualization

8.9 match 1.7k stars 15.18 score 3.8k scripts 87 dependents

paulnorthrop

donut:Nearest Neighbour Search with Variables on a Torus

Finds the k nearest neighbours in a dataset of specified points, adding the option to wrap certain variables on a torus. The user chooses the algorithm to use to find the nearest neighbours. Two such algorithms, provided by the packages 'RANN' <https://cran.r-project.org/package=RANN>, and 'nabor' <https://cran.r-project.org/package=nabor>, are suggested.

Maintained by Paul J. Northrop. Last updated 2 years ago.

degrees donut edges knn-algorithm knn-search nabor nearest nearest-neighbor nearest-neighbor-search nearest-neighbors nearest-neighbour-algorithm nearest-neighbours neighbors periodicity rann torus wrap

31.7 match 1 stars 4.18 score 5 scripts 1 dependents

cran

FNN:Fast Nearest Neighbor Search Algorithms and Applications

Cover-tree and kd-tree fast k-nearest neighbor search algorithms and related applications including KNN classification, regression and information measures are implemented.

Maintained by Shengqiao Li. Last updated 6 months ago.

cpp

13.0 match 5 stars 9.94 score 2.4k scripts 594 dependents

frbcesab

chessboard:Create Network Connections Based on Chess Moves

Provides functions to work with directed (asymmetric) and undirected (symmetric) spatial networks. It makes the creation of connectivity matrices easier, i.e. a binary matrix of dimension n x n, where n is the number of nodes (sampling units) indicating the presence (1) or the absence (0) of an edge (link) between pairs of nodes. Different network objects can be produced by 'chessboard': node list, neighbor list, edge list, connectivity matrix. It can also produce objects that will be used later in Moran's Eigenvector Maps (Dray et al. (2006) <doi:10.1016/j.ecolmodel.2006.02.015>) and Asymetric Eigenvector Maps (Blanchet et al. (2008) <doi:10.1016/j.ecolmodel.2008.04.001>), methods available in the package 'adespatial' (Dray et al. (2023) <https://CRAN.R-project.org/package=adespatial>). This work is part of the FRB-CESAB working group Bridge <https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/bridge/>.

Maintained by Nicolas Casajus. Last updated 1 years ago.

connectivity-matrix directed-networks neighborhood network one-dimensional-networks spatial-networks two-dimensional-networks undirected-networks

25.5 match 4 stars 4.78 score

mlampros

KernelKnn:Kernel k Nearest Neighbors

Extends the simple k-nearest neighbors algorithm by incorporating numerous kernel functions and a variety of distance metrics. The package takes advantage of 'RcppArmadillo' to speed up the calculation of distances between observations.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

cpp11 distance-metric kernel-methods knn rcpparmadillo openblas cpp openmp

12.4 match 17 stars 9.16 score 54 scripts 13 dependents

gmcmacran

dann:Discriminant Adaptive Nearest Neighbor Classification

Discriminant Adaptive Nearest Neighbor Classification is a variation of k nearest neighbors where the shape of the neighborhood is data driven. This package implements dann and sub_dann from Hastie (1996) <https://web.stanford.edu/~hastie/Papers/dann_IEEE.pdf>.

Maintained by Greg McMahan. Last updated 8 months ago.

openblas cpp

28.7 match 3.74 score 37 scripts

bioc

CatsCradle:This package provides methods for analysing spatial transcriptomics data and for discovering gene clusters

This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. 'CatsCradle' allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.

Maintained by Michael Shapiro. Last updated 1 months ago.

biologicalquestion statisticalmethod geneexpression singlecell transcriptomics spatial

15.9 match 3 stars 6.50 score

tidymodels

dials:Tools for Creating Tuning Parameter Values

Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.

Maintained by Hannah Frick. Last updated 30 days ago.

7.1 match 114 stars 14.31 score 426 scripts 52 dependents

hannameyer

CAST:'caret' Applications for Spatial-Temporal Models

Supporting functionality to run 'caret' with spatial or spatial-temporal data. 'caret' is a frequently used package for model training and prediction using machine learning. CAST includes functions to improve spatial or spatial-temporal modelling tasks using 'caret'. It includes the newly suggested 'Nearest neighbor distance matching' cross-validation to estimate the performance of spatial prediction models and allows for spatial variable selection to selects suitable predictor variables in view to their contribution to the spatial model performance. CAST further includes functionality to estimate the (spatial) area of applicability of prediction models. Methods are described in Meyer et al. (2018) <doi:10.1016/j.envsoft.2017.12.001>; Meyer et al. (2019) <doi:10.1016/j.ecolmodel.2019.108815>; Meyer and Pebesma (2021) <doi:10.1111/2041-210X.13650>; Milà et al. (2022) <doi:10.1111/2041-210X.13851>; Meyer and Pebesma (2022) <doi:10.1038/s41467-022-29838-9>; Linnenbrink et al. (2023) <doi:10.5194/egusphere-2023-1308>; Schumacher et al. (2024) <doi:10.5194/egusphere-2024-2730>. The package is described in detail in Meyer et al. (2024) <doi:10.48550/arXiv.2404.06978>.

Maintained by Hanna Meyer. Last updated 2 months ago.

autocorrelation caret feature-selection machine-learning overfitting predictive-modeling spatial spatio-temporal variable-selection

8.0 match 114 stars 11.97 score 298 scripts 1 dependents

jefferislab

RANN:Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric

Finds the k nearest neighbours for every point in a given dataset in O(N log N) time using Arya and Mount's ANN library (v1.1.3). There is support for approximate as well as exact searches, fixed radius searches and 'bd' as well as 'kd' trees. The distance is computed using the L2 (Euclidean) metric. Please see package 'RANN.L1' for the same functionality using the L1 (Manhattan, taxicab) metric.

Maintained by Gregory Jefferis. Last updated 7 months ago.

ann-library nearest-neighbors nearest-neighbours cpp

7.5 match 58 stars 12.21 score 1.3k scripts 190 dependents

jeffreyevans

yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools

Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.

Maintained by Jeffrey S. Evans. Last updated 6 months ago.

imputation cpp

12.3 match 3 stars 7.40 score 94 scripts 12 dependents

jfrench

smerc:Statistical Methods for Regional Counts

Implements statistical methods for analyzing the counts of areal data, with a focus on the detection of spatial clusters and clustering. The package has a heavy emphasis on spatial scan methods, which were first introduced by Kulldorff and Nagarwalla (1995) <doi:10.1002/sim.4780140809> and Kulldorff (1997) <doi:10.1080/03610929708831995>.

Maintained by Joshua French. Last updated 5 months ago.

cpp

12.9 match 3 stars 6.11 score 45 scripts 3 dependents

kisungyou

Rdimtools:Dimension Reduction and Estimation Methods

We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.

Maintained by Kisung You. Last updated 2 years ago.

dimension-estimation dimension-reduction manifold-learning subspace-learning openblas cpp openmp

9.3 match 52 stars 8.37 score 186 scripts 8 dependents

bioc

ChemmineR:Cheminformatics Toolkit for R

ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics cpp

8.3 match 14 stars 9.42 score 253 scripts 12 dependents

michaeldorman

nngeo:k-Nearest Neighbor Join for Spatial Data

K-nearest neighbor search for projected and non-projected 'sf' spatial layers. Nearest neighbor search uses (1) C code from 'GeographicLib' for lon-lat point layers, (2) function knn() from package 'nabor' for projected point layers, or (3) function st_distance() from package 'sf' for line or polygon layers. The package also includes several other utility functions for spatial analysis.

Maintained by Michael Dorman. Last updated 11 months ago.

8.0 match 81 stars 9.70 score 600 scripts 6 dependents

bioc

GenomicDistributions:GenomicDistributions: fast analysis of genomic intervals with Bioconductor

If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.

Maintained by Kristyna Kupkova. Last updated 5 months ago.

software genomeannotation genomeassembly datarepresentation sequencing coverage functionalgenomics visualization

10.1 match 26 stars 7.44 score 25 scripts

lkremer

ggpointdensity:A Cross Between a 2D Density Plot and a Scatter Plot

A cross between a 2D density plot and a scatter plot, implemented as a 'ggplot2' geom. Points in the scatter plot are colored by the number of neighboring points. This is useful to visualize the 2D-distribution of points in case of overplotting.

Maintained by Lukas P. M. Kremer. Last updated 10 months ago.

2d-density-plot density-visualization geom ggplot-extension ggplot2 ggplot2-enhancements ggplot2-geoms neighboring-points scatter-plot visualization

8.0 match 424 stars 9.30 score 1.1k scripts 4 dependents

welch-lab

cytosignal:What the Package Does (One Line, Title Case)

What the package does (one paragraph).

Maintained by Jialin Liu. Last updated 7 days ago.

openblas cpp

11.8 match 16 stars 5.95 score 6 scripts

topepo

caret:Classification and Regression Training

Misc functions for training and plotting classification and regression models.

Maintained by Max Kuhn. Last updated 3 months ago.

3.6 match 1.6k stars 19.24 score 61k scripts 303 dependents

jkrijthe

Rtsne:T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation

An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation).

Maintained by Jesse Krijthe. Last updated 9 months ago.

openblas cpp openmp

5.0 match 256 stars 13.95 score 4.4k scripts 231 dependents

ssnn-airr

shazam:Immunoglobulin Somatic Hypermutation Analysis

Provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.

Maintained by Susanna Marquez. Last updated 2 months ago.

8.4 match 7.43 score 222 scripts 2 dependents

jefferis

nabor:Wraps 'libnabo', a Fast K Nearest Neighbour Library for Low Dimensions

An R wrapper for 'libnabo', an exact or approximate k nearest neighbour library which is optimised for low dimensional spaces (e.g. 3D). 'libnabo' has speed and space advantages over the 'ANN' library wrapped by package 'RANN'. 'nabor' includes a knn function that is designed as a drop-in replacement for 'RANN' function nn2. In addition, objects which include the k-d tree search structure can be returned to speed up repeated queries of the same set of target points.

Maintained by Gregory Jefferis. Last updated 5 years ago.

libnabo nearest-neighbors cpp

7.5 match 22 stars 8.21 score 104 scripts 34 dependents

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

26.7 match 2.28 score 38 scripts

dvrbts

labdsv:Ordination and Multivariate Analysis for Ecology

A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.

Maintained by David W. Roberts. Last updated 2 years ago.

fortran

9.6 match 3 stars 6.08 score 452 scripts 13 dependents

ludovikcoba

rrecsys:Environment for Evaluating Recommender Systems

Processes standard recommendation datasets (e.g., a user-item rating matrix) as input and generates rating predictions and lists of recommended items. Standard algorithm implementations which are included in this package are the following: Global/Item/User-Average baselines, Weighted Slope One, Item-Based KNN, User-Based KNN, FunkSVD, BPR and weighted ALS. They can be assessed according to the standard offline evaluation methodology (Shani, et al. (2011) <doi:10.1007/978-0-387-85820-3_8>) for recommender systems using measures such as MAE, RMSE, Precision, Recall, F1, AUC, NDCG, RankScore and coverage measures. The package (Coba, et al.(2017) <doi: 10.1007/978-3-319-60042-0_36>) is intended for rapid prototyping of recommendation algorithms and education purposes.

Maintained by Ludovik Çoba. Last updated 3 years ago.

cpp

8.3 match 23 stars 6.84 score 25 scripts

bioc

recountmethylation:Access and analyze public DNA methylation array data compilations

Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.

Maintained by Sean K Maden. Last updated 5 months ago.

dnamethylation epigenetics microarray methylationarray experimenthub

8.8 match 9 stars 6.28 score 9 scripts

r-lidar

lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications

Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.

Maintained by Jean-Romain Roussel. Last updated 1 months ago.

als forestry las laz lidar point-cloud remote-sensing openblas cpp openmp

3.8 match 623 stars 14.47 score 844 scripts 8 dependents

mlampros

nmslibR:Non Metric Space (Approximate) Library

A Non-Metric Space Library ('NMSLIB' <https://github.com/nmslib/nmslib>) wrapper, which according to the authors "is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The goal of the 'NMSLIB' <https://github.com/nmslib/nmslib> Library is to create an effective and comprehensive toolkit for searching in generic non-metric spaces. Being comprehensive is important, because no single method is likely to be sufficient in all cases. Also note that exact solutions are hardly efficient in high dimensions and/or non-metric spaces. Hence, the main focus is on approximate methods". The wrapper also includes Approximate Kernel k-Nearest-Neighbor functions based on the 'NMSLIB' <https://github.com/nmslib/nmslib> 'Python' Library.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

approximate-nearest-neighbor-search nmslib non-metric python reticulate cpp openmp

10.1 match 12 stars 5.14 score 23 scripts

bioc

scrapper:Bindings to C++ Libraries for Single-Cell Analysis

Implements R bindings to C++ code for analyzing single-cell (expression) data, mostly from various libscran libraries. Each function performs an individual step in the single-cell analysis workflow, ranging from quality control to clustering and marker detection. It is mostly intended for other Bioconductor package developers to build more user-friendly end-to-end workflows.

Maintained by Aaron Lun. Last updated 5 days ago.

normalization rnaseq software geneexpression transcriptomics singlecell batcheffect qualitycontrol differentialexpression featureextraction principalcomponent clustering openblas cpp

9.2 match 5.55 score 32 scripts

bioc

matter:Out-of-core statistical computing and signal processing

Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.

Maintained by Kylie A. Bemis. Last updated 3 months ago.

infrastructure datarepresentation dataimport dimensionreduction preprocessing cpp

5.3 match 57 stars 9.52 score 64 scripts 2 dependents

bioc

batchelor:Single-Cell Batch Correction Methods

Implements a variety of methods for batch correction of single-cell (RNA sequencing) data. This includes methods based on detecting mutually nearest neighbors, as well as several efficient variants of linear regression of the log-expression values. Functions are also provided to perform global rescaling to remove differences in depth between batches, and to perform a principal components analysis that is robust to differences in the numbers of cells across batches.

Maintained by Aaron Lun. Last updated 3 days ago.

sequencing rnaseq software geneexpression transcriptomics singlecell batcheffect normalization cpp

5.5 match 9.10 score 1.2k scripts 10 dependents

bioc

singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data

The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.

Maintained by Joshua David Campbell. Last updated 24 days ago.

singlecell geneexpression differentialexpression alignment clustering immunooncology batcheffect normalization qualitycontrol dataimport gui

4.9 match 181 stars 10.16 score 252 scripts

bioc

RCy3:Functions to Access and Control Cytoscape

Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.

Maintained by Alex Pico. Last updated 5 months ago.

visualization graphandnetwork thirdpartyclient network

3.6 match 52 stars 13.39 score 628 scripts 15 dependents

bioc

tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Maintained by Timothy Keyes. Last updated 5 months ago.

singlecell flowcytometry bioinformatics cytometry data-science single-cell tidyverse cpp

6.7 match 18 stars 7.24 score 35 scripts

kosukeimai

MatchIt:Nonparametric Preprocessing for Parametric Causal Inference

Selects matched samples of the original treated and control groups with similar covariate distributions -- can be used to match exactly on covariates, to match on propensity scores, or perform a variety of other matching procedures. The package also implements a series of recommendations offered in Ho, Imai, King, and Stuart (2007) <DOI:10.1093/pan/mpl013>. (The 'gurobi' package, which is not on CRAN, is optional and comes with an installation of the Gurobi Optimizer, available at <https://www.gurobi.com>.)

Maintained by Noah Greifer. Last updated 2 days ago.

cpp openmp

3.2 match 220 stars 15.03 score 2.4k scripts 21 dependents

jhmadsen

DDoutlier:Distance & Density-Based Outlier Detection

Outlier detection in multidimensional domains. Implementation of notable distance and density-based outlier algorithms. Allows users to identify local outliers by comparing observations to their nearest neighbors, reverse nearest neighbors, shared neighbors or natural neighbors. For distance-based approaches, see Knorr, M., & Ng, R. T. (1997) <doi:10.1145/782010.782021>, Angiulli, F., & Pizzuti, C. (2002) <doi:10.1007/3-540-45681-3_2>, Hautamaki, V., & Ismo, K. (2004) <doi:10.1109/ICPR.2004.1334558> and Zhang, K., Hutter, M. & Jin, H. (2009) <doi:10.1007/978-3-642-01307-2_84>. For density-based approaches, see Tang, J., Chen, Z., Fu, A. W. C., & Cheung, D. W. (2002) <doi:10.1007/3-540-47887-6_53>, Jin, W., Tung, A. K. H., Han, J., & Wang, W. (2006) <doi:10.1007/11731139_68>, Schubert, E., Zimek, A. & Kriegel, H-P. (2014) <doi:10.1137/1.9781611973440.63>, Latecki, L., Lazarevic, A. & Prokrajac, D. (2007) <doi:10.1007/978-3-540-73499-4_6>, Papadimitriou, S., Gibbons, P. B., & Faloutsos, C. (2003) <doi:10.1109/ICDE.2003.1260802>, Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000) <doi:10.1145/342009.335388>, Kriegel, H.-P., Kröger, P., Schubert, E., & Zimek, A. (2009) <doi:10.1145/1645953.1646195>, Zhu, Q., Feng, Ji. & Huang, J. (2016) <doi:10.1016/j.patrec.2016.05.007>, Huang, J., Zhu, Q., Yang, L. & Feng, J. (2015) <doi:10.1016/j.knosys.2015.10.014>, Tang, B. & Haibo, He. (2017) <doi:10.1016/j.neucom.2017.02.039> and Gao, J., Hu, W., Zhang, X. & Wu, Ou. (2011) <doi:10.1007/978-3-642-20847-8_23>.

Maintained by Jacob H. Madsen. Last updated 6 years ago.

9.6 match 12 stars 5.00 score 56 scripts 1 dependents

mlr-org

mlr3learners:Recommended Learners for 'mlr3'

Recommended Learners for 'mlr3'. Extends 'mlr3' with interfaces to essential machine learning packages on CRAN. This includes, but is not limited to: (penalized) linear and logistic regression, linear and quadratic discriminant analysis, k-nearest neighbors, naive Bayes, support vector machines, and gradient boosting.

Maintained by Marc Becker. Last updated 4 months ago.

classification learners machine-learning mlr3 regression

4.1 match 91 stars 11.51 score 1.5k scripts 10 dependents

bioimaginggroup

bioimagetools:Tools for Microscopy Imaging

Tools for 3D imaging, mostly for biology/microscopy. Read and write TIFF stacks. Functions for segmentation, filtering and analyzing 3D point patterns.

Maintained by Volker Schmid. Last updated 3 years ago.

biology microscopy

8.8 match 4 stars 5.30 score 33 scripts 1 dependents

juanmartinsantos

rgnoisefilt:Elimination of Noisy Samples in Regression Datasets using Noise Filters

Traditional noise filtering methods aim at removing noisy samples from a classification dataset. This package adapts classic and recent filtering techniques for use in regression problems, and it also incorporates methods specifically designed for regression data. In order to do this, it uses approaches proposed in the specialized literature, such as Martin et al. (2021) [<doi:10.1109/ACCESS.2021.3123151>] and Arnaiz-Gonzalez et al. (2016) [<doi:10.1016/j.eswa.2015.12.046>]. Thus, the goal of the implemented noise filters is to eliminate samples with noise in regression datasets.

Maintained by Juan Martin. Last updated 1 years ago.

11.0 match 2 stars 4.00 score 3 scripts

cran

rNeighborQTL:Interval Mapping for Quantitative Trait Loci Underlying Neighbor Effects

To enable quantitative trait loci mapping of neighbor effects, this package extends a single-marker regression to interval mapping. The theoretical background of the method is described in Sato et al. (2021) <doi:10.1093/g3journal/jkab017>.

Maintained by Yasuhiro Sato. Last updated 4 years ago.

22.0 match 2.00 score 3 scripts

cran

rNeighborGWAS:Testing Neighbor Effects in Marker-Based Regressions

To incorporate neighbor genotypic identity into genome-wide association studies, the package provides a set of functions for variation partitioning and association mapping. The theoretical background of the method is described in Sato et al. (2021) <doi:10.1038/s41437-020-00401-w>.

Maintained by Yasuhiro Sato. Last updated 4 years ago.

17.7 match 2.48 score 15 scripts

prioritizr

prioritizr:Systematic Conservation Prioritization in R

Systematic conservation prioritization using mixed integer linear programming (MILP). It provides a flexible interface for building and solving conservation planning problems. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. By using exact algorithm solvers, solutions can be generated that are guaranteed to be optimal (or within a pre-specified optimality gap). Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. To solve large-scale or complex conservation planning problems, users should install the Gurobi optimization software (available from <https://www.gurobi.com/>) and the 'gurobi' R package (see Gurobi Installation Guide vignette for details). Users can also install the IBM CPLEX software (<https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer>) and the 'cplexAPI' R package (available at <https://github.com/cran/cplexAPI>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to generate solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). For further details, see Hanson et al. (2025) <doi:10.1111/cobi.14376>.

Maintained by Richard Schuster. Last updated 12 days ago.

biodiversity conservation conservation-planner optimization prioritization solver spatial cpp

3.5 match 124 stars 11.82 score 584 scripts 2 dependents

rcurtin

mlpack:'Rcpp' Integration for the 'mlpack' Library

A fast, flexible machine learning library, written in C++, that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms. See also Curtin et al. (2023) <doi:10.21105/joss.05026>.

Maintained by Ryan Curtin. Last updated 3 months ago.

openblas cpp openmp

10.9 match 3.71 score 20 scripts 8 dependents

pachadotdev

analogsea:Interface to 'DigitalOcean'

Provides a set of functions for interacting with the 'DigitalOcean' API <https://www.digitalocean.com/>, including creating images, destroying them, rebooting, getting details on regions, and available images.

Maintained by Mauricio Vargas. Last updated 2 years ago.

cloud-computing droplet ssh

5.3 match 159 stars 7.56 score 100 scripts 1 dependents

klausvigo

phangorn:Phylogenetic Reconstruction and Analysis

Allows for estimation of phylogenetic trees and networks using Maximum Likelihood, Maximum Parsimony, distance methods and Hadamard conjugation (Schliep 2011). Offers methods for tree comparison, model selection and visualization of phylogenetic networks as described in Schliep et al. (2017).

Maintained by Klaus Schliep. Last updated 1 months ago.

software technology qualitycontrol phylogenetic-analysis phylogenetics openblas cpp

2.3 match 206 stars 16.69 score 2.5k scripts 135 dependents

nsaph-software

GPCERF:Gaussian Processes for Estimating Causal Exposure Response Curves

Provides a non-parametric Bayesian framework based on Gaussian process priors for estimating causal effects of a continuous exposure and detecting change points in the causal exposure response curves using observational data. Ren, B., Wu, X., Braun, D., Pillai, N., & Dominici, F.(2021). "Bayesian modeling for exposure response curve via gaussian processes: Causal effects of exposure to air pollution on health outcomes." arXiv preprint <doi:10.48550/arXiv.2105.03454>.

Maintained by Boyu Ren. Last updated 11 months ago.

cpp

5.9 match 9 stars 6.33 score 16 scripts

fridleylab

spatialTIME:Spatial Analysis of Vectra Immunoflourescent Data

Visualization and analysis of Vectra Immunoflourescent data. Options for calculating both the univariate and bivariate Ripley's K are included. Calculations are performed using a permutation-based approach presented by Wilson et al. <doi:10.1101/2021.04.27.21256104>.

Maintained by Fridley Lab. Last updated 7 months ago.

6.1 match 4 stars 6.08 score 30 scripts

ghtaranto

scapesClassification:User-Defined Classification of Raster Surfaces

Series of algorithms to translate users' mental models of seascapes, landscapes and, more generally, of geographic features into computer representations (classifications). Spaces and geographic objects are classified with user-defined rules taking into account spatial data as well as spatial relationships among different classes and objects.

Maintained by Gerald H. Taranto. Last updated 3 years ago.

classification-algorithm object-detection raster spatial

8.7 match 1 stars 4.22 score 33 scripts

gpiras

sphet:Estimation of Spatial Autoregressive Models with and without Heteroskedastic Innovations

Functions for fitting Cliff-Ord-type spatial autoregressive models with and without heteroskedastic innovations using Generalized Method of Moments estimation are provided. Some support is available for fitting spatial HAC models, and for fitting with non-spatial endogeneous variables using instrumental variables.

Maintained by Gianfranco Piras. Last updated 7 days ago.

4.9 match 8 stars 7.43 score 188 scripts 3 dependents

predictiveecology

NetLogoR:Build and Run Spatially Explicit Agent-Based Models

Build and run spatially explicit agent-based models using only the R platform. 'NetLogoR' follows the same framework as the 'NetLogo' software (Wilensky (1999) <http://ccl.northwestern.edu/netlogo/>) and is a translation in R of the structure and functions of 'NetLogo'. 'NetLogoR' provides new R classes to define model agents and functions to implement spatially explicit agent-based models in the R environment. This package allows benefiting of the fast and easy coding phase from the highly developed 'NetLogo' framework, coupled with the versatility, power and massive resources of the R software. Examples of two models from the NetLogo software repository (Ants <http://ccl.northwestern.edu/netlogo/models/Ants>) and Wolf-Sheep-Predation (<http://ccl.northwestern.edu/netlogo/models/WolfSheepPredation>), and a third, Butterfly, from Railsback and Grimm (2012) <https://www.railsback-grimm-abm-book.com/>, all written using 'NetLogoR' are available. The 'NetLogo' code of the original version of these models is provided alongside. A programming guide inspired from the 'NetLogo' Programming Guide (<https://ccl.northwestern.edu/netlogo/docs/programming.html>) and a dictionary of 'NetLogo' primitives (<https://ccl.northwestern.edu/netlogo/docs/dictionary.html>) equivalences are also available. NOTE: To increment 'time', these functions can use a for loop or can be integrated with a discrete event simulator, such as 'SpaDES' (<https://cran.r-project.org/package=SpaDES>). The suggested package 'fastshp' can be installed with 'install.packages("fastshp", repos = ("<https://rforge.net>"), type = "source")'.

Maintained by Eliot J B McIntire. Last updated 4 months ago.

5.3 match 38 stars 6.94 score 19 scripts

grasia

knnp:Time Series Prediction using K-Nearest Neighbors Algorithm (Parallel)

Two main functionalities are provided. One of them is predicting values with k-nearest neighbors algorithm and the other is optimizing the parameters k and d of the algorithm. These are carried out in parallel using multiple threads.

Maintained by Daniel Bastarrica Lacalle. Last updated 5 years ago.

knearest-neighbor-algorithm parallel time-series-forecasting

13.1 match 1 stars 2.70 score 8 scripts

julianfaraway

faraway:Datasets and Functions for Books by Julian Faraway

Books are "Linear Models with R" published 1st Ed. August 2004, 2nd Ed. July 2014, 3rd Ed. February 2025 by CRC press, ISBN 9781439887332, and "Extending the Linear Model with R" published by CRC press in 1st Ed. December 2005 and 2nd Ed. March 2016, ISBN 9781584884248 and "Practical Regression and ANOVA in R" contributed documentation on CRAN (now very dated).

Maintained by Julian Faraway. Last updated 1 months ago.

data

3.8 match 29 stars 9.43 score 1.7k scripts 1 dependents

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 4 hours ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

2.0 match 559 stars 17.64 score 17k scripts 851 dependents

bioc

HGC:A fast hierarchical graph-based clustering method

HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.

Maintained by XGlab. Last updated 5 months ago.

singlecell software clustering rnaseq graphandnetwork dnaseq cpp

7.5 match 4.70 score 25 scripts

jeffreyevans

spatialEco:Spatial Analysis and Modelling Utilities

Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.

Maintained by Jeffrey S. Evans. Last updated 14 days ago.

biodiversity conservation ecology r-spatial raster spatial vector

3.7 match 110 stars 9.55 score 736 scripts 2 dependents

kharchenkolab

N2R:Fast and Scalable Approximate k-Nearest Neighbor Search Methods using 'N2' Library

Implements methods to perform fast approximate K-nearest neighbor search on input matrix. Algorithm based on the 'N2' implementation of an approximate nearest neighbor search using hierarchical Navigable Small World (NSW) graphs. The original algorithm is described in "Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs", Y. Malkov and D. Yashunin, <doi:10.1109/TPAMI.2018.2889473>, <arXiv:1603.09320>.

Maintained by Evan Biederstedt. Last updated 1 years ago.

cpp

6.9 match 10 stars 5.08 score 3 scripts 2 dependents

bioc

IRanges:Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation bioconductor-package core-package

2.3 match 22 stars 15.09 score 2.1k scripts 1.8k dependents

biooss

sensitivity:Global Sensitivity Analysis of Model Outputs and Importance Measures

A collection of functions for sensitivity analysis of model outputs (factor screening, global sensitivity analysis and robustness analysis), for variable importance measures of data, as well as for interpretability of machine learning models. Most of the functions have to be applied on scalar output, but several functions support multi-dimensional outputs.

Maintained by Bertrand Iooss. Last updated 7 months ago.

cpp

5.0 match 17 stars 6.74 score 472 scripts 8 dependents

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 6 days ago.

1.8 match 584 stars 18.71 score 7.2k scripts 380 dependents

reddertar

smotefamily:A Collection of Oversampling Techniques for Class Imbalance Problem Based on SMOTE

A collection of various oversampling techniques developed from SMOTE is provided. SMOTE is a oversampling technique which synthesizes a new minority instance between a pair of one minority instance and one of its K nearest neighbor. Other techniques adopt this concept with other criteria in order to generate balanced dataset for class imbalance problem.

Maintained by Wacharasak Siriseriwan. Last updated 1 years ago.

5.7 match 2 stars 5.93 score 512 scripts 8 dependents

plangfelder

WGCNA:Weighted Correlation Network Analysis

Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.

Maintained by Peter Langfelder. Last updated 6 months ago.

cpp

3.5 match 54 stars 9.65 score 5.3k scripts 32 dependents

usepa

spsurvey:Spatial Sampling Design and Analysis

A design-based approach to statistical inference, with a focus on spatial data. Spatially balanced samples are selected using the Generalized Random Tessellation Stratified (GRTS) algorithm. The GRTS algorithm can be applied to finite resources (point geometries) and infinite resources (linear / linestring and areal / polygon geometries) and flexibly accommodates a diverse set of sampling design features, including stratification, unequal inclusion probabilities, proportional (to size) inclusion probabilities, legacy (historical) sites, a minimum distance between sites, and two options for replacement sites (reverse hierarchical order and nearest neighbor). Data are analyzed using a wide range of analysis functions that perform categorical variable analysis, continuous variable analysis, attributable risk analysis, risk difference analysis, relative risk analysis, change analysis, and trend analysis. spsurvey can also be used to summarize objects, visualize objects, select samples that are not spatially balanced, select panel samples, measure the amount of spatial balance in a sample, adjust design weights, and more. For additional details, see Dumelle et al. (2023) <doi:10.18637/jss.v105.i03>.

Maintained by Michael Dumelle. Last updated 7 months ago.

ord

3.7 match 15 stars 8.98 score 241 scripts 1 dependents

bioc

Biobase:Biobase: Base functions for Bioconductor

Functions that are needed by many other packages or which replace R functions.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

2.0 match 9 stars 16.45 score 6.6k scripts 1.8k dependents

tidymodels

parsnip:A Common API to Modeling and Analysis Functions

A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).

Maintained by Max Kuhn. Last updated 5 days ago.

2.0 match 612 stars 16.37 score 3.4k scripts 69 dependents

emmanuelparadis

ape:Analyses of Phylogenetics and Evolution

Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.

Maintained by Emmanuel Paradis. Last updated 23 hours ago.

openblas cpp

1.9 match 64 stars 17.22 score 13k scripts 599 dependents

janusza

RoughSets:Data Analysis Using Rough Set and Fuzzy Rough Set Theories

Implementations of algorithms for data analysis based on the rough set theory (RST) and the fuzzy rough set theory (FRST). We not only provide implementations for the basic concepts of RST and FRST but also popular algorithms that derive from those theories. The methods included in the package can be divided into several categories based on their functionality: discretization, feature selection, instance selection, rule induction and classification based on nearest neighbors. RST was introduced by Zdzisław Pawlak in 1982 as a sophisticated mathematical tool to model and process imprecise or incomplete information. By using the indiscernibility relation for objects/instances, RST does not require additional parameters to analyze the data. FRST is an extension of RST. The FRST combines concepts of vagueness and indiscernibility that are expressed with fuzzy sets (as proposed by Zadeh, in 1965) and RST.

Maintained by Christoph Bergmeir. Last updated 5 years ago.

cpp

5.7 match 37 stars 5.61 score 185 scripts

freezenik

BayesX:R Utilities Accompanying the Software Package BayesX

Functions for exploring and visualising estimation results obtained with BayesX, a free software for estimating structured additive regression models (<https://www.uni-goettingen.de/de/bayesx/550513.html>). In addition, functions that allow to read, write and manipulate map objects that are required in spatial analyses performed with BayesX.

Maintained by Nikolaus Umlauf. Last updated 1 years ago.

8.6 match 3.71 score 48 scripts 3 dependents

elvanceyhan

nnspat:Nearest Neighbor Methods for Spatial Patterns

Contains the functions for testing the spatial patterns (of segregation, spatial symmetry, association, disease clustering, species correspondence and reflexivity) based on nearest neighbor relations, especially using contingency tables such as nearest neighbor contingency tables (Ceyhan (2010) <doi:10.1007/s10651-008-0104-x> and Ceyhan (2017) <doi:10.1016/j.jkss.2016.10.002> and references therein), nearest neighbor symmetry contingency tables (Ceyhan (2014) <doi:10.1155/2014/698296>), species correspondence contingency tables and reflexivity contingency tables (Ceyhan (2018) <doi:10.2436/20.8080.02.72>) for two (or higher) dimensional data. Also contains functions for generating patterns of segregation, association, uniformity in a multi-class setting (Ceyhan (2014) <doi:10.1007/s00477-013-0824-9>), and various non-random labeling patterns for disease clustering in two dimensional cases (Ceyhan (2014) <doi:10.1002/sim.6053>), and for visualization of all these patterns for the two dimensional data. The tests are usually (asymptotic) normal z-tests and chi-square tests.

Maintained by Elvan Ceyhan. Last updated 3 years ago.

10.9 match 2.90 score 16 scripts

r-forge

RobAStBase:Robust Asymptotic Statistics

Base S4-classes and functions for robust asymptotic statistics.

Maintained by Matthias Kohl. Last updated 2 months ago.

6.3 match 4.96 score 64 scripts 4 dependents

bioc

GenomicRanges:Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Maintained by Hervé Pagès. Last updated 4 months ago.

genetics infrastructure datarepresentation sequencing annotation genomeannotation coverage bioconductor-package core-package

1.8 match 44 stars 17.75 score 13k scripts 1.3k dependents

bioc

RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences

This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.

Maintained by Pascal Belleau. Last updated 5 months ago.

genetics software sequencing wholegenome principalcomponent geneticvariability dimensionreduction biocviews ancestry cancer-genomics exome-sequencing genomics inference r-language rna-seq rna-sequencing whole-genome-sequencing

5.0 match 5 stars 6.23 score 19 scripts

franciscomartinezdelrio

tsfknn:Time Series Forecasting Using Nearest Neighbors

Allows forecasting time series using nearest neighbors regression Francisco Martinez, Maria P. Frias, Maria D. Perez-Godoy and Antonio J. Rivera (2019) <doi:10.1007/s10462-017-9593-z>. When the forecasting horizon is higher than 1, two multi-step ahead forecasting strategies can be used. The model built is autoregressive, that is, it is only based on the observations of the time series. The nearest neighbors used in a prediction can be consulted and plotted.

Maintained by Francisco Martinez. Last updated 1 years ago.

cpp

5.6 match 11 stars 5.54 score 63 scripts

freezenik

R2BayesX:Estimate Structured Additive Regression Models with 'BayesX'

An R interface to estimate structured additive regression (STAR) models with 'BayesX'.

Maintained by Nikolaus Umlauf. Last updated 1 years ago.

8.6 match 1 stars 3.55 score 118 scripts 1 dependents

l-ramirez-lopez

resemble:Memory-Based Learning in Spectral Chemometrics

Functions for dissimilarity analysis and memory-based learning (MBL, a.k.a local modeling) in complex spectral data sets. Most of these functions are based on the methods presented in Ramirez-Lopez et al. (2013) <doi:10.1016/j.geoderma.2012.12.014>.

Maintained by Leonardo Ramirez-Lopez. Last updated 2 years ago.

chemoinformatics chemometrics infrared-spectroscopy lazy-learning local-regression machine-learning memory-based-learning nir pedometrics soil-spectroscopy spectral-data spectral-library spectroscopy openblas cpp openmp

5.2 match 20 stars 5.91 score 27 scripts

animint

animint2:Animated Interactive Grammar of Graphics

Functions are provided for defining animated, interactive data visualizations in R code, and rendering on a web page. The 2018 Journal of Computational and Graphical Statistics paper, <doi:10.1080/10618600.2018.1513367> describes the concepts implemented.

Maintained by Toby Hocking. Last updated 28 days ago.

3.4 match 64 stars 8.87 score 173 scripts

r-spatial

spdep:Spatial Dependence: Weighting Schemes, Statistics

A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.

Maintained by Roger Bivand. Last updated 19 days ago.

spatial-autocorrelation spatial-dependence spatial-weights

1.8 match 131 stars 16.62 score 6.0k scripts 107 dependents

madr0008

mldr.resampling:Resampling Algorithms for Multi-Label Datasets

Collection of the state of the art multi-label resampling algorithms. The objective of these algorithms is to achieve balance in multi-label datasets.

Maintained by Miguel Ángel Dávila. Last updated 1 years ago.

11.0 match 1 stars 2.70 score 7 scripts

bioc

SummarizedExperiment:A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Maintained by Hervé Pagès. Last updated 5 months ago.

genetics infrastructure sequencing annotation coverage genomeannotation bioconductor-package core-package

1.8 match 34 stars 16.85 score 8.6k scripts 1.2k dependents

mobiodiv

mobr:Measurement of Biodiversity

Functions for calculating metrics for the measurement biodiversity and its changes across scales, treatments, and gradients. The methods implemented in this package are described in: Chase, J.M., et al. (2018) <doi:10.1111/ele.13151>, McGlinn, D.J., et al. (2019) <doi:10.1111/2041-210X.13102>, McGlinn, D.J., et al. (2020) <doi:10.1101/851717>, and McGlinn, D.J., et al. (2023) <doi:10.1101/2023.09.19.558467>.

Maintained by Daniel McGlinn. Last updated 5 months ago.

biodiversity conservation ecology rarefaction species statistics

3.4 match 23 stars 8.59 score 93 scripts

tkonopka

umap:Uniform Manifold Approximation and Projection

Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).

Maintained by Tomasz Konopka. Last updated 11 months ago.

dimensionality-reduction umap cpp

2.2 match 132 stars 12.74 score 3.6k scripts 43 dependents

thomasp85

densityClust:Clustering by Fast Search and Find of Density Peaks

An improved implementation (based on k-nearest neighbors) of the density peak clustering algorithm, originally described by Alex Rodriguez and Alessandro Laio (Science, 2014 vol. 344). It can handle large datasets (> 100,000 samples) very efficiently. It was initially implemented by Thomas Lin Pedersen, with inputs from Sean Hughes and later improved by Xiaojie Qiu to handle large datasets with kNNs.

Maintained by Thomas Lin Pedersen. Last updated 1 years ago.

cpp

3.9 match 153 stars 7.14 score 75 scripts

bioc

GenomicFeatures:Query the gene models of a given organism/assembly

Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.

Maintained by H. Pagès. Last updated 4 months ago.

genetics infrastructure annotation sequencing genomeannotation bioconductor-package core-package

1.8 match 26 stars 15.34 score 5.3k scripts 339 dependents

besanson

FastKNN:Fast k Nearest Neighbor

This are different Functions related to the k Nearest Neighbo classifier. The distance matrix is an input making the computation faster and allowing other distances than euclidean.

Maintained by Gaston Besanson. Last updated 10 years ago.

6.7 match 3.97 score 62 scripts 1 dependents

jdonaldson

tsne:T-Distributed Stochastic Neighbor Embedding for R (t-SNE)

A "pure R" implementation of the t-SNE algorithm.

Maintained by Justin Donaldson. Last updated 6 years ago.

2.8 match 58 stars 9.35 score 656 scripts 13 dependents

bbbruce

nncc:Nearest Neighbors Matching of Case-Control Data

Provides nearest-neighbors matching and analysis of case-control data. Cui, Z., Marder, E. P., Click, E. S., Hoekstra, R. M., & Bruce, B. B. (2022) <doi:10.1097/EDE.0000000000001504>.

Maintained by Beau Bruce. Last updated 1 years ago.

9.7 match 2.70 score 3 scripts

juanv66x

viralx:Explainers for Regression Models in HIV Research

A dedicated viral-explainer model tool designed to empower researchers in the field of HIV research, particularly in viral load and CD4 (Cluster of Differentiation 4) lymphocytes regression modeling. Drawing inspiration from the 'tidymodels' framework for rigorous model building of Max Kuhn and Hadley Wickham (2020) <https://www.tidymodels.org>, and the 'DALEXtra' tool for explainability by Przemyslaw Biecek (2020) <doi:10.48550/arXiv.2009.13248>. It aims to facilitate interpretable and reproducible research in biostatistics and computational biology for the benefit of understanding HIV dynamics.

Maintained by Juan Pablo Acuña González. Last updated 4 months ago.

8.6 match 3.00 score 1 scripts

joeguinness

GpGp:Fast Gaussian Process Computation Using Vecchia's Approximation

Functions for fitting and doing predictions with Gaussian process models using Vecchia's (1988) approximation. Package also includes functions for reordering input locations, finding ordered nearest neighbors (with help from 'FNN' package), grouping operations, and conditional simulations. Covariance functions for spatial and spatial-temporal data on Euclidean domains and spheres are provided. The original approximation is due to Vecchia (1988) <http://www.jstor.org/stable/2345768>, and the reordering and grouping methods are from Guinness (2018) <doi:10.1080/00401706.2018.1437476>. Model fitting employs a Fisher scoring algorithm described in Guinness (2019) <doi:10.48550/arXiv.1905.08374>.

Maintained by Joseph Guinness. Last updated 5 months ago.

openblas cpp openmp

4.1 match 10 stars 6.16 score 160 scripts 6 dependents

bioc

wpm:Well Plate Maker

The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.

Maintained by Helene Borges. Last updated 5 months ago.

gui proteomics massspectrometry batcheffect experimentaldesign

5.3 match 6 stars 4.78 score 7 scripts

dfsp-spirit

fsbrain:Managing and Visualizing Brain Surface Data

Provides high-level access to neuroimaging data from standard software packages like 'FreeSurfer' <http://freesurfer.net/> on the level of subjects and groups. Load morphometry data, surfaces and brain parcellations based on atlases. Mask data using labels, load data for specific atlas regions only, and visualize data and statistical results directly in 'R'.

Maintained by Tim Schäfer. Last updated 4 months ago.

3d brain dti freesurfer mesh mri neuroimaging research surface visualization voxel

3.9 match 66 stars 6.47 score 15 scripts

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

3.5 match 145 stars 7.09 score 50 scripts 2 dependents

spatlyu

tidyrgeoda:A tidy interface for rgeoda

An interface for 'rgeoda' to integrate with 'sf' objects and the 'tidyverse'.

Maintained by Wenbo Lv. Last updated 7 months ago.

geocomputation geoinformatics giscience spatial-analysis spatial-statistics

4.9 match 16 stars 5.11 score 5 scripts

promerpr

scanstatistics:Space-Time Anomaly Detection using Scan Statistics

Detection of anomalous space-time clusters using the scan statistics methodology. Focuses on prospective surveillance of data streams, scanning for clusters with ongoing anomalies. Hypothesis testing is made possible by Monte Carlo simulation. Allévius (2018) <doi:10.21105/joss.00515>.

Maintained by Paul Romer Present. Last updated 2 years ago.

cpp

5.1 match 1 stars 4.81 score 43 scripts

bioc

scran:Methods for Single-Cell RNA-Seq Data Analysis

Implements miscellaneous functions for interpretation of single-cell RNA-seq data. Methods are provided for assignment of cell cycle phase, detection of highly variable and significantly correlated genes, identification of marker genes, and other common tasks in routine single-cell analysis workflows.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell clustering bioconductor-package human-cell-atlas single-cell-rna-seq openblas cpp

1.9 match 41 stars 13.14 score 7.6k scripts 36 dependents

chrhennig

prabclus:Functions for Clustering and Testing of Presence-Absence, Abundance and Multilocus Genetic Data

Distance-based parametric bootstrap tests for clustering with spatial neighborhood information. Some distance measures, Clustering of presence-absence, abundance and multilocus genetic data for species delimitation, nearest neighbor based noise detection. Genetic distances between communities. Tests whether various distance-based regressions are equal. Try package?prabclus for on overview.

Maintained by Christian Hennig. Last updated 6 months ago.

4.1 match 1 stars 5.99 score 90 scripts 71 dependents

bioc

MetaNeighbor:Single cell replicability analysis

MetaNeighbor allows users to quantify cell type replicability across datasets using neighbor voting.

Maintained by Stephan Fischer. Last updated 5 months ago.

immunooncology geneexpression go multiplecomparison singlecell transcriptomics

4.1 match 5.89 score 78 scripts

dbolotov

neighbr:Classification, Regression, Clustering with K Nearest Neighbors

Classification, regression, and clustering with k nearest neighbors algorithm. Implements several distance and similarity measures, covering continuous and logical features. Outputs ranked neighbors. Most features of this package are directly based on the PMML specification for KNN.

Maintained by Dmitriy Bolotov. Last updated 5 years ago.

9.8 match 2.48 score 30 scripts

rozetasimonovska

SDPDmod:Spatial Dynamic Panel Data Modeling

Spatial model calculation for static and dynamic panel data models, weights matrix creation and Bayesian model comparison. Bayesian model comparison methods were described by 'LeSage' (2014) <doi:10.1016/j.spasta.2014.02.002>. The 'Lee'-'Yu' transformation approach is described in 'Yu', 'De Jong' and 'Lee' (2008) <doi:10.1016/j.jeconom.2008.08.002>, 'Lee' and 'Yu' (2010) <doi:10.1016/j.jeconom.2009.08.001> and 'Lee' and 'Yu' (2010) <doi:10.1017/S0266466609100099>.

Maintained by Rozeta Simonovska. Last updated 11 months ago.

4.8 match 5 stars 4.98 score 19 scripts

tanaylab

misha:Toolkit for Analysis of Genomic Data

A toolkit for analysis of genomic data. The 'misha' package implements an efficient data structure for storing genomic data, and provides a set of functions for data extraction, manipulation and analysis. Some of the 2D genome algorithms were described in Yaffe and Tanay (2011) <doi:10.1038/ng.947>.

Maintained by Aviezer Lifshitz. Last updated 6 days ago.

genomic-data-analysis cpp

4.0 match 4 stars 5.86 score

bioc

xcms:LC-MS and GC-MS Data Analysis

Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.

Maintained by Steffen Neumann. Last updated 3 days ago.

immunooncology massspectrometry metabolomics bioconductor feature-detection mass-spectrometry peak-detection cpp

1.7 match 196 stars 14.31 score 984 scripts 11 dependents

mnwright

bnnSurvival:Bagged k-Nearest Neighbors Survival Prediction

Implements a bootstrap aggregated (bagged) version of the k-nearest neighbors survival probability prediction method (Lowsky et al. 2013). In addition to the bootstrapping of training samples, the features can be subsampled in each baselearner to break the correlation between them. The Rcpp package is used to speed up the computation.

Maintained by Marvin N. Wright. Last updated 8 years ago.

cpp

8.7 match 1 stars 2.70 score 5 scripts

sfcheung

modelbpp:Model BIC Posterior Probability

Fits the neighboring models of a fitted structural equation model and assesses the model uncertainty of the fitted model based on BIC posterior probabilities, using the method presented in Wu, Cheung, and Leung (2020) <doi:10.1080/00273171.2019.1574546>.

Maintained by Shu Fai Cheung. Last updated 6 months ago.

lavaan model-comparison model-comparison-and-selection model-selection structural-equation-modeling

5.2 match 4.54 score 2 scripts

paobranco

UBL:An Implementation of Re-Sampling Approaches to Utility-Based Learning for Both Classification and Regression Tasks

Provides a set of functions that can be used to obtain better predictive performance on cost-sensitive and cost/benefits tasks (for both regression and classification). This includes re-sampling approaches that modify the original data set biasing it towards the user preferences.

Maintained by Paula Branco. Last updated 3 months ago.

fortran

3.5 match 33 stars 6.39 score 165 scripts 1 dependents

fguenther

LSAfun:Applied Latent Semantic Analysis (LSA) Functions

Provides functions that allow for convenient working with vector space models of semantics/distributional semantic models/word embeddings. Originally built for LSA models (hence the name), but can be used for all such vector-based models. For actually building a vector semantic space, use the package 'lsa' or other specialized software. Downloadable semantic spaces can be found at <https://sites.google.com/site/fritzgntr/software-resources>.

Maintained by Fritz Guenther. Last updated 1 years ago.

6.9 match 1 stars 3.18 score 85 scripts 1 dependents

mthrun

ProjectionBasedClustering:Projection Based Clustering

A clustering approach applicable to every projection method is proposed here. The two-dimensional scatter plot of any projection method can construct a topographic map which displays unapparent data structures by using distance and density information of the data. The generalized U*-matrix renders this visualization in the form of a topographic map, which can be used to automatically define the clusters of high-dimensional data. The whole system is based on Thrun and Ultsch, "Using Projection based Clustering to Find Distance and Density based Clusters in High-Dimensional Data" <DOI:10.1007/s00357-020-09373-2>. Selecting the correct projection method will result in a visualization in which mountains surround each cluster. The number of clusters can be determined by counting valleys on the topographic map. Most projection methods are wrappers for already available methods in R. By contrast, the neighbor retrieval visualizer (NeRV) is based on C++ source code of the 'dredviz' software package, and the Curvilinear Component Analysis (CCA) is translated from 'MATLAB' ('SOM Toolbox' 2.0) to R.

Maintained by Michael Thrun. Last updated 3 years ago.

cpp

4.1 match 7 stars 5.33 score 34 scripts 3 dependents

kharchenkolab

sccore:Core Utilities for Single-Cell RNA-Seq

Core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with differential expression (DE) matrices and count matrices, a collection of functions for manipulating and plotting data via 'ggplot2', and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP <doi:10.21105/joss.00861>, collapsing vertices of each cluster in the graph, and propagating graph labels.

Maintained by Evan Biederstedt. Last updated 1 years ago.

cpp

3.4 match 12 stars 6.44 score 36 scripts 9 dependents

stuart-lab

Signac:Analysis of Single-Cell Chromatin Data

A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.

Maintained by Tim Stuart. Last updated 7 months ago.

atac bioinformatics single-cell zlib cpp

1.8 match 349 stars 12.19 score 3.7k scripts 1 dependents

jlmelville

uwot:The Uniform Manifold Approximation and Projection (UMAP) Method for Dimensionality Reduction

An implementation of the Uniform Manifold Approximation and Projection dimensionality reduction by McInnes et al. (2018) <doi:10.48550/arXiv.1802.03426>. It also provides means to transform new data and to carry out supervised dimensionality reduction. An implementation of the related LargeVis method of Tang et al. (2016) <doi:10.48550/arXiv.1602.00370> is also provided. This is a complete re-implementation in R (and C++, via the 'Rcpp' package): no Python installation is required. See the uwot website (<https://github.com/jlmelville/uwot>) for more documentation and examples.

Maintained by James Melville. Last updated 20 days ago.

dimensionality-reduction umap cpp

1.3 match 328 stars 15.74 score 2.0k scripts 140 dependents

felixthestudent

cellpypes:Cell Type Pipes for Single-Cell RNA Sequencing Data

Annotate single-cell RNA sequencing data manually based on marker gene thresholds. Find cell type rules (gene+threshold) through exploration, use the popular piping operator '%>%' to reconstruct complex cell type hierarchies. 'cellpypes' models technical noise to find positive and negative cells for a given expression threshold and returns cell type labels or pseudobulks. Cite this package as Frauhammer (2022) <doi:10.5281/zenodo.6555728> and visit <https://github.com/FelixTheStudent/cellpypes> for tutorials and newest features.

Maintained by Felix Frauhammer. Last updated 1 years ago.

celltype-annotation classification-algorithm scrnaseq single-cell-rna-seq

4.7 match 51 stars 4.41 score 8 scripts

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

2.0 match 47 stars 10.34 score 200 scripts

bioc

Banksy:Spatial transcriptomic clustering

Banksy is an R package that incorporates spatial information to cluster cells in a feature space (e.g. gene expression). To incorporate spatial information, BANKSY computes the mean neighborhood expression and azimuthal Gabor filters that capture gene expression gradients. These features are combined with the cell's own expression to embed cells in a neighbor-augmented product space which can then be clustered, allowing for accurate and spatially-aware cell typing and tissue domain segmentation.

Maintained by Joseph Lee. Last updated 13 days ago.

clustering spatial singlecell geneexpression dimensionreduction clustering-algorithm single-cell-omics spatial-omics

2.3 match 90 stars 9.03 score 248 scripts

dfriend21

quadtree:Region Quadtrees for Spatial Data

Provides functionality for working with raster-like quadtrees (also called “region quadtrees”), which allow for variable-sized cells. The package allows for flexibility in the quadtree creation process. Several functions defining how to split and aggregate cells are provided, and custom functions can be written for both of these processes. In addition, quadtrees can be created using other quadtrees as “templates”, so that the new quadtree's structure is identical to the template quadtree. The package also includes functionality for modifying quadtrees, querying values, saving quadtrees to a file, and calculating least-cost paths using the quadtree as a resistance surface.

Maintained by Derek Friend. Last updated 2 years ago.

cpp

3.2 match 19 stars 6.34 score 58 scripts

thothorn

ipred:Improved Predictors

Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error.

Maintained by Torsten Hothorn. Last updated 8 months ago.

1.9 match 10.76 score 3.3k scripts 411 dependents

cogbrainhealthlab

VertexWiseR:Simplified Vertex-Wise Analyses of Whole-Brain and Hippocampal Surface

Provides functions to run statistical analyses on surface-based neuroimaging data, computing measures including cortical thickness and surface area of the whole-brain and of the hippocampi. It can make use of 'FreeSurfer', 'fMRIprep' and 'HCP' preprocessed datasets and 'HippUnfold' hippocampal segmentation outputs for a given sample by restructuring the data values into a single file. The single file can then be used by the package for analyses independently from its base dataset and without need for its access.

Maintained by Charly Billaud. Last updated 7 days ago.

3.4 match 1 stars 5.84 score 12 scripts

cran

e1071:Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien

Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, generalized k-nearest neighbour ...

Maintained by David Meyer. Last updated 6 months ago.

cpp

1.8 match 29 stars 11.26 score 2.0k dependents

cmlmagneville

mFD:Compute and Illustrate the Multiple Facets of Functional Diversity

Computing functional traits-based distances between pairs of species for species gathered in assemblages allowing to build several functional spaces. The package allows to compute functional diversity indices assessing the distribution of species (and of their dominance) in a given functional space for each assemblage and the overlap between assemblages in a given functional space, see: Chao et al. (2018) <doi:10.1002/ecm.1343>, Maire et al. (2015) <doi:10.1111/geb.12299>, Mouillot et al. (2013) <doi:10.1016/j.tree.2012.10.004>, Mouillot et al. (2014) <doi:10.1073/pnas.1317625111>, Ricotta and Szeidl (2009) <doi:10.1016/j.tpb.2009.10.001>. Graphical outputs are included. Visit the 'mFD' website for more information, documentation and examples.

Maintained by Camille Magneville. Last updated 3 months ago.

2.7 match 26 stars 7.35 score 61 scripts

bioc

DepecheR:Determination of essential phenotypic elements of clusters in high-dimensional entities

The purpose of this package is to identify traits in a dataset that can separate groups. This is done on two levels. First, clustering is performed, using an implementation of sparse K-means. Secondly, the generated clusters are used to predict outcomes of groups of individuals based on their distribution of observations in the different clusters. As certain clusters with separating information will be identified, and these clusters are defined by a sparse number of variables, this method can reduce the complexity of data, to only emphasize the data that actually matters.

Maintained by Jakob Theorell. Last updated 5 months ago.

software cellbasedassays transcription differentialexpression datarepresentation immunooncology transcriptomics classification clustering dimensionreduction featureextraction flowcytometry rnaseq singlecell visualization cpp

3.8 match 5.18 score 15 scripts

mikeasilva

simplegraphdb:A Simple Graph Database

This is a graph database in 'SQLite'. It is inspired by Denis Papathanasiou's Python simple-graph project on 'GitHub'.

Maintained by Michael Silva. Last updated 4 years ago.

graph sqlite sqlite-database

5.2 match 7 stars 3.75 score 16 scripts

rnabioco

valr:Genome Interval Arithmetic

Read and manipulate genome intervals and signals. Provides functionality similar to command-line tool suites within R, enabling interactive analysis and visualization of genome-scale data. Riemondy et al. (2017) <doi:10.12688/f1000research.11997.1>.

Maintained by Kent Riemondy. Last updated 8 days ago.

bedtools genome interval-arithmetic cpp

2.0 match 90 stars 9.69 score 227 scripts

yuelyu21

SCIntRuler:Guiding the Integration of Multiple Single-Cell RNA-Seq Datasets

The accumulation of single-cell RNA-seq (scRNA-seq) studies highlights the potential benefits of integrating multiple datasets. By augmenting sample sizes and enhancing analytical robustness, integration can lead to more insightful biological conclusions. However, challenges arise due to the inherent diversity and batch discrepancies within and across studies. SCIntRuler, a novel R package, addresses these challenges by guiding the integration of multiple scRNA-seq datasets.

Maintained by Yue Lyu. Last updated 5 months ago.

sequencing geneticvariability singlecell cpp

4.0 match 2 stars 4.85 score 3 scripts

tnagler

kdecopula:Kernel Smoothing for Bivariate Copula Densities

Provides fast implementations of kernel smoothing techniques for bivariate copula densities, in particular density estimation and resampling.

Maintained by Thomas Nagler. Last updated 7 years ago.

openblas cpp

3.4 match 8 stars 5.63 score 31 scripts 1 dependents

bioc

annotate:Annotation for microarrays

Using R enviroments for annotation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation pathways go

1.6 match 11.41 score 812 scripts 243 dependents

emilhvitfeldt

fastTextR:An Interface to the 'fastText' Library

An interface to the 'fastText' library <https://github.com/facebookresearch/fastText>. The package can be used for text classification and to learn word vectors. An example how to use 'fastTextR' can be found in the 'README' file.

Maintained by Emil Hvitfeldt. Last updated 1 years ago.

cpp

3.3 match 4 stars 5.50 score 44 scripts 2 dependents

jcatwood

nntmvn:Draw Samples of Truncated Multivariate Normal Distributions

Draw samples from truncated multivariate normal distribution using the sequential nearest neighbor (SNN) method introduced in "Scalable Sampling of Truncated Multivariate Normals Using Sequential Nearest-Neighbor Approximation" <doi:10.48550/arXiv.2406.17307>.

Maintained by Jian Cao. Last updated 1 months ago.

cpp

6.4 match 2.85 score 3 scripts

rezakj

iCellR:Analyzing High-Throughput Single Cell Sequencing Data

A toolkit that allows scientists to work with data from single cell sequencing technologies such as scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST). Single (i) Cell R package ('iCellR') provides unprecedented flexibility at every step of the analysis pipeline, including normalization, clustering, dimensionality reduction, imputation, visualization, and so on. Users can design both unsupervised and supervised models to best suit their research. In addition, the toolkit provides 2D and 3D interactive visualizations, differential expression analysis, filters based on cells, genes and clusters, data merging, normalizing for dropouts, data imputation methods, correcting for batch differences, pathway analysis, tools to find marker genes for clusters and conditions, predict cell types and pseudotime analysis. See Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.05.05.078550> and Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.03.31.019109> for more details.

Maintained by Alireza Khodadadi-Jamayran. Last updated 8 months ago.

10xgenomics 3d batch-normalization cell-type-classification cite-seq clustering clustering-algorithm diffusion-maps dropout icellr imputation intractive-graph normalization pseudotime scrna-seq scvdj-seq singel-cell-sequencing umap cpp

3.3 match 121 stars 5.56 score 7 scripts 1 dependents

luukvdmeer

sfnetworks:Tidy Geospatial Networks

Provides a tidy approach to spatial network analysis, in the form of classes and functions that enable a seamless interaction between the network analysis package 'tidygraph' and the spatial analysis package 'sf'.

Maintained by Lucas van der Meer. Last updated 3 months ago.

geospatial-networks network-analysis rspatial simple-features spatial-analysis spatial-data-science spatial-networks tidygraph tidyverse

1.9 match 372 stars 9.63 score 332 scripts 6 dependents

bioc

bluster:Clustering Algorithms for Bioconductor

Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology software geneexpression transcriptomics singlecell clustering cpp

1.9 match 9.43 score 636 scripts 51 dependents

bioc

celda:CEllular Latent Dirichlet Allocation

Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.

Maintained by Joshua Campbell. Last updated 28 days ago.

singlecell geneexpression clustering sequencing bayesian immunooncology dataimport cpp openmp

1.6 match 147 stars 10.47 score 256 scripts 2 dependents

cran

CornerstoneR:Collection of Scripts for Interface Between 'Cornerstone' and 'R'

Collection of generic 'R' scripts which enable you to use existing 'R' routines in 'Cornerstone'. . The desktop application 'Cornerstone' (<https://www.camline.com/en/products/cornerstone/cornerstone-core.html>) is a data analysis software provided by 'camLine' that empowers engineering teams to find solutions even faster. The engineers incorporate intensified hands-on statistics into their projects. They benefit from an intuitive and uniquely designed graphical Workmap concept: you design experiments (DoE) and explore data, analyze dependencies, and find answers you can act upon, immediately, interactively, and without any programming. . While 'Cornerstone's' interface to the statistical programming language 'R' has been available since version 6.0, the latest interface with 'R' is even much more efficient. 'Cornerstone' release 7.1.1 allows you to integrate user defined 'R' packages directly into the standard 'Cornerstone' GUI. Your engineering team stays in 'Cornerstone's' graphical working environment and can apply 'R' routines, immediately and without the need to deal with programming code. Additionally, your 'R' programming team develops corresponding 'R' packages detached from 'Cornerstone' in their favorite 'R' environment. . Learn how to use 'R' packages in 'Cornerstone' 7.1.1 on 'camLineTV' YouTube channel (<https://www.youtube.com/watch?v=HEQHwq_laXU>) (available in German).

Maintained by Gerrith Djaja. Last updated 5 years ago.

4.8 match 3.54 score

mlampros

OpenImageR:An Image Processing Toolkit

Incorporates functions for image preprocessing, filtering and image recognition. The package takes advantage of 'RcppArmadillo' to speed up computationally intensive functions. The histogram of oriented gradients descriptor is a modification of the 'findHOGFeatures' function of the 'SimpleCV' computer vision platform, the average_hash(), dhash() and phash() functions are based on the 'ImageHash' python library. The Gabor Feature Extraction functions are based on 'Matlab' code of the paper, "CloudID: Trustworthy cloud-based and cross-enterprise biometric identification" by M. Haghighat, S. Zonouz, M. Abdel-Mottaleb, Expert Systems with Applications, vol. 42, no. 21, pp. 7905-7916, 2015, <doi:10.1016/j.eswa.2015.06.025>. The 'SLIC' and 'SLICO' superpixel algorithms were explained in detail in (i) "SLIC Superpixels Compared to State-of-the-art Superpixel Methods", Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Suesstrunk, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, num. 11, p. 2274-2282, May 2012, <doi:10.1109/TPAMI.2012.120> and (ii) "SLIC Superpixels", Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Suesstrunk, EPFL Technical Report no. 149300, June 2010.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

filtering gabor-feature-extraction gabor-filters hog-features image image-hashing processing rcpparmadillo recognition slic slico superpixels openblas cpp openmp

1.7 match 60 stars 9.86 score 358 scripts 8 dependents

networkgroupr

fastnet:Large-Scale Social Network Analysis

We present an implementation of the algorithms required to simulate large-scale social networks and retrieve their most relevant metrics.

Maintained by Nazrul Shaikh. Last updated 8 years ago.

5.0 match 5 stars 3.37 score 47 scripts

michaelhallquist

ggbrain:Create Images of Volumetric Brain Data in NIfTI Format Using 'ggplot2' Syntax

A 'ggplot2'-consistent approach to generating 2D displays of volumetric brain imaging data. Display data from multiple NIfTI images using standard 'ggplot2' conventions such scales, limits, and themes to control the appearance of displays. The resulting plots are returned as 'patchwork' objects, inheriting from 'ggplot', allowing for any standard modifications of display aesthetics supported by 'ggplot2'.

Maintained by Michael Hallquist. Last updated 25 days ago.

cpp

3.3 match 2 stars 5.03 score 18 scripts

r-spatial

rgee:R Bindings for Calling the 'Earth Engine' API

Earth Engine <https://earthengine.google.com/> client library for R. All of the 'Earth Engine' API classes, modules, and functions are made available. Additional functions implemented include importing (exporting) of Earth Engine spatial objects, extraction of time series, interactive map display, assets management interface, and metadata display. See <https://r-spatial.github.io/rgee/> for further details.

Maintained by Cesar Aybar. Last updated 4 days ago.

earth-engine earthengine google-earth-engine googleearthengine spatial-analysis spatial-data

1.2 match 715 stars 13.77 score 1.9k scripts 3 dependents

jfrench

smacpod:Statistical Methods for the Analysis of Case-Control Point Data

Statistical methods for analyzing case-control point data. Methods include the ratio of kernel densities, the difference in K Functions, the spatial scan statistic, and q nearest neighbors of cases.

Maintained by Joshua French. Last updated 5 months ago.

4.4 match 3.69 score 49 scripts

computationalstylistics

stylo:Stylometric Multivariate Analyses

Supervised and unsupervised multivariate methods, supplemented by GUI and some visualizations, to perform various analyses in the field of computational stylistics, authorship attribution, etc. For further reference, see Eder et al. (2016), <https://journal.r-project.org/archive/2016/RJ-2016-007/index.html>. You are also encouraged to visit the Computational Stylistics Group's website <https://computationalstylistics.github.io/>, where a reasonable amount of information about the package and related projects are provided.

Maintained by Maciej Eder. Last updated 2 months ago.

1.9 match 187 stars 8.58 score 462 scripts

mlr-org

mlr3extralearners:Extra Learners For mlr3

Extra learners for use in mlr3.

Maintained by Sebastian Fischer. Last updated 4 months ago.

machine-learning mlr3

1.8 match 94 stars 9.16 score 474 scripts

sunweisurrey

snn:Stabilized Nearest Neighbor Classifier

Implement K-nearest neighbor classifier, weighted nearest neighbor classifier, bagged nearest neighbor classifier, optimal weighted nearest neighbor classifier and stabilized nearest neighbor classifier, and perform model selection via 5 fold cross-validation for them. This package also provides functions for computing the classification error and classification instability of a classification procedure.

Maintained by Wei Sun. Last updated 10 years ago.

15.3 match 1.04 score 11 scripts

bioc

sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips

Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.

Maintained by Wanding Zhou. Last updated 2 months ago.

dnamethylation methylationarray preprocessing qualitycontrol bioinformatics dna-methylation microarray

1.8 match 69 stars 9.08 score 258 scripts 1 dependents

cran

SKNN:A Super K-Nearest Neighbor (SKNN) Classification Algorithm

It's a Super K-Nearest Neighbor classification method with using kernel density to describe weight of the distance between a training observation and the testing sample.

Maintained by Yarong Yang. Last updated 5 months ago.

8.8 match 1.78 score

bioc

qmtools:Quantitative Metabolomics Data Processing Tools

The qmtools (quantitative metabolomics tools) package provides basic tools for processing quantitative metabolomics data with the standard SummarizedExperiment class. This includes functions for imputation, normalization, feature filtering, feature clustering, dimension-reduction, and visualization to help users prepare data for statistical analysis. This package also offers a convenient way to compute empirical Bayes statistics for which metabolic features are different between two sets of study samples. Several functions in this package could also be used in other types of omics data.

Maintained by Jaehyun Joo. Last updated 5 months ago.

metabolomics preprocessing normalization dimensionreduction massspectrometry

3.6 match 1 stars 4.30 score 5 scripts

bioc

rifi:'rifi' analyses data from rifampicin time series created by microarray or RNAseq

'rifi' analyses data from rifampicin time series created by microarray or RNAseq. 'rifi' is a transcriptome data analysis tool for the holistic identification of transcription and decay associated processes. The decay constants and the delay of the onset of decay is fitted for each probe/bin. Subsequently, probes/bins of equal properties are combined into segments by dynamic programming, independent of a existing genome annotation. This allows to detect transcript segments of different stability or transcriptional events within one annotated gene. In addition to the classic decay constant/half-life analysis, 'rifi' detects processing sites, transcription pausing sites, internal transcription start sites in operons, sites of partial transcription termination in operons, identifies areas of likely transcriptional interference by the collision mechanism and gives an estimate of the transcription velocity. All data are integrated to give an estimate of continous transcriptional units, i.e. operons. Comprehensive output tables and visualizations of the full genome result and the individual fits for all probes/bins are produced.

Maintained by Jens Georg. Last updated 5 months ago.

rnaseq differentialexpression generegulation transcriptomics regression microarray software

3.3 match 4.60 score 1 scripts

cpanse

protViz:Visualizing and Analyzing Mass Spectrometry Related Data in Proteomics

Helps with quality checks, visualizations and analysis of mass spectrometry data, coming from proteomics experiments. The package is developed, tested and used at the Functional Genomics Center Zurich <https://fgcz.ch>. We use this package mainly for prototyping, teaching, and having fun with proteomics data. But it can also be used to do data analysis for small scale data sets.

Maintained by Christian Panse. Last updated 1 years ago.

fun mass-spectrometry peptide-identification proteomics quantification visualization cpp

1.9 match 11 stars 7.88 score 72 scripts 2 dependents

bioc

Sconify:A toolkit for performing KNN-based statistics for flow and mass cytometry data

This package does k-nearest neighbor based statistics and visualizations with flow and mass cytometery data. This gives tSNE maps"fold change" functionality and provides a data quality metric by assessing manifold overlap between fcs files expected to be the same. Other applications using this package include imputation, marker redundancy, and testing the relative information loss of lower dimension embeddings compared to the original manifold.

Maintained by Tyler J Burns. Last updated 5 months ago.

immunooncology singlecell flowcytometry software multiplecomparison visualization

3.1 match 4.74 score 11 scripts

christopherkenny

geomander:Geographic Tools for Studying Gerrymandering

A compilation of tools to complete common tasks for studying gerrymandering. This focuses on the geographic tool side of common problems, such as linking different levels of spatial units or estimating how to break up units. Functions exist for creating redistricting-focused data for the US.

Maintained by Christopher T. Kenny. Last updated 19 days ago.

cpp

1.9 match 14 stars 7.81 score 191 scripts 1 dependents

bioc

SPIAT:Spatial Image Analysis of Tissues

SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.

Maintained by Yuzhou Feng. Last updated 1 days ago.

biomedicalinformatics cellbiology spatial clustering dataimport immunooncology qualitycontrol singlecell software visualization

1.7 match 22 stars 8.59 score 69 scripts

bioc

scDblFinder:scDblFinder

The scDblFinder package gathers various methods for the detection and handling of doublets/multiplets in single-cell sequencing data (i.e. multiple cells captured within the same droplet or reaction volume). It includes methods formerly found in the scran package, the new fast and comprehensive scDblFinder method, and a reimplementation of the Amulet detection method for single-cell ATAC-seq.

Maintained by Pierre-Luc Germain. Last updated 2 months ago.

preprocessing singlecell rnaseq atacseq doublets single-cell

1.2 match 184 stars 12.34 score 888 scripts 1 dependents

brian-j-smith

MachineShop:Machine Learning Models and Tools

Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.

Maintained by Brian J Smith. Last updated 7 months ago.

classification-models machine-learning predictive-modeling regression-models survival-models

1.8 match 62 stars 7.95 score 121 scripts

tpetzoldt

simecol:Simulation of Ecological (and Other) Dynamic Systems

An object oriented framework to simulate ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well. It supports structuring of simulation scenarios (to avoid copy and paste) and aims to improve readability and re-usability of code.

Maintained by Thomas Petzoldt. Last updated 7 months ago.

3.0 match 4.76 score 190 scripts

bioc

imcRtools:Methods for imaging mass cytometry data analysis

This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.

Maintained by Daniel Schulz. Last updated 5 months ago.

immunooncology singlecell spatial dataimport clustering imc single-cell

1.9 match 24 stars 7.58 score 126 scripts

tidymodels

spatialsample:Spatial Resampling Infrastructure

Functions and classes for spatial resampling to use with the 'rsample' package, such as spatial cross-validation (Brenning, 2012) <doi:10.1109/IGARSS.2012.6352393>. The scope of 'rsample' and 'spatialsample' is to provide the basic building blocks for creating and analyzing resamples of a spatial data set, but neither package includes functions for modeling or computing statistics. The resampled spatial data sets created by 'spatialsample' do not contain much overhead in memory.

Maintained by Michael Mahoney. Last updated 6 months ago.

cpp

1.7 match 73 stars 8.19 score 118 scripts 2 dependents

bioc

dandelionR:Single-cell Immune Repertoire Trajectory Analysis in R

dandelionR is an R package for performing single-cell immune repertoire trajectory analysis, based on the original python implementation. It provides the necessary functions to interface with scRepertoire and a custom implementation of an absorbing Markov chain for pseudotime inference, inspired by the Palantir Python package.

Maintained by Kelvin Tuong. Last updated 14 days ago.

software immunooncology singlecell

2.4 match 8 stars 5.81 score 7 scripts

rpatin

segclust2d:Bivariate Segmentation/Clustering Methods and Tools

Provides two methods for segmentation and joint segmentation/clustering of bivariate time-series. Originally intended for ecological segmentation (home-range and behavioural modes) but easily applied on other series, the package also provides tools for analysing outputs from R packages 'moveHMM' and 'marcher'. The segmentation method is a bivariate extension of Lavielle's method available in 'adehabitatLT' (Lavielle, 1999 <doi:10.1016/S0304-4149(99)00023-X> and 2005 <doi:10.1016/j.sigpro.2005.01.012>). This method rely on dynamic programming for efficient segmentation. The segmentation/clustering method alternates steps of dynamic programming with an Expectation-Maximization algorithm. This is an extension of Picard et al (2007) <doi:10.1111/j.1541-0420.2006.00729.x> method (formerly available in 'cghseg' package) to the bivariate case. The method is fully described in Patin et al (2018) <doi:10.1101/444794>.

Maintained by Remi Patin. Last updated 11 months ago.

cpp

2.5 match 7 stars 5.50 score 30 scripts

fsavje

distances:Tools for Distance Metrics

Provides tools for constructing, manipulating and using distance metrics.

Maintained by Fredrik Savje. Last updated 1 years ago.

cpp

2.0 match 17 stars 6.92 score 117 scripts 12 dependents

adeverse

adegraphics:An S4 Lattice-Based Package for the Representation of Multivariate Data

Graphical functionalities for the representation of multivariate data. It is a complete re-implementation of the functions available in the 'ade4' package.

Maintained by Aurélie Siberchicot. Last updated 8 months ago.

1.3 match 9 stars 10.37 score 386 scripts 6 dependents

zcolburn

Bioi:Biological Image Analysis

Single linkage clustering and connected component analyses are often performed on biological images. 'Bioi' provides a set of functions for performing these tasks. This functionality is implemented in several key functions that can extend to from 1 to many dimensions. The single linkage clustering method implemented here can be used on n-dimensional data sets, while connected component analyses are limited to 3 or fewer dimensions.

Maintained by Zachary Colburn. Last updated 5 years ago.

biological-data-analysis biology cell cpp image-analysis microscopy cpp

3.6 match 3.81 score 13 scripts

mtrupiano1

knnwtsim:K Nearest Neighbor Forecasting with a Tailored Similarity Metric

Functions to implement K Nearest Neighbor forecasting using a weighted similarity metric tailored to the problem of forecasting univariate time series where recent observations, seasonal patterns, and exogenous predictors are all relevant in predicting future observations of the series in question. For more information on the formulation of this similarity metric please see Trupiano (2021) <arXiv:2112.06266>.

Maintained by Matthew Trupiano. Last updated 3 years ago.

forecasting knn-regression machine-learning time-series

5.1 match 1 stars 2.70 score 2 scripts

kenaho1

asbio:A Collection of Statistical Tools for Biologists

Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.

Maintained by Ken Aho. Last updated 2 months ago.

1.9 match 5 stars 7.32 score 310 scripts 3 dependents

bioc

systemPipeTools:Tools for data visualization

systemPipeTools package extends the widely used systemPipeR (SPR) workflow environment with an enhanced toolkit for data visualization, including utilities to automate the data visualizaton for analysis of differentially expressed genes (DEGs). systemPipeTools provides data transformation and data exploration functions via scatterplots, hierarchical clustering heatMaps, principal component analysis, multidimensional scaling, generalized principal components, t-Distributed Stochastic Neighbor embedding (t-SNE), and MA and volcano plots. All these utilities can be integrated with the modular design of the systemPipeR environment that allows users to easily substitute any of these features and/or custom with alternatives.

Maintained by Daniela Cassol. Last updated 5 months ago.

infrastructure dataimport sequencing qualitycontrol reportwriting experimentaldesign clustering differentialexpression multidimensionalscaling principalcomponent

3.4 match 4.00 score 4 scripts

kisungyou

Riemann:Learning with Data on Riemannian Manifolds

We provide a variety of algorithms for manifold-valued data, including Fréchet summaries, hypothesis testing, clustering, visualization, and other learning tasks. See Bhattacharya and Bhattacharya (2012) <doi:10.1017/CBO9781139094764> for general exposition to statistics on manifolds.

Maintained by Kisung You. Last updated 2 years ago.

openblas cpp openmp

3.7 match 10 stars 3.70 score 8 scripts

bnprks

BPCells:Single Cell Counts Matrices to PCA

> Efficient operations for single cell ATAC-seq fragments and RNA counts matrices. Interoperable with standard file formats, and introduces efficient bit-packed formats that allow large storage savings and increased read speeds.

Maintained by Benjamin Parks. Last updated 1 months ago.

zlib hdf5 cpp

1.8 match 184 stars 7.48 score 172 scripts

sharifrahmanie

MBMethPred:Medulloblastoma Subgroups Prediction

Utilizing a combination of machine learning models (Random Forest, Naive Bayes, K-Nearest Neighbor, Support Vector Machines, Extreme Gradient Boosting, and Linear Discriminant Analysis) and a deep Artificial Neural Network model, 'MBMethPred' can predict medulloblastoma subgroups, including wingless (WNT), sonic hedgehog (SHH), Group 3, and Group 4 from DNA methylation beta values. See Sharif Rahmani E, Lawarde A, Lingasamy P, Moreno SV, Salumets A and Modhukur V (2023), MBMethPred: a computational framework for the accurate classification of childhood medulloblastoma subgroups using data integration and AI-based approaches. Front. Genet. 14:1233657. <doi: 10.3389/fgene.2023.1233657> for more details.

Maintained by Edris Sharif Rahmani. Last updated 1 years ago.

3.6 match 3.70 score 1 scripts

bioc

ncGTW:Alignment of LC-MS Profiles by Neighbor-wise Compound-specific Graphical Time Warping with Misalignment Detection

The purpose of ncGTW is to help XCMS for LC-MS data alignment. Currently, ncGTW can detect the misaligned feature groups by XCMS, and the user can choose to realign these feature groups by ncGTW or not.

Maintained by Chiung-Ting Wu. Last updated 5 months ago.

software massspectrometry metabolomics alignment cpp

2.7 match 8 stars 4.90 score 3 scripts

dwbapst

paleotree:Paleontological and Phylogenetic Analyses of Evolution

Provides tools for transforming, a posteriori time-scaling, and modifying phylogenies containing extinct (i.e. fossil) lineages. In particular, most users are interested in the functions timePaleoPhy, bin_timePaleoPhy, cal3TimePaleoPhy and bin_cal3TimePaleoPhy, which date cladograms of fossil taxa using stratigraphic data. This package also contains a large number of likelihood functions for estimating sampling and diversification rates from different types of data available from the fossil record (e.g. range data, occurrence data, etc). paleotree users can also simulate diversification and sampling in the fossil record using the function simFossilRecord, which is a detailed simulator for branching birth-death-sampling processes composed of discrete taxonomic units arranged in ancestor-descendant relationships. Users can use simFossilRecord to simulate diversification in incompletely sampled fossil records, under various models of morphological differentiation (i.e. the various patterns by which morphotaxa originate from one another), and with time-dependent, longevity-dependent and/or diversity-dependent rates of diversification, extinction and sampling. Additional functions allow users to translate simulated ancestor-descendant data from simFossilRecord into standard time-scaled phylogenies or unscaled cladograms that reflect the relationships among taxon units.

Maintained by David W. Bapst. Last updated 8 months ago.

1.8 match 21 stars 7.53 score 216 scripts 2 dependents

bioc

BioNERO:Biological Network Reconstruction Omnibus

BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.

Maintained by Fabricio Almeida-Silva. Last updated 5 months ago.

software geneexpression generegulation systemsbiology graphandnetwork preprocessing network networkinference

1.7 match 27 stars 7.78 score 50 scripts 1 dependents

chavent

ClustOfVar:Clustering of Variables

Cluster analysis of a set of variables. Variables can be quantitative, qualitative or a mixture of both.

Maintained by Marie Chavent. Last updated 5 years ago.

2.0 match 7 stars 6.47 score 142 scripts 2 dependents

thomasp85

particles:A Graph Based Particle Simulator Based on D3-Force

Simulating particle movement in 2D space has many application. The 'particles' package implements a particle simulator based on the ideas behind the 'd3-force' 'JavaScript' library. 'particles' implements all forces defined in 'd3-force' as well as others such as vector fields, traps, and attractors.

Maintained by Thomas Lin Pedersen. Last updated 3 months ago.

d3js graph-layout network network-visualization particles simulation cpp

1.8 match 119 stars 7.19 score 43 scripts

sweinand

pricelevels:Spatial Price Level Comparisons

Price comparisons within or between countries provide an overall measure of the relative difference in prices, often denoted as price levels. This package provides index number methods for such price comparisons (e.g., The World Bank, 2011, <doi:10.1596/978-0-8213-9728-2>). Moreover, it contains functions for sampling and characterizing price data.

Maintained by Sebastian Weinand. Last updated 10 months ago.

index-numbers price-comparison spatial-analysis

3.0 match 4.30 score 2 scripts

rmgpanw

gtexr:Query the GTEx Portal API

A convenient R interface to the Genotype-Tissue Expression (GTEx) Portal API. For more information on the API, see <https://gtexportal.org/api/v2/redoc>.

Maintained by Alasdair Warwick. Last updated 6 months ago.

api-wrapper bioinformatics eqtl gtex sqtl

2.0 match 5 stars 6.41 score 5 scripts

cefet-rj-dal

daltoolbox:Leveraging Experiment Lines to Data Analytics

The natural increase in the complexity of current research experiments and data demands better tools to enhance productivity in Data Analytics. The package is a framework designed to address the modern challenges in data analytics workflows. The package is inspired by Experiment Line concepts. It aims to provide seamless support for users in developing their data mining workflows by offering a uniform data model and method API. It enables the integration of various data mining activities, including data preprocessing, classification, regression, clustering, and time series prediction. It also offers options for hyper-parameter tuning and supports integration with existing libraries and languages. Overall, the package provides researchers with a comprehensive set of functionalities for data science, promoting ease of use, extensibility, and integration with various tools and libraries. Information on Experiment Line is based on Ogasawara et al. (2009) <doi:10.1007/978-3-642-02279-1_20>.

Maintained by Eduardo Ogasawara. Last updated 1 months ago.

1.9 match 1 stars 6.65 score 536 scripts 4 dependents

swarm-lab

swaRm:Processing Collective Movement Data

Function library for processing collective movement data (e.g. fish schools, ungulate herds, baboon troops) collected from GPS trackers or computer vision tracking software.

Maintained by Simon Garnier. Last updated 1 years ago.

animal-behavior animal-behaviour collective-behavior collective-behaviour

2.3 match 21 stars 5.50 score 8 scripts 1 dependents

cran

RSDA:R to Symbolic Data Analysis

Symbolic Data Analysis (SDA) was proposed by professor Edwin Diday in 1987, the main purpose of SDA is to substitute the set of rows (cases) in the data table for a concept (second order statistical unit). This package implements, to the symbolic case, certain techniques of automatic classification, as well as some linear models.

Maintained by Oldemar Rodriguez. Last updated 1 years ago.

3.8 match 1 stars 3.26 score 3 dependents

bioc

scde:Single Cell Differential Expression

The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).

Maintained by Evan Biederstedt. Last updated 5 months ago.

immunooncology rnaseq statisticalmethod differentialexpression bayesian transcription software analysis bioinformatics heterogenity ngs single-cell transcriptomics openblas cpp openmp

1.6 match 173 stars 7.53 score 141 scripts

jkim82133

TDA:Statistical Tools for Topological Data Analysis

Tools for Topological Data Analysis. The package focuses on statistical analysis of persistent homology and density clustering. For that, this package provides an R interface for the efficient algorithms of the C++ libraries 'GUDHI' <https://project.inria.fr/gudhi/software/>, 'Dionysus' <https://www.mrzv.org/software/dionysus/>, and 'PHAT' <https://bitbucket.org/phat-code/phat/>. This package also implements methods from Fasy et al. (2014) <doi:10.1214/14-AOS1252> and Chazal et al. (2015) <doi:10.20382/jocg.v6i2a8> for analyzing the statistical significance of persistent homology features.

Maintained by Jisu Kim. Last updated 1 months ago.

gmp cpp

1.7 match 9 stars 7.18 score 204 scripts 5 dependents

matrix-profile-foundation

tsmp:Time Series with Matrix Profile

A toolkit implementing the Matrix Profile concept that was created by CS-UCR <http://www.cs.ucr.edu/~eamonn/MatrixProfile.html>.

Maintained by Francisco Bischoff. Last updated 3 years ago.

algorithm matrix-profile motif-search time-series cpp

1.7 match 72 stars 7.29 score 179 scripts 1 dependents

yuimaproject

yuima:The YUIMA Project Package for SDEs

Simulation and Inference for SDEs and Other Stochastic Processes.

Maintained by Stefano M. Iacus. Last updated 3 days ago.

openblas cpp

1.7 match 9 stars 7.26 score 92 scripts 2 dependents

toduckhanh

bcROCsurface:Bias-Corrected Methods for Estimating the ROC Surface of Continuous Diagnostic Tests

The bias-corrected estimation methods for the receiver operating characteristics ROC surface and the volume under ROC surfaces (VUS) under missing at random (MAR) assumption.

Maintained by Duc-Khanh To. Last updated 1 years ago.

openblas cpp

3.5 match 3.45 score 14 scripts

msq-123

CovidMutations:Mutation Analysis and Assay Validation Toolkit for COVID-19 (Coronavirus Disease 2019)

A feasible framework for mutation analysis and reverse transcription polymerase chain reaction (RT-PCR) assay evaluation of COVID-19, including mutation profile visualization, statistics and mutation ratio of each assay. The mutation ratio is conducive to evaluating the coverage of RT-PCR assays in large-sized samples<doi:10.20944/preprints202004.0529.v1>.

Maintained by Shaoqian Ma. Last updated 5 years ago.

2.8 match 4 stars 4.30 score 6 scripts

joycekang

symphony:Efficient and Precise Single-Cell Reference Atlas Mapping

Implements the Symphony single-cell reference building and query mapping algorithms and additional functions described in Kang et al <https://www.nature.com/articles/s41467-021-25957-x>.

Maintained by Joyce Kang. Last updated 2 years ago.

openblas cpp

3.1 match 3.83 score 134 scripts

connordonegan

geostan:Bayesian Spatial Analysis

For spatial data analysis; provides exploratory spatial analysis tools, spatial regression, spatial econometric, and disease mapping models, model diagnostics, and special methods for inference with small area survey data (e.g., the America Community Survey (ACS)) and censored population health monitoring data. Models are pre-specified using the Stan programming language, a platform for Bayesian inference using Markov chain Monte Carlo (MCMC). References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Donegan (2021) <doi:10.31219/osf.io/3ey65>; Donegan (2022) <doi:10.21105/joss.04716>; Donegan, Chun and Hughes (2020) <doi:10.1016/j.spasta.2020.100450>; Donegan, Chun and Griffith (2021) <doi:10.3390/ijerph18136856>; Morris et al. (2019) <doi:10.1016/j.sste.2019.100301>.

Maintained by Connor Donegan. Last updated 3 months ago.

bayesian bayesian-inference bayesian-statistics epidemiology modeling public-health rspatial spatial stan cpp

1.3 match 80 stars 8.80 score 46 scripts

kdonnay

geomerge:Geospatial Data Integration

Geospatial data integration framework that merges raster, spatial polygon, and (dynamic) spatial points data into a spatial (panel) data frame at any geographical resolution.

Maintained by Karsten Donnay. Last updated 1 years ago.

3.9 match 2.93 score 17 scripts

bioc

VarCon:VarCon: an R package for retrieving neighboring nucleotides of an SNV

VarCon is an R package which converts the positional information from the annotation of an single nucleotide variation (SNV) (either referring to the coding sequence or the reference genomic sequence). It retrieves the genomic reference sequence around the position of the single nucleotide variation. To asses, whether the SNV could potentially influence binding of splicing regulatory proteins VarCon calcualtes the HEXplorer score as an estimation. Besides, VarCon additionally reports splice site strengths of splice sites within the retrieved genomic sequence and any changes due to the SNV.

Maintained by Johannes Ptok. Last updated 5 months ago.

functionalgenomics alternativesplicing

2.9 match 4.00 score 5 scripts

kwilliams83

ldbod:Local Density-Based Outlier Detection

Flexible procedures to compute local density-based outlier scores for ranking outliers. Both exact and approximate nearest neighbor search can be implemented, while also accommodating multiple neighborhood sizes and four different local density-based methods. It allows for referencing a random subsample of the input data or a user specified reference data set to compute outlier scores against, so both unsupervised and semi-supervised outlier detection can be implemented.

Maintained by Kristopher Williams. Last updated 8 years ago.

3.8 match 2 stars 3.00 score 3 scripts

bioc

cydar:Using Mass Cytometry for Differential Abundance Analyses

Identifies differentially abundant populations between samples and groups in mass cytometry data. Provides methods for counting cells into hyperspheres, controlling the spatial false discovery rate, and visualizing changes in abundance in the high-dimensional marker space.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology flowcytometry multiplecomparison proteomics singlecell cpp

2.0 match 5.64 score 48 scripts

matloff

qeML:Quick and Easy Machine Learning Tools

The letters 'qe' in the package title stand for "quick and easy," alluding to the convenience goal of the package. We bring together a variety of machine learning (ML) tools from standard R packages, providing wrappers with a simple, convenient, and uniform interface.

Maintained by Norm Matloff. Last updated 26 days ago.

1.3 match 41 stars 8.41 score 48 scripts 1 dependents

pfh

langevitour:Langevin Tour

An HTML widget that randomly tours 2D projections of numerical data. A random walk through projections of the data is shown. The user can manipulate the plot to use specified axes, or turn on Guided Tour mode to find an informative projection of the data. Groups within the data can be hidden or shown, as can particular axes. Points can be brushed, and the selection can be linked to other widgets using crosstalk. The underlying method to produce the random walk and projection pursuit uses Langevin dynamics. The widget can be used from within R, or included in a self-contained R Markdown or Quarto document or presentation, or used in a Shiny app.

Maintained by Paul Harrison. Last updated 2 months ago.

javascript-applications langevin-dynamics tour visualization

1.8 match 26 stars 6.41 score 22 scripts 1 dependents

michaeldorman

starsExtra:Miscellaneous Functions for Working with 'stars' Rasters

Miscellaneous functions for working with 'stars' objects, mainly single-band rasters. Currently includes functions for: (1) focal filtering, (2) detrending of Digital Elevation Models, (3) calculating flow length, (4) calculating the Convergence Index, (5) calculating topographic aspect and topographic slope.

Maintained by Michael Dorman. Last updated 1 years ago.

1.7 match 25 stars 6.53 score 45 scripts 2 dependents

cran

JGL:Performs the Joint Graphical Lasso for Sparse Inverse Covariance Estimation on Multiple Classes

The Joint Graphical Lasso is a generalized method for estimating Gaussian graphical models/ sparse inverse covariance matrices/ biological networks on multiple classes of data. We solve JGL under two penalty functions: The Fused Graphical Lasso (FGL), which employs a fused penalty to encourage inverse covariance matrices to be similar across classes, and the Group Graphical Lasso (GGL), which encourages similar network structure between classes. FGL is recommended over GGL for most applications. Reference: Danaher P, Wang P, Witten DM. (2013) <doi:10.1111/rssb.12033>.

Maintained by Patrick Danaher. Last updated 1 years ago.

4.1 match 1 stars 2.65 score 1 dependents

minoo-asty

CINNA:Deciphering Central Informative Nodes in Network Analysis

Computing, comparing, and demonstrating top informative centrality measures within a network. "CINNA: an R/CRAN package to decipher Central Informative Nodes in Network Analysis" provides a comprehensive overview of the package functionality Ashtiani et al. (2018) <doi:10.1093/bioinformatics/bty819>.

Maintained by Minoo Ashtiani. Last updated 2 years ago.

3.3 match 1 stars 3.29 score 98 scripts