Showing 188 of total 188 results (show query)
braverock
PortfolioAnalytics:Portfolio Analysis, Including Numerical Methods for Optimization of Portfolios
Portfolio optimization and analysis routines and graphics.
Maintained by Brian G. Peterson. Last updated 4 hours ago.
16.9 match 87 stars 11.60 score 626 scripts 2 dependentsbnaras
pamr:Pam: Prediction Analysis for Microarrays
Some functions for sample classification in microarrays.
Maintained by Balasubramanian Narasimhan. Last updated 9 months ago.
22.8 match 7.98 score 256 scripts 14 dependentsasardaes
dtwclust:Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance
Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.
Maintained by Alexis Sarda. Last updated 8 months ago.
clusteringdtwtime-seriesopenblascpp
11.7 match 262 stars 12.35 score 406 scripts 14 dependentsbioc
Spectra:Spectra Infrastructure for Mass Spectrometry Data
The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different implementations (backends) to store mass spectrometry data. These comprise backends tuned for fast data access and processing and backends for very large data sets ensuring a small memory footprint.
Maintained by RforMassSpectrometry Package Maintainer. Last updated 25 days ago.
infrastructureproteomicsmassspectrometrymetabolomicsbioconductorhacktoberfestmass-spectrometry
10.0 match 41 stars 13.01 score 254 scripts 35 dependentsitsleeds
od:Manipulate and Map Origin-Destination Data
The aim of 'od' is to provide tools and example datasets for working with origin-destination ('OD') datasets of the type used to describe aggregate urban mobility patterns (Carey et al. 1981) <doi:10.1287/trsc.15.1.32>. The package builds on functions for working with 'OD' data in the package 'stplanr', (Lovelace and Ellison 2018) <doi:10.32614/RJ-2018-053> with a focus on computational efficiency and support for the 'sf' class system (Pebesma 2018) <doi:10.32614/RJ-2018-009>. With few dependencies and a simple class system based on data frames, the package is intended to facilitate efficient analysis of 'OD' datasets and to provide a place for developing new functions. The package enables the creation and analysis of geographic entities representing large scale mobility patterns, from daily travel between zones in cities to migration between countries.
Maintained by Robin Lovelace. Last updated 6 months ago.
15.0 match 33 stars 8.50 score 96 scripts 6 dependentssatijalab
SeuratObject:Data Structures for Single Cell Data
Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.
Maintained by Paul Hoffman. Last updated 2 years ago.
9.5 match 25 stars 11.69 score 1.2k scripts 88 dependentsropensci
spatsoc:Group Animal Relocation Data by Spatial and Temporal Relationship
Detects spatial and temporal groups in GPS relocations (Robitaille et al. (2019) <doi:10.1111/2041-210X.13215>). It can be used to convert GPS relocations to gambit-of-the-group format to build proximity-based social networks In addition, the randomizations function provides data-stream randomization methods suitable for GPS data.
Maintained by Alec L. Robitaille. Last updated 2 months ago.
10.8 match 24 stars 9.97 score 145 scripts 3 dependentsbioc
Cardinal:A mass spectrometry imaging toolbox for statistical analysis
Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.
Maintained by Kylie Ariel Bemis. Last updated 3 months ago.
softwareinfrastructureproteomicslipidomicsmassspectrometryimagingmassspectrometryimmunooncologynormalizationclusteringclassificationregression
10.3 match 48 stars 10.32 score 200 scriptsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 3 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
6.0 match 560 stars 17.62 score 17k scripts 857 dependentsludvigolsen
rearrr:Rearranging Data
Arrange data by a set of methods. Use rearrangers to reorder data points and mutators to change their values. From basic utilities, to centering the greatest value, to swirling in 3-dimensional space, 'rearrr' enables creativity when plotting and experimenting with data.
Maintained by Ludvig Renbo Olsen. Last updated 26 days ago.
arrangeclusterexpandforminggenerateggplot2orderplotting-in-rrollrotateshapingswirltransformations
10.6 match 24 stars 7.26 score 128 scripts 8 dependentsandrewljackson
SIBER:Stable Isotope Bayesian Ellipses in R
Fits bi-variate ellipses to stable isotope data using Bayesian inference with the aim being to describe and compare their isotopic niche.
Maintained by Andrew Jackson. Last updated 10 months ago.
community-ecologyecologyniche-modellingstable-isotopesjagscpp
8.4 match 37 stars 9.15 score 187 scripts 1 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 18 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
6.0 match 131 stars 12.76 score 772 scripts 36 dependentsiandryden
shapes:Statistical Shape Analysis
Routines for the statistical analysis of landmark shapes, including Procrustes analysis, graphical displays, principal components analysis, permutation and bootstrap tests, thin-plate spline transformation grids and comparing covariance matrices. See Dryden, I.L. and Mardia, K.V. (2016). Statistical shape analysis, with Applications in R (2nd Edition), John Wiley and Sons.
Maintained by Ian Dryden. Last updated 5 days ago.
8.5 match 7 stars 8.61 score 225 scripts 24 dependentsrspatial
geosphere:Spherical Trigonometry
Spherical trigonometry for geographic applications. That is, compute distances and related measures for angular (longitude/latitude) locations.
Maintained by Robert J. Hijmans. Last updated 6 months ago.
5.3 match 36 stars 13.80 score 5.7k scripts 119 dependentstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 5 hours ago.
3.8 match 586 stars 18.79 score 7.2k scripts 381 dependentsropensci
CoordinateCleaner:Automated Cleaning of Occurrence Records from Biological Collections
Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) <doi:10.1111/2041-210X.13152>.
Maintained by Alexander Zizka. Last updated 1 years ago.
6.0 match 82 stars 10.93 score 306 scripts 3 dependentsmikeblazanin
gcplyr:Wrangle and Analyze Growth Curve Data
Easy wrangling and model-free analysis of microbial growth curve data, as commonly output by plate readers. Tools for reshaping common plate reader outputs into 'tidy' formats and merging them with design information, making data easy to work with using 'gcplyr' and other packages. Also streamlines common growth curve processing steps, like smoothing and calculating derivatives, and facilitates model-free characterization and analysis of growth data. See methods at <https://mikeblazanin.github.io/gcplyr/>.
Maintained by Mike Blazanin. Last updated 2 months ago.
8.2 match 30 stars 7.53 score 75 scriptsitsleeds
pct:Propensity to Cycle Tool
Functions and example data to teach and increase the reproducibility of the methods and code underlying the Propensity to Cycle Tool (PCT), a research project and web application hosted at <https://www.pct.bike/>. For an academic paper on the methods, see Lovelace et al (2017) <doi:10.5198/jtlu.2016.862>.
Maintained by Robin Lovelace. Last updated 27 days ago.
8.9 match 20 stars 6.54 scoreropensci
geojsonio:Convert Data from and to 'GeoJSON' or 'TopoJSON'
Convert data to 'GeoJSON' or 'TopoJSON' from various R classes, including vectors, lists, data frames, shape files, and spatial classes. 'geojsonio' does not aim to replace packages like 'sp', 'rgdal', 'rgeos', but rather aims to be a high level client to simplify conversions of data from and to 'GeoJSON' and 'TopoJSON'.
Maintained by Michael Mahoney. Last updated 1 years ago.
geojsontopojsongeospatialconversiondatainput-outputio
5.3 match 151 stars 10.83 score 2.9k scripts 13 dependentsbioc
SpatialFeatureExperiment:Integrating SpatialExperiment with Simple Features in sf
A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.
Maintained by Lambda Moses. Last updated 2 months ago.
datarepresentationtranscriptomicsspatial
6.0 match 49 stars 9.40 score 322 scripts 1 dependentsbioc
ProtGenerics:Generic infrastructure for Bioconductor mass spectrometry packages
S4 generic functions and classes needed by Bioconductor proteomics packages.
Maintained by Laurent Gatto. Last updated 3 months ago.
infrastructureproteomicsmassspectrometrybioconductormass-spectrometrymetabolomics
6.0 match 8 stars 9.36 score 4 scripts 188 dependentsmomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
7.6 match 51 stars 7.42 score 346 scriptsjmsigner
amt:Animal Movement Tools
Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.
Maintained by Johannes Signer. Last updated 5 months ago.
5.3 match 41 stars 10.54 score 418 scriptsblosloos
enviPat:Isotope Pattern, Profile and Centroid Calculation for Mass Spectrometry
Fast and very memory-efficient calculation of isotope patterns, subsequent convolution to theoretical envelopes (profiles) plus valley detection and centroidization or intensoid calculation. Batch processing, resolution interpolation, wrapper, adduct calculations and molecular formula parsing. Loos, M., Gerber, C., Corona, F., Hollender, J., Singer, H. (2015) <doi:10.1021/acs.analchem.5b00941>.
Maintained by Martin Loos. Last updated 8 months ago.
8.7 match 7 stars 6.35 score 48 scripts 7 dependentsspatstat
spatstat.geom:Geometrical Functionality of the 'spatstat' Family
Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)
Maintained by Adrian Baddeley. Last updated 7 days ago.
classes-and-objectsdistance-calculationgeometrygeometry-processingimagesmensurationplottingpoint-patternsspatial-dataspatial-data-analysis
4.5 match 7 stars 12.14 score 241 scripts 229 dependentsopengeos
whitebox:'WhiteboxTools' R Frontend
An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.
Maintained by Andrew Brown. Last updated 6 months ago.
geomorphometrygeoprocessinggeospatialgishydrologyremote-sensingrstudio
5.6 match 173 stars 9.65 score 203 scripts 2 dependentssrkobakian
sugarbag:Create Tessellated Hexagon Maps
Create a hexagon tile map display from spatial polygons. Each polygon is represented by a hexagon tile, placed as close to it's original centroid as possible, with a focus on maintaining spatial relationship to a focal point. Developed to aid visualisation and analysis of spatial distributions across Australia, which can be challenging due to the concentration of the population on the coast and wide open interior.
Maintained by Dianne Cook. Last updated 2 years ago.
7.8 match 42 stars 6.52 score 53 scriptsbioc
GSgalgoR:An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer
A multi-objective optimization algorithm for disease sub-type discovery based on a non-dominated sorting genetic algorithm. The 'Galgo' framework combines the advantages of clustering algorithms for grouping heterogeneous 'omics' data and the searching properties of genetic algorithms for feature selection. The algorithm search for the optimal number of clusters determination considering the features that maximize the survival difference between sub-types while keeping cluster consistency high.
Maintained by Carlos Catania. Last updated 5 months ago.
geneexpressiontranscriptionclusteringclassificationsurvival
9.1 match 15 stars 5.48 score 6 scriptsswarm-lab
swaRm:Processing Collective Movement Data
Function library for processing collective movement data (e.g. fish schools, ungulate herds, baboon troops) collected from GPS trackers or computer vision tracking software.
Maintained by Simon Garnier. Last updated 1 years ago.
animal-behavioranimal-behaviourcollective-behaviorcollective-behaviour
8.9 match 21 stars 5.50 score 8 scripts 1 dependentsdadongz
OncoSubtype:Predict Cancer Subtypes Based on TCGA Data using Machine Learning Method
Provide functionality for cancer subtyping using nearest centroids or machine learning methods based on TCGA data.
Maintained by Dadong Zhang. Last updated 1 years ago.
12.5 match 1 stars 3.70 score 1 scriptsroelandkindt
BiodiversityR:Package for Community Ecology and Suitability Analysis
Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.
Maintained by Roeland Kindt. Last updated 2 months ago.
6.2 match 17 stars 7.13 score 390 scripts 2 dependentsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
6.0 match 18 stars 7.24 score 35 scriptsjeffreyevans
spatialEco:Spatial Analysis and Modelling Utilities
Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.
Maintained by Jeffrey S. Evans. Last updated 29 days ago.
biodiversityconservationecologyr-spatialrasterspatialvector
4.5 match 110 stars 9.55 score 736 scripts 2 dependentsrobinlovelace
simodels:Flexible Framework for Developing Spatial Interaction Models
Develop spatial interaction models (SIMs). SIMs predict the amount of interaction, for example number of trips per day, between geographic entities representing trip origins and destinations. Contains functions for creating origin-destination datasets from geographic input datasets and calculating movement between origin-destination pairs with constrained, production-constrained, and attraction-constrained models (Wilson 1979) <doi:10.1068/a030001>.
Maintained by Robin Lovelace. Last updated 25 days ago.
6.0 match 18 stars 6.90 score 11 scriptscpanse
protViz:Visualizing and Analyzing Mass Spectrometry Related Data in Proteomics
Helps with quality checks, visualizations and analysis of mass spectrometry data, coming from proteomics experiments. The package is developed, tested and used at the Functional Genomics Center Zurich <https://fgcz.ch>. We use this package mainly for prototyping, teaching, and having fun with proteomics data. But it can also be used to do data analysis for small scale data sets.
Maintained by Christian Panse. Last updated 1 years ago.
funmass-spectrometrypeptide-identificationproteomicsquantificationvisualizationcpp
5.1 match 11 stars 7.88 score 72 scripts 2 dependentscran
flexclust:Flexible Cluster Algorithms
The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, ...), and bootstrap methods for the analysis of cluster stability.
Maintained by Bettina Grün. Last updated 1 months ago.
6.7 match 3 stars 5.99 score 53 dependentsbarnhilldave
TML:Tropical Geometry Tools for Machine Learning
Suite of tropical geometric tools for use in machine learning applications. These methods may be summarized in the following references: Yoshida, et al. (2022) <arxiv:2209.15045>, Barnhill et al. (2023) <arxiv:2303.02539>, Barnhill and Yoshida (2023) <doi:10.3390/math11153433>, Aliatimis et al. (2023) <arXiv:2306.08796>, Yoshida et al. (2022) <arXiv:2206.04206>, and Yoshida et al. (2019) <doi:10.1007/s11538-018-0493-4>.
Maintained by David Barnhill. Last updated 8 months ago.
11.4 match 3 stars 3.48 score 1 scriptsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 5 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
5.0 match 57 stars 7.68 score 8 scriptsecospat
ecospat:Spatial Ecology Miscellaneous Methods
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
Maintained by Olivier Broennimann. Last updated 2 months ago.
4.0 match 32 stars 9.35 score 418 scripts 1 dependentsdernst
flexord:Flexible Clustering of Ordinal and Mixed-with-Ordinal Data
Extends the capabilities for flexible partitioning and model-based clustering available in the packages 'flexclust' and 'flexmix' to handle ordinal and mixed-with-ordinal data types via new distance, centroid and driver functions that make various assumptions regarding ordinality. Using them within the flex-scheme allows for easy comparisons across methods.
Maintained by Lena Ortega Menjivar. Last updated 6 days ago.
6.6 match 2 stars 5.51 scorevegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 1 months ago.
ecological-modellingecologyordinationfortranopenblas
1.9 match 476 stars 19.40 score 15k scripts 445 dependentsemf-creaf
vegclust:Fuzzy Clustering of Vegetation Data
A set of functions to: (1) perform fuzzy clustering of vegetation data (De Caceres et al, 2010) <doi:10.1111/j.1654-1103.2010.01211.x>; (2) to assess ecological community similarity on the basis of structure and composition (De Caceres et al, 2013) <doi:10.1111/2041-210X.12116>.
Maintained by Miquel De Cáceres. Last updated 8 months ago.
5.7 match 2 stars 6.27 score 52 scripts 6 dependentskylebittinger
usedist:Distance Matrix Utilities
Functions to re-arrange, extract, and work with distances.
Maintained by Kyle Bittinger. Last updated 10 months ago.
5.4 match 14 stars 6.63 score 169 scripts 6 dependentstopepo
caret:Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Maintained by Max Kuhn. Last updated 4 months ago.
1.8 match 1.6k stars 19.24 score 61k scripts 303 dependentsbblonder
hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls
Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.
Maintained by Benjamin Blonder. Last updated 2 months ago.
3.5 match 23 stars 9.69 score 211 scripts 7 dependentsadafede
CentroidR:CentroidR
CentroidR provides the infrastructure to centroid profile spectra.
Maintained by Adriano Rutz. Last updated 3 days ago.
12.4 match 2.60 scoreepivec
TDLM:Systematic Comparison of Trip Distribution Laws and Models
The main purpose of this package is to propose a rigorous framework to fairly compare trip distribution laws and models as described in Lenormand et al. (2016) <doi:10.1016/j.jtrangeo.2015.12.008>.
Maintained by Maxime Lenormand. Last updated 26 days ago.
6.6 match 2 stars 4.85 score 3 scriptsjmadinlab
habtools:Tools and Metrics for 3D Surfaces and Objects
A collection of functions for sampling and simulating 3D surfaces and objects and estimating metrics like rugosity, fractal dimension, convexity, sphericity, circularity, second moments of area and volume, and more.
Maintained by Nina Schiettekatte. Last updated 26 days ago.
5.2 match 12 stars 6.10 score 9 scriptsmurrayefford
secr:Spatially Explicit Capture-Recapture
Functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
Maintained by Murray Efford. Last updated 5 days ago.
3.0 match 3 stars 10.06 score 410 scripts 5 dependentsropensci
eph:Argentina's Permanent Household Survey Data and Manipulation Utilities
Tools to download and manipulate the Permanent Household Survey from Argentina (EPH is the Spanish acronym for Permanent Household Survey). e.g: get_microdata() for downloading the datasets, get_poverty_lines() for downloading the official poverty baskets, calculate_poverty() for the calculation of stating if a household is in poverty or not, following the official methodology. organize_panels() is used to concatenate observations from different periods, and organize_labels() adds the official labels to the data. The implemented methods are based on INDEC (2016) <http://www.estadistica.ec.gba.gov.ar/dpe/images/SOCIEDAD/EPH_metodologia_22_pobreza.pdf>. As this package works with the argentinian Permanent Household Survey and its main audience is from this country, the documentation was written in Spanish.
Maintained by Carolina Pradier. Last updated 8 months ago.
ephindecmercado-de-trabajorstatses
3.5 match 59 stars 8.38 score 255 scriptschristopherkenny
geomander:Geographic Tools for Studying Gerrymandering
A compilation of tools to complete common tasks for studying gerrymandering. This focuses on the geographic tool side of common problems, such as linking different levels of spatial units or estimating how to break up units. Functions exist for creating redistricting-focused data for the US.
Maintained by Christopher T. Kenny. Last updated 1 months ago.
3.6 match 14 stars 7.81 score 191 scripts 1 dependentsroaldarbol
animovement:An R toolbox for analysing animal movement across space and time
An R toolbox for analysing animal movement across space and time.
Maintained by Mikkel Roald-Arbøl. Last updated 3 months ago.
animal-behaviouranimal-movementneuroethologyneuroscience
5.5 match 10 stars 4.81 score 8 scriptstguillerme
dispRity:Measuring Disparity
A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.
Maintained by Thomas Guillerme. Last updated 14 days ago.
disparityecologymultidimensionalitypalaeobiology
3.0 match 26 stars 8.65 score 220 scripts 1 dependentsjayanilakshika
quollr:Visualising How Nonlinear Dimension Reduction Warps Your Data
To construct a model in 2D space from 2D embedding data and then lift it to the high-dimensional space. Additionally, it provides tools to visualize the model in 2D space and to overlay the fitted model on data using the tour technique. Furthermore, it facilitates the generation of summaries of high-dimensional distributions.
Maintained by Jayani P.G. Lakshika. Last updated 2 days ago.
5.7 match 3 stars 4.48 score 7 scriptscran
smfishHmrf:Hidden Markov Random Field for Spatial Transcriptomic Data
Discovery of spatial patterns with Hidden Markov Random Field. This package is designed for spatial transcriptomic data and single molecule fluorescent in situ hybridization (FISH) data such as sequential fluorescence in situ hybridization (seqFISH) and multiplexed error-robust fluorescence in situ hybridization (MERFISH). The methods implemented in this package are described in Zhu et al. (2018) <doi:10.1038/nbt.4260>.
Maintained by Qian Zhu. Last updated 4 years ago.
15.0 match 1.70 scoresapfluxnet
sapfluxnetr:Working with 'Sapfluxnet' Project Data
Access, modify, aggregate and plot data from the 'Sapfluxnet' project (<http://sapfluxnet.creaf.cat>), the first global database of sap flow measurements.
Maintained by Victor Granda. Last updated 2 years ago.
3.9 match 25 stars 6.57 score 49 scriptsbenbruyneel
proteinDiscover:ProteinDiscover
Provides an interface to the data contained in Proteome Discoverer (Thermo Scientific) results.
Maintained by Ben Bruyneel. Last updated 1 years ago.
mass-spectrometryproteomicsproteomics-data-analysis
7.5 match 2 stars 3.00 score 2 scriptstrelliscope
trelliscope:Create Interactive Multi-Panel Displays
Trelliscope enables interactive exploration of data frames of visualizations.
Maintained by Ryan Hafen. Last updated 7 months ago.
3.5 match 29 stars 6.43 score 117 scriptsropengov
geofi:Access Finnish Geospatial Data
Designed to simplify geospatial data access from the Statistics Finland Web Feature Service API <https://geo.stat.fi/geoserver/index.html>, the geofi package offers researchers and analysts a set of tools to obtain and harmonize administrative spatial data for a wide range of applications, from urban planning to environmental research. The package contains annually updated time series of municipality key datasets that can be used for data aggregation and language translations.
Maintained by Markus Kainu. Last updated 2 months ago.
2.8 match 20 stars 8.17 score 61 scriptskaiaragaki
classifyBLCA:What the Package Does (One Line, Title Case)
What the package does (one paragraph).
Maintained by Kai Aragaki. Last updated 2 years ago.
13.1 match 1.70 scoreinlabru-org
fmesher:Triangle Meshes and Related Geometry Tools
Generate planar and spherical triangle meshes, compute finite element calculations for 1- and 2-dimensional flat and curved manifolds with associated basis function spaces, methods for lines and polygons, and transparent handling of coordinate reference systems and coordinate transformation, including 'sf' and 'sp' geometries. The core 'fmesher' library code was originally part of the 'INLA' package, and implements parts of "Triangulations and Applications" by Hjelle and Daehlen (2006) <doi:10.1007/3-540-33261-8>.
Maintained by Finn Lindgren. Last updated 18 hours ago.
1.9 match 16 stars 11.28 score 261 scripts 26 dependentsbioc
MsCoreUtils:Core Utils for Mass Spectrometry Data
MsCoreUtils defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning, baseline estimation), quantitative aggregation functions (median polish, robust summarisation, ...), missing data imputation, data normalisation (quantiles, vsn, ...), misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.
Maintained by RforMassSpectrometry Package Maintainer. Last updated 11 days ago.
infrastructureproteomicsmassspectrometrymetabolomicsbioconductormass-spectrometryutils
2.0 match 16 stars 10.57 score 41 scripts 71 dependentsteunbrand
ggh4x:Hacks for 'ggplot2'
A 'ggplot2' extension that does a variety of little helpful things. The package extends 'ggplot2' facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.
Maintained by Teun van den Brand. Last updated 13 days ago.
1.5 match 617 stars 14.06 score 4.4k scripts 21 dependentsjosiahparry
rsgeo:An Interface to Rust's 'geo' Library
An R interface to the GeoRust crates 'geo' and 'geo-types' providing access to geometry primitives and algorithms.
Maintained by Josiah Parry. Last updated 8 months ago.
5.3 match 47 stars 3.96 score 13 scriptsbioc
CelliD:Unbiased Extraction of Single Cell gene signatures using Multiple Correspondence Analysis
CelliD is a clustering-free multivariate statistical method for the robust extraction of per-cell gene signatures from single-cell RNA-seq. CelliD allows unbiased cell identity recognition across different donors, tissues-of-origin, model organisms and single-cell omics protocols. The package can also be used to explore functional pathways enrichment in single cell data.
Maintained by Akira Cortal. Last updated 5 months ago.
rnaseqsinglecelldimensionreductionclusteringgenesetenrichmentgeneexpressionatacseqopenblascppopenmp
4.3 match 4.85 score 70 scriptspbs-software
PBSmapping:Mapping Fisheries Data and Spatial Analysis Tools
This software has evolved from fisheries research conducted at the Pacific Biological Station (PBS) in 'Nanaimo', British Columbia, Canada. It extends the R language to include two-dimensional plotting features similar to those commonly available in a Geographic Information System (GIS). Embedded C code speeds algorithms from computational geometry, such as finding polygons that contain specified point events or converting between longitude-latitude and Universal Transverse Mercator (UTM) coordinates. Additionally, we include 'C++' code developed by Angus Johnson for the 'Clipper' library, data for a global shoreline, and other data sets in the public domain. Under the user's R library directory '.libPaths()', specifically in './PBSmapping/doc', a complete user's guide is offered and should be consulted to use package functions effectively.
Maintained by Rowan Haigh. Last updated 6 months ago.
2.0 match 11 stars 10.16 score 652 scripts 9 dependentsmikejohnson51
AOI:Areas of Interest
A consistent tool kit for forward and reverse geocoding and defining boundaries for spatial analysis.
Maintained by Mike Johnson. Last updated 1 years ago.
aoiarea-of-interestbounding-boxesgisspatialsubset
4.0 match 37 stars 4.98 score 174 scripts 1 dependentsdusadrian
venn:Draw Venn Diagrams
A close to zero dependency package to draw and display Venn diagrams up to 7 sets, and any Boolean union of set intersections.
Maintained by Adrian Dusa. Last updated 6 months ago.
2.0 match 30 stars 9.90 score 508 scripts 13 dependentsplangfelder
WGCNA:Weighted Correlation Network Analysis
Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.
Maintained by Peter Langfelder. Last updated 6 months ago.
2.0 match 54 stars 9.65 score 5.3k scripts 32 dependentsbioc
CMA:Synthesis of microarray-based classification
This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.
Maintained by Roman Hornung. Last updated 5 months ago.
3.8 match 5.09 score 61 scriptsbioc
matter:Out-of-core statistical computing and signal processing
Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.
Maintained by Kylie A. Bemis. Last updated 4 months ago.
infrastructuredatarepresentationdataimportdimensionreductionpreprocessingcpp
2.0 match 57 stars 9.52 score 64 scripts 2 dependentsqile0317
APackOfTheClones:Visualization of Clonal Expansion for Single Cell Immune Profiles
Visualize clonal expansion via circle-packing. 'APackOfTheClones' extends 'scRepertoire' to produce a publication-ready visualization of clonal expansion at a single cell resolution, by representing expanded clones as differently sized circles. The method was originally implemented by Murray Christian and Ben Murrell in the following immunology study: Ma et al. (2021) <doi:10.1126/sciimmunol.abg6356>.
Maintained by Qile Yang. Last updated 4 months ago.
clonal-analysisimmune-repertoireimmune-systemscrna-seqscrnaseqseuratsingle-cellsingle-cell-genomicscpp
2.9 match 15 stars 6.45 score 15 scriptsbioc
ggcyto:Visualize Cytometry data with ggplot
With the dedicated fortify method implemented for flowSet, ncdfFlowSet and GatingSet classes, both raw and gated flow cytometry data can be plotted directly with ggplot. ggcyto wrapper and some customed layers also make it easy to add gates and population statistics to the plot.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyflowcytometrycellbasedassaysinfrastructurevisualization
1.7 match 58 stars 11.25 score 362 scripts 5 dependentswelch-lab
rliger:Linked Inference of Genomic Experimental Relationships
Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.
Maintained by Yichen Wang. Last updated 3 months ago.
nonnegative-matrix-factorizationsingle-cellopenblascpp
1.7 match 408 stars 10.77 score 334 scripts 1 dependentsimmunogenomics
harmony:Fast, Sensitive, and Accurate Integration of Single Cell Data
Implementation of the Harmony algorithm for single cell integration, described in Korsunsky et al <doi:10.1038/s41592-019-0619-0>. Package includes a standalone Harmony function and interfaces to external frameworks.
Maintained by Ilya Korsunsky. Last updated 5 months ago.
algorithmdata-integrationscrna-seqopenblascpp
1.3 match 554 stars 13.74 score 5.5k scripts 8 dependentsadmahood
neonPlantEcology:Process NEON Plant Data for Ecological Analysis
Downloading and organizing plant presence and percent cover data from the National Ecological Observatory Network <https://www.neonscience.org>.
Maintained by Adam Mahood. Last updated 3 months ago.
3.6 match 8 stars 5.08 score 7 scriptszarquon42b
Morpho:Calculations and Visualisations Related to Geometric Morphometrics
A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.
Maintained by Stefan Schlager. Last updated 5 months ago.
1.8 match 51 stars 10.01 score 218 scripts 13 dependentsrozetasimonovska
SDPDmod:Spatial Dynamic Panel Data Modeling
Spatial model calculation for static and dynamic panel data models, weights matrix creation and Bayesian model comparison. Bayesian model comparison methods were described by 'LeSage' (2014) <doi:10.1016/j.spasta.2014.02.002>. The 'Lee'-'Yu' transformation approach is described in 'Yu', 'De Jong' and 'Lee' (2008) <doi:10.1016/j.jeconom.2008.08.002>, 'Lee' and 'Yu' (2010) <doi:10.1016/j.jeconom.2009.08.001> and 'Lee' and 'Yu' (2010) <doi:10.1017/S0266466609100099>.
Maintained by Rozeta Simonovska. Last updated 12 months ago.
3.6 match 5 stars 4.98 score 19 scriptsbioc
smoppix:Analyze Single Molecule Spatial Omics Data Using the Probabilistic Index
Test for univariate and bivariate spatial patterns in spatial omics data with single-molecule resolution. The tests implemented allow for analysis of nested designs and are automatically calibrated to different biological specimens. Tests for aggregation, colocalization, gradients and vicinity to cell edge or centroid are provided.
Maintained by Stijn Hawinkel. Last updated 1 months ago.
transcriptomicsspatialsinglecellcpp
3.3 match 1 stars 5.10 score 4 scriptschrhennig
fpc:Flexible Procedures for Clustering
Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Standardisation of cluster validation statistics by random clusterings and comparison between many clustering methods and numbers of clusters based on this. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther's prediction strength, Fang and Wang's bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
Maintained by Christian Hennig. Last updated 6 months ago.
1.8 match 11 stars 9.32 score 2.6k scripts 69 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
2.0 match 3 stars 8.20 score 7.8k scripts 11 dependentsbart1
move:Visualizing and Analyzing Animal Track Data
Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.
Maintained by Bart Kranstauber. Last updated 4 months ago.
1.9 match 8.70 score 690 scripts 3 dependentsbioc
MsBackendSql:SQL-based Mass Spectrometry Data Backend
SQL-based mass spectrometry (MS) data backend supporting also storange and handling of very large data sets. Objects from this package are supposed to be used with the Spectra Bioconductor package. Through the MsBackendSql with its minimal memory footprint, this package thus provides an alternative MS data representation for very large or remote MS data sets.
Maintained by Johannes Rainer. Last updated 15 days ago.
infrastructuremassspectrometrymetabolomicsdataimportproteomics
3.0 match 4 stars 5.41 score 16 scriptscomputationalstylistics
stylo:Stylometric Multivariate Analyses
Supervised and unsupervised multivariate methods, supplemented by GUI and some visualizations, to perform various analyses in the field of computational stylistics, authorship attribution, etc. For further reference, see Eder et al. (2016), <https://journal.r-project.org/archive/2016/RJ-2016-007/index.html>. You are also encouraged to visit the Computational Stylistics Group's website <https://computationalstylistics.github.io/>, where a reasonable amount of information about the package and related projects are provided.
Maintained by Maciej Eder. Last updated 3 months ago.
1.9 match 187 stars 8.58 score 462 scriptsfreezenik
BayesX:R Utilities Accompanying the Software Package BayesX
Functions for exploring and visualising estimation results obtained with BayesX, a free software for estimating structured additive regression models (<https://www.uni-goettingen.de/de/bayesx/550513.html>). In addition, functions that allow to read, write and manipulate map objects that are required in spatial analyses performed with BayesX.
Maintained by Nikolaus Umlauf. Last updated 1 years ago.
4.3 match 3.71 score 48 scripts 3 dependentscran
sda:Shrinkage Discriminant Analysis and CAT Score Variable Selection
Provides an efficient framework for high-dimensional linear and diagonal discriminant analysis with variable selection. The classifier is trained using James-Stein-type shrinkage estimators and predictor variables are ranked using correlation-adjusted t-scores (CAT scores). Variable selection error is controlled using false non-discovery rates or higher criticism.
Maintained by Korbinian Strimmer. Last updated 3 years ago.
4.9 match 3.21 score 3 dependentsjdtuck
fdasrvf:Elastic Functional Data Analysis
Performs alignment, PCA, and modeling of multidimensional and unidimensional functions using the square-root velocity framework (Srivastava et al., 2011 <doi:10.48550/arXiv.1103.3817> and Tucker et al., 2014 <DOI:10.1016/j.csda.2012.12.001>). This framework allows for elastic analysis of functional data through phase and amplitude separation.
Maintained by J. Derek Tucker. Last updated 1 months ago.
2.0 match 13 stars 7.79 score 83 scripts 3 dependentsglenndavis52
munsellinterpol:Interpolate Munsell Renotation Data from Hue Value/Chroma to CIE/RGB
Methods for interpolating data in the Munsell color system following the ASTM D-1535 standard. Hues and chromas with decimal values can be interpolated and converted to/from the Munsell color system and CIE xyY, CIE XYZ, CIE Lab, CIE Luv, or RGB. Includes ISCC-NBS color block lookup. Based on the work by Paul Centore, "The Munsell and Kubelka-Munk Toolbox".
Maintained by Glenn Davis. Last updated 2 months ago.
3.4 match 2 stars 4.31 score 43 scripts 2 dependentscran
NCSampling:Nearest Centroid (NC) Sampling
Provides functionality for performing Nearest Centroid (NC) Sampling. The NC sampling procedure was developed for forestry applications and selects plots for ground measurement so as to maximize the efficiency of imputation estimates. It uses multiple auxiliary variables and multivariate clustering to search for an optimal sample. Further details are given in Melville G. & Stone C. (2016) <doi:10.1080/00049158.2016.1218265>.
Maintained by Gavin Melville. Last updated 8 years ago.
14.4 match 1.00 scoretgoodbody
sgsR:Structurally Guided Sampling
Structurally guided sampling (SGS) approaches for airborne laser scanning (ALS; LIDAR). Primary functions provide means to generate data-driven stratifications & methods for allocating samples. Intermediate functions for calculating and extracting important information about input covariates and samples are also included. Processing outcomes are intended to help forest and environmental management practitioners better optimize field sample placement as well as assess and augment existing sample networks in the context of data distributions and conditions. ALS data is the primary intended use case, however any rasterized remote sensing data can be used, enabling data-driven stratifications and sampling approaches.
Maintained by Tristan RH Goodbody. Last updated 29 days ago.
1.9 match 46 stars 7.50 score 34 scriptstrevorld
affiner:A Finer Way to Render 3D Illustrated Objects in 'grid' Using Affine Transformations
Dilate, permute, project, reflect, rotate, shear, and translate 2D and 3D points. Supports parallel projections including oblique projections such as the cabinet projection as well as axonometric projections such as the isometric projection. Use 'grid's "affine transformation" feature to render illustrated flat surfaces.
Maintained by Trevor L. Davis. Last updated 4 months ago.
2.0 match 9 stars 6.91 score 1 scripts 5 dependentsmobiodiv
mobr:Measurement of Biodiversity
Functions for calculating metrics for the measurement biodiversity and its changes across scales, treatments, and gradients. The methods implemented in this package are described in: Chase, J.M., et al. (2018) <doi:10.1111/ele.13151>, McGlinn, D.J., et al. (2019) <doi:10.1111/2041-210X.13102>, McGlinn, D.J., et al. (2020) <doi:10.1101/851717>, and McGlinn, D.J., et al. (2023) <doi:10.1101/2023.09.19.558467>.
Maintained by Daniel McGlinn. Last updated 12 days ago.
biodiversityconservationecologyrarefactionspeciesstatistics
1.6 match 23 stars 8.65 score 93 scriptstidymodels
tidyclust:A Common API to Clustering
A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.
Maintained by Emil Hvitfeldt. Last updated 2 months ago.
1.9 match 112 stars 7.21 score 139 scriptsvtshen
AHM:Additive Heredity Model: Method for the Mixture-of-Mixtures Experiments
An implementation of the additive heredity model for the mixture-of-mixtures experiments of Shen et al. (2019) in Technometrics <doi:10.1080/00401706.2019.1630010>. The additive heredity model considers an additive structure to inherently connect the major components with the minor components. The additive heredity model has a meaningful interpretation for the estimated model because of the hierarchical and heredity principles applied and the nonnegative garrote technique used for variable selection.
Maintained by Sumin Shen. Last updated 6 years ago.
5.0 match 2.70 score 2 scriptsbioc
scmap:A tool for unsupervised projection of single cell RNA-seq data
Single-cell RNA-seq (scRNA-seq) is widely used to investigate the composition of complex tissues since the technology allows researchers to define cell-types using unsupervised clustering of the transcriptome. However, due to differences in experimental methods and computational analyses, it is often challenging to directly compare the cells identified in two different experiments. scmap is a method for projecting cells from a scRNA-seq experiment on to the cell-types or individual cells identified in a different experiment.
Maintained by Vladimir Kiselev. Last updated 5 months ago.
immunooncologysinglecellsoftwareclassificationsupportvectormachinernaseqvisualizationtranscriptomicsdatarepresentationtranscriptionsequencingpreprocessinggeneexpressiondataimportbioconductor-packagehuman-cell-atlasprojection-mappingsingle-cell-rna-seqopenblascpp
1.5 match 95 stars 8.82 score 172 scriptspdil
usmapdata:Mapping Data for 'usmap' Package
Provides a container for data used by the 'usmap' package. The data used by 'usmap' has been extracted into this package so that the file size of the 'usmap' package can be reduced greatly. The data in this package will be updated roughly once per year as new map data files are provided by the US Census Bureau.
Maintained by Paolo Di Lorenzo. Last updated 25 days ago.
countiesdatafipsmappingstatesusa
2.0 match 5 stars 6.59 score 35 scripts 3 dependentsbioc
HiCDOC:A/B compartment detection and differential analysis
HiCDOC normalizes intrachromosomal Hi-C matrices, uses unsupervised learning to predict A/B compartments from multiple replicates, and detects significant compartment changes between experiment conditions. It provides a collection of functions assembled into a pipeline to filter and normalize the data, predict the compartments and visualize the results. It accepts several type of data: tabular `.tsv` files, Cooler `.cool` or `.mcool` files, Juicer `.hic` files or HiC-Pro `.matrix` and `.bed` files.
Maintained by Maigné Élise. Last updated 4 months ago.
hicdna3dstructurenormalizationsequencingsoftwareclusteringcpp
2.3 match 4 stars 5.86 score 6 scripts 1 dependentsbioc
geva:Gene Expression Variation Analysis (GEVA)
Statistic methods to evaluate variations of differential expression (DE) between multiple biological conditions. It takes into account the fold-changes and p-values from previous differential expression (DE) results that use large-scale data (*e.g.*, microarray and RNA-seq) and evaluates which genes would react in response to the distinct experiments. This evaluation involves an unique pipeline of statistical methods, including weighted summarization, quantile detection, cluster analysis, and ANOVA tests, in order to classify a subset of relevant genes whose DE is similar or dependent to certain biological factors.
Maintained by Itamar José Guimarães Nunes. Last updated 5 months ago.
classificationdifferentialexpressiongeneexpressionmicroarraymultiplecomparisonrnaseqsystemsbiologytranscriptomics
3.0 match 2 stars 4.30 score 4 scriptsbioc
cola:A Framework for Consensus Partitioning
Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.
Maintained by Zuguang Gu. Last updated 2 months ago.
clusteringgeneexpressionclassificationsoftwareconsensus-clusteringcpp
1.7 match 61 stars 7.49 score 112 scriptsbioc
Mfuzz:Soft clustering of omics time series data
The Mfuzz package implements noise-robust soft clustering of omics time-series data, including transcriptomic, proteomic or metabolomic data. It is based on the use of c-means clustering. For convenience, it includes a graphical user interface.
Maintained by Matthias Futschik. Last updated 5 months ago.
microarrayclusteringtimecoursepreprocessingvisualization
1.6 match 7.64 score 338 scripts 4 dependentsgavinrozzi
zipcodeR:Data & Functions for Working with US ZIP Codes
Make working with ZIP codes in R painless with an integrated dataset of U.S. ZIP codes and functions for working with them. Search ZIP codes by multiple geographies, including state, county, city & across time zones. Also included are functions for relating ZIP codes to Census data, geocoding & distance calculations.
Maintained by Gavin Rozzi. Last updated 1 years ago.
1.7 match 80 stars 7.31 score 176 scriptspgiraudoux
pgirmess:Spatial Analysis and Data Mining for Field Ecologists
Set of tools for reading, writing and transforming spatial and seasonal data, model selection and specific statistical tests for ecologists. It includes functions to interpolate regular positions of points between landmarks, to discretize polylines into regular point positions, link distant observations to points and convert a bounding box in a spatial object. It also provides miscellaneous functions for field ecologists such as spatial statistics and inference on diversity indexes, writing data.frame with Chinese characters.
Maintained by Patrick Giraudoux. Last updated 1 years ago.
1.7 match 5 stars 7.29 score 422 scripts 2 dependentstscnlab
LightLogR:Process Data from Wearable Light Loggers and Optical Radiation Dosimeters
Import, processing, validation, and visualization of personal light exposure measurement data from wearable devices. The package implements features such as the import of data and metadata files, conversion of common file formats, validation of light logging data, verification of crucial metadata, calculation of common parameters, and semi-automated analysis and visualization.
Maintained by Johannes Zauner. Last updated 1 months ago.
dosimetrylighttime-series-analysiswearable-deviceswearable-sensors
2.0 match 12 stars 5.88 score 28 scriptsswfsc
eSDM:Ensemble Tool for Predictions from Species Distribution Models
A tool which allows users to create and evaluate ensembles of species distribution model (SDM) predictions. Functionality is offered through R functions or a GUI (R Shiny app). This tool can assist users in identifying spatial uncertainties and making informed conservation and management decisions. The package is further described in Woodman et al (2019) <doi:10.1111/2041-210X.13283>.
Maintained by Sam Woodman. Last updated 6 months ago.
1.9 match 11 stars 6.07 score 24 scriptsptarroso
phylin:Spatial Interpolation of Genetic Data
The spatial interpolation of genetic distances between samples is based on a modified kriging method that accepts a genetic distance matrix and generates a map of probability of lineage presence. This package also offers tools to generate a map of potential contact zones between groups with user-defined thresholds in the tree to account for old and recent divergence. Additionally, it has functions for IDW interpolation using genetic data and midpoints.
Maintained by Pedro Tarroso. Last updated 5 years ago.
3.8 match 2.99 score 49 scriptscepardot
FactoClass:Combination of Factorial Methods and Cluster Analysis
Some functions of 'ade4' and 'stats' are combined in order to obtain a partition of the rows of a data table, with columns representing variables of scales: quantitative, qualitative or frequency. First, a principal axes method is performed and then, a combination of Ward agglomerative hierarchical classification and K-means is performed, using some of the first coordinates obtained from the previous principal axes method. In order to permit different weights of the elements to be clustered, the function 'kmeansW', programmed in C++, is included. It is a modification of 'kmeans'. Some graphical functions include the option: 'gg=FALSE'. When 'gg=TRUE', they use the 'ggplot2' and 'ggrepel' packages to avoid the super-position of the labels.
Maintained by Campo Elias Pardo. Last updated 1 years ago.
5.0 match 2.21 score 163 scriptsszymonnowakowski
hclust1d:Hierarchical Clustering of Univariate (1d) Data
Univariate agglomerative hierarchical clustering with a comprehensive list of choices of a linkage function in O(n*log n) time. The better algorithmic time complexity is paired with an efficient 'C++' implementation.
Maintained by Szymon Nowakowski. Last updated 2 years ago.
2.2 match 3 stars 4.95 score 9 scripts 1 dependentscran
PracTools:Designing and Weighting Survey Samples
Functions and datasets to support Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>, "Practical Tools for Designing and Weighting Survey Samples". Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-, and three-stage sample designs, and single-stage audit sample designs. Functions are included that will group geographic units accounting for distances apart and measures of size. Other functions compute variance components for multistage designs and sample sizes in two-phase designs. A number of example data sets are included.
Maintained by Richard Valliant. Last updated 9 months ago.
3.4 match 1 stars 3.18 score 1 dependentsbiometris
douconca:Double Constrained Correspondence Analysis for Trait-Environment Analysis in Ecology
Double constrained correspondence analysis (dc-CA) analyzes (multi-)trait (multi-)environment ecological data by using the 'vegan' package and native R code. Throughout the two step algorithm of ter Braak et al. (2018) is used. This algorithm combines and extends community- (sample-) and species-level analyses, i.e. the usual community weighted means (CWM)-based regression analysis and the species-level analysis of species-niche centroids (SNC)-based regression analysis. The two steps use canonical correspondence analysis to regress the abundance data on to the traits and (weighted) redundancy analysis to regress the CWM of the orthonormalized traits on to the environmental predictors. The function dc_CA() has an option to divide the abundance data of a site by the site total, giving equal site weights. This division has the advantage that the multivariate analysis corresponds with an unweighted (multi-trait) community-level analysis, instead of being weighted. The first step of the algorithm uses vegan::cca(). The second step uses wrda() but vegan::rda() if the site weights are equal. This version has a predict() function. For details see ter Braak et al. 2018 <doi:10.1007/s10651-017-0395-x>.
Maintained by Bart-Jan van Rossum. Last updated 4 months ago.
correspondence-analysisecologyecology-modelingmulti-environmentmulti-trait
2.1 match 5.00 score 6 scriptsvalentint
rda:Shrunken Centroids Regularized Discriminant Analysis
Provides functions implementing the shrunken centroids regularized discriminant analysis for classification purpose in high dimensional data. The method is described in Guo at al. (2013) <doi:10.1093/biostatistics/kxj035>.
Maintained by Valentin Todorov. Last updated 2 years ago.
3.5 match 3.02 score 21 scriptsbioc
TrajectoryUtils:Single-Cell Trajectory Analysis Utilities
Implements low-level utilities for single-cell trajectory analysis, primarily intended for re-use inside higher-level packages. Include a function to create a cluster-level minimum spanning tree and data structures to hold pseudotime inference results.
Maintained by Aaron Lun. Last updated 5 months ago.
1.8 match 5.91 score 16 scripts 9 dependentscran
centiserve:Find Graph Centrality Indices
Calculates centrality indices additional to the 'igraph' package centrality functions.
Maintained by Mahdi Jalili. Last updated 8 years ago.
5.1 match 1 stars 2.08 score 1 dependentsmatildabrown
rWCVP:Generating Summaries, Reports and Plots from the World Checklist of Vascular Plants
A companion to the World Checklist of Vascular Plants (WCVP). It includes functions to generate maps and species lists, as well as match names to the WCVP. For more details and to cite the package, see: Brown M.J.M., Walker B.E., Black N., Govaerts R., Ondo I., Turner R., Nic Lughadha E. (in press). "rWCVP: A companion R package to the World Checklist of Vascular Plants". New Phytologist.
Maintained by Matilda Brown. Last updated 1 years ago.
1.7 match 22 stars 6.17 score 45 scripts 1 dependentsjosiahparry
sdf:What the Package Does (One Line, Title Case)
What the package does (one paragraph).
Maintained by Josiah Parry. Last updated 2 years ago.
3.3 match 27 stars 3.13 score 6 scriptsbioc
CatsCradle:This package provides methods for analysing spatial transcriptomics data and for discovering gene clusters
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. 'CatsCradle' allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.
Maintained by Michael Shapiro. Last updated 15 days ago.
biologicalquestionstatisticalmethodgeneexpressionsinglecelltranscriptomicsspatial
1.6 match 3 stars 6.52 scorecidm-ph
ggautomap:Create Maps from a Column of Place Names
Mapping tools that convert place names to coordinates on the fly. These 'ggplot2' extensions make maps from a data frame where one of the columns contains place names, without having to directly work with the underlying geospatial data and tools. The corresponding map data must be registered with 'cartographer' either by the user or by another package.
Maintained by Carl Suster. Last updated 1 years ago.
data-visualizationgeospatialggplot-extensionggplot2
2.0 match 24 stars 5.08 score 5 scriptskaiaragaki
reclanc:A Revival of the ClaNC Algorithm
Classification of microarrays to nearest centroids (ClaNC) <doi:10.1093/bioinformatics/bti756> selects optimal genes for centroids, similar to Prediction Analysis for Microarrays (PAM) but using fewer corrective factors, resulting in greater sensitivity and accuracy. Unfortunately, the original source of ClaNC can no longer be found. 'reclanc' reimplements this algorithm, with the the additional benefit of increased interoperability with standard data structures and modeling ecosystems.
Maintained by Kai Aragaki. Last updated 8 months ago.
2.6 match 3.85 score 5 scriptsnikkrieger
USpopcenters:United States Centers of Population (Centroids)
Centers of population (centroid) data for census areas in the United States.
Maintained by Nik Krieger. Last updated 2 years ago.
3.6 match 1 stars 2.70 score 2 scriptsropensci
weatherOz:An API Client for Australian Weather and Climate Data Resources
Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
Maintained by Rodrigo Pires. Last updated 1 months ago.
dpirdbommeteorological-dataweather-forecastaustraliaweatherweather-datameteorologywestern-australiaaustralia-bureau-of-meteorologywestern-australia-agricultureaustralia-agricultureaustralia-climateaustralia-weatherapi-clientclimatedatarainfallweather-api
1.1 match 31 stars 8.47 score 40 scriptsicosa-grid
icosa:Global Triangular and Penta-Hexagonal Grids Based on Tessellated Icosahedra
Implementation of icosahedral grids in three dimensions. The spherical-triangular tessellation can be set to create grids with custom resolutions. Both the primary triangular and their inverted penta-hexagonal grids can be calculated. Additional functions are provided that allow plotting of the grids and associated data, the interaction of the grids with other raster and vector objects, and treating the grids as a graphs.
Maintained by Adam T. Kocsis. Last updated 8 months ago.
1.8 match 4 stars 5.41 score 65 scriptsflaviomoc
divraster:Diversity Metrics Calculations for Rasterized Data
Alpha and beta diversity for taxonomic (TD), functional (FD), and phylogenetic (PD) dimensions based on rasters. Spatial and temporal beta diversity can be partitioned into replacement and richness difference components. It also calculates standardized effect size for FD and PD alpha diversity and the average individual traits across multilayer rasters. The layers of the raster represent species, while the cells represent communities. Methods details can be found at Cardoso et al. 2022 <https://CRAN.R-project.org/package=BAT> and Heming et al. 2023 <https://CRAN.R-project.org/package=SESraster>.
Maintained by Flávio M. M. Mota. Last updated 15 days ago.
1.8 match 10 stars 5.40 score 7 scriptsberndbischl
tspmeta:Instance Feature Calculation and Evolutionary Instance Generation for the Traveling Salesman Problem
Instance feature calculation and evolutionary instance generation for the traveling salesman problem. Also contains code to "morph" two TSP instances into each other. And the possibility to conveniently run a couple of solvers on TSP instances.
Maintained by Bernd Bischl. Last updated 9 years ago.
2.3 match 5 stars 4.08 score 24 scriptsdatalowe
synr:Explore and Process Synesthesia Consistency Test Data
Explore synesthesia consistency test data, calculate consistency scores, and classify participant data as valid or invalid.
Maintained by Lowe Wilsson. Last updated 1 years ago.
1.7 match 5.32 score 139 scriptsweksi-budiaji
kmed:Distance-Based k-Medoids
Algorithms of distance-based k-medoids clustering: simple and fast k-medoids, ranked k-medoids, and increasing number of clusters in k-medoids. Calculate distances for mixed variable data such as Gower, Podani, Wishart, Huang, Harikumar-PV, and Ahmad-Dey. Cluster validation applies internal and relative criteria. The internal criteria includes silhouette index and shadow values. The relative criterium applies bootstrap procedure producing a heatmap with a flexible reordering matrix algorithm such as complete, ward, or average linkages. The cluster result can be plotted in a marked barplot or pca biplot.
Maintained by Weksi Budiaji. Last updated 3 years ago.
2.9 match 3.15 score 141 scriptsclancylabuiuc
moRphomenses:Geometric Morphometric Tools to Align, Scale, and Compare "Shape" of Menstrual Cycle Hormones
Mitteroecker & Gunz (2009) <doi:10.1007/s11692-009-9055-x> describe how geometric morphometric methods allow researchers to quantify the size and shape of physical biological structures. We provide tools to extend geometric morphometric principles to the study of non-physical structures, hormone profiles, as outlined in Ehrlich et al (2021) <doi:10.1002/ajpa.24514>. Easily transform daily measures into multivariate landmark-based data. Includes custom functions to apply multivariate methods for data exploration as well as hypothesis testing. Also includes 'shiny' web app to streamline data exploration. Developed to study menstrual cycle hormones but functions have been generalized and should be applicable to any biomarker over any time period.
Maintained by Daniel Ehrlich. Last updated 3 months ago.
2.3 match 2 stars 4.04 score 4 scriptsachubaty
grainscape:Landscape Connectivity, Habitat, and Protected Area Networks
Given a landscape resistance surface, creates minimum planar graph (Fall et al. (2007) <doi:10.1007/s10021-007-9038-7>) and grains of connectivity (Galpern et al. (2012) <doi:10.1111/j.1365-294X.2012.05677.x>) models that can be used to calculate effective distances for landscape connectivity at multiple scales. Documentation is provided by several vignettes, and a paper (Chubaty, Galpern & Doctolero (2020) <doi:10.1111/2041-210X.13350>).
Maintained by Alex M Chubaty. Last updated 2 months ago.
habitat-connectivitylandscape-connectivityspatial-graphscpp
1.3 match 19 stars 6.76 score 20 scriptso1iv3r
ClustImpute:K-Means Clustering with Build-in Missing Data Imputation
This k-means algorithm is able to cluster data with missing values and as a by-product completes the data set. The implementation can deal with missing values in multiple variables and is computationally efficient since it iteratively uses the current cluster assignment to define a plausible distribution for missing value imputation. Weights are used to shrink early random draws for missing values (i.e., draws based on the cluster assignments after few iterations) towards the global mean of each feature. This shrinkage slowly fades out after a fixed number of iterations to reflect the increasing credibility of cluster assignments. See the vignette for details.
Maintained by Oliver Pfaffel. Last updated 4 years ago.
1.8 match 7 stars 4.96 score 13 scriptsjrosen48
prcr:Person-Centered Analysis
Provides an easy-to-use yet adaptable set of tools to conduct person-center analysis using a two-step clustering procedure. As described in Bergman and El-Khouri (1999) <DOI:10.1002/(SICI)1521-4036(199910)41:6%3C753::AID-BIMJ753%3E3.0.CO;2-K>, hierarchical clustering is performed to determine the initial partition for the subsequent k-means clustering procedure.
Maintained by Joshua M Rosenberg. Last updated 5 years ago.
1.9 match 5 stars 4.65 score 18 scriptsazvoleff
gfcanalysis:Tools for Working with Hansen et al. Global Forest Change Dataset
Supports analyses using the Global Forest Change dataset released by Hansen et al. gfcanalysis was originally written for the Tropical Ecology Assessment and Monitoring (TEAM) Network. For additional details on the Global Forest Change dataset, see: Hansen, M. et al. 2013. "High-Resolution Global Maps of 21st-Century Forest Cover Change." Science 342 (15 November): 850-53. The forest change data and more information on the product is available at <http://earthenginepartners.appspot.com>.
Maintained by Matthew Cooper. Last updated 1 years ago.
1.7 match 17 stars 4.93 score 33 scriptsapwheele
ptools:Tools for Poisson Data
Functions used for analyzing count data, mostly crime counts. Includes checking difference in two Poisson counts (e-test), checking the fit for a Poisson distribution, small sample tests for counts in bins, Weighted Displacement Difference test (Wheeler and Ratcliffe, 2018) <doi:10.1186/s40163-018-0085-5>, to evaluate crime changes over time in treated/control areas. Additionally includes functions for aggregating spatial data and spatial feature engineering.
Maintained by Andrew Wheeler. Last updated 1 years ago.
crime-analysiscriminal-justicecriminology
1.9 match 5 stars 4.44 score 11 scriptsalfodefalco
dPCP:Automated Analysis of Multiplex Digital PCR Data
The automated clustering and quantification of the digital PCR data is based on the combination of 'DBSCAN' (Hahsler et al. (2019) <doi:10.18637/jss.v091.i01>) and 'c-means' (Bezdek et al. (1981) <doi:10.1007/978-1-4757-0450-1>) algorithms. The analysis is independent of multiplexing geometry, dPCR system, and input amount. The details about input data and parameters are available in the vignette.
Maintained by Alfonso De Falco. Last updated 2 years ago.
1.9 match 2 stars 4.36 score 23 scriptscran
cba:Clustering for Business Analytics
Implements clustering techniques such as Proximus and Rock, utility functions for efficient computation of cross distances and data manipulation.
Maintained by Christian Buchta. Last updated 8 months ago.
2.3 match 3.62 score 3 dependentskaerosen
tilemaps:Generate Tile Maps
Implements an algorithm for generating maps, known as tile maps, in which each region is represented by a single tile of the same shape and size. The algorithm was first proposed in "Generating Tile Maps" by Graham McNeill and Scott Hale (2017) <doi:10.1111/cgf.13200>. Functions allow users to generate, plot, and compare square or hexagon tile maps.
Maintained by Kaelyn Rosenberg. Last updated 1 years ago.
1.5 match 45 stars 5.35 score 8 scriptscran
spcosa:Spatial Coverage Sampling and Random Sampling from Compact Geographical Strata
Spatial coverage sampling and random sampling from compact geographical strata created by k-means. See Walvoort et al. (2010) <doi:10.1016/j.cageo.2010.04.005> for details.
Maintained by Dennis Walvoort. Last updated 2 years ago.
2.3 match 2 stars 3.48 score 1 dependentsahfoss
kamila:Methods for Clustering Mixed-Type Data
Implements methods for clustering mixed-type data, specifically combinations of continuous and nominal data. Special attention is paid to the often-overlooked problem of equitably balancing the contribution of the continuous and categorical variables. This package implements KAMILA clustering, a novel method for clustering mixed-type data in the spirit of k-means clustering. It does not require dummy coding of variables, and is efficient enough to scale to rather large data sets. Also implemented is Modha-Spangler clustering, which uses a brute-force strategy to maximize the cluster separation simultaneously in the continuous and categorical variables. For more information, see Foss, Markatou, Ray, & Heching (2016) <doi:10.1007/s10994-016-5575-7> and Foss & Markatou (2018) <doi:10.18637/jss.v083.i13>.
Maintained by Alexander Foss. Last updated 2 years ago.
1.8 match 16 stars 4.25 score 22 scriptsandriyprotsak5
UAHDataScienceUC:Learn Clustering Techniques Through Examples and Code
A comprehensive educational package combining clustering algorithms with detailed step-by-step explanations. Provides implementations of both traditional (hierarchical, k-means) and modern (Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Gaussian Mixture Models (GMM), genetic k-means) clustering methods as described in Ezugwu et. al., (2022) <doi:10.1016/j.engappai.2022.104743>. Includes educational datasets highlighting different clustering challenges, based on 'scikit-learn' examples (Pedregosa et al., 2011) <https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html>. Features detailed algorithm explanations, visualizations, and weighted distance calculations for enhanced learning.
Maintained by Andriy Protsak Protsak. Last updated 1 months ago.
2.3 match 3.30 scoreshubhamdutta26
mapindiatools:Mapping Data for 'mapindia' Package
Provides a container for data used by the 'mapindia' package. The data used by 'mapindia' has been extracted into this package so that the file size of the 'mapindia' package can be reduced considerably. The data in this package will be updated when latest data is available.
Maintained by Shubham Dutta. Last updated 5 months ago.
2.0 match 3.65 score 1 dependentsbioc
xenLite:Simple classes and methods for managing Xenium datasets
Define a relatively light class for managing Xenium data using Bioconductor. Address use of parquet for coordinates, SpatialExperiment for assay and sample data. Address serialization and use of cloud storage.
Maintained by Vincent Carey. Last updated 5 months ago.
1.6 match 1 stars 4.48 score 4 scriptsbioc
Rmagpie:MicroArray Gene-expression-based Program In Error rate estimation
Microarray Classification is designed for both biologists and statisticians. It offers the ability to train a classifier on a labelled microarray dataset and to then use that classifier to predict the class of new observations. A range of modern classifiers are available, including support vector machines (SVMs), nearest shrunken centroids (NSCs)... Advanced methods are provided to estimate the predictive error rate and to report the subset of genes which appear essential in discriminating between classes.
Maintained by Camille Maumet. Last updated 5 months ago.
2.2 match 3.30 score 1 scriptsdustinstoltz
text2map:R Tools for Text Matrices, Embeddings, and Networks
This is a collection of functions optimized for working with with various kinds of text matrices. Focusing on the text matrix as the primary object - represented either as a base R dense matrix or a 'Matrix' package sparse matrix - allows for a consistent and intuitive interface that stays close to the underlying mathematical foundation of computational text analysis. In particular, the package includes functions for working with word embeddings, text networks, and document-term matrices. Methods developed in Stoltz and Taylor (2019) <doi:10.1007/s42001-019-00048-6>, Taylor and Stoltz (2020) <doi:10.1007/s42001-020-00075-8>, Taylor and Stoltz (2020) <doi:10.15195/v7.a23>, and Stoltz and Taylor (2021) <doi:10.1016/j.poetic.2021.101567>.
Maintained by Dustin Stoltz. Last updated 4 months ago.
1.8 match 3.82 score 22 scriptsjackdunnnz
iai:Interface to 'Interpretable AI' Modules
An interface to the algorithms of 'Interpretable AI' <https://www.interpretable.ai> from the R programming language. 'Interpretable AI' provides various modules, including 'Optimal Trees' for classification, regression, prescription and survival analysis, 'Optimal Imputation' for missing data imputation and outlier detection, and 'Optimal Feature Selection' for exact sparse regression. The 'iai' package is an open-source project. The 'Interpretable AI' software modules are proprietary products, but free academic and evaluation licenses are available.
Maintained by Jack Dunn. Last updated 5 months ago.
3.4 match 1 stars 2.00 score 7 scriptsdieghernan
arcgeocoder:Geocoding with the 'ArcGIS' REST API Service
Lite interface for finding locations of addresses or businesses around the world using the 'ArcGIS' REST API service <https://developers.arcgis.com/rest/geocode/api-reference/overview-world-geocoding-service.htm>. Address text can be converted to location candidates and a location can be converted into an address. No API key required.
Maintained by Diego Hernangómez. Last updated 6 days ago.
geocodingarcgisaddressreverse-geocodingapi-wrapperapi-restarcgis-apigis
1.2 match 2 stars 5.59 score 15 scriptscran
bangladesh:Provides Ready to Use Shapefiles for Geographical Map of Bangladesh
Usually, it is difficult to plot choropleth maps for Bangladesh in 'R'. The 'bangladesh' package provides ready-to-use shapefiles for different administrative regions of Bangladesh (e.g., Division, District, Upazila, and Union). This package helps users to draw thematic maps of administrative regions of Bangladesh easily as it comes with the 'sf' objects for the boundaries. It also provides functions allowing users to efficiently get specific area maps and center coordinates for regions. Users can also search for a specific area and calculate the centroids of those areas.
Maintained by Musaddiqur Rahman Ovi. Last updated 2 years ago.
2.4 match 1 stars 2.70 scorecran
OasisR:Outright Tool for the Analysis of Spatial Inequalities and Segregation
A comprehensive set of indexes and tests for social segregation analysis, as described in Tivadar (2019) - 'OasisR': An R Package to Bring Some Order to the World of Segregation Measurement <doi:10.18637/jss.v089.i07>. The package is the most complete existing tool and it clarifies many ambiguities and errors regarding the definition of segregation indices. Additionally, 'OasisR' introduces several resampling methods that enable testing their statistical significance (randomization tests, bootstrapping, and jackknife methods).
Maintained by Mihai Tivadar. Last updated 5 months ago.
3.4 match 2 stars 1.78 score 1 dependentstimcdlucas
paleomorph:Geometric Morphometric Tools for Paleobiology
Fill missing symmetrical data with mirroring, calculate Procrustes alignments with or without scaling, and compute standard or vector correlation and covariance matrices (congruence coefficients) of 3D landmarks. Tolerates missing data for all analyses.
Maintained by Tim Lucas. Last updated 8 years ago.
morphometricspaleobiologyprocrustesstatistical-analysis
1.7 match 4 stars 3.60 score 20 scriptsjoycekang
symphony:Efficient and Precise Single-Cell Reference Atlas Mapping
Implements the Symphony single-cell reference building and query mapping algorithms and additional functions described in Kang et al <https://www.nature.com/articles/s41467-021-25957-x>.
Maintained by Joyce Kang. Last updated 2 years ago.
1.5 match 3.83 score 134 scriptsamessbee
rsdepth:Ray Shooting Depth (i.e. RS Depth) Functions for Bivariate Analysis
Ray Shooting Depth functions are provided for bivariate analysis. This mainly includes functions for computing the bivariate depth as well as RS median. Drawing functions for depth bags are also provided.
Maintained by Mudassir Shabbir. Last updated 3 years ago.
5.3 match 1.04 score 11 scriptsyhenryli
PAC:Partition-Assisted Clustering and Multiple Alignments of Networks
Implements partition-assisted clustering and multiple alignments of networks. It 1) utilizes partition-assisted clustering to find robust and accurate clusters and 2) discovers coherent relationships of clusters across multiple samples. It is particularly useful for analyzing single-cell data set. Please see Li et al. (2017) <doi:10.1371/journal.pcbi.1005875> for detail method description.
Maintained by Ye Henry Li. Last updated 4 years ago.
1.6 match 3.30 score 7 scriptscran
IDmeasurer:Assessment of Individual Identity in Animal Signals
Provides tools for assessment and quantification of individual identity information in animal signals. This package accompanies a research article by Linhart et al. (2019) <doi:10.1101/546143>: "Measuring individual identity information in animal signals: Overview and performance of available identity metrics".
Maintained by Pavel Linhart. Last updated 6 years ago.
1.8 match 2.70 scorejsl5-code
mixexp:Design and Analysis of Mixture Experiments
Functions for creating designs for mixture experiments, making ternary contour plots, and making mixture effect plots.
Maintained by John Lawson. Last updated 5 months ago.
1.8 match 1 stars 2.75 score 31 scripts 2 dependentsfedericogiorgi
corto:Inference of Gene Regulatory Networks
We present 'corto' (Correlation Tool), a simple package to infer gene regulatory networks and visualize master regulators from gene expression data using DPI (Data Processing Inequality) and bootstrapping to recover edges. An initial step is performed to calculate all significant edges between a list of source nodes (centroids) and target genes. Then all triplets containing two centroids and one target are tested in a DPI step which removes edges. A bootstrapping process then calculates the robustness of the network, eventually re-adding edges previously removed by DPI. The algorithm has been optimized to run outside a computing cluster, using a fast correlation implementation. The package finally provides functions to calculate network enrichment analysis from RNA-Seq and ATAC-Seq signatures as described in the article by Giorgi lab (2020) <doi:10.1093/bioinformatics/btaa223>.
Maintained by Federico M. Giorgi. Last updated 2 years ago.
0.8 match 20 stars 6.25 score 59 scriptstravis-barton
LilRhino:For Implementation of Feed Reduction, Learning Examples, NLP and Code Management
This is for code management functions, NLP tools, a Monty Hall simulator, and for implementing my own variable reduction technique called Feed Reduction. The Feed Reduction technique is not yet published, but is merely a tool for implementing a series of binary neural networks meant for reducing data into N dimensions, where N is the number of possible values of the response variable.
Maintained by Travis Barton. Last updated 3 years ago.
1.7 match 1 stars 2.78 score 12 scriptscran
overlapptest:Test Overlapping of Polygons Against Random Rotation
Tests the observed overlapping polygon area in a collection of polygons against a null model of random rotation, as explained in De la Cruz et al. (2017) <doi:10.13140/RG.2.2.12825.72801>.
Maintained by Marcelino de la Cruz. Last updated 2 years ago.
2.0 match 2.00 scorebioc
TSCAN:Tools for Single-Cell Analysis
Provides methods to perform trajectory analysis based on a minimum spanning tree constructed from cluster centroids. Computes pseudotemporal cell orderings by mapping cells in each cluster (or new cells) to the closest edge in the tree. Uses linear modelling to identify differentially expressed genes along each path through the tree. Several plotting and interactive visualization functions are also implemented.
Maintained by Zhicheng Ji. Last updated 5 months ago.
geneexpressionvisualizationgui
0.5 match 7.58 score 207 scripts 3 dependentsbewicklab
HybridMicrobiomes:Analysis of Host-Associated Microbiomes from Hybrid Organisms
A set of tools to analyze and visualize the relationships between host-associated microbiomes of hybrid organisms and those of their progenitor species. Though not necessary, installing the microViz package is recommended as a check for phyloseq objects. To install microViz from R Universe use the following command: install.packages("microViz", repos = c(davidbarnett = "https://david-barnett.r-universe.dev", getOption("repos"))). To install microViz from GitHub use the following commands: install.packages("devtools") followed by devtools::install_github("david-barnett/microViz").
Maintained by Sharon Bewick. Last updated 1 years ago.
3.7 match 1.00 scores-u
fastshp:Fast routines for hanlding large ESRI shapefiles (.shp)
Routines for handling of large ESRI shapefiles (.shp). This includes reading, thinning of points and matching of points to containing shapes. The main aim for this package is to provide the speed to support large shapefiles (millions of points). It is several orders of maginute faster than some other shapefile packages.
Maintained by Simon Urbanek. Last updated 7 years ago.
1.9 match 9 stars 1.95 score 8 scriptscran
SpatialAcc:Spatial Accessibility Measures
Provides a set of spatial accessibility measures from a set of locations (demand) to another set of locations (supply). It aims, among others, to support research on spatial accessibility to health care facilities. Includes the locations and some characteristics of major public hospitals in Greece.
Maintained by Stamatis Kalogirou. Last updated 12 months ago.
3.6 match 1.00 scoreviroli
quantileDA:Quantile Classifier
Code for centroid, median and quantile classifiers.
Maintained by Cinzia Viroli. Last updated 1 years ago.
2.5 match 1.00 score 10 scriptsmusajajorge
mapsPERU:Maps of Peru
Information of the centroids and geographical limits of the regions, departments, provinces and districts of Peru.
Maintained by Jorge L. C. Musaja. Last updated 2 years ago.
0.6 match 17 stars 4.04 score 13 scriptsandrefujita
cemco:Fit 'CemCO' Algorithm
'CemCO' algorithm, a model-based (Gaussian) clustering algorithm that removes/minimizes the effects of undesirable covariates during the clustering process both in cluster centroids and in cluster covariance structures (Relvas C. & Fujita A., (2020) <arXiv:2004.02333>).
Maintained by Andre Fujita. Last updated 2 years ago.
2.2 match 1.00 scorebioc
PDATK:Pancreatic Ductal Adenocarcinoma Tool-Kit
Pancreatic ductal adenocarcinoma (PDA) has a relatively poor prognosis and is one of the most lethal cancers. Molecular classification of gene expression profiles holds the potential to identify meaningful subtypes which can inform therapeutic strategy in the clinical setting. The Pancreatic Cancer Adenocarcinoma Tool-Kit (PDATK) provides an S4 class-based interface for performing unsupervised subtype discovery, cross-cohort meta-clustering, gene-expression-based classification, and subsequent survival analysis to identify prognostically useful subtypes in pancreatic cancer and beyond. Two novel methods, Consensus Subtypes in Pancreatic Cancer (CSPC) and Pancreatic Cancer Overall Survival Predictor (PCOSP) are included for consensus-based meta-clustering and overall-survival prediction, respectively. Additionally, four published subtype classifiers and three published prognostic gene signatures are included to allow users to easily recreate published results, apply existing classifiers to new data, and benchmark the relative performance of new methods. The use of existing Bioconductor classes as input to all PDATK classes and methods enables integration with existing Bioconductor datasets, including the 21 pancreatic cancer patient cohorts available in the MetaGxPancreas data package. PDATK has been used to replicate results from Sandhu et al (2019) [https://doi.org/10.1200/cci.18.00102] and an additional paper is in the works using CSPC to validate subtypes from the included published classifiers, both of which use the data available in MetaGxPancreas. The inclusion of subtype centroids and prognostic gene signatures from these and other publications will enable researchers and clinicians to classify novel patient gene expression data, allowing the direct clinical application of the classifiers included in PDATK. Overall, PDATK provides a rich set of tools to identify and validate useful prognostic and molecular subtypes based on gene-expression data, benchmark new classifiers against existing ones, and apply discovered classifiers on novel patient data to inform clinical decision making.
Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassificationsurvivalclusteringgeneprediction
0.5 match 1 stars 4.31 score 17 scriptscran
EcotoneFinder:Characterising and Locating Ecotones and Communities
Analytical methods to locate and characterise ecotones, ecosystems and environmental patchiness along ecological gradients. Methods are implemented for isolated sampling or for space/time series. It includes Detrended Correspondence Analysis (Hill & Gauch (1980) <doi:10.1007/BF00048870>), fuzzy clustering (De Cáceres et al. (2010) <doi:10.1080/01621459.1963.10500845>), biodiversity indices (Jost (2006) <doi:10.1111/j.2006.0030-1299.14714.x>), and network analyses (Epskamp et al. (2012) <doi:10.18637/jss.v048.i04>) - as well as tools to explore the number of clusters in the data. Functions to produce synthetic ecological datasets are also provided.
Maintained by Antoine Bagnaro. Last updated 4 years ago.
2.0 match 1.00 scorecran
SpatialVx:Spatial Forecast Verification
Spatial forecast verification refers to verifying weather forecasts when the verification set (forecast and observations) is on a spatial field, usually a high-resolution gridded spatial field. Most of the functions here require the forecast and observed fields to be gridded and on the same grid. For a thorough review of most of the methods in this package, please see Gilleland et al. (2009) <doi: 10.1175/2009WAF2222269.1> and for a tutorial on some of the main functions available here, see Gilleland (2022) <doi: 10.5065/4px3-5a05>.
Maintained by Eric Gilleland. Last updated 4 months ago.
1.8 match 1 stars 1.00 scorer-forge
anacor:Simple and Canonical Correspondence Analysis
Performs simple and canonical CA (covariates on rows/columns) on a two-way frequency table (with missings) by means of SVD. Different scaling methods (standard, centroid, Benzecri, Goodman) as well as various plots including confidence ellipsoids are provided.
Maintained by Patrick Mair. Last updated 7 days ago.
0.5 match 3.40 score 21 scriptsbioc
cancerclass:Development and validation of diagnostic tests from high-dimensional molecular data
The classification protocol starts with a feature selection step and continues with nearest-centroid classification. The accurarcy of the predictor can be evaluated using training and test set validation, leave-one-out cross-validation or in a multiple random validation protocol. Methods for calculation and visualization of continuous prediction scores allow to balance sensitivity and specificity and define a cutoff value according to clinical requirements.
Maintained by Daniel Kosztyla. Last updated 5 months ago.
cancermicroarrayclassificationvisualization
0.5 match 3.30 score 10 scriptsrajkumpismb
PCAPAM50:Enhanced 'PAM50' Subtyping of Breast Cancer
Accurate classification of breast cancer tumors based on gene expression data is not a trivial task, and it lacks standard practices.The 'PAM50' classifier, which uses 50 gene centroid correlation distances to classify tumors, faces challenges with balancing estrogen receptor (ER) status and gene centering. The 'PCAPAM50' package leverages principal component analysis and iterative 'PAM50' calls to create a gene expression-based ER-balanced subset for gene centering, avoiding the use of protein expression-based ER data resulting into an enhanced Breast Cancer subtyping.
Maintained by Praveen-Kumar Raj-Kumar. Last updated 3 months ago.
0.5 match 2.48 score 3 scriptspavlomozharovskyi
TukeyRegion:Tukey Region and Median
Tukey regions are polytopes in the Euclidean space, viz. upper-level sets of the Tukey depth function on given data. The bordering hyperplanes of a Tukey region are computed as well as its vertices, facets, centroid, and volume. In addition, the Tukey median set, which is the non-empty Tukey region having highest depth level, and its barycenter (= Tukey median) are calculated. Tukey regions are visualized in dimension two and three. For details see Liu, Mosler, and Mozharovskyi (2019, <doi:10.1080/10618600.2018.1546595>). See file LICENSE.note for additional license information.
Maintained by Pavlo Mozharovskyi. Last updated 2 years ago.
0.5 match 1.00 scorehdraisma
represent:Determine How Representative Two Multidimensional Data Sets are
Compute the values of various parameters evaluating how similar two multidimensional datasets' structures are in multidimensional space, as described in: Jouan-Rimbaud, D., Massart, D. L., Saby, C. A., Puel, C. (1998), <doi:10.1016/S0169-7439(98)00005-7>. The computed parameters evaluate three properties, namely, the direction of the data sets, the variance-covariance of the data points, and the location of the data sets' centroids. The package contains workhorse function jrparams(), as well as two helper functions Mboxtest() and JRsMahaldist(), and four example data sets.
Maintained by Harmen Draisma. Last updated 1 years ago.
0.5 match 1.00 score 5 scripts