Showing 68 of total 68 results (show query)
bioc
GenomicRanges:Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Maintained by Hervé Pagès. Last updated 4 months ago.
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
44 stars 17.68 score 13k scripts 1.3k dependentsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 3 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
560 stars 17.65 score 17k scripts 856 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 1 days ago.
163 stars 17.23 score 58k scripts 562 dependentsbioc
SummarizedExperiment:A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Maintained by Hervé Pagès. Last updated 5 months ago.
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
34 stars 16.84 score 8.6k scripts 1.2k dependentsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
22 stars 16.09 score 2.1k scripts 1.8k dependentsbioc
GenomicFeatures:Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Maintained by H. Pagès. Last updated 5 months ago.
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
26 stars 15.34 score 5.3k scripts 339 dependentsbioc
xcms:LC-MS and GC-MS Data Analysis
Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.
Maintained by Steffen Neumann. Last updated 15 days ago.
immunooncologymassspectrometrymetabolomicsbioconductorfeature-detectionmass-spectrometrypeak-detectioncpp
196 stars 14.31 score 984 scripts 11 dependentsbioc
phyloseq:Handling and analysis of high-throughput microbiome census data
phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.
Maintained by Paul J. McMurdie. Last updated 5 months ago.
immunooncologysequencingmicrobiomemetagenomicsclusteringclassificationmultiplecomparisongeneticvariability
597 stars 13.90 score 8.4k scripts 37 dependentsdrostlab
philentropy:Similarity and Distance Quantification Between Probability Functions
Computes 46 optimized distance and similarity measures for comparing probability functions (Drost (2018) <doi:10.21105/joss.00765>). These comparisons between probability functions have their foundations in a broad range of scientific disciplines from mathematics to ecology. The aim of this package is to provide a core framework for clustering, classification, statistical inference, goodness-of-fit, non-parametric statistics, information theory, and machine learning tasks that are based on comparing univariate or multivariate probability functions.
Maintained by Hajk-Georg Drost. Last updated 4 months ago.
distance-measuresdistance-quantificationinformation-theoryjensen-shannon-divergenceparametric-distributionssimilarity-measuresstatisticscpp
137 stars 12.44 score 484 scripts 24 dependentsstuart-lab
Signac:Analysis of Single-Cell Chromatin Data
A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.
Maintained by Tim Stuart. Last updated 7 months ago.
atacbioinformaticssingle-cellzlibcpp
355 stars 12.18 score 3.7k scripts 1 dependentsbioc
destiny:Creates diffusion maps
Create and plot diffusion maps.
Maintained by Philipp Angerer. Last updated 4 months ago.
cellbiologycellbasedassaysclusteringsoftwarevisualizationdiffusion-mapsdimensionality-reductioncpp
82 stars 11.44 score 792 scripts 1 dependentsqinwf
jiebaR:Chinese Text Segmentation
Chinese text segmentation, keyword extraction and speech tagging For R.
Maintained by Qin Wenfeng. Last updated 5 years ago.
chinesechinese-text-segmentationcppjiebajiebalexical-analysisnlpcpp
352 stars 10.46 score 456 scripts 6 dependentsgforge
Gmisc:Descriptive Statistics, Transition Plots, and More
Tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bézier lines with arrows complementing the ones in the 'grid' package, and more.
Maintained by Max Gordon. Last updated 2 years ago.
50 stars 10.40 score 233 scripts 2 dependentsphiala
ecodist:Dissimilarity-Based Functions for Ecological Analysis
Dissimilarity-based analysis functions including ordination and Mantel test functions, intended for use with spatial and community ecological data. The original package description is in Goslee and Urban (2007) <doi:10.18637/jss.v022.i07>, with further statistical detail in Goslee (2010) <doi:10.1007/s11258-009-9641-0>.
Maintained by Sarah Goslee. Last updated 1 years ago.
9 stars 9.84 score 566 scripts 9 dependentsjonclayden
shades:Simple Colour Manipulation
Functions for easily manipulating colours, creating colour scales and calculating colour distances.
Maintained by Jon Clayden. Last updated 6 months ago.
colorcolor-manipulationcolourcolour-manipulationcolour-spaces
83 stars 9.58 score 178 scripts 37 dependentsbrry
berryFunctions:Function Collection Related to Plotting and Hydrology
Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.
Maintained by Berry Boessenkool. Last updated 2 months ago.
13 stars 9.43 score 350 scripts 16 dependentsbodkan
slendr:A Simulation Framework for Spatiotemporal Population Genetics
A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.
Maintained by Martin Petr. Last updated 9 hours ago.
popgenpopulation-geneticssimulationsspatial-statistics
56 stars 9.13 score 88 scriptsgavinsimpson
analogue:Analogue and Weighted Averaging Methods for Palaeoecology
Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.
Maintained by Gavin L. Simpson. Last updated 6 months ago.
14 stars 8.87 score 185 scripts 4 dependentspecanproject
PEcAn.emulator:Gausian Process Emulator
Implementation of a Gaussian Process model (both likelihood and bayesian approaches) for kriging and model emulation. Includes functions for sampling design and prediction.
Maintained by Mike Dietze. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 8.83 score 1 scripts 6 dependentsbart1
move:Visualizing and Analyzing Animal Track Data
Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.
Maintained by Bart Kranstauber. Last updated 4 months ago.
8.76 score 690 scripts 3 dependentsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 2 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
57 stars 7.68 score 8 scriptsgpiras
sphet:Estimation of Spatial Autoregressive Models with and without Heteroskedastic Innovations
Functions for fitting Cliff-Ord-type spatial autoregressive models with and without heteroskedastic innovations using Generalized Method of Moments estimation are provided. Some support is available for fitting spatial HAC models, and for fitting with non-spatial endogeneous variables using instrumental variables.
Maintained by Gianfranco Piras. Last updated 19 days ago.
8 stars 7.43 score 188 scripts 3 dependentsgagolews
FuzzyNumbers:Tools to Deal with Fuzzy Numbers
S4 classes and methods to deal with fuzzy numbers. They allow for computing any arithmetic operations (e.g., by using the Zadeh extension principle), performing approximation of arbitrary fuzzy numbers by trapezoidal and piecewise linear ones, preparing plots for publications, computing possibility and necessity values for comparisons, etc.
Maintained by Marek Gagolewski. Last updated 3 years ago.
10 stars 7.37 score 91 scripts 17 dependentsbioc
OrganismDbi:Software to enable the smooth interfacing of different database packages
The package enables a simple unified interface to several annotation packages each of which has its own schema by taking advantage of the fact that each of these packages implements a select methods.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
7.26 score 34 scripts 34 dependentsludvigolsen
rearrr:Rearranging Data
Arrange data by a set of methods. Use rearrangers to reorder data points and mutators to change their values. From basic utilities, to centering the greatest value, to swirling in 3-dimensional space, 'rearrr' enables creativity when plotting and experimenting with data.
Maintained by Ludvig Renbo Olsen. Last updated 23 days ago.
arrangeclusterexpandforminggenerateggplot2orderplotting-in-rrollrotateshapingswirltransformations
24 stars 7.26 score 128 scripts 8 dependentsalexchristensen
NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis
Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.
Maintained by Alexander Christensen. Last updated 2 years ago.
23 stars 7.04 score 101 scripts 4 dependentsips-lmu
emuR:Main Package of the EMU Speech Database Management System
Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.
Maintained by Markus Jochim. Last updated 1 years ago.
24 stars 6.89 score 135 scripts 1 dependentsboydorr
rdiversity:Measurement and Partitioning of Similarity-Sensitive Biodiversity
Provides a framework for the measurement and partitioning of the (similarity-sensitive) biodiversity of a metacommunity and its constituent subcommunities. Richard Reeve, et al. (2016) <arXiv:1404.6520v3>.
Maintained by Richard Reeve. Last updated 3 years ago.
biodiversitydiversity-measurementpartitioning-diversity
8 stars 6.85 score 66 scripts 1 dependentsnepem-ufsc
pliman:Tools for Plant Image Analysis
Tools for both single and batch image manipulation and analysis (Olivoto, 2022 <doi:10.1111/2041-210X.13803>) and phytopathometry (Olivoto et al., 2022 <doi:10.1007/S40858-021-00487-5>). The tools can be used for the quantification of leaf area, object counting, extraction of image indexes, shape measurement, object landmark identification, and Elliptical Fourier Analysis of object outlines (Claude (2008) <doi:10.1007/978-0-387-77789-4>). The package also provides a comprehensive pipeline for generating shapefiles with complex layouts and supports high-throughput phenotyping of RGB, multispectral, and hyperspectral orthomosaics. This functionality facilitates field phenotyping using UAV- or satellite-based imagery.
Maintained by Tiago Olivoto. Last updated 1 days ago.
11 stars 6.76 score 476 scriptsachubaty
grainscape:Landscape Connectivity, Habitat, and Protected Area Networks
Given a landscape resistance surface, creates minimum planar graph (Fall et al. (2007) <doi:10.1007/s10021-007-9038-7>) and grains of connectivity (Galpern et al. (2012) <doi:10.1111/j.1365-294X.2012.05677.x>) models that can be used to calculate effective distances for landscape connectivity at multiple scales. Documentation is provided by several vignettes, and a paper (Chubaty, Galpern & Doctolero (2020) <doi:10.1111/2041-210X.13350>).
Maintained by Alex M Chubaty. Last updated 2 months ago.
habitat-connectivitylandscape-connectivityspatial-graphscpp
19 stars 6.76 score 20 scriptsmarkheckmann
OpenRepGrid:Tools to Analyze Repertory Grid Data
Analyze repertory grids, a qualitative-quantitative data collection technique devised by George A. Kelly in the 1950s. Today, grids are used across various domains ranging from clinical psychology to marketing. The package contains functions to quantitatively analyze and visualize repertory grid data (e.g. 'Fransella', 'Bell', & 'Bannister', 2004, ISBN: 978-0-470-09080-0). The package is part of the The package is part of the <https://openrepgrid.org/> project.
Maintained by Mark Heckmann. Last updated 27 days ago.
19 stars 6.69 score 156 scriptsericarcher
swfscMisc:Miscellaneous Functions for Southwest Fisheries Science Center
Collection of conversion, analytical, geodesic, mapping, and plotting functions. Used to support packages and code written by researchers at the Southwest Fisheries Science Center of the National Oceanic and Atmospheric Administration.
Maintained by Eric Archer. Last updated 3 days ago.
2 stars 6.64 score 101 scripts 20 dependentspachadotdev
economiccomplexity:Computational Methods for Economic Complexity
A wrapper of different methods from Linear Algebra for the equations introduced in The Atlas of Economic Complexity and related literature. This package provides standard matrix and graph output that can be used seamlessly with other packages. See <doi:10.21105/joss.01866> for a summary of these methods and its evolution in literature.
Maintained by Mauricio Vargas Sepulveda. Last updated 3 months ago.
economic-complexityeigenvalueseigenvectorsgraphsinternational-tradematrixnetworksrecursive-algorithmopenblascppopenmp
39 stars 6.32 score 18 scriptsinseefr
disaggR:Two-Steps Benchmarks for Time Series Disaggregation
The twoStepsBenchmark() and threeRuleSmooth() functions allow you to disaggregate a low-frequency time series with higher frequency time series, using the French National Accounts methodology. The aggregated sum of the resulting time series is strictly equal to the low-frequency time series within the benchmarking window. Typically, the low-frequency time series is an annual one, unknown for the last year, and the high frequency one is either quarterly or monthly. See "Methodology of quarterly national accounts", Insee Méthodes N°126, by Insee (2012, ISBN:978-2-11-068613-8, <https://www.insee.fr/en/information/2579410>).
Maintained by Pauline Meinzel. Last updated 9 months ago.
disaggregationstatistical-packagetime-series
11 stars 6.01 score 31 scriptsblasbenito
distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis
Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.
Maintained by Blas M. Benito. Last updated 1 months ago.
dissimilaritydynamic-time-warpinglock-steptime-seriescpp
23 stars 5.73 score 11 scriptsrbgramacy
laGP:Local Approximate Gaussian Process Regression
Performs approximate GP regression for large computer experiments and spatial datasets. The approximation is based on finding small local designs for prediction (independently) at particular inputs. OpenMP and SNOW parallelization are supported for prediction over a vast out-of-sample testing set; GPU acceleration is also supported for an important subroutine. OpenMP and GPU features may require special compilation. An interface to lower-level (full) GP inference and prediction is provided. Wrapper routines for blackbox optimization under mixed equality and inequality constraints via an augmented Lagrangian scheme, and for large scale computer model calibration, are also provided. For details and tutorial, see Gramacy (2016 <doi:10.18637/jss.v072.i01>.
Maintained by Robert B. Gramacy. Last updated 2 years ago.
8 stars 5.47 score 166 scripts 2 dependentswjschne
ggdiagram:Object-Oriented Diagram Plots with ggplot2
The ggdiagram package creates path diagrams with an object-oriented approach and plots diagrams with ggplot2.
Maintained by W. Joel Schneider. Last updated 5 days ago.
diagramsfactor-analysisggplot2path-analysiss7structural-equation-modeling
32 stars 5.43 scorejqveenstra
arfima:Fractional ARIMA (and Other Long Memory) Time Series Modeling
Simulates, fits, and predicts long-memory and anti-persistent time series, possibly mixed with ARMA, regression, transfer-function components. Exact methods (MLE, forecasting, simulation) are used.
Maintained by JQ Veenstra. Last updated 1 years ago.
14 stars 5.31 score 81 scripts 1 dependentsgoldingn
pop:A Flexible Syntax for Population Dynamic Modelling
Population dynamic models underpin a range of analyses and applications in ecology and epidemiology. The various approaches for analysing population dynamics models (MPMs, IPMs, ODEs, POMPs, PVA) each require the model to be defined in a different way. This makes it difficult to combine different modelling approaches and data types to solve a given problem. 'pop' aims to provide a flexible and easy to use common interface for constructing population dynamic models and enabling to them to be fitted and analysed in lots of different ways.
Maintained by Nick Golding. Last updated 9 years ago.
10 stars 4.88 score 15 scriptsgrundy95
changepoint.geo:Geometrically Inspired Multivariate Changepoint Detection
Implements the high-dimensional changepoint detection method GeomCP and the related mappings used for changepoint detection. These methods view the changepoint problem from a geometrical viewpoint and aim to extract relevant geometrical features in order to detect changepoints. The geomcp() function should be your first point of call. References: Grundy et al. (2020) <doi:10.1007/s11222-020-09940-y>.
Maintained by Thomas Grundy. Last updated 4 years ago.
11 stars 4.74 score 10 scriptsteazrq
orthoDr:Semi-Parametric Dimension Reduction Models Using Orthogonality Constrained Optimization
Utilize an orthogonality constrained optimization algorithm of Wen & Yin (2013) <DOI:10.1007/s10107-012-0584-1> to solve a variety of dimension reduction problems in the semiparametric framework, such as Ma & Zhu (2012) <DOI:10.1080/01621459.2011.646925>, Ma & Zhu (2013) <DOI:10.1214/12-AOS1072>, Sun, Zhu, Wang & Zeng (2019) <arXiv:1704.05046> and Zhou, Zhu & Zeng (2021) <arXiv:1802.06156>. It also serves as a general purpose optimization solver for problems with orthogonality constraints. Parallel computing for approximating the gradient is enabled through 'OpenMP'.
Maintained by Ruoqing Zhu. Last updated 3 years ago.
8 stars 4.53 score 14 scripts 2 dependentsbioc
DNABarcodes:A tool for creating and analysing DNA barcodes used in Next Generation Sequencing multiplexing experiments
The package offers a function to create DNA barcode sets capable of correcting insertion, deletion, and substitution errors. Existing barcodes can be analysed regarding their minimal, maximal and average distances between barcodes. Finally, reads that start with a (possibly mutated) barcode can be demultiplexed, i.e., assigned to their original reference barcode.
Maintained by Tilo Buschmann. Last updated 5 months ago.
preprocessingsequencingcppopenmp
4.51 score 27 scriptsbioc
DNABarcodeCompatibility:A Tool for Optimizing Combinations of DNA Barcodes Used in Multiplexed Experiments on Next Generation Sequencing Platforms
The package allows one to obtain optimised combinations of DNA barcodes to be used for multiplex sequencing. In each barcode combination, barcodes are pooled with respect to Illumina chemistry constraints. Combinations can be filtered to keep those that are robust against substitution and insertion/deletion errors thereby facilitating the demultiplexing step. In addition, the package provides an optimiser function to further favor the selection of barcode combinations with least heterogeneity in barcode usage.
Maintained by Céline Trébeau. Last updated 5 months ago.
4.30 score 2 scriptsrajarshi
fingerprint:Functions to Operate on Binary Fingerprint Data
Functions to manipulate binary fingerprints of arbitrary length. A fingerprint is represented by an object of S4 class 'fingerprint' which is internally represented a vector of integers, such that each element represents the position in the fingerprint that is set to 1. The bitwise logical functions in R are overridden so that they can be used directly with 'fingerprint' objects. A number of distance metrics are also available (many contributed by Michael Fadock). Fingerprints can be converted to Euclidean vectors (i.e., points on the unit hypersphere) and can also be folded using OR. Arbitrary fingerprint formats can be handled via line handlers. Currently handlers are provided for CDK, MOE and BCI fingerprint data.
Maintained by Rajarshi Guha. Last updated 7 years ago.
4.27 score 82 scripts 12 dependentstobiasschoch
wbacon:Weighted BACON Algorithms
The BACON algorithms are methods for multivariate outlier nomination (detection) and robust linear regression by Billor, Hadi, and Velleman (2000) <doi:10.1016/S0167-9473(99)00101-2>. The extension to weighted problems is due to Beguin and Hulliger (2008) <https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X200800110616>; see also <doi:10.21105/joss.03238>.
Maintained by Tobias Schoch. Last updated 7 months ago.
outlieroutlier-detectionrobust-regressionstatisticsopenblasopenmp
2 stars 4.00 score 8 scriptsmrshoenel
mmb:Arbitrary Dependency Mixed Multivariate Bayesian Models
Supports Bayesian models with full and partial (hence arbitrary) dependencies between random variables. Discrete and continuous variables are supported, and conditional joint probabilities and probability densities are estimated using Kernel Density Estimation (KDE). The full general form, which implements an extension to Bayes' theorem, as well as the simple form, which is just a Bayesian network, both support regression through segmentation and KDE and estimation of probability or relative likelihood of discrete or continuous target random variables. This package also provides true statistical distance measures based on Bayesian models. Furthermore, these measures can be facilitated on neighborhood searches, and to estimate the similarity and distance between data points. Related work is by Bayes (1763) <doi:10.1098/rstl.1763.0053> and by Scutari (2010) <doi:10.18637/jss.v035.i03>.
Maintained by Sebastian Hönel. Last updated 4 years ago.
bayes-classifierkernel-density-estimationneighborhood-searchregression-models
3.70 score 5 scriptscnrakt
haplotypes:Manipulating DNA Sequences and Estimating Unambiguous Haplotype Network with Statistical Parsimony
Provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel coding methods (only simple indel coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapses identical DNA sequences into haplotypes or inferring haplotypes using user provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks.
Maintained by Caner Aktas. Last updated 2 years ago.
1 stars 3.43 score 54 scriptsvascobranco
gecko:Geographical Ecology and Conservation Knowledge Online
Includes a collection of geographical analysis functions aimed primarily at ecology and conservation science studies, allowing processing of both point and raster data. Now integrates SPECTRE (<https://biodiversityresearch.org/spectre/>), a dataset of global geospatial threat data, developed by the authors.
Maintained by Vasco V. Branco. Last updated 3 months ago.
conservation-scienceecologyspatial-analysis
5 stars 3.40 score 4 scriptsfguenther
LSAfun:Applied Latent Semantic Analysis (LSA) Functions
Provides functions that allow for convenient working with vector space models of semantics/distributional semantic models/word embeddings. Originally built for LSA models (hence the name), but can be used for all such vector-based models. For actually building a vector semantic space, use the package 'lsa' or other specialized software. Downloadable semantic spaces can be found at <https://sites.google.com/site/fritzgntr/software-resources>.
Maintained by Fritz Guenther. Last updated 1 years ago.
1 stars 3.18 score 85 scripts 1 dependentsrbgramacy
plgp:Particle Learning of Gaussian Processes
Sequential Monte Carlo (SMC) inference for fully Bayesian Gaussian process (GP) regression and classification models by particle learning (PL) following Gramacy & Polson (2011) <arXiv:0909.5262>. The sequential nature of inference and the active learning (AL) hooks provided facilitate thrifty sequential design (by entropy) and optimization (by improvement) for classification and regression models, respectively. This package essentially provides a generic PL interface, and functions (arguments to the interface) which implement the GP models and AL heuristics. Functions for a special, linked, regression/classification GP model and an integrated expected conditional improvement (IECI) statistic provide for optimization in the presence of unknown constraints. Separable and isotropic Gaussian, and single-index correlation functions are supported. See the examples section of ?plgp and demo(package="plgp") for an index of demos.
Maintained by Robert B. Gramacy. Last updated 2 years ago.
1 stars 2.96 score 102 scripts 3 dependentsdavezes
widals:Weighting by Inverse Distance with Adaptive Least Squares
Computationally easy modeling, interpolation, forecasting of massive temporal-spacial data.
Maintained by Dave Zes. Last updated 4 days ago.
2.89 score 39 scriptsmilesott
strategicplayers:Strategic Players
Identifies individuals in a social network who should be the intervention subjects for a network intervention in which you have a group of targets, a group of avoiders, and a group that is neither.
Maintained by Miles Ott. Last updated 1 years ago.
2.70 score 1 scriptsmatutosi
ecan:Ecological Analysis and Visualization
Support ecological analyses such as ordination and clustering. Contains consistent and easy wrapper functions of 'stat', 'vegan', and 'labdsv' packages, and visualisation functions of ordination and clustering.
Maintained by Toshikazu Matsumura. Last updated 6 months ago.
2.70 score 5 scriptsdbolotov
neighbr:Classification, Regression, Clustering with K Nearest Neighbors
Classification, regression, and clustering with k nearest neighbors algorithm. Implements several distance and similarity measures, covering continuous and logical features. Outputs ranked neighbors. Most features of this package are directly based on the PMML specification for KNN.
Maintained by Dmitriy Bolotov. Last updated 5 years ago.
2.48 score 30 scriptscran
argosfilter:Argos Locations Filter
Filters animal satellite tracking data obtained from the Argos system(<https://www.argos-system.org/>), following the algorithm described in Freitas et al (2008) <doi:10.1111/j.1748-7692.2007.00180.x>. It is especially indicated for telemetry studies of marine animals, where Argos locations are predominantly of low-quality.
Maintained by Carla Freitas. Last updated 3 years ago.
2.00 scorecran
OasisR:Outright Tool for the Analysis of Spatial Inequalities and Segregation
A comprehensive set of indexes and tests for social segregation analysis, as described in Tivadar (2019) - 'OasisR': An R Package to Bring Some Order to the World of Segregation Measurement <doi:10.18637/jss.v089.i07>. The package is the most complete existing tool and it clarifies many ambiguities and errors regarding the definition of segregation indices. Additionally, 'OasisR' introduces several resampling methods that enable testing their statistical significance (randomization tests, bootstrapping, and jackknife methods).
Maintained by Mihai Tivadar. Last updated 5 months ago.
2 stars 1.78 score 1 dependentscran
manifold:Operations for Riemannian Manifolds
Implements operations for Riemannian manifolds, e.g., geodesic distance, Riemannian metric, exponential and logarithm maps, etc. Also incorporates random object generator on the manifolds. See Dai, Lin, and Müller (2021) <doi:10.1111/biom.13385>.
Maintained by Xiongtao Dai. Last updated 2 years ago.
1 stars 1.48 score 1 dependentscran
PerMallows:Permutations and Mallows Distributions
Includes functions to work with the Mallows and Generalized Mallows Models. The considered distances are Kendall's-tau, Cayley, Hamming and Ulam and it includes functions for making inference, sampling and learning such distributions, some of which are novel in the literature. As a by-product, PerMallows also includes operations for permutations, paying special attention to those related with the Kendall's-tau, Cayley, Ulam and Hamming distances. It is also possible to generate random permutations at a given distance, or with a given number of inversions, or cycles, or fixed points or even with a given length on LIS (longest increasing subsequence).
Maintained by Ekhine Irurozki. Last updated 1 months ago.
1 stars 1.00 scorecran
SpatialAcc:Spatial Accessibility Measures
Provides a set of spatial accessibility measures from a set of locations (demand) to another set of locations (supply). It aims, among others, to support research on spatial accessibility to health care facilities. Includes the locations and some characteristics of major public hospitals in Greece.
Maintained by Stamatis Kalogirou. Last updated 12 months ago.
1.00 score