Showing 200 of total 609 results (show query)
r-lib
bit:Classes and Methods for Fast Memory-Efficient Boolean Selections
Provided are classes for boolean and skewed boolean vectors, fast boolean methods, fast unique and non-unique integer sorting, fast set operations on sorted and unsorted sets of integers, and foundations for ff (range index, compression, chunked processing).
Maintained by Michael Chirico. Last updated 7 days ago.
30.3 match 12 stars 15.15 score 131 scripts 3.2k dependentsrpolars
polars:Lightning-Fast 'DataFrame' Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Soren Welling. Last updated 4 days ago.
26.2 match 499 stars 12.01 score 1.0k scripts 2 dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
14.1 match 18 stars 16.05 score 1.0k scripts 1.9k dependentsinsightsengineering
rtables:Reporting Tables
Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.
Maintained by Joe Zhu. Last updated 2 months ago.
15.6 match 232 stars 13.65 score 238 scripts 17 dependentsr-lib
bit64:A S3 Class for Vectors of 64bit Integers
Package 'bit64' provides serializable S3 atomic 64bit (signed) integers. These are useful for handling database keys and exact counting in +-2^63. WARNING: do not use them as replacement for 32bit integers, integer64 are not supported for subscripting by R-core and they have different semantics when combined with double, e.g. integer64 + double => integer64. Class integer64 can be used in vectors, matrices, arrays and data.frames. Methods are available for coercion from and to logicals, integers, doubles, characters and factors as well as many elementwise and summary functions. Many fast algorithmic operations such as 'match' and 'order' support inter- active data exploration and manipulation and optionally leverage caching.
Maintained by Michael Chirico. Last updated 5 days ago.
11.6 match 35 stars 14.91 score 1.5k scripts 3.2k dependentsgpilgrim2670
SwimmeR:Data Import, Cleaning, and Conversions for Swimming Results
The goal of the 'SwimmeR' package is to provide means of acquiring, and then analyzing, data from swimming (and diving) competitions. To that end 'SwimmeR' allows results to be read in from .html sources, like 'Hy-Tek' real time results pages, '.pdf' files, 'ISL' results, 'Omega' results, and (on a development basis) '.hy3' files. Once read in, 'SwimmeR' can convert swimming times (performances) between the computationally useful format of seconds reported to the '100ths' place (e.g. 95.37), and the conventional reporting format (1:35.37) used in the swimming community. 'SwimmeR' can also score meets in a variety of formats with user defined point values, convert times between courses ('LCM', 'SCM', 'SCY') and draw single elimination brackets, as well as providing a suite of tools for working cleaning swimming data. This is a developmental package, not yet mature.
Maintained by Greg Pilgrim. Last updated 2 years ago.
32.6 match 4 stars 4.53 score 17 scriptsbioc
GenomicRanges:Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Maintained by Hervé Pagès. Last updated 4 months ago.
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
7.8 match 44 stars 17.75 score 13k scripts 1.3k dependentsherveabdi
DistatisR:DiSTATIS Three Way Metric Multidimensional Scaling
Implement DiSTATIS and CovSTATIS (three-way multidimensional scaling). DiSTATIS and CovSTATIS are used to analyze multiple distance/covariance matrices collected on the same set of observations. These methods are based on Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012) <doi:10.1002/wics.198>.
Maintained by Herve Abdi. Last updated 1 years ago.
3-way-mdsdistatismetric-multidimensional-scaling
29.9 match 4 stars 4.42 score 44 scriptshusson
SensoMineR:Sensory Data Analysis
Statistical Methods to Analyse Sensory Data. SensoMineR: A package for sensory data analysis. S. Le and F. Husson (2008).
Maintained by Francois Husson. Last updated 1 years ago.
22.8 match 5.72 score 108 scripts 3 dependentseitsupi
neopolars:R Bindings for the 'polars' Rust Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Tatsuya Shima. Last updated 6 hours ago.
24.9 match 40 stars 4.87 score 1 scriptsgluc
data.tree:General Purpose Hierarchical Data Structure
Create tree structures from hierarchical data, and traverse the tree in various orders. Aggregate, cumulate, print, plot, convert to and from data.frame and more. Useful for decision trees, machine learning, finance, conversion from and to JSON, and many other applications.
Maintained by Christoph Glur. Last updated 5 months ago.
9.0 match 209 stars 12.84 score 1.1k scripts 88 dependentsrdatatable
data.table:Extension of `data.frame`
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
Maintained by Tyson Barrett. Last updated 13 hours ago.
4.4 match 3.7k stars 23.53 score 230k scripts 4.6k dependentsbioc
BiocGenerics:S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructurebioconductor-packagecore-package
7.0 match 12 stars 14.22 score 612 scripts 2.2k dependentsr-forge
Matrix:Sparse and Dense Matrix Classes and Methods
A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.
Maintained by Martin Maechler. Last updated 8 days ago.
5.6 match 1 stars 17.23 score 33k scripts 12k dependentsbioc
SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data
SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.
Maintained by Guido Barzaghi. Last updated 29 days ago.
dnamethylationcoveragenucleosomepositioningdatarepresentationepigeneticsmethylseqqualitycontrolsequencing
14.7 match 2 stars 6.43 score 27 scriptsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 6 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
5.3 match 559 stars 17.64 score 17k scripts 851 dependentslarmarange
labelled:Manipulating Labelled Data
Work with labelled data imported from 'SPSS' or 'Stata' with 'haven' or 'foreign'. This package provides useful functions to deal with "haven_labelled" and "haven_labelled_spss" classes introduced by 'haven' package.
Maintained by Joseph Larmarange. Last updated 27 days ago.
havenlabelsmetadatasasspssstata
6.2 match 76 stars 15.02 score 2.4k scripts 96 dependentstalgalili
dendextend:Extending 'dendrogram' Functionality in R
Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Maintained by Tal Galili. Last updated 2 months ago.
5.4 match 154 stars 17.02 score 6.0k scripts 164 dependentsgagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
5.0 match 309 stars 18.31 score 10k scripts 8.6k dependentsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 7 days ago.
5.3 match 520 stars 16.52 score 1.4k scripts 38 dependentsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 13 hours ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
4.1 match 582 stars 21.11 score 31k scripts 1.9k dependentshwborchers
pracma:Practical Numerical Math Functions
Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.
Maintained by Hans W. Borchers. Last updated 1 years ago.
7.0 match 29 stars 12.34 score 6.6k scripts 931 dependentsjchiquet
aricode:Efficient Computations of Standard Clustering Comparison Measures
Implements an efficient O(n) algorithm based on bucket-sorting for fast computation of standard clustering comparison measures. Available measures include adjusted Rand index (ARI), normalized information distance (NID), normalized mutual information (NMI), adjusted mutual information (AMI), normalized variation information (NVI) and entropy, as described in Vinh et al (2009) <doi:10.1145/1553374.1553511>. Include AMI (Adjusted Mutual Information) since version 0.1.2, a modified version of ARI (MARI), as described in Sundqvist et al. <doi:10.1007/s00180-022-01230-7> and simple Chi-square distance since version 1.0.0.
Maintained by Julien Chiquet. Last updated 1 years ago.
bucket-sortclusteringclustering-comparison-measurescpp
10.3 match 25 stars 8.15 score 542 scripts 14 dependentswinvector
WVPlots:Common Plots for Analysis
Select data analysis plots, under a standardized calling interface implemented on top of 'ggplot2' and 'plotly'. Plots of interest include: 'ROC', gain curve, scatter plot with marginal distributions, conditioned scatter plot with marginal densities, box and stem with matching theoretical distribution, and density with matching theoretical distribution.
Maintained by John Mount. Last updated 11 months ago.
10.4 match 85 stars 8.00 score 280 scriptsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
6.4 match 290 stars 12.50 score 2.0k scripts 5 dependentseddelbuettel
nanotime:Nanosecond-Resolution Time Support for R
Full 64-bit resolution date and time functionality with nanosecond granularity is provided, with easy transition to and from the standard 'POSIXct' type. Three additional classes offer interval, period and duration functionality for nanosecond-resolution timestamps.
Maintained by Dirk Eddelbuettel. Last updated 1 months ago.
datetimedatetimesnanosecond-resolutionnanosecondscpp
7.2 match 53 stars 10.91 score 134 scripts 17 dependentsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 1 months ago.
arulesassociation-rulesfrequent-itemsets
5.6 match 194 stars 13.99 score 3.3k scripts 28 dependentsdbosak01
procs:Recreates Some 'SAS®' Procedures in 'R'
Contains functions to simulate the most commonly used 'SAS®' procedures. Specifically, the package aims to simulate the functionality of 'proc freq', 'proc means', 'proc ttest', 'proc reg', 'proc transpose', 'proc sort', and 'proc print'. The simulation will include recreating all statistics with the highest fidelity possible.
Maintained by David Bosak. Last updated 10 months ago.
9.8 match 6 stars 7.57 score 37 scripts 2 dependentsmatthewheun
matsbyname:An Implementation of Matrix Mathematics that Respects Row and Column Names
An implementation of matrix mathematics wherein operations are performed "by name."
Maintained by Matthew Heun. Last updated 11 days ago.
11.0 match 2 stars 6.65 score 150 scripts 1 dependentsjolars
SLOPE:Sorted L1 Penalized Estimation
Efficient implementations for Sorted L-One Penalized Estimation (SLOPE): generalized linear models regularized with the sorted L1-norm (Bogdan et al. 2015). Supported models include ordinary least-squares regression, binomial regression, multinomial regression, and Poisson regression. Both dense and sparse predictor matrices are supported. In addition, the package features predictor screening rules that enable fast and efficient solutions to high-dimensional problems.
Maintained by Johan Larsson. Last updated 3 days ago.
generalized-linear-modelsslopesparse-regressioncppopenmp
7.6 match 17 stars 9.62 score 75 scripts 3 dependentssnoweye
pbdMPI:R Interface to MPI for HPC Clusters (Programming with Big Data Project)
A simplified, efficient, interface to MPI for HPC clusters. It is a derivation and rethinking of the Rmpi package. pbdMPI embraces the prevalent parallel programming style on HPC clusters. Beyond the interface, a collection of functions for global work with distributed data and resource-independent RNG reproducibility is included. It is based on S4 classes and methods.
Maintained by Wei-Chen Chen. Last updated 6 months ago.
10.0 match 2 stars 7.11 score 179 scripts 3 dependentsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 12 days ago.
5.7 match 46 stars 12.34 score 1.2k scripts 13 dependentskkholst
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 10 hours ago.
multivariate-time-to-eventsurvival-analysistime-to-eventfortranopenblascpp
5.2 match 14 stars 13.47 score 236 scripts 42 dependentspboutros
bedr:Genomic Region Processing using Tools Such as 'BEDTools', 'BEDOPS' and 'Tabix'
Genomic regions processing using open-source command line tools such as 'BEDTools', 'BEDOPS' and 'Tabix'. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. bedr API enhances access to these tools as well as offers additional utilities for genomic regions processing.
Maintained by Paul C. Boutros. Last updated 6 years ago.
13.9 match 4.98 score 264 scripts 2 dependentsr-lib
vctrs:Vector Helpers
Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling, and are in turn connected to ideas of type- and size-stability useful for analysing function interfaces.
Maintained by Davis Vaughan. Last updated 5 months ago.
3.6 match 290 stars 18.97 score 1.1k scripts 13k dependentsforkonlp
N2H4:Handling Methods for Naver News Text Crawling
Provides some functions to get Korean text sample from news articles in Naver which is popular news portal service <https://news.naver.com/> in Korea.
Maintained by Chanyub Park. Last updated 1 years ago.
crawlercrawlinggetcommentshacktoberfesthacktoberfest2021koreannavernewssort
11.0 match 216 stars 6.11 score 20 scriptstidyverse
stringr:Simple, Consistent Wrappers for Common String Operations
A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.
Maintained by Hadley Wickham. Last updated 7 months ago.
3.0 match 628 stars 21.99 score 164k scripts 8.3k dependentsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 7 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
8.4 match 33 stars 7.77 score 10 scriptsrenkun-ken
rlist:A Toolbox for Non-Tabular Data Manipulation
Provides a set of functions for data manipulation with list objects, including mapping, filtering, grouping, sorting, updating, searching, and other useful functions. Most functions are designed to be pipeline friendly so that data processing with lists can be chained.
Maintained by Kun Ren. Last updated 2 years ago.
4.6 match 206 stars 13.73 score 2.2k scripts 123 dependentseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
6.6 match 118 stars 9.40 score 76 scriptspachadotdev
cpp11armadillo:An 'Armadillo' Interface
Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.
Maintained by Mauricio Vargas Sepulveda. Last updated 27 days ago.
armadillocppcpp11hacktoberfestlinear-algebra
6.7 match 9 stars 9.14 score 1 scripts 16 dependentsquanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 2 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
3.6 match 851 stars 16.68 score 5.4k scripts 51 dependentspaterijk
MCDA:Support for the Multicriteria Decision Aiding Process
Support for the analyst in a Multicriteria Decision Aiding (MCDA) process with algorithms, preference elicitation and data visualisation functions. Sébastien Bigaret, Richard Hodgett, Patrick Meyer, Tatyana Mironova, Alexandru Olteanu (2017) Supporting the multi-criteria decision aiding process : R and the MCDA package, Euro Journal On Decision Processes, Volume 5, Issue 1 - 4, pages 169 - 194 <doi:10.1007/s40070-017-0064-1>.
Maintained by Patrick Meyer. Last updated 2 years ago.
9.8 match 30 stars 6.04 score 182 scriptsbioc
Biostrings:Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Maintained by Hervé Pagès. Last updated 25 days ago.
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
3.3 match 61 stars 17.83 score 8.6k scripts 1.2k dependentssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 11 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
3.8 match 959 stars 15.16 score 4.0k scripts 21 dependentsbioc
SummarizedExperiment:A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Maintained by Hervé Pagès. Last updated 5 months ago.
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
3.3 match 34 stars 16.85 score 8.6k scripts 1.2k dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
6.8 match 3 stars 8.20 score 7.8k scripts 11 dependentsropensci
beautier:'BEAUti' from R
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAUti 2' (which is part of 'BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file 'BEAST2' needs to run. This package provides a way to create 'BEAST2' input files without active user input, but using R function calls instead.
Maintained by Richèl J.C. Bilderbeek. Last updated 24 days ago.
bayesianbeastbeast2beautiphylogenetic-inferencephylogenetics
6.3 match 13 stars 8.76 score 198 scripts 5 dependentsbioc
uSORT:uSORT: A self-refining ordering pipeline for gene selection
This package is designed to uncover the intrinsic cell progression path from single-cell RNA-seq data. It incorporates data pre-processing, preliminary PCA gene selection, preliminary cell ordering, feature selection, refined cell ordering, and post-analysis interpretation and visualization.
Maintained by Hao Chen. Last updated 5 months ago.
immunooncologyrnaseqguicellbiologydnaseq
16.2 match 3.30 scorearchaeostat
ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling
Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.
Maintained by Anne Philippe. Last updated 11 months ago.
archaeologybayesian-statisticsgeochronologymarkov-chainradiocarbon-dates
7.6 match 10 stars 6.90 score 66 scriptsgdemin
expss:Tables, Labels and Some Useful Functions from Spreadsheets and 'SPSS' Statistics
Package computes and displays tables with support for 'SPSS'-style labels, multiple and nested banners, weights, multiple-response variables and significance testing. There are facilities for nice output of tables in 'knitr', 'Shiny', '*.xlsx' files, R and 'Jupyter' notebooks. Methods for labelled variables add value labels support to base R functions and to some functions from other packages. Additionally, the package brings popular data transformation functions from 'SPSS' Statistics and 'Excel': 'RECODE', 'COUNT', 'COUNTIF', 'VLOOKUP' and etc. These functions are very useful for data processing in marketing research surveys. Package intended to help people to move data processing from 'Excel' and 'SPSS' to R.
Maintained by Gregory Demin. Last updated 11 months ago.
excellabelslabels-supportmsexcelpivot-tablesrecodespssspss-statisticstablesvariable-labelsvlookup
4.7 match 84 stars 11.00 score 1.8k scripts 4 dependentsturtletopia
versionsort:Sort and Order Version Codes
A lightweight package for sorting version codes in various forms. No strong dependencies guaranteed.
Maintained by Laura Bakala. Last updated 3 years ago.
13.2 match 5 stars 3.88 score 4 scripts 1 dependentsmatloff
partools:Tools for the 'Parallel' Package
Miscellaneous utilities for parallelizing large computations. Alternative to MapReduce. File splitting and distributed operations such as sort and aggregate. "Software Alchemy" method for parallelizing most statistical methods, presented in N. Matloff, Parallel Computation for Data Science, Chapman and Hall, 2015. Includes a debugging aid.
Maintained by Norm Matloff. Last updated 2 years ago.
6.6 match 40 stars 7.51 score 30 scripts 3 dependentscdueben
cppcontainers:'C++' Standard Template Library Containers
Use 'C++' Standard Template Library containers interactively in R. Includes sets, unordered sets, multisets, unordered multisets, maps, unordered maps, multimaps, unordered multimaps, stacks, queues, priority queues, vectors, deques, forward lists, and lists.
Maintained by Christian Düben. Last updated 2 months ago.
10.6 match 4.70 score 1 scriptsberndbischl
BBmisc:Miscellaneous Helper Functions for B. Bischl
Miscellaneous helper functions for and from B. Bischl and some other guys, mainly for package development.
Maintained by Bernd Bischl. Last updated 2 years ago.
4.6 match 20 stars 10.59 score 980 scripts 69 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 21 hours ago.
3.6 match 845 stars 13.60 score 264 scripts 2 dependentsinsightsengineering
tern:Create Common TLGs Used in Clinical Trials
Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.
Maintained by Joe Zhu. Last updated 2 months ago.
clinical-trialsgraphslistingsnestoutputstables
3.8 match 79 stars 12.62 score 186 scripts 9 dependentsa-dickerson
portsort:Factor-Based Portfolio Sorts
Designed to aid both academic researchers and asset managers in conducting factor based portfolio sorts. Provides functionality to sort assets into portfolios for up to three factors via a conditional or unconditional sorting procedure.
Maintained by Alex Dickerson. Last updated 6 years ago.
20.5 match 2.32 score 21 scriptssimongrund1
mitml:Tools for Multiple Imputation in Multilevel Modeling
Provides tools for multiple imputation of missing data in multilevel modeling. Includes a user-friendly interface to the packages 'pan' and 'jomo', and several functions for visualization, data management and the analysis of multiply imputed data sets.
Maintained by Simon Grund. Last updated 1 years ago.
imputationmissing-datamixed-effectsmultilevel-datamultilevel-models
3.8 match 29 stars 12.36 score 246 scripts 153 dependentskjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 11 months ago.
20.4 match 2.28 score 38 scriptsbioc
ShortRead:FASTQ input and manipulation
This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
dataimportsequencingqualitycontrolbioconductor-packagecore-packagezlibcpp
3.8 match 8 stars 12.08 score 1.8k scripts 49 dependentsbioc
GenomicAlignments:Representation and manipulation of short genomic alignments
Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.
Maintained by Hervé Pagès. Last updated 5 months ago.
infrastructuredataimportgeneticssequencingrnaseqsnpcoveragealignmentimmunooncologybioconductor-packagecore-package
3.3 match 10 stars 13.61 score 3.1k scripts 529 dependentsagisga
grpSLOPE:Group Sorted L1 Penalized Estimation
Group SLOPE (Group Sorted L1 Penalized Estimation) is a penalized linear regression method that is used for adaptive selection of groups of significant predictors in a high-dimensional linear model. The Group SLOPE method can control the (group) false discovery rate at a user-specified level (i.e., control the expected proportion of irrelevant among all selected groups of predictors). For additional information about the implemented methods please see Brzyski, Gossmann, Su, Bogdan (2018) <doi:10.1080/01621459.2017.1411269>.
Maintained by Alexej Gossmann. Last updated 2 years ago.
9.1 match 4 stars 4.93 score 43 scriptsdoi-usgs
hydroloom:Utilities to Weave Hydrologic Fabrics
A collection of utilities that support creation of network attributes for hydrologic networks. Methods and algorithms implemented are documented in Moore et al. (2019) <doi:10.3133/ofr20191096>), Cormen and Leiserson (2022) <ISBN:9780262046305> and Verdin and Verdin (1999) <doi:10.1016/S0022-1694(99)00011-6>.
Maintained by David Blodgett. Last updated 2 months ago.
5.3 match 28 stars 8.53 score 19 scripts 6 dependentsbioc
seqsetvis:Set Based Visualizations for Next-Gen Sequencing Data
seqsetvis enables the visualization and analysis of sets of genomic sites in next gen sequencing data. Although seqsetvis was designed for the comparison of mulitple ChIP-seq samples, this package is domain-agnostic and allows the processing of multiple genomic coordinate files (bed-like files) and signal files (bigwig files pileups from bam file). seqsetvis has multiple functions for fetching data from regions into a tidy format for analysis in data.table or tidyverse and visualization via ggplot2.
Maintained by Joseph R Boyd. Last updated 3 months ago.
softwarechipseqmultiplecomparisonsequencingvisualization
7.3 match 5.82 score 82 scriptschristopherkenny
name:Tools for Working with Names
A system for organizing column names in data. Aimed at supporting a prefix-based and suffix-based column naming scheme. Extends 'dplyr' functionality to add ordering by function and more explicit renaming.
Maintained by Christopher T. Kenny. Last updated 3 years ago.
6.8 match 2 stars 6.28 score 19k scriptsmartin3141
spant:MR Spectroscopy Analysis Tools
Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.
Maintained by Martin Wilson. Last updated 1 months ago.
brainmrimrsmrshubspectroscopyfortran
4.9 match 25 stars 8.52 score 81 scriptsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 13 days ago.
3.3 match 418 stars 12.50 score 448 scripts 9 dependentsopengeos
whitebox:'WhiteboxTools' R Frontend
An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.
Maintained by Andrew Brown. Last updated 5 months ago.
geomorphometrygeoprocessinggeospatialgishydrologyremote-sensingrstudio
4.3 match 173 stars 9.65 score 203 scripts 2 dependentswinvector
wrapr:Wrap R Tools for Debugging and Parametric Programming
Tools for writing and debugging R code. Provides: '%.>%' dot-pipe (an 'S3' configurable pipe), unpack/to (R style multiple assignment/return), 'build_frame()'/'draw_frame()' ('data.frame' example tools), 'qc()' (quoting concatenate), ':=' (named map builder), 'let()' (converts non-standard evaluation interfaces to parametric standard evaluation interfaces, inspired by 'gtools::strmacro()' and 'base::bquote()'), and more.
Maintained by John Mount. Last updated 2 years ago.
3.7 match 137 stars 11.11 score 390 scripts 12 dependentsms609
TreeTools:Create, Modify and Analyse Phylogenetic Trees
Efficient implementations of functions for the creation, modification and analysis of phylogenetic trees. Applications include: generation of trees with specified shapes; tree rearrangement; analysis of tree shape; rooting of trees and extraction of subtrees; calculation and depiction of split support; plotting the position of rogue taxa (Klopfstein & Spasojevic 2019) <doi:10.1371/journal.pone.0212942>; calculation of ancestor-descendant relationships, of 'stemwardness' (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>, and of tree balance (Mir et al. 2013, Lemant et al. 2022) <doi:10.1016/j.mbs.2012.10.005>, <doi:10.1093/sysbio/syac027>; artificial extinction (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>; import and export of trees from Newick, Nexus (Maddison et al. 1997) <doi:10.1093/sysbio/46.4.590>, and TNT <https://www.lillo.org.ar/phylogeny/tnt/> formats; and analysis of splits and cladistic information.
Maintained by Martin R. Smith. Last updated 1 months ago.
evolutionary-biologyphylogenetic-treesphylogeneticscpp
4.1 match 21 stars 9.92 score 124 scripts 10 dependentseriqande
rubias:Bayesian Inference from the Conditional Genetic Stock Identification Model
Implements Bayesian inference for the conditional genetic stock identification model. It allows inference of mixed fisheries and also simulation of mixtures to predict accuracy. A full description of the underlying methods is available in a recently published article in the Canadian Journal of Fisheries and Aquatic Sciences: <doi:10.1139/cjfas-2018-0016>.
Maintained by Eric C. Anderson. Last updated 1 years ago.
6.9 match 3 stars 5.90 score 89 scriptstidy-finance
tidyfinance:Tidy Finance Helper Functions
Helper functions for empirical research in financial economics, addressing a variety of topics covered in Scheuch, Voigt, and Weiss (2023) <doi:10.1201/b23237>. The package is designed to provide shortcuts for issues extensively discussed in the book, facilitating easier application of its concepts. For more information and resources related to the book, visit <https://www.tidy-finance.org/r/index.html>.
Maintained by Christoph Scheuch. Last updated 3 months ago.
5.4 match 15 stars 7.48 score 24 scriptsjakobbossek
ecr:Evolutionary Computation in R
Framework for building evolutionary algorithms for both single- and multi-objective continuous or discrete optimization problems. A set of predefined evolutionary building blocks and operators is included. Moreover, the user can easily set up custom objective functions, operators, building blocks and representations sticking to few conventions. The package allows both a black-box approach for standard tasks (plug-and-play style) and a much more flexible white-box approach where the evolutionary cycle is written by hand.
Maintained by Jakob Bossek. Last updated 1 years ago.
combinatorial-optimizationevolutionary-algorithmevolutionary-algorithmsevolutionary-strategygenetic-algorithm-frameworkmetaheuristicsmulti-objective-optimizationoptimizationoptimization-frameworkcpp
5.5 match 43 stars 7.36 score 89 scripts 2 dependentsbioc
scifer:Scifer: Single-Cell Immunoglobulin Filtering of Sanger Sequences
Have you ever index sorted cells in a 96 or 384-well plate and then sequenced using Sanger sequencing? If so, you probably had some struggles to either check the electropherogram of each cell sequenced manually, or when you tried to identify which cell was sorted where after sequencing the plate. Scifer was developed to solve this issue by performing basic quality control of Sanger sequences and merging flow cytometry data from probed single-cell sorted B cells with sequencing data. scifer can export summary tables, 'fasta' files, electropherograms for visual inspection, and generate reports.
Maintained by Rodrigo Arcoverde Cerveira. Last updated 3 months ago.
preprocessingqualitycontrolsangerseqsequencingsoftwareflowcytometrysinglecell
7.1 match 5 stars 5.54 score 9 scriptslarmarange
ggstats:Extension to 'ggplot2' for Plotting Stats
Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots.
Maintained by Joseph Larmarange. Last updated 7 days ago.
3.0 match 37 stars 13.08 score 190 scripts 156 dependentsbiometry
bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices
Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.
Maintained by Carsten F. Dormann. Last updated 8 days ago.
3.6 match 37 stars 10.93 score 592 scripts 15 dependentslrberge
stringmagic:Character String Operations and Interpolation, Magic Edition
Performs complex string operations compactly and efficiently. Supports string interpolation jointly with over 50 string operations. Also enhances regular string functions (like grep() and co). See an introduction at <https://lrberge.github.io/stringmagic/>.
Maintained by Laurent R Berge. Last updated 7 months ago.
3.7 match 15 stars 10.56 score 37 scripts 33 dependentsshackett
romic:R for High-Dimensional Omic Data
Represents high-dimensional data as tables of features, samples and measurements, and a design list for tracking the meaning of individual variables. Using this format, filtering, normalization, and other transformations of a dataset can be carried out in a flexible manner. 'romic' takes advantage of these transformations to create interactive 'shiny' apps for exploratory data analysis such as an interactive heatmap.
Maintained by Sean Hackett. Last updated 1 years ago.
14.4 match 1 stars 2.70 score 10 scriptsbioc
InteractionSet:Base Classes for Storing Genomic Interaction Data
Provides the GInteractions, InteractionSet and ContactMatrix objects and associated methods for storing and manipulating genomic interaction data from Hi-C and ChIA-PET experiments.
Maintained by Aaron Lun. Last updated 5 months ago.
infrastructuredatarepresentationsoftwarehiccpp
4.8 match 7.97 score 250 scripts 36 dependentstherneau
survival:Survival Analysis
Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models.
Maintained by Terry M Therneau. Last updated 3 months ago.
1.9 match 400 stars 20.43 score 29k scripts 3.9k dependentsmsberends
plot2:A Plotting Assistant for Fast 'ggplot2' Visualisations
A streamlined extension of 'ggplot2' designed to simplify and accelerate the creation of data visualisations. 'plot2' automates common tasks such as axis handling, plot type selection, and data transformation, allowing users to create complex, publication-ready plots with minimal code. It integrates seamlessly with the tidyverse and retains full compatibility with 'ggplot2', while offering additional conveniences like enhanced sorting, faceting, and custom theming.
Maintained by Matthijs S. Berends. Last updated 9 days ago.
ggplot2helperplottingtidyverse
8.8 match 1 stars 4.32 score 8 scripts 1 dependentsjmw86069
jamba:Just Analysis Methods Base
Just analysis methods ('jam') base functions focused on bioinformatics. Version- and gene-centric alphanumeric sort, unique name and version assignment, colorized console and 'HTML' output, color ramp and palette manipulation, 'Rmarkdown' cache import, styled 'Excel' worksheet import and export, interpolated raster output from smooth scatter and image plots, list to delimited vector, efficient list tools.
Maintained by James M. Ward. Last updated 10 days ago.
7.0 match 6 stars 5.43 scorepecanproject
PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Istem Fer. Last updated 3 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
3.8 match 216 stars 9.94 score 20 scripts 2 dependentsevolutionary-optimization-laboratory
rmoo:Multi-Objective Optimization in R
The 'rmoo' package is a framework for multi- and many-objective optimization, which allows researchers and users versatility in parameter configuration, as well as tools for analysis, replication and visualization of results. The 'rmoo' package was built as a fork of the 'GA' package by Luca Scrucca(2017) <DOI:10.32614/RJ-2017-008> and implementing the Non-Dominated Sorting Genetic Algorithms proposed by K. Deb's.
Maintained by Francisco Benitez. Last updated 5 months ago.
metaheuristicsmultiobjectivemultiobjective-optimizationnsgansga2nsga3optimizationpareto-front
7.5 match 30 stars 5.01 score 23 scriptssbgraves237
sos:Search Contributed R Packages, Sort by Package
Search contributed R packages, sort by package.
Maintained by Spencer Graves. Last updated 9 months ago.
5.5 match 2 stars 6.82 score 241 scripts 3 dependentsguokai8
fctutils:Advanced Factor Manipulation Utilities
Provides a collection of utility functions for manipulating and analyzing factor vectors in R. It offers tools for filtering, splitting, combining, and reordering factor levels based on various criteria. The package is designed to simplify common tasks in categorical data analysis, making it easier to work with factors in a flexible and efficient manner.
Maintained by Kai Guo. Last updated 5 months ago.
8.1 match 2 stars 4.60 score 4 scriptsleonawicz
tabr:Music Notation Syntax, Manipulation, Analysis and Transcription in R
Provides a music notation syntax and a collection of music programming functions for generating, manipulating, organizing, and analyzing musical information in R. Music syntax can be entered directly in character strings, for example to quickly transcribe short pieces of music. The package contains functions for directly performing various mathematical, logical and organizational operations and musical transformations on special object classes that facilitate working with music data and notation. The same music data can be organized in tidy data frames for a familiar and powerful approach to the analysis of large amounts of structured music data. Functions are available for mapping seamlessly between these formats and their representations of musical information. The package also provides an API to 'LilyPond' (<https://lilypond.org/>) for transcribing musical representations in R into tablature ("tabs") and sheet music. 'LilyPond' is open source music engraving software for generating high quality sheet music based on markup syntax. The package generates 'LilyPond' files from R code and can pass them to the 'LilyPond' command line interface to be rendered into sheet music PDF files or inserted into R markdown documents. The package offers nominal MIDI file output support in conjunction with rendering sheet music. The package can read MIDI files and attempts to structure the MIDI data to integrate as best as possible with the data structures and functionality found throughout the package.
Maintained by Matthew Leonawicz. Last updated 6 months ago.
guitar-tablaturelilypondlilypond-apimusic-analysismusic-datamusic-notationmusic-programmingmusic-syntaxmusic-transcriptionsheet-music
4.8 match 132 stars 7.87 score 94 scriptsbioc
VariantFiltering:Filtering of coding and non-coding genetic variants
Filter genetic variants using different criteria such as inheritance model, amino acid change consequence, minor allele frequencies across human populations, splice site strength, conservation, etc.
Maintained by Robert Castelo. Last updated 2 months ago.
geneticshomo_sapiensannotationsnpsequencinghighthroughputsequencing
6.0 match 4 stars 6.23 score 21 scriptslbb220
GWmodel:Geographically-Weighted Models
Techniques from a particular branch of spatial statistics,termed geographically-weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. 'GWmodel' includes functions to calibrate: GW summary statistics (Brunsdon et al., 2002)<doi: 10.1016/s0198-9715(01)00009-6>, GW principal components analysis (Harris et al., 2011)<doi: 10.1080/13658816.2011.554838>, GW discriminant analysis (Brunsdon et al., 2007)<doi: 10.1111/j.1538-4632.2007.00709.x> and various forms of GW regression (Brunsdon et al., 1996)<doi: 10.1111/j.1538-4632.1996.tb00936.x>; some of which are provided in basic and robust (outlier resistant) forms.
Maintained by Binbin Lu. Last updated 6 months ago.
5.7 match 18 stars 6.38 score 266 scripts 4 dependentscalvagone
campsismod:Generic Implementation of a PK/PD Model
A generic, easy-to-use and expandable implementation of a pharmacokinetic (PK) / pharmacodynamic (PD) model based on the S4 class system. This package allows the user to read/write a pharmacometric model from/to files and adapt it further on the fly in the R environment. For this purpose, this package provides an intuitive API to add, modify or delete equations, ordinary differential equations (ODE's), model parameters or compartment properties (like infusion duration or rate, bioavailability and initial values). Finally, this package also provides a useful export of the model for use with simulation packages 'rxode2' and 'mrgsolve'. This package is designed and intended to be used with package 'campsis', a PK/PD simulation platform built on top of 'rxode2' and 'mrgsolve'.
Maintained by Nicolas Luyckx. Last updated 12 hours ago.
5.3 match 5 stars 6.82 score 42 scripts 1 dependentsajrgodfrey
BrailleR:Improved Access for Blind Users
Blind users do not have access to the graphical output from R without printing the content of graphics windows to an embosser of some kind. This is not as immediate as is required for efficient access to statistical output. The functions here are created so that blind people can make even better use of R. This includes the text descriptions of graphs, convenience functions to replace the functionality offered in many GUI front ends, and experimental functionality for optimising graphical content to prepare it for embossing as tactile images.
Maintained by A. Jonathan R. Godfrey. Last updated 11 months ago.
4.0 match 123 stars 8.90 score 143 scriptstguillerme
dispRity:Measuring Disparity
A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.
Maintained by Thomas Guillerme. Last updated 3 days ago.
disparityecologymultidimensionalitypalaeobiology
4.1 match 26 stars 8.69 score 220 scripts 1 dependentszarquon42b
Morpho:Calculations and Visualisations Related to Geometric Morphometrics
A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.
Maintained by Stefan Schlager. Last updated 5 months ago.
3.5 match 51 stars 10.00 score 218 scripts 13 dependentsbioc
MACSQuantifyR:Fast treatment of MACSQuantify FACS data
Automatically process the metadata of MACSQuantify FACS sorter. It runs multiple modules: i) imports of raw file and graphical selection of duplicates in well plate, ii) computes statistics on data and iii) can compute combination index.
Maintained by Raphaël Bonnet. Last updated 5 months ago.
dataimportpreprocessingnormalizationflowcytometrydatarepresentationgui
7.8 match 4.48 score 3 scriptstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 7 days ago.
1.9 match 584 stars 18.71 score 7.2k scripts 380 dependentszdebruine
RcppML:Rcpp Machine Learning Library
Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.
Maintained by Zach DeBruine. Last updated 2 years ago.
clusteringmatrix-factorizationnmfrcpprcppeigensparse-matrixcppopenmp
3.3 match 104 stars 10.53 score 125 scripts 46 dependentsgrunwaldlab
metacoder:Tools for Parsing, Manipulating, and Graphing Taxonomic Abundance Data
Reads, plots, and manipulates large taxonomic data sets, like those generated from modern high-throughput sequencing, such as metabarcoding (i.e. amplification metagenomics, 16S metagenomics, etc). It provides a tree-based visualization called "heat trees" used to depict statistics for every taxon in a taxonomy using color and size. It also provides various functions to do common tasks in microbiome bioinformatics on data in the 'taxmap' format defined by the 'taxa' package. The 'metacoder' package is described in the publication by Foster et al. (2017) <doi:10.1371/journal.pcbi.1005404>.
Maintained by Zachary Foster. Last updated 1 months ago.
community-diversityhierarchicalmetabarcodingpcrtaxonomytreescpp
3.6 match 140 stars 9.64 score 328 scriptsdankelley
oce:Analysis of Oceanographic Data
Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Maintained by Dan Kelley. Last updated 2 days ago.
2.3 match 146 stars 15.42 score 4.2k scripts 18 dependentshrbrmstr
vegalite:Tools to Encode Visualizations with the 'Grammar of Graphics'-Like 'Vega-Lite' 'Spec'
The 'Vega-Lite' 'JavaScript' framework provides a higher-level grammar for visual analysis, akin to 'ggplot' or 'Tableau', that generates complete 'Vega' specifications. Functions exist which enable building a valid 'spec' from scratch or importing a previously created 'spec' file. Functions also exist to export 'spec' files and to generate code which will enable plots to be embedded in properly configured web pages. The default behavior is to generate an 'htmlwidget'.
Maintained by Bob Rudis. Last updated 7 years ago.
data-visualizationdatavisualizationvega-litevega-lite-specvisualizationwidget
4.5 match 158 stars 7.60 score 84 scriptsbioc
methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results
methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.
Maintained by Altuna Akalin. Last updated 17 days ago.
dnamethylationsequencingmethylseqgenome-biologymethylationstatistical-analysisvisualizationcurlbzip2xz-utilszlibcpp
2.9 match 220 stars 11.80 score 578 scripts 3 dependentstingtingzhan
fmx:Finite Mixture Parametrization
A parametrization framework for finite mixture distribution using S4 objects. Density, cumulative density, quantile and simulation functions are defined. Currently normal, Tukey g-&-h, skew-normal and skew-t distributions are well tested. The gamma, negative binomial distributions are being tested.
Maintained by Tingting Zhan. Last updated 2 days ago.
10.8 match 3.18 score 7 scripts 1 dependentskos59125
naturalsort:Natural Ordering
Provides functions related to human natural ordering. It handles adjacent digits in a character sequence as a number so that natural sort function arranges a character vector by their numbers, not digit characters. It is typically seen when operating systems lists file names. For example, a sequence a-1.png, a-2.png, a-10.png looks naturally ordered because 1 < 2 < 10 and natural sort algorithm arranges so whereas general sort algorithms arrange it into a-1.png, a-10.png, a-2.png owing to their third and fourth characters.
Maintained by Kosei Abe. Last updated 9 years ago.
4.9 match 9 stars 6.86 score 201 scripts 19 dependentsdavid-barnett
microViz:Microbiome Data Analysis and Visualization
Microbiome data visualization and statistics tools built upon phyloseq.
Maintained by David Barnett. Last updated 3 months ago.
microbiomemicrobiome-analysismicrobiota
5.3 match 114 stars 6.22 score 480 scriptstidyverse
forcats:Tools for Working with Categorical Variables (Factors)
Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').
Maintained by Hadley Wickham. Last updated 1 years ago.
1.7 match 555 stars 18.77 score 21k scripts 1.2k dependentsddsjoberg
gtsummary:Presentation-Ready Data Summary and Analytic Result Tables
Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.
Maintained by Daniel D. Sjoberg. Last updated 3 days ago.
easy-to-usegthtml5regression-modelsreproducibilityreproducible-researchstatisticssummary-statisticssummary-tablestable1tableone
1.9 match 1.1k stars 17.00 score 8.2k scripts 15 dependentsharrelfe
Hmisc:Harrell Miscellaneous
Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.
Maintained by Frank E Harrell Jr. Last updated 1 days ago.
1.8 match 210 stars 17.61 score 17k scripts 750 dependentsstatnet
statnet.common:Common R Scripts and Utilities Used by the Statnet Project Software
Non-statistical utilities used by the software developed by the Statnet Project. They may also be of use to others.
Maintained by Pavel N. Krivitsky. Last updated 28 days ago.
2.7 match 8 stars 11.42 score 197 scripts 148 dependentsbioc
GenomicTuples:Representation and Manipulation of Genomic Tuples
GenomicTuples defines general purpose containers for storing genomic tuples. It aims to provide functionality for tuples of genomic co-ordinates that are analogous to those available for genomic ranges in the GenomicRanges Bioconductor package.
Maintained by Peter Hickey. Last updated 4 months ago.
infrastructuredatarepresentationsequencingcpp
5.6 match 4 stars 5.48 score 7 scriptsropensci
RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.
Maintained by Mathew W. McLean. Last updated 4 months ago.
2.5 match 115 stars 12.06 score 2.3k scripts 16 dependentsropensci
git2rdata:Store and Retrieve Data.frames in a Git Repository
The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.
Maintained by Thierry Onkelinx. Last updated 2 months ago.
reproducible-researchversion-control
3.0 match 99 stars 10.03 score 216 scripts 4 dependentsr-lib
lintr:A 'Linter' for R Code
Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.
Maintained by Michael Chirico. Last updated 6 hours ago.
1.8 match 1.2k stars 17.00 score 916 scripts 33 dependentsmayoverse
arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.
Maintained by Ethan Heinzen. Last updated 7 months ago.
baseline-characteristicsdescriptive-statisticsmodelingpaired-comparisonsreportingstatisticstableone
2.2 match 225 stars 13.45 score 1.2k scripts 16 dependentseasystats
parameters:Processing of Model Parameters
Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution).
Maintained by Daniel Lüdecke. Last updated 3 days ago.
betabootstrapciconfidence-intervalsdata-reductioneasystatsfafeature-extractionfeature-reductionhacktoberfestparameterspcapvaluesregression-modelsrobust-statisticsstandardizestandardized-estimatesstatistical-models
1.9 match 453 stars 15.65 score 1.8k scripts 56 dependentsdavidnsousa
qsort:Scoring Q-Sort Data
Computes scores from Q-sort data, using criteria sorts and derived scales from subsets of items. The 'qsort' package includes descriptions and scoring procedures for four different Q-sets commonly used in developmental psychology research: Attachment Q-set (version 3.0) (Waters, 1995, <doi:10.1111/j.1540-5834.1995.tb00214.x>); California Child Q-set (Block and Block, 1969, <doi:10.1037/0012-1649.21.3.508>); Maternal Behaviour Q-set (version 3.1) (Pederson et al., 1999, <https://ir.lib.uwo.ca/cgi/viewcontent.cgi?article=1000&context=psychologypub>); Preschool Q-set (Baumrind, 1968 revised by Wanda Bronson, <doi:10.1111/j.1540-5834.1995.tb00214.x>).
Maintained by David N Sousa. Last updated 6 years ago.
10.6 match 2.70 score 8 scriptsalan9956
nsga2R:Elitist Non-Dominated Sorting Genetic Algorithm
Box-constrained multiobjective optimization using the elitist non-dominated sorting genetic algorithm - NSGA-II. Fast non-dominated sorting, crowding distance, tournament selection, simulated binary crossover, and polynomial mutation are called in the main program. The methods are described in Deb et al. (2002) <doi:10.1109/4235.996017>.
Maintained by Ming-Chang (Alan) Lee. Last updated 3 years ago.
9.0 match 3.12 score 29 scripts 5 dependentsbioc
STRINGdb:STRINGdb - Protein-Protein Interaction Networks and Functional Enrichment Analysis
The STRINGdb package provides a R interface to the STRING protein-protein interactions database (https://string-db.org).
Maintained by Damian Szklarczyk. Last updated 5 months ago.
3.4 match 8.08 score 344 scripts 9 dependentskcuilla
reactablefmtr:Streamlined Table Styling and Formatting for Reactable
Provides various features to streamline and enhance the styling of interactive reactable tables with easy-to-use and highly-customizable functions and themes. Apply conditional formatting to cells with data bars, color scales, color tiles, and icon sets. Utilize custom table themes inspired by popular websites such and bootstrap themes. Apply sparkline line & bar charts (note this feature requires the 'dataui' package which can be downloaded from <https://github.com/timelyportfolio/dataui>). Increase the portability and reproducibility of reactable tables by embedding images from the web directly into cells. Save the final table output as a static image or interactive file.
Maintained by Kyle Cuilla. Last updated 2 years ago.
customizationdata-visualizationeasy-to-usereproducibletables
3.1 match 209 stars 8.79 score 460 scripts 4 dependentsspkaluzny
splusTimeDate:Times and Dates from 'S-PLUS'
A collection of classes and methods for working with times and dates. The code was originally available in 'S-PLUS'.
Maintained by Stephen Kaluzny. Last updated 2 months ago.
5.6 match 4.94 score 58 scripts 2 dependentsdbosak01
common:Solutions for Common Problems in Base R
Contains functions for solving commonly encountered problems while programming in R. This package is intended to provide a lightweight supplement to Base R, and will be useful for almost any R user.
Maintained by David Bosak. Last updated 12 months ago.
3.3 match 6 stars 8.21 score 193 scripts 11 dependentsspatstat
spatstat.geom:Geometrical Functionality of the 'spatstat' Family
Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)
Maintained by Adrian Baddeley. Last updated 15 hours ago.
classes-and-objectsdistance-calculationgeometrygeometry-processingimagesmensurationplottingpoint-patternsspatial-dataspatial-data-analysis
2.3 match 7 stars 12.11 score 241 scripts 227 dependentshzhanghenry
RCircos:Circos 2D Track Plot
A simple and flexible way to generate Circos 2D track plot images for genomic data visualization is implemented in this package. The types of plots include: heatmap, histogram, lines, scatterplot, tiles and plot items for further decorations include connector, link (lines and ribbons), and text (gene) label. All functions require only R graphics package that comes with R base installation.
Maintained by Hongen Zhang. Last updated 3 years ago.
3.8 match 6 stars 7.21 score 298 scripts 3 dependentshojsgaard
doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities
Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.
Maintained by Søren Højsgaard. Last updated 5 days ago.
1.8 match 1 stars 14.94 score 3.2k scripts 939 dependentspoissonconsulting
chk:Check User-Supplied Function Arguments
For developers to check user-supplied function arguments. It is designed to be simple, fast and customizable. Error messages follow the tidyverse style guide.
Maintained by Joe Thorley. Last updated 2 months ago.
2.3 match 48 stars 11.89 score 22 scripts 95 dependentsskoestlmeier
monotonicity:Test for Monotonicity in Expected Asset Returns, Sorted by Portfolios
Test for monotonicity in financial variables sorted by portfolios. It is conventional practice in empirical research to form portfolios of assets ranked by a certain sort variable. A t-test is then used to consider the mean return spread between the portfolios with the highest and lowest values of the sort variable. Yet comparing only the average returns on the top and bottom portfolios does not provide a sufficient way to test for a monotonic relation between expected returns and the sort variable. This package provides nonparametric tests for the full set of monotonic patterns by Patton, A. and Timmermann, A. (2010) <doi:10.1016/j.jfineco.2010.06.006> and compares the proposed results with extant alternatives such as t-tests, Bonferroni bounds, and multivariate inequality tests through empirical applications and simulations.
Maintained by Siegfried Köstlmeier. Last updated 3 years ago.
monotonicityportfolio-analysis
7.1 match 10 stars 3.70 score 10 scriptspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
3.3 match 49 stars 7.96 score 311 scriptsimpaug
UpAndDownPlots:Displays Percentage and Absolute Changes
Displays percentage changes by height and absolute changes by area for up to three nested or non-nested levels. The plots visualise changes in indices and markets, showing how the changes for sectors or for individual components contribute to the overall change. Data can be classified by up to three levels of grouping variables in a layered, hierarchical plot. Each level can be ordered in several ways including by baseline, by percentage change, and by absolute change. The vignettes give examples.
Maintained by Antony Unwin. Last updated 12 months ago.
8.0 match 3.30 score 6 scriptsdmurdoch
rgl:3D Visualization Using OpenGL
Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.). Output may be on screen using OpenGL, or to various standard 3D file formats including WebGL, PLY, OBJ, STL as well as 2D image formats, including PNG, Postscript, SVG, PGF.
Maintained by Duncan Murdoch. Last updated 2 months ago.
graphicsopenglrglwebgllibglulibglvndlibpnglibx11freetypecpp
1.5 match 91 stars 17.49 score 7.3k scripts 300 dependentsludvigolsen
rearrr:Rearranging Data
Arrange data by a set of methods. Use rearrangers to reorder data points and mutators to change their values. From basic utilities, to centering the greatest value, to swirling in 3-dimensional space, 'rearrr' enables creativity when plotting and experimenting with data.
Maintained by Ludvig Renbo Olsen. Last updated 11 days ago.
arrangeclusterexpandforminggenerateggplot2orderplotting-in-rrollrotateshapingswirltransformations
3.6 match 24 stars 7.26 score 128 scripts 8 dependentsbranchlab
metasnf:Meta Clustering with Similarity Network Fusion
Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.
Maintained by Prashanth S Velayudhan. Last updated 6 days ago.
bioinformaticsclusteringmetaclusteringsnf
3.2 match 8 stars 8.21 score 30 scriptsjmbarbone
mark:Miscellaneous, Analytic R Kernels
Miscellaneous functions and wrappers for development in other packages created, maintained by Jordan Mark Barbone.
Maintained by Jordan Mark Barbone. Last updated 1 months ago.
5.3 match 6 stars 4.95 score 9 scriptsbioc
podkat:Position-Dependent Kernel Association Test
This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.
Maintained by Ulrich Bodenhofer. Last updated 5 months ago.
geneticswholegenomeannotationvariantannotationsequencingdataimportcurlbzip2xz-utilszlibcpp
5.2 match 5.02 score 6 scriptsmb4511
DTwrappers:Simplified Data Analysis with Wrapper Functions for the 'Data.Table' Package
Provides functionality for users who are learning R or the techniques of data analysis. Written as a collection of wrapper functions, the 'DTwrapper' package facilitates many core operations of data processing. This is achieved with relatively few requirements about the order of the processing steps or knowledge of specialized syntax. 'DTwrappers' creates coding results along with translations to data.table's code. This enables users to benefit from the speed and efficiency of data.table's calculations. Furthermore, the package also provides the translated code for educational purposes so that users can review working examples of coding syntax and calculations.
Maintained by Mayur Bansal. Last updated 4 years ago.
9.2 match 2.78 score 9 scripts 2 dependentsips-lmu
emuR:Main Package of the EMU Speech Database Management System
Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.
Maintained by Markus Jochim. Last updated 1 years ago.
3.7 match 24 stars 6.89 score 135 scripts 1 dependentssherrillmix
taxonomizr:Functions to Work with NCBI Accessions and Taxonomy
Functions for assigning taxonomy to NCBI accession numbers and taxon IDs based on NCBI's accession2taxid and taxdump files. This package allows the user to download NCBI data dumps and create a local database for fast and local taxonomic assignment.
Maintained by Scott Sherrill-Mix. Last updated 5 days ago.
2.9 match 72 stars 8.85 score 255 scripts 2 dependentsbioc
Rsamtools:Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import
This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), FASTA, binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.
Maintained by Bioconductor Package Maintainer. Last updated 4 months ago.
dataimportsequencingcoveragealignmentqualitycontrolbioconductor-packagecore-packagecurlbzip2xz-utilszlibcpp
1.7 match 28 stars 15.42 score 3.2k scripts 566 dependentsjosherrickson
rlemon:R Access to LEMON Graph Algorithms
Allows easy access to the LEMON Graph Library set of algorithms, written in C++. See the LEMON project page at <https://lemon.cs.elte.hu/trac/lemon>. Current LEMON version is 1.3.1.
Maintained by Josh Errickson. Last updated 2 months ago.
3.6 match 8 stars 7.04 score 1 scripts 13 dependentsbioc
plyranges:A fluent interface for manipulating GenomicRanges
A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.
Maintained by Michael Love. Last updated 5 months ago.
infrastructuredatarepresentationworkflowstepcoveragebioconductordata-analysisdplyrgenomic-rangesgenomicstidy-data
2.0 match 143 stars 12.60 score 1.9k scripts 20 dependentsdmpe
urlshorteneR:R Wrapper for the 'Bit.ly' and 'Is.gd'/'v.gd' URL Shortening Services
Allows using two URL shortening services, which also provide expanding and analytic functions. Specifically developed for 'Bit.ly' (which requires OAuth 2.0) and 'is.gd' (no API key).
Maintained by John Malc. Last updated 29 days ago.
bitlyisgdshorten-urlsshortenershorturlurl
3.8 match 21 stars 6.70 score 53 scripts 1 dependentspik-piam
magclass:Data Class and Tools for Handling Spatial-Temporal Data
Data class for increased interoperability working with spatial-temporal data together with corresponding functions and methods (conversions, basic calculations and basic data manipulation). The class distinguishes between spatial, temporal and other dimensions to facilitate the development and interoperability of tools build for it. Additional features are name-based addressing of data and internal consistency checks (e.g. checking for the right data order in calculations).
Maintained by Jan Philipp Dietrich. Last updated 11 days ago.
2.3 match 5 stars 11.16 score 412 scripts 56 dependentsjrowen
rhandsontable:Interface to the 'Handsontable.js' Library
An R interface to the 'Handsontable' JavaScript library, which is a minimalist Excel-like data grid editor. See <https://handsontable.com/> for details.
Maintained by Jonathan Owen. Last updated 3 years ago.
handsontablehtmlwidgetsjavascriptshinysparkline
2.0 match 389 stars 12.31 score 1.0k scripts 46 dependentsbioc
pcaMethods:A collection of PCA methods
Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.
Maintained by Henning Redestig. Last updated 5 months ago.
1.9 match 49 stars 13.10 score 538 scripts 73 dependentseasystats
correlation:Methods for Correlation Analysis
Lightweight package for computing different kinds of correlations, such as partial correlations, Bayesian correlations, multilevel correlations, polychoric correlations, biweight correlations, distance correlations and more. Part of the 'easystats' ecosystem. References: Makowski et al. (2020) <doi:10.21105/joss.02306>.
Maintained by Brenton M. Wiernik. Last updated 14 days ago.
bayesianbayesian-correlationsbiserialcorcorrelationcorrelation-analysiscorrelationseasystatsgammagaussian-graphical-modelshacktoberfestmatrixmultilevel-correlationsoutlierspartialpartial-correlationsregressionrobustspearman
1.7 match 439 stars 14.23 score 672 scripts 10 dependentspharmaverse
metatools:Enable the Use of 'metacore' to Help Create and Check Dataset
Uses the metadata information stored in 'metacore' objects to check and build metadata associated columns.
Maintained by Christina Fillmore. Last updated 2 months ago.
3.9 match 10 stars 6.26 score 120 scriptsbioc
HarmonizR:Handles missing values and makes more data available
An implementation, which takes input data and makes it available for proper batch effect removal by ComBat or Limma. The implementation appropriately handles missing values by dissecting the input matrix into smaller matrices with sufficient data to feed the ComBat or limma algorithm. The adjusted data is returned to the user as a rebuild matrix. The implementation is meant to make as much data available as possible with minimal data loss.
Maintained by Simon Schlumbohm. Last updated 5 months ago.
5.8 match 4.20 score 16 scriptsbioc
ORFik:Open Reading Frames in Genomics
R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.
Maintained by Haakon Tjeldnes. Last updated 29 days ago.
immunooncologysoftwaresequencingriboseqrnaseqfunctionalgenomicscoveragealignmentdataimportcpp
2.3 match 33 stars 10.63 score 115 scripts 2 dependentscran
timeSeries:Financial Time Series Objects (Rmetrics)
'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.
Maintained by Georgi N. Boshnakov. Last updated 6 months ago.
2.4 match 2 stars 9.90 score 1.3k scripts 145 dependentsrspatial
dismo:Species Distribution Modeling
Methods for species distribution modeling, that is, predicting the environmental similarity of any site to that of the locations of known occurrences of a species.
Maintained by Robert J. Hijmans. Last updated 4 months ago.
2.0 match 26 stars 11.88 score 2.8k scripts 21 dependentsbilldenney
PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis
Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.
Maintained by Bill Denney. Last updated 17 days ago.
ncanoncompartmental-analysispharmacokinetics
1.9 match 73 stars 12.61 score 214 scripts 4 dependentsmablab
sftrack:Modern Classes for Tracking and Movement Data
Modern classes for tracking and movement data, building on 'sf' spatial infrastructure, and early theoretical work from Turchin (1998, ISBN: 9780878938476), and Calenge et al. (2009) <doi:10.1016/j.ecoinf.2008.10.002>. Tracking data are series of locations with at least 2-dimensional spatial coordinates (x,y), a time index (t), and individual identification (id) of the object being monitored; movement data are made of trajectories, i.e. the line representation of the path, composed by steps (the straight-line segments connecting successive locations). 'sftrack' is designed to handle movement of both living organisms and inanimate objects.
Maintained by Mathieu Basille. Last updated 2 years ago.
movementtracking-datatrajectory
3.7 match 53 stars 6.42 score 5 scriptsrfastofficial
Rfast2:A Collection of Efficient and Extremely Fast R Functions II
A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.
Maintained by Manos Papadakis. Last updated 1 years ago.
2.9 match 38 stars 8.09 score 75 scripts 26 dependentsrobinhankin
freealg:The Free Algebra
The free algebra in R with non-commuting indeterminates. Uses 'disordR' discipline (Hankin, 2022, <doi:10.48550/ARXIV.2210.03856>). To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2211.04002>.
Maintained by Robin K. S. Hankin. Last updated 6 days ago.
3.3 match 3 stars 7.07 score 10 dependentsbioc
globaltest:Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing
The global test tests groups of covariates (or features) for association with a response variable. This package implements the test with diagnostic plots and multiple testing utilities, along with several functions to facilitate the use of this test for gene set testing of GO and KEGG terms.
Maintained by Jelle Goeman. Last updated 5 months ago.
microarrayonechannelbioinformaticsdifferentialexpressiongopathways
3.3 match 6.96 score 79 scripts 7 dependentsmarkmfredrickson
optmatch:Functions for Optimal Matching
Distance based bipartite matching using minimum cost flow, oriented to matching of treatment and control groups in observational studies ('Hansen' and 'Klopfer' 2006 <doi:10.1198/106186006X137047>). Routines are provided to generate distances from generalised linear models (propensity score matching), formulas giving variables on which to limit matched distances, stratified or exact matching directives, or calipers, alone or in combination.
Maintained by Josh Errickson. Last updated 3 months ago.
1.9 match 47 stars 12.22 score 588 scripts 5 dependentsdoi-usgs
nhdplusTools:NHDPlus Tools
Tools for traversing and working with National Hydrography Dataset Plus (NHDPlus) data. All methods implemented in 'nhdplusTools' are available in the NHDPlus documentation available from the US Environmental Protection Agency <https://www.epa.gov/waterdata/basic-information>.
Maintained by David Blodgett. Last updated 26 days ago.
2.0 match 87 stars 11.38 score 348 scripts 5 dependentsropensci
jqr:Client for 'jq', a 'JSON' Processor
Client for 'jq', a 'JSON' processor (<https://jqlang.github.io/jq/>), written in C. 'jq' allows the following with 'JSON' data: index into, parse, do calculations, cut up and filter, change key names and values, perform conditionals and comparisons, and more.
Maintained by Jeroen Ooms. Last updated 3 months ago.
2.3 match 144 stars 10.04 score 95 scripts 28 dependentsrobinhankin
mvp:Fast Symbolic Multivariate Polynomials
Fast manipulation of symbolic multivariate polynomials using the 'Map' class of the Standard Template Library. The package uses print and coercion methods from the 'mpoly' package but offers speed improvements. It is comparable in speed to the 'spray' package for sparse arrays, but retains the symbolic benefits of 'mpoly'. To cite the package in publications, use Hankin 2022 <doi:10.48550/ARXIV.2210.15991>. Uses 'disordR' discipline.
Maintained by Robin K. S. Hankin. Last updated 4 days ago.
3.3 match 9 stars 6.83 score 36 scripts 2 dependentsvochr
TapeS:Tree Taper Curves and Sorting Based on 'TapeR'
Providing new german-wide 'TapeR' Models and functions for their evaluation. Included are the most common tree species in Germany (Norway spruce, Scots pine, European larch, Douglas fir, Silver fir as well as European beech, Common/Sessile oak and Red oak). Many other species are mapped to them so that 36 tree species / groups can be processed. Single trees are defined by species code, one or multiple diameters in arbitrary measuring height and tree height. The functions then provide information on diameters along the stem, bark thickness, height of diameters, volume of the total or parts of the trunk and total and component above-ground biomass. It is also possible to calculate assortments from the taper curves. Uncertainty information is provided for diameter, volume and component biomass estimation.
Maintained by Christian Vonderach. Last updated 1 months ago.
5.7 match 3.90 score 1 scriptsrobinhankin
clifford:Arbitrary Dimensional Clifford Algebras
A suite of routines for Clifford algebras, using the 'Map' class of the Standard Template Library. Canonical reference: Hestenes (1987, ISBN 90-277-1673-0, "Clifford algebra to geometric calculus"). Special cases including Lorentz transforms, quaternion multiplication, and Grassmann algebra, are discussed. Vignettes presenting conformal geometric algebra, quaternions and split quaternions, dual numbers, and Lorentz transforms are included. The package follows 'disordR' discipline.
Maintained by Robin K. S. Hankin. Last updated 1 months ago.
3.3 match 5 stars 6.71 score 4 scriptsubod
apcluster:Affinity Propagation Clustering
Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) <DOI:10.1126/science.1136800>. The algorithms are largely analogous to the 'Matlab' code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.
Maintained by Ulrich Bodenhofer. Last updated 11 months ago.
2.3 match 10 stars 9.82 score 270 scripts 25 dependentsspkaluzny
splusTimeSeries:Time Series from 'S-PLUS'
A collection of classes and methods for working with indexed rectangular data. The index values can be calendar (timeSeries class) or numeric (signalSeries class). Methods are included for aggregation, alignment, merging, and summaries. The code was originally available in 'S-PLUS'.
Maintained by Stephen Kaluzny. Last updated 6 months ago.
5.6 match 3.95 score 20 scripts 1 dependentsbioc
variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments
Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.
Maintained by Gabriel E. Hoffman. Last updated 2 months ago.
rnaseqgeneexpressiongenesetenrichmentdifferentialexpressionbatcheffectqualitycontrolregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationpreprocessingmicroarrayimmunooncologysoftware
1.9 match 7 stars 11.69 score 1.1k scripts 3 dependentsrobinhankin
stokes:The Exterior Calculus
Provides functionality for working with tensors, alternating forms, wedge products, Stokes's theorem, and related concepts from the exterior calculus. Uses 'disordR' discipline (Hankin, 2022, <doi:10.48550/arXiv.2210.03856>). The canonical reference would be M. Spivak (1965, ISBN:0-8053-9021-9) "Calculus on Manifolds". To cite the package in publications please use Hankin (2022) <doi:10.48550/arXiv.2210.17008>.
Maintained by Robin K. S. Hankin. Last updated 2 months ago.
3.3 match 3 stars 6.62 scoreinrae
airGRiwrm:'airGR' Integrated Water Resource Management
Semi-distributed Precipitation-Runoff Modeling based on 'airGR' package models integrating human infrastructures and their managements.
Maintained by David Dorchies. Last updated 6 months ago.
3.4 match 6.34 score 45 scriptsmathewchamberlain
SignacX:Cell Type Identification and Discovery from Single Cell Gene Expression Data
An implementation of neural networks trained with flow-sorted gene expression data to classify cellular phenotypes in single cell RNA-sequencing data. See Chamberlain M et al. (2021) <doi:10.1101/2021.02.01.429207> for more details.
Maintained by Mathew Chamberlain. Last updated 2 years ago.
cellular-phenotypesseuratsingle-cell-rna-seq
3.4 match 24 stars 6.46 score 34 scriptsrobinhankin
disordR:Non-Ordered Vectors
Functionality for manipulating values of associative maps. The package is a dependency for mvp-type packages that use the STL map class: it traps plausible idiom that is ill-defined (implementation-specific) and returns an informative error, rather than returning a possibly incorrect result. To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2210.03856>.
Maintained by Robin K. S. Hankin. Last updated 4 months ago.
3.3 match 1 stars 6.59 score 20 dependentstoxpi
toxpiR:Create ToxPi Prioritization Models
Enables users to build 'ToxPi' prioritization models and provides functionality within the grid framework for plotting ToxPi graphs. 'toxpiR' allows for more customization than the 'ToxPi GUI' (<https://toxpi.org>) and integration into existing workflows for greater ease-of-use, reproducibility, and transparency. toxpiR package behaves nearly identically to the GUI; the package documentation includes notes about all differences. The vignettes download example files from <https://github.com/ToxPi/ToxPi-example-files>.
Maintained by Jonathon F Fleming. Last updated 7 months ago.
data-sciencemodelingtoxicology
3.3 match 11 stars 6.58 score 19 scriptsspkaluzny
splus2R:Supplemental S-PLUS Functionality in R
Currently there are many functions in S-PLUS that are missing in R. To facilitate the conversion of S-PLUS packages to R packages, this package provides some missing S-PLUS functionality in R.
Maintained by Stephen Kaluzny. Last updated 1 years ago.
3.3 match 1 stars 6.56 score 82 scripts 30 dependentsr-forge
Polychrome:Qualitative Palettes with Many Colors
Tools for creating, viewing, and assessing qualitative palettes with many (20-30 or more) colors. See Coombes and colleagues (2019) <doi:10.18637/jss.v090.c01>.
Maintained by Kevin R. Coombes. Last updated 1 months ago.
2.3 match 9.56 score 1.0k scripts 27 dependentsishidamgm
TreeRingShape:Recording Tree-Ring Shapes of Tree Disks with Manual Digitizing and Interpolating Model
Record all tree-ring Shapefile of tree disk with GIS soft ('Qgis'<https://www.qgis.org/en/site/>) and interpolating model from high resolution tree disk image.
Maintained by Megumi ISHIDA. Last updated 4 months ago.
4.8 match 4.48 scorebernd-mueller
epos:Epilepsy Ontologies' Similarities
Analysis and visualization of similarities between epilepsy ontologies based on text mining results by comparing ranked lists of co-occurring drug terms in the BioASQ corpus. The ranked result lists of neurological drug terms co-occurring with terms from the epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS undergo further analysis. The source data to create the ranked lists of drug names is produced using the text mining workflows described in Mueller, Bernd and Hagelstein, Alexandra (2016) <doi:10.4126/FRL01-006408558>, Mueller, Bernd et al. (2017) <doi:10.1007/978-3-319-58694-6_22>, Mueller, Bernd and Rebholz-Schuhmann, Dietrich (2020) <doi:10.1007/978-3-030-43887-6_52>, and Mueller, Bernd et al. (2022) <doi:10.1186/s13326-021-00258-w>.
Maintained by Bernd Mueller. Last updated 1 years ago.
5.3 match 4.03 score 53 scriptsgagolews
stringx:Replacements for Base String Functions Powered by 'stringi'
English is the native language for only 5% of the World population. Also, only 17% of us can understand this text. Moreover, the Latin alphabet is the main one for merely 36% of the total. The early computer era, now a very long time ago, was dominated by the US. Due to the proliferation of the internet, smartphones, social media, and other technologies and communication platforms, this is no longer the case. This package replaces base R string functions (such as grep(), tolower(), sprintf(), and strptime()) with ones that fully support the Unicode standards related to natural language and date-time processing. It also fixes some long-standing inconsistencies, and introduces some new, useful features. Thanks to 'ICU' (International Components for Unicode) and 'stringi', they are fast, reliable, and portable across different platforms.
Maintained by Marek Gagolewski. Last updated 2 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringitexttext-processingunicode
4.5 match 28 stars 4.75 score 1 scriptsbioc
unifiedWMWqPCR:Unified Wilcoxon-Mann Whitney Test for testing differential expression in qPCR data
This packages implements the unified Wilcoxon-Mann-Whitney Test for qPCR data. This modified test allows for testing differential expression in qPCR data.
Maintained by Joris Meys. Last updated 5 months ago.
differentialexpressiongeneexpressionmicrotitreplateassaymultiplecomparisonqualitycontrolsoftwarevisualizationqpcr
5.1 match 4.18 scoredmurdoch
plotrix:Various Plotting Functions
Lots of plots, various labeling, axis and color scaling functions. The author/maintainer died in September 2023.
Maintained by Duncan Murdoch. Last updated 1 years ago.
1.9 match 5 stars 11.31 score 9.2k scripts 361 dependentspharmaverse
admiral:ADaM in R Asset Library
A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).
Maintained by Ben Straub. Last updated 6 hours ago.
cdiscclinical-trialsopen-source
1.5 match 238 stars 13.92 score 486 scripts 4 dependentssciviews
pastecs:Package for Analysis of Space-Time Ecological Series
Regularisation, decomposition and analysis of space-time series. The pastecs R package is a PNEC-Art4 and IFREMER (Benoit Beliaeff <Benoit.Beliaeff@ifremer.fr>) initiative to bring PASSTEC 2000 functionalities to R.
Maintained by Philippe Grosjean. Last updated 1 years ago.
2.0 match 4 stars 10.34 score 2.1k scripts 13 dependentsmarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
1.9 match 54 stars 11.01 score 270 scripts 4 dependentsstuart-lab
Signac:Analysis of Single-Cell Chromatin Data
A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.
Maintained by Tim Stuart. Last updated 7 months ago.
atacbioinformaticssingle-cellzlibcpp
1.7 match 349 stars 12.19 score 3.7k scripts 1 dependentscrsh
papaja:Prepare American Psychological Association Journal Articles with R Markdown
Tools to create dynamic, submission-ready manuscripts, which conform to American Psychological Association manuscript guidelines. We provide R Markdown document formats for manuscripts (PDF and Word) and revision letters (PDF). Helper functions facilitate reporting statistical analyses or create publication-ready tables and plots.
Maintained by Frederik Aust. Last updated 19 days ago.
apaapa-guidelinesjournalmanuscriptpsychologyreproducible-paperreproducible-researchrmarkdown
1.8 match 662 stars 11.74 score 1.7k scripts 1 dependentsbioc
MANOR:CGH Micro-Array NORmalization
Importation, normalization, visualization, and quality control functions to correct identified sources of variability in array-CGH experiments.
Maintained by Pierre Neuvial. Last updated 5 days ago.
microarraytwochanneldataimportqualitycontrolpreprocessingcopynumbervariationnormalization
4.1 match 4.98 score 1 scriptsbioc
twoddpcr:Classify 2-d Droplet Digital PCR (ddPCR) data and quantify the number of starting molecules
The twoddpcr package takes Droplet Digital PCR (ddPCR) droplet amplitude data from Bio-Rad's QuantaSoft and can classify the droplets. A summary of the positive/negative droplet counts can be generated, which can then be used to estimate the number of molecules using the Poisson distribution. This is the first open source package that facilitates the automatic classification of general two channel ddPCR data. Previous work includes 'definetherain' (Jones et al., 2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle one channel ddPCR experiments only. The 'ddpcr' package available on CRAN (Attali et al., 2016) supports automatic gating of a specific class of two channel ddPCR experiments only.
Maintained by Anthony Chiu. Last updated 5 months ago.
3.5 match 10 stars 5.78 score 4 scripts2005m
kit:Data Manipulation Functions Implemented in C
Basic functions, implemented in C, for large data manipulation. Fast vectorised ifelse()/nested if()/switch() functions, psum()/pprod() functions equivalent to pmin()/pmax() plus others which are missing from base R. Most of these functions are callable at C level.
Maintained by Morgan Jacob. Last updated 6 months ago.
2.3 match 58 stars 9.11 score 92 scripts 5 dependentsdavid-cortes
MatrixExtra:Extra Methods for Sparse Matrices
Extends sparse matrix and vector classes from the 'Matrix' package by providing: (a) Methods and operators that work natively on CSR formats (compressed sparse row, a.k.a. 'RsparseMatrix') such as slicing/sub-setting, assignment, rbind(), mathematical operators for CSR and COO such as addition ("+") or sqrt(), and methods such as diag(); (b) Multi-threaded matrix multiplication and cross-product for many <sparse, dense> types, including the 'float32' type from 'float'; (c) Coercion methods between pairs of classes which are not present in 'Matrix', such as 'dgCMatrix' -> 'ngRMatrix', as well as convenience conversion functions; (d) Utility functions for sparse matrices such as sorting the indices or removing zero-valued entries; (e) Fast transposes that work by outputting in the opposite storage format; (f) Faster replacements for many 'Matrix' methods for all sparse types, such as slicing and elementwise multiplication. (g) Convenience functions for sparse objects, such as 'mapSparse' or a shorter 'show' method.
Maintained by David Cortes. Last updated 9 months ago.
csrsparse-matrixopenblascppopenmp
2.3 match 20 stars 9.08 score 84 scripts 29 dependentslvclark
polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids
Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.
Maintained by Lindsay V. Clark. Last updated 9 days ago.
bioinformaticsdna-sequencinggenotype-likelihoodsgenotyping-by-sequencinghacktoberfestrad-seqrad-sequencingsnp-genotypingcpp
2.9 match 28 stars 6.98 score 85 scriptsxfim
ggmcmc:Tools for Analyzing MCMC Simulations from Bayesian Inference
Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables, and functions to work with hierarchical/multilevel batches of parameters (Fernández-i-Marín, 2016 <doi:10.18637/jss.v070.i09>).
Maintained by Xavier Fernández i Marín. Last updated 2 years ago.
bayesian-data-analysisggplot2graphicaljagsmcmcstan
1.7 match 112 stars 12.02 score 1.6k scripts 8 dependentsmariarizzo
energy:E-Statistics: Multivariate Inference via the Energy of Data
E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, k-groups and hierarchical clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.
Maintained by Maria Rizzo. Last updated 7 months ago.
distance-correlationenergymultivariate-analysisstatisticscpp
1.9 match 45 stars 10.60 score 634 scripts 45 dependentsrhartmano
labelr:Label Data Frames, Variables, and Values
Create and use data frame labels for data frame objects (frame labels), their columns (name labels), and individual values of a column (value labels). Value labels include one-to-one and many-to-one labels for nominal and ordinal variables, as well as numerical range-based value labels for continuous variables. Convert value-labeled variables so each value is replaced by its corresponding value label. Add values-converted-to-labels columns to a value-labeled data frame while preserving parent columns. Filter and subset a value-labeled data frame using labels, while returning results in terms of values. Overlay labels in place of values in common R commands to increase interpretability. Generate tables of value frequencies, with categories expressed as raw values or as labels. Access data frames that show value-to-label mappings for easy reference.
Maintained by Robert Hartman. Last updated 7 months ago.
3.5 match 3 stars 5.65 score 10 scriptsbioc
QSutils:Quasispecies Diversity
Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and exploration—functions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indices—functions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulation—functions useful for generating random viral quasispecies data.
Maintained by Mercedes Guerrero-Murillo. Last updated 5 months ago.
softwaregeneticsdnaseqgeneticvariabilitysequencingalignmentsequencematchingdataimport
3.6 match 5.56 score 8 scripts 1 dependents