Showing 200 of total 485 results (show query)
usepa
ctxR:Utilities for Interacting with the 'CTX' APIs
Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://www.epa.gov/comptox-tools/computational-toxicology-and-exposure-apis>. 'ctxR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.
Maintained by Paul Kruse. Last updated 2 months ago.
76.0 match 10 stars 8.02 score 13 scripts 1 dependentsusepa
ccdR:Utilities for Interacting with the 'CTX' APIs
Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://api-ccte.epa.gov/docs/>. 'ccdR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.
Maintained by Paul Kruse. Last updated 8 months ago.
78.7 match 2 stars 6.38 score 7 scriptspecanproject
PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Istem Fer. Last updated 3 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
44.7 match 216 stars 9.94 score 20 scripts 2 dependentsedubruell
tidyllm:Tidy Integration of Large Language Models
A tidy interface for integrating large language model (LLM) APIs such as 'Claude', 'Openai', 'Groq','Mistral' and local models via 'Ollama' into R workflows. The package supports text and media-based interactions, interactive message history, batch request APIs, and a tidy, pipeline-oriented interface for streamlined integration into data workflows. Web services are available at <https://www.anthropic.com>, <https://openai.com>, <https://groq.com>, <https://mistral.ai/> and <https://ollama.com>.
Maintained by Eduard Brüll. Last updated 5 days ago.
52.2 match 68 stars 7.82 score 26 scriptswlandau
crew.aws.batch:A Crew Launcher Plugin for AWS Batch
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The 'crew.aws.batch' package extends the 'mirai'-powered 'crew' package with a worker launcher plugin for AWS Batch. Inspiration also comes from packages 'mirai' by Gao (2023) <https://github.com/shikokuchuo/mirai>, 'future' by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, 'rrq' by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, 'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and 'batchtools' by Lang, Bischl, and Surmann (2017). <doi:10.21105/joss.00135>.
Maintained by William Michael Landau. Last updated 1 months ago.
aws-batchcrewhigh-performance-computing
62.2 match 15 stars 4.99 score 6 scriptst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
27.2 match 10.93 score 10k scripts 55 dependentsbioc
batchelor:Single-Cell Batch Correction Methods
Implements a variety of methods for batch correction of single-cell (RNA sequencing) data. This includes methods based on detecting mutually nearest neighbors, as well as several efficient variants of linear regression of the log-expression values. Functions are also provided to perform global rescaling to remove differences in depth between batches, and to perform a principal components analysis that is robust to differences in the numbers of cells across batches.
Maintained by Aaron Lun. Last updated 4 days ago.
sequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecellbatcheffectnormalizationcpp
31.6 match 9.10 score 1.2k scripts 10 dependentsbioc
singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Maintained by Joshua David Campbell. Last updated 25 days ago.
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
25.6 match 181 stars 10.16 score 252 scriptsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica McClintock. Last updated 5 months ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
26.1 match 7 stars 8.99 score 54 scriptsmlampros
ClusterR:Gaussian Mixture Models, K-Means, Mini-Batch-Kmeans, K-Medoids and Affinity Propagation Clustering
Gaussian mixture models, k-means, mini-batch-kmeans, k-medoids and affinity propagation clustering with the option to plot, validate, predict (new data) and estimate the optimal number of clusters. The package takes advantage of 'RcppArmadillo' to speed up the computationally intensive parts of the functions. For more information, see (i) "Clustering in an Object-Oriented Environment" by Anja Struyf, Mia Hubert, Peter Rousseeuw (1997), Journal of Statistical Software, <doi:10.18637/jss.v001.i04>; (ii) "Web-scale k-means clustering" by D. Sculley (2010), ACM Digital Library, <doi:10.1145/1772690.1772862>; (iii) "Armadillo: a template-based C++ library for linear algebra" by Sanderson et al (2016), The Journal of Open Source Software, <doi:10.21105/joss.00026>; (iv) "Clustering by Passing Messages Between Data Points" by Brendan J. Frey and Delbert Dueck, Science 16 Feb 2007: Vol. 315, Issue 5814, pp. 972-976, <doi:10.1126/science.1136800>.
Maintained by Lampros Mouselimis. Last updated 9 months ago.
affinity-propagationcpp11gmmkmeanskmedoids-clusteringmini-batch-kmeansrcpparmadilloopenblascppopenmp
18.7 match 84 stars 11.08 score 640 scripts 24 dependentslme4
lme4:Linear Mixed-Effects Models using 'Eigen' and S4
Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".
Maintained by Ben Bolker. Last updated 4 days ago.
9.8 match 647 stars 20.69 score 35k scripts 1.5k dependentsdylanpieper
hellmer:Batch Processing for Chat Models
Batch processing framework for 'ellmer' chat model interactions. Enables sequential and parallel processing of chat completions. Core capabilities include error handling with backoff, state persistence, progress tracking, and retry management. Parallel processing is implemented via the 'future' framework. Additional features include structured data extraction, tool integration, timeout handling, verbosity control, and sound notifications. Includes methods for returning chat texts, chat objects, progress status, and structured data.
Maintained by Dylan Pieper. Last updated 3 days ago.
batchbatch-processingellmerllm
36.3 match 6 stars 5.18 scorellrs
experDesign:Design Experiments for Batches
Distributes samples in batches while making batches homogeneous according to their description. Allows for an arbitrary number of variables, both numeric and categorical. For quality control it provides functions to subset a representative sample.
Maintained by Lluís Revilla Sancho. Last updated 3 months ago.
31.1 match 10 stars 5.54 score 1 scriptsmlr-org
bbotk:Black-Box Optimization Toolkit
Features highly configurable search spaces via the 'paradox' package and optimizes every user-defined objective function. The package includes several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). bbotk is the base package of 'mlr3tuning', 'mlr3fselect' and 'miesmuschel'.
Maintained by Marc Becker. Last updated 3 months ago.
bbotkblack-box-optimizationdata-sciencehyperparameter-optimizationhyperparameter-tuningmachine-learningmlr3optimization
16.9 match 22 stars 9.87 score 166 scripts 14 dependentsbioc
BERT:High Performance Data Integration for Large-Scale Analyses of Incomplete Omic Profiles Using Batch-Effect Reduction Trees (BERT)
Provides efficient batch-effect adjustment of data with missing values. BERT orders all batch effect correction to a tree of pairwise computations. BERT allows parallelization over sub-trees.
Maintained by Yannis Schumann. Last updated 2 months ago.
batcheffectpreprocessingexperimentaldesignqualitycontrolbatch-effectbioconductor-packagebioinformaticsdata-integrationdata-science
29.4 match 2 stars 5.40 score 18 scriptspoissonconsulting
batchr:Batch Process Files
Processes multiple files with a user-supplied function. The key design principle is that only files which were last modified before the directory was configured are processed. A hidden file stores the configuration time and function etc while successfully processed files are automatically touched to update their modification date. As a result batch processing can be stopped and restarted and any files created (or modified or deleted) during processing are ignored.
Maintained by Joe Thorley. Last updated 2 months ago.
31.8 match 6 stars 4.56 score 8 scriptsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 21 hours ago.
10.5 match 845 stars 13.60 score 264 scripts 2 dependentsmllg
batchtools:Tools for Computation on Batch Systems
As a successor of the packages 'BatchJobs' and 'BatchExperiments', this package provides a parallel implementation of the Map function for high performance computing systems managed by schedulers 'IBM Spectrum LSF' (<https://www.ibm.com/products/hpc-workload-management>), 'OpenLava' (<https://www.openlava.org/>), 'Univa Grid Engine'/'Oracle Grid Engine' (<https://www.univa.com/>), 'Slurm' (<https://slurm.schedmd.com/>), 'TORQUE/PBS' (<https://adaptivecomputing.com/cherry-services/torque-resource-manager/>), or 'Docker Swarm' (<https://docs.docker.com/engine/swarm/>). A multicore and socket mode allow the parallelization on a local machines, and multiple machines can be hooked up via SSH to create a makeshift cluster. Moreover, the package provides an abstraction mechanism to define large-scale computer experiments in a well-organized and reproducible way.
Maintained by Michel Lang. Last updated 2 years ago.
batchexperimentsbatchjobsdocker-swarmhigh-performance-computinghpchpc-clusterslsfopenlavaparallel-computingreproducibilitysgeslurmtorque
12.3 match 175 stars 11.39 score 772 scripts 14 dependentsmarkedmondson1234
googleAuthR:Authenticate and Create Google APIs
Create R functions that interact with OAuth2 Google APIs <https://developers.google.com/apis-explorer/> easily, with auto-refresh and Shiny compatibility.
Maintained by Erik Grönroos. Last updated 10 months ago.
apiauthenticationgooglegoogleauthroauth2-flowshiny
10.8 match 178 stars 12.84 score 804 scripts 13 dependentsbioc
CDI:Clustering Deviation Index (CDI)
Single-cell RNA-sequencing (scRNA-seq) is widely used to explore cellular variation. The analysis of scRNA-seq data often starts from clustering cells into subpopulations. This initial step has a high impact on downstream analyses, and hence it is important to be accurate. However, there have not been unsupervised metric designed for scRNA-seq to evaluate clustering performance. Hence, we propose clustering deviation index (CDI), an unsupervised metric based on the modeling of scRNA-seq UMI counts to evaluate clustering of cells.
Maintained by Jiyuan Fang. Last updated 5 months ago.
singlecellsoftwareclusteringvisualizationsequencingrnaseqcellbasedassays
27.6 match 5 stars 5.00 score 4 scriptsbedapub
designit:Blocking and Randomization for Experimental Design
Intelligently assign samples to batches in order to reduce batch effects. Batch effects can have a significant impact on data analysis, especially when the assignment of samples to batches coincides with the contrast groups being studied. By defining a batch container and a scoring function that reflects the contrasts, this package allows users to assign samples in a way that minimizes the potential impact of batch effects on the comparison of interest. Among other functionality, we provide an implementation for OSAT score by Yan et al. (2012, <doi:10.1186/1471-2164-13-689>).
Maintained by Iakov I. Davydov. Last updated 4 months ago.
design-of-experimentsrandomization
18.3 match 8 stars 7.28 score 24 scriptsbioc
BEclear:Correction of batch effects in DNA methylation data
Provides functions to detect and correct for batch effects in DNA methylation data. The core function is based on latent factor models and can also be used to predict missing values in any other matrix containing real numbers.
Maintained by Livia Rasp. Last updated 5 months ago.
batcheffectdnamethylationsoftwarepreprocessingstatisticalmethodbatch-effectsbioconductor-packagedna-methylationlatent-factor-modelmethylationmissing-datamissing-valuesstochastic-gradient-descentcpp
22.4 match 4 stars 5.90 score 11 scriptsmlr-org
mlr3tuning:Hyperparameter Optimization for 'mlr3'
Hyperparameter optimization package of the 'mlr3' ecosystem. It features highly configurable search spaces via the 'paradox' package and finds optimal hyperparameter configurations for any 'mlr3' learner. 'mlr3tuning' works with several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). Moreover, it can automatically optimize learners and estimate the performance of optimized models with nested resampling.
Maintained by Marc Becker. Last updated 3 months ago.
bbotkhyperparameter-optimizationhyperparameter-tuningmachine-learningmlr3optimizationtunetuning
11.1 match 55 stars 11.59 score 384 scripts 11 dependentsshixiangwang
ezcox:Easily Process a Batch of Cox Models
A tool to operate a batch of univariate or multivariate Cox models and return tidy result.
Maintained by Shixiang Wang. Last updated 1 years ago.
17.5 match 21 stars 7.22 score 44 scripts 1 dependentsgitdemont
IFC:Tools for Imaging Flow Cytometry
Contains several tools to treat imaging flow cytometry data from 'ImageStream®' and 'FlowSight®' cytometers ('Amnis®' 'Cytek®'). Provides an easy and simple way to read and write .fcs, .rif, .cif and .daf files. Information such as masks, features, regions and populations set within these files can be retrieved for each single cell. In addition, raw data such as images stored can also be accessed. Users, may hopefully increase their productivity thanks to dedicated functions to extract, visualize, manipulate and export 'IFC' data. Toy data example can be installed through the 'IFCdata' package of approximately 32 MB, which is available in a 'drat' repository <https://gitdemont.github.io/IFCdata/>. See file 'COPYRIGHTS' and file 'AUTHORS' for a list of copyright holders and authors.
Maintained by Yohann Demont. Last updated 11 days ago.
cytometrycytometry-dataflowflow-cytometryflow-cytometry-analysisflow-cytometry-dataflow-cytometry-filesifcimageimaging-flow-cytometryimaging-flow-cytometry-datamicroscopycpp
22.1 match 4 stars 5.34 score 12 scriptsstcolema
batchmix:Semi-Supervised Bayesian Mixture Models Incorporating Batch Correction
Semi-supervised and unsupervised Bayesian mixture models that simultaneously infer the cluster/class structure and a batch correction. Densities available are the multivariate normal and the multivariate t. The model sampler is implemented in C++. This package is aimed at analysis of low-dimensional data generated across several batches. See Coleman et al. (2022) <doi:10.1101/2022.01.14.476352> for details of the model.
Maintained by Stephen Coleman. Last updated 10 months ago.
27.7 match 4.00 score 3 scriptsbioc
PLSDAbatch:PLSDA-batch
A novel framework to correct for batch effects prior to any downstream analysis in microbiome data based on Projection to Latent Structures Discriminant Analysis. The main method is named “PLSDA-batch”. It first estimates treatment and batch variation with latent components, then subtracts batch-associated components from the data whilst preserving biological variation of interest. PLSDA-batch is highly suitable for microbiome data as it is non-parametric, multivariate and allows for ordination and data visualisation. Combined with centered log-ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far. Two other variants are proposed for 1/ unbalanced batch x treatment designs that are commonly encountered in studies with small sample sizes, and for 2/ selection of discriminative variables amongst treatment groups to avoid overfitting in classification problems. These two variants have widened the scope of applicability of PLSDA-batch to different data settings.
Maintained by Yiwen (Eva) Wang. Last updated 5 months ago.
statisticalmethoddimensionreductionprincipalcomponentclassificationmicrobiomebatcheffectnormalizationvisualization
19.9 match 13 stars 5.37 score 18 scriptsbioc
debrowser:Interactive Differential Expresion Analysis Browser
Bioinformatics platform containing interactive plots and tables for differential gene and region expression studies. Allows visualizing expression data much more deeply in an interactive and faster way. By changing the parameters, users can easily discover different parts of the data that like never have been done before. Manually creating and looking these plots takes time. With DEBrowser users can prepare plots without writing any code. Differential expression, PCA and clustering analysis are made on site and the results are shown in various plots such as scatter, bar, box, volcano, ma plots and Heatmaps.
Maintained by Alper Kucukural. Last updated 5 months ago.
sequencingchipseqrnaseqdifferentialexpressiongeneexpressionclusteringimmunooncology
13.2 match 61 stars 7.80 score 65 scriptsstopsack
batchtma:Batch Effect Adjustments
Different adjustment methods for batch effects in biomarker data, such as from tissue microarrays. Some methods attempt to retain differences between batches that may be due to between-batch differences in "biological" factors that influence biomarker values.
Maintained by Konrad Stopsack. Last updated 9 months ago.
batch-effectsmeasurement-errortissue-microarray-analysis
27.2 match 1 stars 3.70 score 3 scriptsropensci
tarchetypes:Archetypes for Targets
Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>.
Maintained by William Michael Landau. Last updated 21 days ago.
data-sciencehigh-performance-computingpeer-reviewedpipeliner-targetopiareproducibilitytargetsworkflow
8.7 match 141 stars 11.43 score 1.7k scripts 10 dependentsgojiplus
captr:Client for the Captricity API
Get text from images of text using Captricity Optical Character Recognition (OCR) API. Captricity allows you to get text from handwritten forms --- think surveys --- and other structured paper documents. And it can output data in form a delimited file keeping field information intact. For more information, read <https://shreddr.captricity.com/developer/overview/>.
Maintained by Gaurav Sood. Last updated 7 years ago.
18.8 match 14 stars 5.29 score 28 scriptsrstudio
tfdatasets:Interface to 'TensorFlow' Datasets
Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.
Maintained by Tomasz Kalinowski. Last updated 5 days ago.
10.3 match 34 stars 9.32 score 656 scripts 3 dependentscran
batch:Batching Routines in Parallel and Passing Command-Line Arguments to R
Functions to allow you to easily pass command-line arguments into R, and functions to aid in submitting your R code in parallel on a cluster and joining the results afterward (e.g. multiple parameter values for simulations running in parallel, splitting up a permutation test in parallel, etc.). See `parseCommandArgs(...)' for the main example of how to use this package.
Maintained by Thomas Hoffmann. Last updated 7 years ago.
59.8 match 1.60 scoretudo-r
BatchJobs:Batch Computing with R
Provides Map, Reduce and Filter variants to generate jobs on batch computing systems like PBS/Torque, LSF, SLURM and Sun Grid Engine. Multicore and SSH systems are also supported. For further details see the project web page.
Maintained by Bernd Bischl. Last updated 3 years ago.
10.9 match 85 stars 8.57 score 616 scripts 3 dependentstiledb-inc
tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays
The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.
Maintained by Isaiah Norton. Last updated 5 days ago.
arrayhdfss3storage-managertiledbcpp
7.8 match 107 stars 11.96 score 306 scripts 4 dependentsbioc
sevenbridges:Seven Bridges Platform API Client and Common Workflow Language Tool Builder in R
R client and utilities for Seven Bridges platform API, from Cancer Genomics Cloud to other Seven Bridges supported platforms.
Maintained by Phil Webster. Last updated 5 months ago.
softwaredataimportthirdpartyclientapi-clientbioconductorbioinformaticscloudcommon-workflow-languagesevenbridges
11.5 match 35 stars 7.40 score 24 scriptsr-simmer
simmer:Discrete-Event Simulation for R
A process-oriented and trajectory-based Discrete-Event Simulation (DES) package for R. It is designed as a generic yet powerful framework. The architecture encloses a robust and fast simulation core written in 'C++' with automatic monitoring capabilities. It provides a rich and flexible R API that revolves around the concept of trajectory, a common path in the simulation model for entities of the same type. Documentation about 'simmer' is provided by several vignettes included in this package, via the paper by Ucar, Smeets & Azcorra (2019, <doi:10.18637/jss.v090.i02>), and the paper by Ucar, Hernández, Serrano & Azcorra (2018, <doi:10.1109/MCOM.2018.1700960>); see 'citation("simmer")' for details.
Maintained by Iñaki Ucar. Last updated 6 months ago.
7.3 match 223 stars 11.47 score 440 scripts 6 dependentsstevenmmortimer
salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles
Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 4 months ago.
api-wrappersr-languager-programmingsalesforcesalesforce-apis
8.8 match 82 stars 9.27 score 191 scriptszheng206
ComBatFamQC:Comprehensive Batch Effect Diagnostics and Harmonization
Provides a comprehensive framework for batch effect diagnostics, harmonization, and post-harmonization downstream analysis. Features include interactive visualization tools, robust statistical tests, and a range of harmonization techniques. Additionally, 'ComBatFamQC' enables the creation of life-span age trend plots with estimated age-adjusted centiles and facilitates the generation of covariate-corrected residuals for analytical purposes. Methods for harmonization are based on approaches described in Johnson et al., (2007) <doi:10.1093/biostatistics/kxj037>, Beer et al., (2020) <doi:10.1016/j.neuroimage.2020.117129>, Pomponio et al., (2020) <doi:10.1016/j.neuroimage.2019.116450>, and Chen et al., (2021) <doi:10.1002/hbm.25688>.
Maintained by Zheng Ren. Last updated 19 hours ago.
diagnostic-toolharmonizationrshinyapp
14.9 match 2 stars 5.41 score 16 scriptsbioc
standR:Spatial transcriptome analyses of Nanostring's DSP data in R
standR is an user-friendly R package providing functions to assist conducting good-practice analysis of Nanostring's GeoMX DSP data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. standR allows data inspection, quality control, normalization, batch correction and evaluation with informative visualizations.
Maintained by Ning Liu. Last updated 1 months ago.
spatialtranscriptomicsgeneexpressiondifferentialexpressionqualitycontrolnormalizationexperimenthubsoftware
10.8 match 18 stars 7.39 score 45 scriptsdavidcsterratt
retistruct:Retinal Reconstruction Program
Reconstructs retinae by morphing a flat surface with cuts (a dissected flat-mount retina) onto a curvilinear surface (the standard retinal shape). It can estimate the position of a point on the intact adult retina to within 8 degrees of arc (3.6% of nasotemporal axis). The coordinates in reconstructed retinae can be transformed to visuotopic coordinates. For more details see Sterratt, D. C., Lyngholm, D., Willshaw, D. J. and Thompson, I. D. (2013) <doi:10.1371/journal.pcbi.1002921>.
Maintained by David C. Sterratt. Last updated 10 days ago.
17.4 match 8 stars 4.60 scoreazure
AzureGraph:Simple Interface to 'Microsoft Graph'
A simple interface to the 'Microsoft Graph' API <https://learn.microsoft.com/en-us/graph/overview>. 'Graph' is a comprehensive framework for accessing data in various online Microsoft services. This package was originally intended to provide an R interface only to the 'Azure Active Directory' part, with a view to supporting interoperability of R and 'Azure': users, groups, registered apps and service principals. However it has since been expanded into a more general tool for interacting with Graph. Part of the 'AzureR' family of packages.
Maintained by Hong Ooi. Last updated 2 years ago.
azure-active-directory-graph-apiazure-sdk-rmicrosoft-graph-api
7.5 match 32 stars 10.30 score 36 scripts 21 dependentsbioc
MBECS:Evaluation and correction of batch effects in microbiome data-sets
The Microbiome Batch Effect Correction Suite (MBECS) provides a set of functions to evaluate and mitigate unwated noise due to processing in batches. To that end it incorporates a host of batch correcting algorithms (BECA) from various packages. In addition it offers a correction and reporting pipeline that provides a preliminary look at the characteristics of a data-set before and after correcting for batch effects.
Maintained by Michael Olbrich. Last updated 5 months ago.
batcheffectmicrobiomereportwritingvisualizationnormalizationqualitycontrol
16.6 match 4 stars 4.60 score 4 scriptsbioc
mbkmeans:Mini-batch K-means Clustering for Single-Cell RNA-seq
Implements the mini-batch k-means algorithm for large datasets, including support for on-disk data representation.
Maintained by Davide Risso. Last updated 5 months ago.
clusteringgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellhuman-cell-atlascpp
10.0 match 10 stars 7.41 score 54 scripts 2 dependentsmartinloza
Canek:Batch Correction of Single Cell Transcriptome Data
Non-linear/linear hybrid method for batch-effect correction that uses Mutual Nearest Neighbors (MNNs) to identify similar cells between datasets. Reference: Loza M. et al. (NAR Genomics and Bioinformatics, 2020) <doi:10.1093/nargab/lqac022>.
Maintained by Martin Loza. Last updated 1 years ago.
batch-effectsbioinformaticssingle-cell-rna-seqtranscriptomics
14.3 match 5 stars 5.06 score 23 scriptscrunch-io
crunch:Crunch.io Data Tools
The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.
Maintained by Greg Freedman Ellis. Last updated 12 days ago.
6.9 match 9 stars 10.53 score 200 scripts 2 dependentsyaoxiangli
cmmr:CEU Mass Mediator RESTful API
CEU (CEU San Pablo University) Mass Mediator is an on-line tool for aiding researchers in performing metabolite annotation. 'cmmr' (CEU Mass Mediator RESTful API) allows for programmatic access in R: batch search, batch advanced search, MS/MS (tandem mass spectrometry) search, etc. For more information about the API Endpoint please go to <https://github.com/YaoxiangLi/cmmr>.
Maintained by Yaoxiang Li. Last updated 5 months ago.
batch-searchceu-mass-mediatormetablomicsms-search
15.1 match 15 stars 4.73 score 12 scriptsouhscbbmc
REDCapR:Interaction Between R and REDCap
Encapsulates functions to streamline calls from R to the REDCap API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The Application Programming Interface (API) offers an avenue to access and modify data programmatically, improving the capacity for literate and reproducible programming.
Maintained by Will Beasley. Last updated 2 months ago.
5.7 match 118 stars 12.36 score 438 scripts 6 dependentsbioc
RPA:RPA: Robust Probabilistic Averaging for probe-level analysis
Probabilistic analysis of probe reliability and differential gene expression on short oligonucleotide arrays.
Maintained by Leo Lahti. Last updated 5 months ago.
geneexpressionmicroarraypreprocessingqualitycontrol
12.2 match 5.78 score 20 scripts 1 dependentsmrcieu
ieugwasr:Interface to the 'OpenGWAS' Database API
Interface to the 'OpenGWAS' database API <https://api.opengwas.io/api/>. Includes a wrapper to make generic calls to the API, plus convenience functions for specific queries.
Maintained by Gibran Hemani. Last updated 4 days ago.
6.5 match 89 stars 10.71 score 404 scripts 6 dependentskharchenkolab
conos:Clustering on Network of Samples
Wires together large collections of single-cell RNA-seq datasets, which allows for both the identification of recurrent cell clusters and the propagation of information between datasets in multi-sample or atlas-scale collections. 'Conos' focuses on the uniform mapping of homologous cell types across heterogeneous sample collections. For instance, users could investigate a collection of dozens of peripheral blood samples from cancer patients combined with dozens of controls, which perhaps includes samples of a related tissue such as lymph nodes. This package interacts with data available through the 'conosPanel' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/conos>. The size of the 'conosPanel' package is approximately 12 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
batch-correctionscrna-seqsingle-cell-rna-seqopenblascppopenmp
9.1 match 204 stars 7.32 score 258 scriptsbioc
NewWave:Negative binomial model for scRNA-seq
A model designed for dimensionality reduction and batch effect removal for scRNA-seq data. It is designed to be massively parallelizable using shared objects that prevent memory duplication, and it can be used with different mini-batch approaches in order to reduce time consumption. It assumes a negative binomial distribution for the data with a dispersion parameter that can be both commonwise across gene both genewise.
Maintained by Federico Agostinis. Last updated 5 months ago.
softwaregeneexpressiontranscriptomicssinglecellbatcheffectsequencingcoverageregressionbatch-effectsdimensionality-reductionnegative-binomialscrna-seq
12.4 match 4 stars 5.33 score 27 scriptsbioc
sccomp:Tests differences in cell-type proportion for single-cell data, robust to outliers
A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.
Maintained by Stefano Mangiola. Last updated 2 days ago.
bayesianregressiondifferentialexpressionsinglecellmetagenomicsflowcytometryspatialbatch-correctioncompositioncytofdifferential-proportionmicrobiomemultilevelproportionsrandom-effectssingle-cellunwanted-variation
7.5 match 99 stars 8.43 score 69 scriptspaws-r
paws:Amazon Web Services Software Development Kit
Interface to Amazon Web Services <https://aws.amazon.com>, including storage, database, and compute services, such as 'Simple Storage Service' ('S3'), 'DynamoDB' 'NoSQL' database, and 'Lambda' functions-as-a-service.
Maintained by Dyfan Jones. Last updated 5 days ago.
5.6 match 332 stars 11.25 score 177 scripts 12 dependentsberrij
profoc:Probabilistic Forecast Combination Using CRPS Learning
Combine probabilistic forecasts using CRPS learning algorithms proposed in Berrisch, Ziel (2021) <doi:10.48550/arXiv.2102.00968> <doi:10.1016/j.jeconom.2021.11.008>. The package implements multiple online learning algorithms like Bernstein online aggregation; see Wintenberger (2014) <doi:10.48550/arXiv.1404.1356>. Quantile regression is also implemented for comparison purposes. Model parameters can be tuned automatically with respect to the loss of the forecast combination. Methods like predict(), update(), plot() and print() are available for convenience. This package utilizes the optim C++ library for numeric optimization <https://github.com/kthohr/optim>.
Maintained by Jonathan Berrisch. Last updated 6 months ago.
10.8 match 14 stars 5.74 score 13 scriptsbioc
ChemmineR:Cheminformatics Toolkit for R
ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.
Maintained by Thomas Girke. Last updated 5 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomicscpp
6.5 match 14 stars 9.42 score 253 scripts 12 dependentsbioc
scMerge:scMerge: Merging multiple batches of scRNA-seq data
Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.
Maintained by Yingxin Lin. Last updated 5 months ago.
batcheffectgeneexpressionnormalizationrnaseqsequencingsinglecellsoftwaretranscriptomicsbioinformaticssingle-cell
6.4 match 67 stars 9.52 score 137 scripts 1 dependentssciurus365
simlandr:Simulation-Based Landscape Construction for Dynamical Systems
A toolbox for constructing potential landscapes for dynamical systems using Monte Carlo simulation. The method is based on the potential landscape definition by Wang et al. (2008) <doi:10.1073/pnas.0800579105> (also see Zhou & Li, 2016 <doi:10.1063/1.4943096> for further mathematical discussions) and can be used for a large variety of models.
Maintained by Jingmeng Cui. Last updated 1 months ago.
9.4 match 6 stars 6.41 score 12 scripts 2 dependentsmikejareds
hermiter:Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)
Facilitates estimation of full univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation (bivariate) using Hermite series based estimators. These estimators are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. Based on: Stephanou, Michael, Varughese, Melvin and Macdonald, Iain. "Sequential quantiles via Hermite series density estimation." Electronic Journal of Statistics 11.1 (2017): 570-607 <doi:10.1214/17-EJS1245>, Stephanou, Michael and Varughese, Melvin. "On the properties of Hermite series based distribution function estimators." Metrika (2020) <doi:10.1007/s00184-020-00785-z> and Stephanou, Michael and Varughese, Melvin. "Sequential estimation of Spearman rank correlation using Hermite series estimators." Journal of Multivariate Analysis (2021) <doi:10.1016/j.jmva.2021.104783>.
Maintained by Michael Stephanou. Last updated 7 months ago.
cumulative-distribution-functionkendall-correlation-coefficientonline-algorithmsprobability-density-functionquantilespearman-correlation-coefficientstatisticsstreaming-algorithmsstreaming-datacpp
10.7 match 15 stars 5.58 score 17 scriptsbioc
CellMixS:Evaluate Cellspecific Mixing
CellMixS provides metrics and functions to evaluate batch effects, data integration and batch effect correction in single cell trancriptome data with single cell resolution. Results can be visualized and summarised on different levels, e.g. on cell, celltype or dataset level.
Maintained by Almut Lütge. Last updated 5 months ago.
singlecelltranscriptomicsgeneexpressionbatcheffect
9.4 match 7 stars 6.35 score 64 scriptsbioc
pmp:Peak Matrix Processing and signal batch correction for metabolomics datasets
Methods and tools for (pre-)processing of metabolomics datasets (i.e. peak matrices), including filtering, normalisation, missing value imputation, scaling, and signal drift and batch effect correction methods. Filtering methods are based on: the fraction of missing values (across samples or features); Relative Standard Deviation (RSD) calculated from the Quality Control (QC) samples; the blank samples. Normalisation methods include Probabilistic Quotient Normalisation (PQN) and normalisation to total signal intensity. A unified user interface for several commonly used missing value imputation algorithms is also provided. Supported methods are: k-nearest neighbours (knn), random forests (rf), Bayesian PCA missing value estimator (bpca), mean or median value of the given feature and a constant small value. The generalised logarithm (glog) transformation algorithm is available to stabilise the variance across low and high intensity mass spectral features. Finally, this package provides an implementation of the Quality Control-Robust Spline Correction (QCRSC) algorithm for signal drift and batch effect correction of mass spectrometry-based datasets.
Maintained by Gavin Rhys Lloyd. Last updated 5 months ago.
massspectrometrymetabolomicssoftwarequalitycontrolbatcheffect
12.6 match 4.60 score 33 scriptsbioc
BUScorrect:Batch Effects Correction with Unknown Subtypes
High-throughput experimental data are accumulating exponentially in public databases. However, mining valid scientific discoveries from these abundant resources is hampered by technical artifacts and inherent biological heterogeneity. The former are usually termed "batch effects," and the latter is often modelled by "subtypes." The R package BUScorrect fits a Bayesian hierarchical model, the Batch-effects-correction-with-Unknown-Subtypes model (BUS), to correct batch effects in the presence of unknown subtypes. BUS is capable of (a) correcting batch effects explicitly, (b) grouping samples that share similar characteristics into subtypes, (c) identifying features that distinguish subtypes, and (d) enjoying a linear-order computation complexity.
Maintained by Xiangyu Luo. Last updated 5 months ago.
geneexpressionstatisticalmethodbayesianclusteringfeatureextractionbatcheffect
14.4 match 4.00 score 2 scriptsbioc
MetaCyto:MetaCyto: A package for meta-analysis of cytometry data
This package provides functions for preprocessing, automated gating and meta-analysis of cytometry data. It also provides functions that facilitate the collection of cytometry data from the ImmPort database.
Maintained by Zicheng Hu. Last updated 5 months ago.
immunooncologycellbiologyflowcytometryclusteringstatisticalmethodsoftwarecellbasedassayspreprocessing
11.9 match 4.73 score 18 scriptspaws-r
paws.compute:'Amazon Web Services' Compute Services
Interface to 'Amazon Web Services' compute services, including 'Elastic Compute Cloud' ('EC2'), 'Lambda' functions-as-a-service, containers, batch processing, and more <https://aws.amazon.com/>.
Maintained by Dyfan Jones. Last updated 5 days ago.
6.1 match 332 stars 9.12 score 16 dependentsrezakj
iCellR:Analyzing High-Throughput Single Cell Sequencing Data
A toolkit that allows scientists to work with data from single cell sequencing technologies such as scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST). Single (i) Cell R package ('iCellR') provides unprecedented flexibility at every step of the analysis pipeline, including normalization, clustering, dimensionality reduction, imputation, visualization, and so on. Users can design both unsupervised and supervised models to best suit their research. In addition, the toolkit provides 2D and 3D interactive visualizations, differential expression analysis, filters based on cells, genes and clusters, data merging, normalizing for dropouts, data imputation methods, correcting for batch differences, pathway analysis, tools to find marker genes for clusters and conditions, predict cell types and pseudotime analysis. See Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.05.05.078550> and Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.03.31.019109> for more details.
Maintained by Alireza Khodadadi-Jamayran. Last updated 8 months ago.
10xgenomics3dbatch-normalizationcell-type-classificationcite-seqclusteringclustering-algorithmdiffusion-mapsdropouticellrimputationintractive-graphnormalizationpseudotimescrna-seqscvdj-seqsingel-cell-sequencingumapcpp
9.9 match 121 stars 5.56 score 7 scripts 1 dependentsbioc
onlineFDR:Online error rate control
This package allows users to control the false discovery rate (FDR) or familywise error rate (FWER) for online multiple hypothesis testing, where hypotheses arrive in a stream. In this framework, a null hypothesis is rejected based on the evidence against it and on the previous rejection decisions.
Maintained by David S. Robertson. Last updated 5 months ago.
multiplecomparisonsoftwarestatisticalmethoderror-rate-controlfdrfwerhypothesis-testingcpp
7.9 match 14 stars 6.88 score 26 scriptssciviews
svMisc:Miscellaneous Functions for 'SciViews::R'
Functions required for the 'SciViews::R' dialect or for general use: manage a temporary environment attached to the search path, define synonyms for R functions using aka(), test if 'Aqua', 'Mac', 'Win' ... Show progress bar, etc.
Maintained by Philippe Grosjean. Last updated 4 months ago.
6.4 match 3 stars 8.35 score 380 scripts 16 dependentsbioc
oligoClasses:Classes for high-throughput arrays supported by oligo and crlmm
This package contains class definitions, validity checks, and initialization methods for classes used by the oligo and crlmm packages.
Maintained by Benilton Carvalho. Last updated 5 months ago.
8.8 match 5.85 score 93 scripts 17 dependentspiusdahinden
expirest:Expiry Estimation Procedures
The Australian Regulatory Guidelines for Prescription Medicines (ARGPM), guidance on "Stability testing for prescription medicines", recommends to predict the shelf life of chemically derived medicines from stability data by taking the worst case situation at batch release into account. Consequently, if a change over time is observed, a release limit needs to be specified. Finding a release limit and the associated shelf life is supported, as well as the standard approach that is recommended by guidance Q1E "Evaluation of stability data" from the International Council for Harmonisation (ICH).
Maintained by Pius Dahinden. Last updated 21 days ago.
15.0 match 3.40 score 6 scriptsbioc
Xeva:Analysis of patient-derived xenograft (PDX) data
The Xeva package provides efficient and powerful functions for patient-drived xenograft (PDX) based pharmacogenomic data analysis. This package contains a set of functions to perform analysis of patient-derived xenograft data. This package was developed by the BHKLab, for further information please see our documentation.
Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassification
8.0 match 11 stars 6.35 score 17 scriptstianshu129
foqat:Field Observation Quick Analysis Toolkit
Tools for quickly processing and analyzing field observation data and air quality data. This tools contain functions that facilitate analysis in atmospheric chemistry (especially in ozone pollution). Some functions of time series are also applicable to other fields. For detail please view homepage<https://github.com/tianshu129/foqat>. Scientific Reference: 1. The Hydroxyl Radical (OH) Reactivity: Roger Atkinson and Janet Arey (2003) <doi:10.1021/cr0206420>. 2. Ozone Formation Potential (OFP): <https://ww2.arb.ca.gov/sites/default/files/classic/regact/2009/mir2009/mir10.pdf>, Zhang et al.(2021) <doi:10.5194/acp-21-11053-2021>. 3. Aerosol Formation Potential (AFP): Wenjing Wu et al. (2016) <doi:10.1016/j.jes.2016.03.025>. 4. TUV model: <https://www2.acom.ucar.edu/modeling/tropospheric-ultraviolet-and-visible-tuv-radiation-model>.
Maintained by Tianshu Chen. Last updated 6 months ago.
air-pollutionair-qualityair-quality-dataair-quality-measurementsair-quality-monitorair-quality-reportsair-quality-sensoratmospheric-chemistryatmospheric-modellingatmospheric-sciencedaily-maximum-8-hour-ozonefield-observationmirofpozone-formation-potentialphotolysis-rate-coefficientstime-seriestime-series-analysistuv
11.1 match 35 stars 4.54 score 20 scriptsropensci
tacmagic:Positron Emission Tomography Time-Activity Curve Analysis
To facilitate the analysis of positron emission tomography (PET) time activity curve (TAC) data, and to encourage open science and replicability, this package supports data loading and analysis of multiple TAC file formats. Functions are available to analyze loaded TAC data for individual participants or in batches. Major functionality includes weighted TAC merging by region of interest (ROI), calculating models including standardized uptake value ratio (SUVR) and distribution volume ratio (DVR, Logan et al. 1996 <doi:10.1097/00004647-199609000-00008>), basic plotting functions and calculation of cut-off values (Aizenstein et al. 2008 <doi:10.1001/archneur.65.11.1509>). Please see the walkthrough vignette for a detailed overview of 'tacmagic' functions.
Maintained by Eric Brown. Last updated 5 years ago.
mrineuroimagingneuroscienceneuroscience-methodspetpet-mrpositronpositron-emission-tomographystatistics
10.3 match 5 stars 4.76 score 23 scriptsavi-kenny
SimEngine:A Modular Framework for Statistical Simulations in R
An open-source R package for structuring, maintaining, running, and debugging statistical simulations on both local and cluster-based computing environments.See full documentation at <https://avi-kenny.github.io/SimEngine/>.
Maintained by Avi Kenny. Last updated 24 days ago.
6.4 match 12 stars 7.18 score 50 scriptsmichaelhallquist
MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus
Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.
Maintained by Michael Hallquist. Last updated 2 months ago.
3.6 match 86 stars 12.96 score 664 scripts 13 dependentsmlr-org
mlr3torch:Deep Learning with 'mlr3'
Deep Learning library that extends the mlr3 framework by building upon the 'torch' package. It allows to conveniently build, train, and evaluate deep learning models without having to worry about low level details. Custom architectures can be created using the graph language defined in 'mlr3pipelines'.
Maintained by Sebastian Fischer. Last updated 1 months ago.
data-sciencedeep-learningmachine-learningmlr3torch
6.0 match 42 stars 7.63 score 78 scriptstengfei-emory
QuantNorm:Mitigating the Adverse Impact of Batch Effects in Sample Pattern Detection
Modifies the distance matrix obtained from data with batch effects, so as to improve the performance of sample pattern detection, such as clustering, dimension reduction, and construction of networks between subjects. The method has been published in Bioinformatics (Fei et al, 2018, <doi:10.1093/bioinformatics/bty117>). Also available on 'GitHub' <https://github.com/tengfei-emory/QuantNorm>.
Maintained by Teng Fei. Last updated 5 years ago.
12.5 match 9 stars 3.65 score 9 scriptsohdsi
Andromeda:Asynchronous Disk-Based Representation of Massive Data
Storing very large data objects on a local drive, while still making it possible to manipulate the data in an efficient manner.
Maintained by Martijn Schuemie. Last updated 7 months ago.
4.9 match 11 stars 9.18 score 57 scripts 7 dependentsmmaechler
cluster:"Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al.
Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) "Finding Groups in Data".
Maintained by Martin Maechler. Last updated 5 days ago.
3.8 match 3 stars 11.98 score 14k scripts 2.2k dependentsus-bea
bea.R:Bureau of Economic Analysis API
Provides an R interface for the Bureau of Economic Analysis (BEA) API (see <http://www.bea.gov/API/bea_web_service_api_user_guide.htm> for more information) that serves two core purposes - 1. To Extract/Transform/Load data [beaGet()] from the BEA API as R-friendly formats in the user's work space [transformation done by default in beaGet() can be modified using optional parameters; see, too, bea2List(), bea2Tab()]. 2. To enable the search of descriptive meta data [beaSearch()]. Other features of the library exist mainly as intermediate methods or are in early stages of development. Important Note - You must have an API key to use this library. Register for a key at <http://www.bea.gov/API/signup/index.cfm> .
Maintained by Andrea Batch. Last updated 2 months ago.
9.2 match 118 stars 4.77 scorebioc
sva:Surrogate Variable Analysis
The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv). Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics).
Maintained by Jeffrey T. Leek. Last updated 5 months ago.
immunooncologymicroarraystatisticalmethodpreprocessingmultiplecomparisonsequencingrnaseqbatcheffectnormalization
4.3 match 10.05 score 3.2k scripts 50 dependentsyufree
enviGCMS:GC/LC-MS Data Analysis for Environmental Science
Gas/Liquid Chromatography-Mass Spectrometer(GC/LC-MS) Data Analysis for Environmental Science. This package covered topics such molecular isotope ratio, matrix effects and Short-Chain Chlorinated Paraffins analysis etc. in environmental analysis.
Maintained by Miao YU. Last updated 2 months ago.
environmentmass-spectrometrymetabolomics
6.7 match 17 stars 6.49 score 30 scripts 1 dependentscvasi-tktd
cvasi:Calibration, Validation, and Simulation of TKTD Models
Eases the use of ecotoxicological effect models. Can simulate common toxicokinetic-toxicodynamic (TK/TD) models such as General Unified Threshold models of Survival (GUTS) and Lemna. It can derive effects and effect profiles (EPx) from scenarios. It supports the use of 'tidyr' workflows employing the pipe symbol. Time-consuming tasks can be parallelized.
Maintained by Nils Kehrein. Last updated 5 days ago.
ecotoxicologymodelingsimulation
6.8 match 2 stars 6.26 score 12 scriptsjessecambon
tidygeocoder:Geocoding Made Easy
An intuitive interface for getting data from geocoding services.
Maintained by Jesse Cambon. Last updated 4 months ago.
3.8 match 287 stars 11.35 score 1.0k scripts 9 dependentscbailiss
pivottabler:Create Pivot Tables
Create regular pivot tables with just a few lines of R. More complex pivot tables can also be created, e.g. pivot tables with irregular layouts, multiple calculations and/or derived calculations based on multiple data frames. Pivot tables are constructed using R only and can be written to a range of output formats (plain text, 'HTML', 'Latex' and 'Excel'), including with styling/formatting.
Maintained by Christopher Bailiss. Last updated 1 years ago.
calculationshtmlhtmlwidgetlatexpivot-tablesvisualization
5.2 match 122 stars 8.08 score 358 scripts 1 dependentsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 28 days ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
2.8 match 305 stars 14.45 score 1.6k scripts 6 dependentszdk123
pulsar:Parallel Utilities for Lambda Selection along a Regularization Path
Model selection for penalized graphical models using the Stability Approach to Regularization Selection ('StARS'), with options for speed-ups including Bounded StARS (B-StARS), batch computing, and other stability metrics (e.g., graphlet stability G-StARS). Christian L. Müller, Richard Bonneau, Zachary Kurtz (2016) <arXiv:1605.07072>.
Maintained by Zachary Kurtz. Last updated 1 years ago.
6.5 match 10 stars 6.16 score 65 scriptsbioc
scMultiSim:Simulation of Multi-Modality Single Cell Data Guided By Gene Regulatory Networks and Cell-Cell Interactions
scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments.
Maintained by Hechen Li. Last updated 5 months ago.
singlecelltranscriptomicsgeneexpressionsequencingexperimentaldesign
5.6 match 23 stars 7.15 score 11 scriptsropensci
workloopR:Analysis of Work Loops and Other Data from Muscle Physiology Experiments
Functions for the import, transformation, and analysis of data from muscle physiology experiments. The work loop technique is used to evaluate the mechanical work and power output of muscle. Josephson (1985) <doi:10.1242/jeb.114.1.493> modernized the technique for application in comparative biomechanics. Although our initial motivation was to provide functions to analyze work loop experiment data, as we developed the package we incorporated the ability to analyze data from experiments that are often complementary to work loops. There are currently three supported experiment types: work loops, simple twitches, and tetanus trials. Data can be imported directly from .ddf files or via an object constructor function. Through either method, data can then be cleaned or transformed via methods typically used in studies of muscle physiology. Data can then be analyzed to determine the timing and magnitude of force development and relaxation (for isometric trials) or the magnitude of work, net power, and instantaneous power among other things (for work loops). Although we do not provide plotting functions, all resultant objects are designed to be friendly to visualization via either base-R plotting or 'tidyverse' functions. This package has been peer-reviewed by rOpenSci (v. 1.1.0).
Maintained by Vikram B. Baliga. Last updated 8 months ago.
ddfmuscle-forcemuscle-physiology-experimentstetanuswork-loopworkloop
6.6 match 3 stars 5.92 score 46 scriptscran
zoomGroupStats:Analyze Text, Audio, and Video from 'Zoom' Meetings
Provides utilities for processing and analyzing the files that are exported from a recorded 'Zoom' Meeting. This includes analyzing data captured through video cameras and microphones, the text-based chat, and meta-data. You can analyze aspects of the conversation among meeting participants and their emotional expressions throughout the meeting.
Maintained by Andrew Knight. Last updated 4 years ago.
11.7 match 3.30 score 10 scriptscyclestreets
cyclestreets:Cycle Routing and Data for Cycling Advocacy
An interface to the cycle routing/data services provided by 'CycleStreets', a not-for-profit social enterprise and advocacy organisation. The application programming interfaces (APIs) provided by 'CycleStreets' are documented at (<https://www.cyclestreets.net/api/>). The focus of this package is the journey planning API, which aims to emulate the routes taken by a knowledgeable cyclist. An innovative feature of the routing service of its provision of fastest, quietest and balanced profiles. These represent routes taken to minimise time, avoid traffic and compromise between the two, respectively.
Maintained by Robin Lovelace. Last updated 3 months ago.
cyclingroutingtransporttransportation-planning
6.8 match 27 stars 5.62 score 31 scriptsbioc
randRotation:Random Rotation Methods for High Dimensional Data with Batch Structure
A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.
Maintained by Peter Hettegger. Last updated 5 months ago.
softwaresequencingbatcheffectbiomedicalinformaticsrnaseqpreprocessingmicroarraydifferentialexpressiongeneexpressiongeneticsmicrornaarraynormalizationstatisticalmethod
10.5 match 3.60 score 3 scriptsbioc
ChAMP:Chip Analysis Methylation Pipeline for Illumina HumanMethylation450 and EPIC
The package includes quality control metrics, a selection of normalization methods and novel methods to identify differentially methylated regions and to highlight copy number alterations.
Maintained by Yuan Tian. Last updated 5 months ago.
microarraymethylationarraynormalizationtwochannelcopynumberdnamethylation
5.6 match 6.54 score 278 scriptscristianetaniguti
onemap:Construction of Genetic Maps in Experimental Crosses
Analysis of molecular marker data from model (backcrosses, F2 and recombinant inbred lines) and non-model systems (i. e. outcrossing species). For the later, it allows statistical analysis by simultaneously estimating linkage and linkage phases (genetic map construction) according to Wu et al. (2002) <doi:10.1006/tpbi.2002.1577>. All analysis are based on multipoint approaches using hidden Markov models.
Maintained by Cristiane Taniguti. Last updated 2 months ago.
5.5 match 3 stars 6.58 score 183 scriptsrqtl
qtl2:Quantitative Trait Locus Mapping in Experimental Crosses
Provides a set of tools to perform quantitative trait locus (QTL) analysis in experimental crosses. It is a reimplementation of the 'R/qtl' package to better handle high-dimensional data and complex cross designs. Broman et al. (2019) <doi:10.1534/genetics.118.301595>.
Maintained by Karl W Broman. Last updated 9 days ago.
3.8 match 34 stars 9.48 score 1.1k scripts 5 dependentsropensci
pathviewr:Wrangle, Analyze, and Visualize Animal Movement Data
Tools to import, clean, and visualize movement data, particularly from motion capture systems such as Optitrack's 'Motive', the Straw Lab's 'Flydra', or from other sources. We provide functions to remove artifacts, standardize tunnel position and tunnel axes, select a region of interest, isolate specific trajectories, fill gaps in trajectory data, and calculate 3D and per-axis velocity. For experiments of visual guidance, we also provide functions that use subject position to estimate perception of visual stimuli.
Maintained by Vikram B. Baliga. Last updated 2 years ago.
animal-movementflydramotionmovement-dataoptitracktrajectoriestrajectory-analysisvisual-guidancevisual-perception
5.5 match 8 stars 6.56 score 102 scriptsbioc
gwasurvivr:gwasurvivr: an R package for genome wide survival analysis
gwasurvivr is a package to perform survival analysis using Cox proportional hazard models on imputed genetic data.
Maintained by Abbas Rizvi. Last updated 5 months ago.
genomewideassociationsurvivalregressiongeneticssnpgeneticvariabilitypharmacogenomicsbiomedicalinformatics
5.5 match 12 stars 6.43 score 75 scriptsbioc
Harman:The removal of batch effects from datasets using a PCA and constrained optimisation based technique
Harman is a PCA and constrained optimisation based technique that maximises the removal of batch effects from datasets, with the constraint that the probability of overcorrection (i.e. removing genuine biological signal along with batch noise) is kept to a fraction which is set by the end-user.
Maintained by Jason Ross. Last updated 5 months ago.
batcheffectmicroarraymultiplecomparisonprincipalcomponentnormalizationpreprocessingdnamethylationtranscriptionsoftwarestatisticalmethodcpp
7.1 match 4.97 score 31 scripts 1 dependentsdylanpieper
batchLLM:Batch Process LLM Text Completions Using a Data Frame
Batch process large language model (LLM) text completions using data frame rows, with support for OpenAI's 'GPT' (<https://chat.openai.com>), Anthropic's 'Claude' (<https://claude.ai>), and Google's 'Gemini' (<https://gemini.google.com>). Includes features such as local storage, metadata logging, API rate limiting delays, and a 'shiny' app addin.
Maintained by Dylan Pieper. Last updated 1 months ago.
7.3 match 11 stars 4.85 score 6 scriptscsafe-isu
handwriter:Handwriting Analysis in R
Perform statistical writership analysis of scanned handwritten documents. Webpage provided at: <https://github.com/CSAFE-ISU/handwriter>.
Maintained by Stephanie Reinders. Last updated 1 months ago.
4.0 match 24 stars 8.70 score 27 scripts 2 dependentsr-dbi
DBI:R Database Interface
A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.
Maintained by Kirill Müller. Last updated 3 months ago.
1.7 match 302 stars 20.88 score 19k scripts 2.9k dependentsbioc
MultiBaC:Multiomic Batch effect Correction
MultiBaC is a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. MultiBaC is the first Batch effect correction algorithm that dealing with batch effect correction in multiomics datasets. MultiBaC is able to remove batch effects across different omics generated within separate batches provided that at least one common omic data type is included in all the batches considered.
Maintained by The package maintainer. Last updated 5 months ago.
softwarestatisticalmethodprincipalcomponentdatarepresentationgeneexpressiontranscriptionbatcheffect
10.5 match 3.30 score 7 scriptsmlr-org
mlr3batchmark:Batch Experiments for 'mlr3'
Extends the 'mlr3' package with a connector to the package 'batchtools'. This allows to run large-scale benchmark experiments on scheduled high-performance computing clusters.
Maintained by Marc Becker. Last updated 1 years ago.
batchtoolscluster-computinghigh-performance-computinghpcmlr3
7.1 match 5 stars 4.85 score 57 scriptsneurodata
causalBatch:Causal Batch Effects
Software which provides numerous functionalities for detecting and removing group-level effects from high-dimensional scientific data which, when combined with additional assumptions, allow for causal conclusions, as-described in our manuscripts Bridgeford et al. (2024) <doi:10.1101/2021.09.03.458920> and Bridgeford et al. (2023) <doi:10.48550/arXiv.2307.13868>. Also provides a number of useful utilities for generating simulations and balancing covariates across multiple groups/batches of data via matching and propensity trimming for more than two groups.
Maintained by Eric W. Bridgeford. Last updated 4 days ago.
7.3 match 4 stars 4.70 score 23 scriptscran
ELISAtools:ELISA Data Analysis with Batch Correction
To run data analysis for enzyme-link immunosorbent assays (ELISAs). Either the five- or four-parameter logistic model will be fitted for data of single ELISA. Moreover, the batch effect correction/normalization will be carried out, when there are more than one batches of ELISAs. Feng (2018) <doi:10.1101/483800>.
Maintained by Feng Feng. Last updated 4 years ago.
10.4 match 1 stars 3.29 score 39 scriptsthej022214
corHMM:Hidden Markov Models of Character Evolution
Fits hidden Markov models of discrete character evolution which allow different transition rate classes on different portions of a phylogeny. Beaulieu et al (2013) <doi:10.1093/sysbio/syt034>.
Maintained by Jeremy Beaulieu. Last updated 28 days ago.
3.6 match 12 stars 9.48 score 422 scripts 2 dependentscjbarrie
academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint
Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.
Maintained by Christopher Barrie. Last updated 2 years ago.
3.8 match 275 stars 8.94 score 177 scriptsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
3.9 match 54 stars 8.63 score 221 scripts 3 dependentscran
agena.ai:R Wrapper for 'agena.ai' API
An R wrapper for 'agena.ai' <https://www.agena.ai> which provides users capabilities to work with 'agena.ai' using the R environment. Users can create Bayesian network models from scratch or import existing models in R and export to 'agena.ai' cloud or local API for calculations. Note: running calculations requires a valid 'agena.ai' API license (past the initial trial period of the local API).
Maintained by Eugene Dementiev. Last updated 1 years ago.
9.3 match 3.54 scorebioc
BiocParallel:Bioconductor facilities for parallel evaluation
This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.
Maintained by Martin Morgan. Last updated 27 days ago.
infrastructurebioconductor-packagecore-packageu24ca289073cpp
1.9 match 67 stars 17.40 score 7.3k scripts 1.1k dependentsbioc
BUSseq:Batch Effect Correction with Unknow Subtypes for scRNA-seq data
BUSseq R package fits an interpretable Bayesian hierarchical model---the Batch Effects Correction with Unknown Subtypes for scRNA seq Data (BUSseq)---to correct batch effects in the presence of unknown cell types. BUSseq is able to simultaneously correct batch effects, clusters cell types, and takes care of the count data nature, the overdispersion, the dropout events, and the cell-specific sequencing depth of scRNA-seq data. After correcting the batch effects with BUSseq, the corrected value can be used for downstream analysis as if all cells were sequenced in a single batch. BUSseq can integrate read count matrices obtained from different scRNA-seq platforms and allow cell types to be measured in some but not all of the batches as long as the experimental design fulfills the conditions listed in our manuscript.
Maintained by Fangda Song. Last updated 5 months ago.
experimentaldesigngeneexpressionstatisticalmethodbayesianclusteringfeatureextractionbatcheffectsinglecellsequencingcppopenmp
7.3 match 4.48 score 30 scriptscmstatr
cmstatr:Statistical Methods for Composite Material Data
An implementation of the statistical methods commonly used for advanced composite materials in aerospace applications. This package focuses on calculating basis values (lower tolerance bounds) for material strength properties, as well as performing the associated diagnostic tests. This package provides functions for calculating basis values assuming several different distributions, as well as providing functions for non-parametric methods of computing basis values. Functions are also provided for testing the hypothesis that there is no difference between strength and modulus data from an alternate sample and that from a "qualification" or "baseline" sample. For a discussion of these statistical methods and their use, see the Composite Materials Handbook, Volume 1 (2012, ISBN: 978-0-7680-7811-4). Additional details about this package are available in the paper by Kloppenborg (2020, <doi:10.21105/joss.02265>).
Maintained by Stefan Kloppenborg. Last updated 4 months ago.
composite-material-datadatamaterials-sciencestatistical-analysisstatistics
5.2 match 4 stars 6.26 score 23 scriptsrorynolan
detrendr:Detrend Images
Detrend fluorescence microscopy image series for fluorescence fluctuation and correlation spectroscopy ('FCS' and 'FFS') analysis. This package contains functionality published in a 2016 paper <doi:10.1093/bioinformatics/btx434> but it has been extended since then with the Robin Hood algorithm and thus contains unpublished work.
Maintained by Rory Nolan. Last updated 2 months ago.
5.3 match 3 stars 6.08 score 25 scripts 1 dependentsbioc
msmsEDA:Exploratory Data Analysis of LC-MS/MS data by spectral counts
Exploratory data analysis to assess the quality of a set of LC-MS/MS experiments, and visualize de influence of the involved factors.
Maintained by Josep Gregori. Last updated 5 months ago.
immunooncologysoftwaremassspectrometryproteomics
7.1 match 4.38 score 4 scripts 2 dependentstuomonieminen
read.gt3x:Parse 'ActiGraph' 'GT3X'/'GT3X+' 'Accelerometer' Data
Implements a high performance C++ parser for 'ActiGraph' 'GT3X'/'GT3X+' data format (with extension '.gt3x') for 'accelerometer' samples. Activity samples can be easily read into a matrix or data.frame. This allows for storing the raw 'accelerometer' samples in the original binary format to reserve space.
Maintained by Tuomo Nieminen. Last updated 3 years ago.
7.0 match 4.46 score 24 scripts 4 dependentsbioc
pvca:Principal Variance Component Analysis (PVCA)
This package contains the function to assess the batch sourcs by fitting all "sources" as random effects including two-way interaction terms in the Mixed Model(depends on lme4 package) to selected principal components, which were obtained from the original data correlation matrix. This package accompanies the book "Batch Effects and Noise in Microarray Experiements, chapter 12.
Maintained by Jianying LI. Last updated 5 months ago.
5.4 match 5.67 score 111 scripts 1 dependentsmlampros
SuperpixelImageSegmentation:Superpixel Image Segmentation
Image Segmentation using Superpixels, Affinity Propagation and Kmeans Clustering. The R code is based primarily on the article "Image Segmentation using SLIC Superpixels and Affinity Propagation Clustering, Bao Zhou, International Journal of Science and Research (IJSR), 2013" <https://www.ijsr.net/archive/v4i4/SUB152869.pdf>.
Maintained by Lampros Mouselimis. Last updated 2 years ago.
affinity-propagationkmeansmini-batch-kmeansslicsuperpixelsopenblascppopenmp
6.7 match 18 stars 4.61 score 15 scripts 1 dependentsallenzhuaz
PPQplan:Process Performance Qualification (PPQ) Plans in Chemistry, Manufacturing and Controls (CMC) Statistical Analysis
Assessment for statistically-based PPQ sampling plan, including calculating the passing probability, optimizing the baseline and high performance cutoff points, visualizing the PPQ plan and power dynamically. The analytical idea is based on the simulation methods from the textbook Burdick, R. K., LeBlond, D. J., Pfahler, L. B., Quiroz, J., Sidor, L., Vukovinsky, K., & Zhang, L. (2017). Statistical Methods for CMC Applications. In Statistical Applications for Chemistry, Manufacturing and Controls (CMC) in the Pharmaceutical Industry (pp. 227-250). Springer, Cham.
Maintained by Yalin Zhu. Last updated 3 years ago.
biostatisticspharmaceuticalssampling-methods
7.4 match 1 stars 4.11 score 13 scriptsravengan
SCIBER:Single-Cell Integrator and Batch Effect Remover
Remove batch effects by projecting query batches into the reference batch space.
Maintained by Dailin Gan. Last updated 2 years ago.
7.1 match 4 stars 4.30 score 8 scriptsalanarnholt
BSDA:Basic Statistics and Data Analysis
Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.
Maintained by Alan T. Arnholt. Last updated 2 years ago.
3.3 match 7 stars 9.11 score 1.3k scripts 6 dependentscivisanalytics
civis:R Client for the 'Civis Platform API'
A convenient interface for making requests directly to the 'Civis Platform API' <https://www.civisanalytics.com/platform/>. Full documentation available 'here' <https://civisanalytics.github.io/civis-r/>.
Maintained by Peter Cooman. Last updated 2 months ago.
3.9 match 16 stars 7.84 score 144 scriptsr-lib
bit:Classes and Methods for Fast Memory-Efficient Boolean Selections
Provided are classes for boolean and skewed boolean vectors, fast boolean methods, fast unique and non-unique integer sorting, fast set operations on sorted and unsorted sets of integers, and foundations for ff (range index, compression, chunked processing).
Maintained by Michael Chirico. Last updated 7 days ago.
2.0 match 12 stars 15.15 score 131 scripts 3.2k dependentsbioc
metabCombiner:Method for Combining LC-MS Metabolomics Feature Measurements
This package aligns LC-HRMS metabolomics datasets acquired from biologically similar specimens analyzed under similar, but not necessarily identical, conditions. Peak-picked and simply aligned metabolomics feature tables (consisting of m/z, rt, and per-sample abundance measurements, plus optional identifiers & adduct annotations) are accepted as input. The package outputs a combined table of feature pair alignments, organized into groups of similar m/z, and ranked by a similarity score. Input tables are assumed to be acquired using similar (but not necessarily identical) analytical methods.
Maintained by Hani Habra. Last updated 5 months ago.
softwaremassspectrometrymetabolomicsmass-spectrometry
5.3 match 10 stars 5.65 score 5 scriptsbroccolito
bolt4jr:Interface for the 'Neo4j Bolt' Protocol
Querying, extracting, and processing large-scale network data from Neo4j databases using the 'Neo4j Bolt' <https://neo4j.com/docs/bolt/current/bolt/> protocol. This interface supports efficient data retrieval, batch processing for large datasets, and seamless conversion of query results into R data frames, making it ideal for bioinformatics, computational biology, and other graph-based applications.
Maintained by Wanjun Gu. Last updated 5 days ago.
6.3 match 4.65 score 2 scriptsroelandkindt
BiodiversityR:Package for Community Ecology and Suitability Analysis
Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.
Maintained by Roeland Kindt. Last updated 2 months ago.
3.9 match 16 stars 7.42 score 390 scripts 2 dependentsbioc
PAA:PAA (Protein Array Analyzer)
PAA imports single color (protein) microarray data that has been saved in gpr file format - esp. ProtoArray data. After preprocessing (background correction, batch filtering, normalization) univariate feature preselection is performed (e.g., using the "minimum M statistic" approach - hereinafter referred to as "mMs"). Subsequently, a multivariate feature selection is conducted to discover biomarker candidates. Therefore, either a frequency-based backwards elimination aproach or ensemble feature selection can be used. PAA provides a complete toolbox of analysis tools including several different plots for results examination and evaluation.
Maintained by Michael Turewicz. Last updated 5 months ago.
classificationmicroarrayonechannelproteomicscpp
6.7 match 4.34 score 11 scriptspdwaggoner
hdImpute:A Batch Process for High Dimensional Imputation
A correlation-based batch process for fast, accurate imputation for high dimensional missing data problems via chained random forests. See Waggoner (2023) <doi:10.1007/s00180-023-01325-9> for more on 'hdImpute', Stekhoven and Bühlmann (2012) <doi:10.1093/bioinformatics/btr597> for more on 'missForest', and Mayer (2022) <https://github.com/mayer79/missRanger> for more on 'missRanger'.
Maintained by Philip Waggoner. Last updated 2 months ago.
8.5 match 2 stars 3.41 score 13 scriptsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 7 days ago.
1.8 match 520 stars 16.52 score 1.4k scripts 38 dependentspoissonconsulting
dbflobr:Read and Write Files to SQLite Databases
Reads and writes files to SQLite databases <https://www.sqlite.org/index.html> as flobs (a flob is a blob that preserves the file extension).
Maintained by Evan Amies-Galonski. Last updated 2 months ago.
4.9 match 6 stars 5.86 score 5 scriptsr-dbi
RSQLite:SQLite Interface for R
Embeds the SQLite database engine in R and provides an interface compliant with the DBI package. The source for the SQLite engine and for various extensions in a recent version is included. System libraries will never be consulted because this package relies on static linking for the plugins it includes; this also ensures a consistent experience across all installations.
Maintained by Kirill Müller. Last updated 26 days ago.
1.5 match 327 stars 18.73 score 8.1k scripts 1.1k dependentsbioc
systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation
systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.
Maintained by Thomas Girke. Last updated 5 months ago.
geneticsinfrastructuredataimportsequencingrnaseqriboseqchipseqmethylseqsnpgeneexpressioncoveragegenesetenrichmentalignmentqualitycontrolimmunooncologyreportwritingworkflowstepworkflowmanagement
2.4 match 53 stars 11.56 score 344 scripts 3 dependentsbioc
limma:Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Maintained by Gordon Smyth. Last updated 7 days ago.
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
2.0 match 13.81 score 16k scripts 585 dependentsrorynolan
nandb:Number and Brightness Image Analysis
Calculation of molecular number and brightness from fluorescence microscopy image series. The software was published in a 2016 paper <doi:10.1093/bioinformatics/btx434>. The seminal paper for the technique is Digman et al. 2008 <doi:10.1529/biophysj.107.114645>. A review of the technique was published in 2017 <doi:10.1016/j.ymeth.2017.12.001>.
Maintained by Rory Nolan. Last updated 2 months ago.
5.3 match 2 stars 5.24 score 29 scriptsbioc
scone:Single Cell Overview of Normalized Expression data
SCONE is an R package for comparing and ranking the performance of different normalization schemes for single-cell RNA-seq and other high-throughput analyses.
Maintained by Davide Risso. Last updated 26 days ago.
immunooncologynormalizationpreprocessingqualitycontrolgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellcoverage
3.0 match 53 stars 9.12 score 104 scriptsbluegreen-labs
daymetr:Interface to the 'Daymet' Web Services
Programmatic interface to the 'Daymet' web services (<http://daymet.ornl.gov>). Allows for easy downloads of 'Daymet' climate data directly to your R workspace or your computer. Routines for both single pixel data downloads and gridded (netCDF) data are provided.
Maintained by Koen Hufkens. Last updated 1 years ago.
climate-datadata-sciencedaymetgridded-datanetcdfornl-daac
3.4 match 31 stars 8.13 score 242 scripts 2 dependentsbioc
scp:Mass Spectrometry-Based Single-Cell Proteomics Data Analysis
Utility functions for manipulating, processing, and analyzing mass spectrometry-based single-cell proteomics data. The package is an extension to the 'QFeatures' package and relies on 'SingleCellExpirement' to enable single-cell proteomics analyses. The package offers the user the functionality to process quantitative table (as generated by MaxQuant, Proteome Discoverer, and more) into data tables ready for downstream analysis and data visualization.
Maintained by Christophe Vanderaa. Last updated 18 days ago.
geneexpressionproteomicssinglecellmassspectrometrypreprocessingcellbasedassaysbioconductormass-spectrometrysingle-cellsoftware
3.0 match 25 stars 8.94 score 115 scriptsmaialba3
LipidMS:Lipid Annotation for LC-MS/MS DDA or DIA Data
Lipid annotation in untargeted LC-MS lipidomics based on fragmentation rules. Alcoriza-Balaguer MI, Garcia-Canaveras JC, Lopez A, Conde I, Juan O, Carretero J, Lahoz A (2019) <doi:10.1021/acs.analchem.8b03409>.
Maintained by M Isabel Alcoriza-Balaguer. Last updated 7 months ago.
5.0 match 2 stars 5.33 score 12 scripts 1 dependentstalhouklab
nanostringr:Performs Quality Control, Data Normalization, and Batch Effect Correction for 'NanoString nCounter' Data
Provides quality control (QC), normalization, and batch effect correction operations for 'NanoString nCounter' data, Talhouk et al. (2016) <doi:10.1371/journal.pone.0153844>. Various metrics are used to determine which samples passed or failed QC. Gene expression should first be normalized to housekeeping genes, before a reference-based approach is used to adjust for batch effects. Raw NanoString data can be imported in the form of Reporter Code Count (RCC) files.
Maintained by Derek Chiu. Last updated 1 months ago.
5.3 match 5 stars 4.95 score 12 scriptsshixiangwang
regport:Regression Model Processing Port
Provides R6 classes, methods and utilities to construct, analyze, summarize, and visualize regression models.
Maintained by Shixiang Wang. Last updated 26 days ago.
batch-processingregression-models
7.5 match 6 stars 3.48 score 4 scriptsbioc
RnBeads:RnBeads
RnBeads facilitates comprehensive analysis of various types of DNA methylation data at the genome scale.
Maintained by Fabian Mueller. Last updated 1 months ago.
dnamethylationmethylationarraymethylseqepigeneticsqualitycontrolpreprocessingbatcheffectdifferentialmethylationsequencingcpgislandimmunooncologytwochanneldataimport
3.8 match 6.85 score 169 scripts 1 dependentsbluegreen-labs
ecmwfr:Interface to 'ECMWF' and 'CDS' Data Web Services
Programmatic interface to the European Centre for Medium-Range Weather Forecasts dataset web services (ECMWF; <https://www.ecmwf.int/>) and Copernicus's Data Stores. Allows for easy downloads of weather forecasts and climate reanalysis data in R. Data stores covered include the Climate Data Store (CDS; <https://cds.climate.copernicus.eu>), Atmosphere Data Store (ADS; <https://ads.atmosphere.copernicus.eu>) and Early Warning Data Store (CEMS; <https://ewds.climate.copernicus.eu>).
Maintained by Koen Hufkens. Last updated 1 months ago.
cdsclimate-datacopernicusecmwf-apiecmwf-services
2.5 match 111 stars 10.08 score 156 scripts 3 dependentsbioc
lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays
The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
Maintained by Lei Huang. Last updated 5 months ago.
microarrayonechannelpreprocessingdnamethylationqualitycontroltwochannel
4.0 match 6.27 score 294 scripts 5 dependentscbroeckl
RAMClustR:Mass Spectrometry Metabolomics Feature Clustering and Interpretation
A feature clustering algorithm for non-targeted mass spectrometric metabolomics data. This method is compatible with gas and liquid chromatography coupled mass spectrometry, including indiscriminant tandem mass spectrometry <DOI: 10.1021/ac501530d> data.
Maintained by Helge Hecht. Last updated 7 months ago.
3.6 match 12 stars 6.78 score 20 scriptsbioc
Omixer:Omixer: multivariate and reproducible sample randomization to proactively counter batch effects in omics studies
Omixer - an Bioconductor package for multivariate and reproducible sample randomization, which ensures optimal sample distribution across batches with well-documented methods. It outputs lab-friendly sample layouts, reducing the risk of sample mixups when manually pipetting randomized samples.
Maintained by Lucy Sinke. Last updated 5 months ago.
datarepresentationexperimentaldesignqualitycontrolsoftwarevisualization
6.0 match 4.00 score 2 scriptsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
3.3 match 18 stars 7.24 score 35 scriptssbg
sevenbridges2:The 'Seven Bridges Platform' API Client
R client and utilities for 'Seven Bridges Platform' API, from 'Cancer Genomics Cloud' to other 'Seven Bridges' supported platforms. API documentation is hosted publicly at <https://docs.sevenbridges.com/docs/the-api>.
Maintained by Marko Trifunovic. Last updated 21 days ago.
api-clientbioinformaticscloudsevenbridges
4.0 match 2 stars 5.90 score 4 scriptsapache
nanoarrow:Interface to the 'nanoarrow' 'C' Library
Provides an 'R' interface to the 'nanoarrow' 'C' library and the 'Apache Arrow' application binary interface. Functions to import and export 'ArrowArray', 'ArrowSchema', and 'ArrowArrayStream' 'C' structures to and from 'R' objects are provided alongside helpers to facilitate zero-copy data transfer among 'R' bindings to libraries implementing the 'Arrow' 'C' data interface.
Maintained by Dewey Dunnington. Last updated 3 days ago.
2.0 match 183 stars 11.79 score 37 scripts 27 dependentshiweller
recolorize:Color-Based Image Segmentation
Automatic, semi-automatic, and manual functions for generating color maps from images. The idea is to simplify the colors of an image according to a metric that is useful for the user, using deterministic methods whenever possible. Many images will be clustered well using the out-of-the-box functions, but the package also includes a toolbox of functions for making manual adjustments (layer merging/isolation, blurring, fitting to provided color clusters or those from another image, etc). Also includes export methods for other color/pattern analysis packages (pavo, patternize, colordistance).
Maintained by Hannah Weller. Last updated 14 days ago.
3.0 match 39 stars 7.68 score 87 scriptskerschke
flacco:Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems
Tools and features for "Exploratory Landscape Analysis (ELA)" of single-objective continuous optimization problems. Those features are able to quantify rather complex properties, such as the global structure, separability, etc., of the optimization problems.
Maintained by Pascal Kerschke. Last updated 2 years ago.
exploratory-landscape-analysisguioptimization
3.4 match 61 stars 6.70 score 41 scriptsmartynplummer
coda:Output Analysis and Diagnostics for MCMC
Provides functions for summarizing and plotting the output from Markov Chain Monte Carlo (MCMC) simulations, as well as diagnostic tests of convergence to the equilibrium distribution of the Markov chain.
Maintained by Martyn Plummer. Last updated 1 years ago.
2.0 match 6 stars 11.33 score 8.3k scripts 1.1k dependentsropensci
lightr:Read Spectrometric Data and Metadata
Parse various reflectance/transmittance/absorbance spectra file formats to extract spectral data and metadata, as described in Gruson, White & Maia (2019) <doi:10.21105/joss.01857>. Among other formats, it can import files from 'Avantes' <https://www.avantes.com/>, 'CRAIC' <https://www.microspectra.com/>, and 'OceanOptics'/'OceanInsight' <https://www.oceanoptics.com/> brands.
Maintained by Hugo Gruson. Last updated 1 months ago.
file-importreproducibilityreproducible-researchreproducible-sciencespectral-dataspectroscopy
3.1 match 13 stars 7.11 score 11 scripts 2 dependentsohdsi
DatabaseConnector:Connecting to Various Database Platforms
An R 'DataBase Interface' ('DBI') compatible interface to various database platforms ('PostgreSQL', 'Oracle', 'Microsoft SQL Server', 'Amazon Redshift', 'Microsoft Parallel Database Warehouse', 'IBM Netezza', 'Apache Impala', 'Google BigQuery', 'Snowflake', 'Spark', 'SQLite', and 'InterSystems IRIS'). Also includes support for fetching data as 'Andromeda' objects. Uses either 'Java Database Connectivity' ('JDBC') or other 'DBI' drivers to connect to databases.
Maintained by Martijn Schuemie. Last updated 2 months ago.
1.8 match 56 stars 12.63 score 772 scripts 11 dependentsmarce10
Rraven:Connecting R and 'Raven' Sound Analysis Software
A tool to exchange data between R and 'Raven' sound analysis software (Cornell Lab of Ornithology). Functions work on data formats compatible with the R package 'warbleR'.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
3.7 match 10 stars 6.00 score 50 scriptsbioc
structToolbox:Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Maintained by Gavin Rhys Lloyd. Last updated 26 days ago.
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
3.5 match 10 stars 6.26 score 12 scriptslau-mel
swamp:Visualization, Analysis and Adjustment of High-Dimensional Data in Respect to Sample Annotations
Collection of functions to connect the structure of the data with the information on the samples. Three types of associations are covered: 1. linear model of principal components. 2. hierarchical clustering analysis. 3. distribution of features-sample annotation associations. Additionally, the inter-relation between sample annotations can be analyzed. Simple methods are provided for the correction of batch effects and removal of principal components.
Maintained by Martin Lauss. Last updated 5 years ago.
9.1 match 2.42 score 29 scripts 1 dependentsbioc
cogeqc:Systematic quality checks on comparative genomics analyses
cogeqc aims to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. cogeqc can be used to asses: i. genome assembly and annotation quality with BUSCOs and comparisons of statistics with publicly available genomes on the NCBI; ii. orthogroup inference using a protein domain-based approach and; iii. synteny detection using synteny network properties. There are also data visualization functions to explore QC summary statistics.
Maintained by Fabrício Almeida-Silva. Last updated 5 months ago.
softwaregenomeassemblycomparativegenomicsfunctionalgenomicsphylogeneticsqualitycontrolnetworkcomparative-genomicsevolutionary-genomics
3.6 match 10 stars 6.08 score 20 scriptseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
2.3 match 118 stars 9.40 score 76 scriptsbioc
GWASTools:Tools for Genome Wide Association Studies
Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.
Maintained by Stephanie M. Gogarten. Last updated 5 months ago.
snpgeneticvariabilityqualitycontrolmicroarray
2.0 match 17 stars 10.50 score 396 scripts 5 dependentscran
qpcR:Modelling and Analysis of Real-Time PCR Data
Model fitting, optimal model selection and calculation of various features that are essential in the analysis of quantitative real-time polymerase chain reaction (qPCR).
Maintained by Andrej-Nikolai Spiess. Last updated 7 years ago.
6.8 match 2 stars 3.06 score 1 dependentsbioc
chevreulProcess:Tools for managing SingleCellExperiment objects as projects
Tools analyzing SingleCellExperiment objects as projects. for input into the Chevreul app downstream. Includes functions for analysis of single cell RNA sequencing data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 1 months ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
3.8 match 5.38 score 2 scripts 2 dependentsfelixfan
FinCal:Time Value of Money, Time Series Analysis and Computational Finance
Package for time value of money calculation, time series analysis and computational finance.
Maintained by Felix Yanhui Fan. Last updated 8 years ago.
3.3 match 23 stars 6.02 score 203 scripts 1 dependentsbioc
PhosR:A set of methods and tools for comprehensive analysis of phosphoproteomics data
PhosR is a package for the comprenhensive analysis of phosphoproteomic data. There are two major components to PhosR: processing and downstream analysis. PhosR consists of various processing tools for phosphoproteomics data including filtering, imputation, normalisation, and functional analysis for inferring active kinases and signalling pathways.
Maintained by Taiyun Kim. Last updated 5 months ago.
softwareresearchfieldproteomics
4.2 match 4.71 score 51 scriptspalderman
DSSAT:A Comprehensive R Interface for the DSSAT Cropping Systems Model
The purpose of this package is to provide a comprehensive R interface to the Decision Support System for Agrotechnology Transfer Cropping Systems Model (DSSAT-CSM; see <https://dssat.net> for more information). The package provides cross-platform functions to read and write input files, run DSSAT-CSM, and read output files.
Maintained by Phillip D. Alderman. Last updated 1 years ago.
3.5 match 22 stars 5.57 score 34 scriptsdanchaltiel
crosstable:Crosstables for Descriptive Analyses
Create descriptive tables for continuous and categorical variables. Apply summary statistics and counting function, with or without a grouping variable, and create beautiful reports using 'rmarkdown' or 'officer'. You can also compute effect sizes and statistical tests if needed.
Maintained by Dan Chaltiel. Last updated 2 months ago.
descriptive-statisticsflextablefrequency-tablehtml-reportmswordofficer
1.9 match 116 stars 10.37 score 340 scriptsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 12 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
1.2 match 375 stars 16.11 score 17k scripts 115 dependentsbioc
MAPFX:MAssively Parallel Flow cytometry Xplorer (MAPFX): A Toolbox for Analysing Data from the Massively-Parallel Cytometry Experiments
MAPFX is an end-to-end toolbox that pre-processes the raw data from MPC experiments (e.g., BioLegend's LEGENDScreen and BD Lyoplates assays), and further imputes the ‘missing’ infinity markers in the wells without those measurements. The pipeline starts by performing background correction on raw intensities to remove the noise from electronic baseline restoration and fluorescence compensation by adapting a normal-exponential convolution model. Unwanted technical variation, from sources such as well effects, is then removed using a log-normal model with plate, column, and row factors, after which infinity markers are imputed using the informative backbone markers as predictors. The completed dataset can then be used for clustering and other statistical analyses. Additionally, MAPFX can be used to normalise data from FFC assays as well.
Maintained by Hsiao-Chi Liao. Last updated 5 months ago.
softwareflowcytometrycellbasedassayssinglecellproteomicsclustering
4.3 match 1 stars 4.54 scoremarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
1.8 match 54 stars 11.01 score 270 scripts 4 dependentsvimc
orderly:Lightweight Reproducible Reporting
Order, create and store reports from R. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.
Maintained by Rich FitzJohn. Last updated 2 years ago.
2.0 match 117 stars 9.63 score 94 scripts 4 dependentsbioc
scuttle:Single-Cell RNA-Seq Analysis Utilities
Provides basic utility functions for performing single-cell analyses, focusing on simple normalization, quality control and data transformations. Also provides some helper functions to assist development of other packages.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationtranscriptomicsgeneexpressionsequencingsoftwaredataimportopenblascpp
1.9 match 10.21 score 1.7k scripts 80 dependentsr-arcgis
arcgisgeocode:A Robust Interface to ArcGIS 'Geocoding Services'
A very fast and robust interface to ArcGIS 'Geocoding Services'. Provides capabilities for reverse geocoding, finding address candidates, character-by-character search autosuggestion, and batch geocoding. The public 'ArcGIS World Geocoder' is accessible for free use via 'arcgisgeocode' for all services except batch geocoding. 'arcgisgeocode' also integrates with 'arcgisutils' to provide access to custom locators or private 'ArcGIS World Geocoder' hosted on 'ArcGIS Enterprise'. Learn more in the 'Geocode service' API reference <https://developers.arcgis.com/rest/geocode/api-reference/overview-world-geocoding-service.htm>.
Maintained by Josiah Parry. Last updated 2 months ago.
2.8 match 41 stars 6.82 score 20 scripts 1 dependentsschochastics
networkdata:Repository of Network Datasets
The package contains a large collection of network dataset with different context. This includes social networks, animal networks and movie networks. All datasets are in 'igraph' format.
Maintained by David Schoch. Last updated 12 months ago.
3.8 match 143 stars 5.01 score 143 scriptszarquon42b
Morpho:Calculations and Visualisations Related to Geometric Morphometrics
A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.
Maintained by Stefan Schlager. Last updated 5 months ago.
1.9 match 51 stars 10.00 score 218 scripts 13 dependentsbioc
POMA:Tools for Omics Data Analysis
The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.
Maintained by Pol Castellano-Escuder. Last updated 4 months ago.
batcheffectclassificationclusteringdecisiontreedimensionreductionmultidimensionalscalingnormalizationpreprocessingprincipalcomponentregressionrnaseqsoftwarestatisticalmethodvisualizationbioconductorbioinformaticsdata-visualizationdimension-reductionexploratory-data-analysismachine-learningomics-data-integrationpipelinepre-processingstatistical-analysisuser-friendlyworkflow
2.3 match 11 stars 8.23 score 20 scripts 1 dependentswelch-lab
rliger:Linked Inference of Genomic Experimental Relationships
Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.
Maintained by Yichen Wang. Last updated 2 months ago.
nonnegative-matrix-factorizationsingle-cellopenblascpp
1.7 match 408 stars 10.77 score 334 scripts 1 dependentsha-pu
globaltrends:Download and Measure Global Trends Through Google Search Volumes
Google offers public access to global search volumes from its search engine through the Google Trends portal. The package downloads these search volumes provided by Google Trends and uses them to measure and analyze the distribution of search scores across countries or within countries. The package allows researchers and analysts to use these search scores to investigate global trends based on patterns within these scores. This offers insights such as degree of internationalization of firms and organizations or dissemination of political, social, or technological trends across the globe or within single countries. An outline of the package's methodological foundations and potential applications is available as a working paper: <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3969013>.
Maintained by Harald Puhr. Last updated 2 years ago.
google-trendsinternationalization
3.7 match 18 stars 5.00 score 11 scriptsrfastofficial
Rfast2:A Collection of Efficient and Extremely Fast R Functions II
A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.
Maintained by Manos Papadakis. Last updated 1 years ago.
2.3 match 38 stars 8.09 score 75 scripts 26 dependentsohdsi
ResultModelManager:Result Model Manager
Database data model management utilities for R packages in the Observational Health Data Sciences and Informatics program <https://ohdsi.org>. 'ResultModelManager' provides utility functions to allow package maintainers to migrate existing SQL database models, export and import results in consistent patterns.
Maintained by Jamie Gilbert. Last updated 6 months ago.
2.4 match 4 stars 7.38 score 9 scripts 3 dependentsbioc
bnbc:Bandwise normalization and batch correction of Hi-C data
Tools to normalize (several) Hi-C data from replicates.
Maintained by Kipper Fletez-Brant. Last updated 5 months ago.
hicpreprocessingnormalizationsoftwarecpp
4.6 match 1 stars 3.88 score 15 scriptsbioc
methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results
methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.
Maintained by Altuna Akalin. Last updated 17 days ago.
dnamethylationsequencingmethylseqgenome-biologymethylationstatistical-analysisvisualizationcurlbzip2xz-utilszlibcpp
1.5 match 220 stars 11.80 score 578 scripts 3 dependentsbioc
cydar:Using Mass Cytometry for Differential Abundance Analyses
Identifies differentially abundant populations between samples and groups in mass cytometry data. Provides methods for counting cells into hyperspheres, controlling the spatial false discovery rate, and visualizing changes in abundance in the high-dimensional marker space.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologyflowcytometrymultiplecomparisonproteomicssinglecellcpp
3.1 match 5.64 score 48 scriptstjaki
PK:Basic Non-Compartmental Pharmacokinetics
Estimation of pharmacokinetic parameters using non-compartmental theory.
Maintained by Thomas Jaki. Last updated 2 years ago.
6.8 match 2.59 score 13 scripts 1 dependentspsegaert
mrfDepth:Depth Measures in Multivariate, Regression and Functional Settings
Tools to compute depth measures and implementations of related tasks such as outlier detection, data exploration and classification of multivariate, regression and functional data.
Maintained by Jakob Raymaekers. Last updated 6 years ago.
3.5 match 3 stars 4.99 score 72 scripts 3 dependentspolar-fhir
fhircrackr:Handling HL7 FHIR® Resources in R
Useful tools for conveniently downloading FHIR resources in xml format and converting them to R data.frames. The package uses FHIR-search to download bundles from a FHIR server, provides functions to save and read xml-files containing such bundles and allows flattening the bundles to data.frames using XPath expressions. FHIR® is the registered trademark of HL7 and is used with the permission of HL7. Use of the FHIR trademark does not constitute endorsement of this product by HL7.
Maintained by Julia Palm. Last updated 12 days ago.
2.3 match 33 stars 7.63 score 46 scriptsbioc
wateRmelon:Illumina DNA methylation array normalization and metrics
15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.
Maintained by Leo C Schalkwyk. Last updated 4 months ago.
dnamethylationmicroarraytwochannelpreprocessingqualitycontrol
2.3 match 7.75 score 247 scripts 2 dependentsbioc
scrapper:Bindings to C++ Libraries for Single-Cell Analysis
Implements R bindings to C++ code for analyzing single-cell (expression) data, mostly from various libscran libraries. Each function performs an individual step in the single-cell analysis workflow, ranging from quality control to clustering and marker detection. It is mostly intended for other Bioconductor package developers to build more user-friendly end-to-end workflows.
Maintained by Aaron Lun. Last updated 6 days ago.
normalizationrnaseqsoftwaregeneexpressiontranscriptomicssinglecellbatcheffectqualitycontroldifferentialexpressionfeatureextractionprincipalcomponentclusteringopenblascpp
3.1 match 5.55 score 32 scriptstudo-r
BatchExperiments:Statistical Experiments on Batch Computing Clusters
Extends the BatchJobs package to run statistical experiments on batch computing clusters. For further details see the project web page.
Maintained by Michel Lang. Last updated 3 years ago.
3.5 match 17 stars 4.90 score 47 scriptsmlampros
textTinyR:Text Processing for Small or Big Data Files
It offers functions for splitting, parsing, tokenizing and creating a vocabulary for big text data files. Moreover, it includes functions for building a document-term matrix and extracting information from those (term-associations, most frequent terms). It also embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. Lastly, it includes functions for Word Vector Representations (i.e. 'GloVe', 'fasttext') and incorporates functions for the calculation of (pairwise) text document dissimilarities. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.
Maintained by Lampros Mouselimis. Last updated 1 years ago.
bhboostcpp11processingrcpprcpparmadillotextopenblascppopenmp
2.3 match 38 stars 7.64 score 244 scripts 1 dependentsbioc
bluster:Clustering Algorithms for Bioconductor
Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologysoftwaregeneexpressiontranscriptomicssinglecellclusteringcpp
1.8 match 9.43 score 636 scripts 51 dependents