R-universe search: comprehensions

dirkschumacher

listcomp:List Comprehensions

An implementation of list comprehensions as purely syntactic sugar with a minor runtime overhead. It constructs nested for-loops and executes the byte-compiled loops to collect the results.

Maintained by Dirk Schumacher. Last updated 3 years ago.

comprehensions list-comprehensions listcomprehensions

24.0 match 19 stars 5.33 score 3 scripts 7 dependents

cmann3

eList:List Comprehension and Tools

Create list comprehensions (and other types of comprehension) similar to those in 'python', 'haskell', and other languages. List comprehension in 'R' converts a regular for() loop into a vectorized lapply() function. Support for looping with multiple variables, parallelization, and across non-standard objects included. Package also contains a variety of functions to help with list comprehension.

Maintained by Chris Mann. Last updated 4 years ago.

20.3 match 2 stars 4.48 score 9 scripts 1 dependents

patrickroocks

listcompr:List Comprehension for R

Syntactic shortcuts for creating synthetic lists, vectors, data frames, and matrices using list comprehension.

Maintained by Patrick Roocks. Last updated 3 years ago.

data-frames list-comprehension matrix syntactic-sugar vector

19.5 match 5 stars 4.40 score 5 scripts

gdemin

comprehenr:List Comprehensions

Provides 'Python'-style list comprehensions. List comprehension expressions use usual loops (for(), while() and repeat()) and usual if() as list producers. In many cases it gives more concise notation than standard "*apply + filter" strategy.

Maintained by Gregory Demin. Last updated 2 years ago.

9.9 match 20 stars 7.45 score 228 scripts 4 dependents

bioc

singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data

The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.

Maintained by Joshua David Campbell. Last updated 24 days ago.

singlecell geneexpression differentialexpression alignment clustering immunooncology batcheffect normalization qualitycontrol dataimport gui

7.0 match 181 stars 10.16 score 252 scripts

massimoaria

bibliometrix:Comprehensive Science Mapping Analysis

Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.

Maintained by Massimo Aria. Last updated 8 days ago.

bibliometric-analysis bibliometrics citation citation-network citations co-authors co-occurence co-word-analysis correspondence-analysis coupling isi-web journal manuscript quantitative-analysis scholars science science-mapping scientific scientometrics scopus

5.5 match 545 stars 12.54 score 518 scripts 2 dependents

r-lum

Luminescence:Comprehensive Luminescence Dating Data Analysis

A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.

Maintained by Sebastian Kreutzer. Last updated 1 days ago.

bayesian-statistics data-science geochronology luminescence luminescence-dating open-science osl plotting radiofluorescence tl xsyg cpp

4.8 match 15 stars 10.77 score 178 scripts 8 dependents

r-forge

carData:Companion to Applied Regression Data Sets

Datasets to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage (2019).

Maintained by John Fox. Last updated 5 months ago.

3.8 match 12.41 score 944 scripts 919 dependents

lightbluetitan

usdatasets:A Comprehensive Collection of U.S. Datasets

Provides a diverse collection of U.S. datasets encompassing various fields such as crime, economics, education, finance, energy, healthcare, and more. It serves as a valuable resource for researchers and analysts seeking to perform in-depth analyses and derive insights from U.S.-specific data.

Maintained by Renzo Caceres Rossi. Last updated 5 months ago.

7.7 match 7 stars 5.99 score 141 scripts

bioc

musicatk:Mutational Signature Comprehensive Analysis Toolkit

Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.

Maintained by Joshua D. Campbell. Last updated 5 months ago.

software biologicalquestion somaticmutation variantannotation

6.5 match 13 stars 7.02 score 20 scripts

emilhvitfeldt

paletteer:Comprehensive Collection of Color Palettes

The choices of color palettes in R can be quite overwhelming with palettes spread over many packages with many different API's. This packages aims to collect all color palettes across the R ecosystem under the same package with a streamlined API.

Maintained by Emil Hvitfeldt. Last updated 9 months ago.

color-palette palettes

3.1 match 957 stars 13.50 score 6.9k scripts 23 dependents

rstudio

reticulate:Interface to 'Python'

Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.

Maintained by Tomasz Kalinowski. Last updated 2 days ago.

cpp

2.0 match 1.7k stars 21.07 score 18k scripts 427 dependents

bioc

Informeasure:R implementation of information measures

This package consolidates a comprehensive set of information measurements, encompassing mutual information, conditional mutual information, interaction information, partial information decomposition, and part mutual information.

Maintained by Chu Pan. Last updated 5 months ago.

geneexpression networkinference network software

9.4 match 3 stars 4.48 score 4 scripts

lightbluetitan

crimedatasets:A Comprehensive Collection of Crime-Related Datasets

A comprehensive collection of datasets exclusively focused on crimes, criminal activities, and related topics. This package serves as a valuable resource for researchers, analysts, and students interested in crime analysis, criminology, social and economic studies related to criminal behavior. Datasets span global and local contexts, with a mix of tabular and spatial data.

Maintained by Renzo Caceres Rossi. Last updated 3 months ago.

8.3 match 8 stars 4.90 score 3 scripts

bioc

miRBaseConverter:A comprehensive and high-efficiency tool for converting and retrieving the information of miRNAs in different miRBase versions

A comprehensive tool for converting and retrieving the miRNA Name, Accession, Sequence, Version, History and Family information in different miRBase versions. It can process a huge number of miRNAs in a short time without other depends.

Maintained by Taosheng Xu Taosheng Xu. Last updated 5 months ago.

software mirna

6.0 match 1 stars 6.50 score 70 scripts

acharaakshit

rminizinc:R Interface to 'MiniZinc'

Constraint optimization, or constraint programming, is the name given to identifying feasible solutions out of a very large set of candidates, where the problem can be modeled in terms of arbitrary constraints. 'MiniZinc' is a free and open-source constraint modeling language. Constraint satisfaction and discrete optimization problems can be formulated in a high-level modeling language. Models are compiled into an intermediate representation that is understood by a wide range of solvers. 'MiniZinc' itself provides several solvers, for instance 'GeCode'. R users can use the package to solve constraint programming problems without using 'MiniZinc' directly, modify existing 'MiniZinc' models and also create their own models.

Maintained by Akshit Achara. Last updated 3 years ago.

cpp

8.0 match 13 stars 4.81 score 5 scripts

bioc

Qtlizer:Comprehensive QTL annotation of GWAS results

This R package provides access to the Qtlizer web server. Qtlizer annotates lists of common small variants (mainly SNPs) and genes in humans with associated changes in gene expression using the most comprehensive database of published quantitative trait loci (QTLs).

Maintained by Matthias Munz. Last updated 16 days ago.

genomewideassociation snp genetics linkagedisequilibrium eqtl gwas variant-annotation

6.4 match 3 stars 5.73 score 2 scripts

lightbluetitan

educationR:A Comprehensive Collection of Educational Datasets

Provides a comprehensive collection of datasets related to education, covering topics such as student performance, learning methods, test scores, absenteeism, and other educational metrics. This package is designed as a resource for educational researchers, data analysts, and statisticians to explore and analyze data in the field of education.

Maintained by Renzo Caceres Rossi. Last updated 3 months ago.

8.4 match 4 stars 4.30 score 3 scripts

openintrostat

openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs

Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.

Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.

data openintro

3.1 match 240 stars 11.39 score 6.0k scripts

snoweye

phyclust:Phylogenetic Clustering (Phyloclustering)

Phylogenetic clustering (phyloclustering) is an evolutionary Continuous Time Markov Chain model-based approach to identify population structure from molecular data without assuming linkage equilibrium. The package phyclust (Chen 2011) provides a convenient implementation of phyloclustering for DNA and SNP data, capable of clustering individuals into subpopulations and identifying molecular sequences representative of those subpopulations. It is designed in C for performance, interfaced with R for visualization, and incorporates other popular open source programs including ms (Hudson 2002) <doi:10.1093/bioinformatics/18.2.337>, seq-gen (Rambaut and Grassly 1997) <doi:10.1093/bioinformatics/13.3.235>, Hap-Clustering (Tzeng 2005) <doi:10.1002/gepi.20063> and PAML baseml (Yang 1997, 2007) <doi:10.1093/bioinformatics/13.5.555>, <doi:10.1093/molbev/msm088>, for simulating data, additional analyses, and searching the best tree. See the phyclust website for more information, documentations and examples.

Maintained by Wei-Chen Chen. Last updated 2 years ago.

4.0 match 9 stars 8.45 score 126 scripts 8 dependents

jacob-long

interactions:Comprehensive, User-Friendly Toolkit for Probing Interactions

A suite of functions for conducting and interpreting analysis of statistical interaction in regression models that was formerly part of the 'jtools' package. Functionality includes visualization of two- and three-way interactions among continuous and/or categorical variables as well as calculation of "simple slopes" and Johnson-Neyman intervals (see e.g., Bauer & Curran, 2005 <doi:10.1207/s15327906mbr4003_5>). These capabilities are implemented for generalized linear models in addition to the standard linear regression context.

Maintained by Jacob A. Long. Last updated 8 months ago.

interactions moderation social-sciences statistics

2.9 match 131 stars 11.39 score 1.2k scripts 5 dependents

bioc

EnrichedHeatmap:Making Enriched Heatmaps

Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.

Maintained by Zuguang Gu. Last updated 5 months ago.

software visualization sequencing genomeannotation coverage cpp

3.0 match 190 stars 10.87 score 330 scripts 1 dependents

lightbluetitan

OncoDataSets:A Comprehensive Collection of Cancer Types and Cancer-related DataSets

Offers a rich collection of data focused on cancer research, covering survival rates, genetic studies, biomarkers, and epidemiological insights. Designed for researchers, analysts, and bioinformatics practitioners, the package includes datasets on various cancer types such as melanoma, leukemia, breast, ovarian, and lung cancer, among others. It aims to facilitate advanced research, analysis, and understanding of cancer epidemiology, genetics, and treatment outcomes.

Maintained by Renzo Caceres Rossi. Last updated 3 months ago.

7.5 match 3 stars 4.18 score 6 scripts

bioc

motifbreakR:A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites

We introduce motifbreakR, which allows the biologist to judge in the first place whether the sequence surrounding the polymorphism is a good match, and in the second place how much information is gained or lost in one allele of the polymorphism relative to another. MotifbreakR is both flexible and extensible over previous offerings; giving a choice of algorithms for interrogation of genomes with motifs from public sources that users can choose from; these are 1) a weighted-sum probability matrix, 2) log-probabilities, and 3) weighted by relative entropy. MotifbreakR can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within Bioconductor (currently there are 32 species, a total of 109 versions).

Maintained by Simon Gert Coetzee. Last updated 5 months ago.

chipseq visualization motifannotation transcription

3.2 match 28 stars 8.96 score 103 scripts

bioc

crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors

Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.

Maintained by Jean-Philippe Fortin. Last updated 11 days ago.

crispr functionalgenomics genetarget bioconductor bioconductor-package crispr-cas9 crispr-design crispr-target genomics-analysis grna grna-sequence grna-sequences sgrna sgrna-design

3.4 match 22 stars 8.28 score 80 scripts 3 dependents

bioc

deconvR:Simulation and Deconvolution of Omic Profiles

This package provides a collection of functions designed for analyzing deconvolution of the bulk sample(s) using an atlas of reference omic signature profiles and a user-selected model. Users are given the option to create or extend a reference atlas and,also simulate the desired size of the bulk signature profile of the reference cell types.The package includes the cell-type-specific methylation atlas and, Illumina Epic B5 probe ids that can be used in deconvolution. Additionally,we included BSmeth2Probe, to make mapping WGBS data to their probe IDs easier.

Maintained by Irem B. Gündüz. Last updated 5 months ago.

dnamethylation regression geneexpression rnaseq singlecell statisticalmethod transcriptomics bioconductor-package deconvolution dna-methylation omics

4.8 match 10 stars 5.78 score 15 scripts

cjendres1

nhanesA:NHANES Data Retrieval

Utility to retrieve data from the National Health and Nutrition Examination Survey (NHANES) website <https://www.cdc.gov/nchs/nhanes/>.

Maintained by Christopher Endres. Last updated 2 months ago.

nhanes

2.9 match 59 stars 9.37 score 239 scripts

bioc

MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework

MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).

Maintained by Shuangbin Xu. Last updated 5 months ago.

visualization microbiome software multiplecomparison featureextraction microbiome-analysis microbiome-data

2.7 match 183 stars 9.70 score 126 scripts 1 dependents

lightbluetitan

MedDataSets:Comprehensive Medical, Disease, Treatment, and Drug Datasets

Provides an extensive collection of datasets related to medicine, diseases, treatments, drugs, and public health. This package covers topics such as drug effectiveness, vaccine trials, survival rates, infectious disease outbreaks, and medical treatments. The included datasets span various health conditions, including AIDS, cancer, bacterial infections, and COVID-19, along with information on pharmaceuticals and vaccines. These datasets are sourced from the R ecosystem and other R packages, remaining unaltered to ensure data integrity. This package serves as a valuable resource for researchers, analysts, and healthcare professionals interested in conducting medical and public health data analysis in R.

Maintained by Renzo Caceres Rossi. Last updated 5 months ago.

4.6 match 8 stars 5.68 score 60 scripts

ccsosa

GOCompare:Comprehensive GO Terms Comparison Between Species

Supports the assessment of functional enrichment analyses obtained for several lists of genes and provides a workflow to analyze them between two species via weighted graphs. Methods are described in Sosa et al. (2023) <doi:10.1016/j.ygeno.2022.110528>.

Maintained by Chrystian Camilo Sosa. Last updated 4 months ago.

6.3 match 9 stars 4.13 score 1 scripts

revolutionanalytics

foreach:Provides Foreach Looping Construct

Support for the foreach looping construct. Foreach is an idiom that allows for iterating over elements in a collection, without the use of an explicit loop counter. This package in particular is intended to be used for its return value, rather than for its side effects. In that sense, it is similar to the standard lapply function, but doesn't require the evaluation of a function. Using foreach without side effects also facilitates executing the loop in parallel.

Maintained by Folashade Daniel. Last updated 3 years ago.

foreach parallel-computing

1.5 match 54 stars 17.16 score 43k scripts 2.8k dependents

kwstat

pals:Color Palettes, Colormaps, and Tools to Evaluate Them

A comprehensive collection of color palettes, colormaps, and tools to evaluate them. See Kovesi (2015) <doi:10.48550/arXiv.1509.03700>.

Maintained by Kevin Wright. Last updated 9 days ago.

2.2 match 83 stars 11.39 score 2.1k scripts 8 dependents

bioc

rCGH:Comprehensive Pipeline for Analyzing and Visualizing Array-Based CGH Data

A comprehensive pipeline for analyzing and interactively visualizing genomic profiles generated through commercial or custom aCGH arrays. As inputs, rCGH supports Agilent dual-color Feature Extraction files (.txt), from 44 to 400K, Affymetrix SNP6.0 and cytoScanHD probeset.txt, cychp.txt, and cnchp.txt files exported from ChAS or Affymetrix Power Tools. rCGH also supports custom arrays, provided data complies with the expected format. This package takes over all the steps required for individual genomic profiles analysis, from reading files to profiles segmentation and gene annotations. This package also provides several visualization functions (static or interactive) which facilitate individual profiles interpretation. Input files can be in compressed format, e.g. .bz2 or .gz.

Maintained by Frederic Commo. Last updated 5 months ago.

acgh copynumbervariation preprocessing featureextraction

5.0 match 4 stars 5.10 score 26 scripts 1 dependents

ropensci

ijtiff:Comprehensive TIFF I/O with Full Support for 'ImageJ' TIFF Files

General purpose TIFF file I/O for R users. Currently the only such package with read and write support for TIFF files with floating point (real-numbered) pixels, and the only package that can correctly import TIFF files that were saved from 'ImageJ' and write TIFF files than can be correctly read by 'ImageJ' <https://imagej.net/ij/>. Also supports text image I/O.

Maintained by Rory Nolan. Last updated 6 days ago.

image-manipulation imagej peer-reviewed tiff-files tiff-images tiff

2.8 match 18 stars 8.97 score 36 scripts 7 dependents

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

3.4 match 7.27 score 251 scripts 1 dependents

ropensci

ckanr:Client for the Comprehensive Knowledge Archive Network ('CKAN') API

Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.

Maintained by Francisco Alves. Last updated 2 years ago.

database open-data ckan api data dataset api-wrapper ckan-api

2.9 match 100 stars 8.67 score 448 scripts 4 dependents

bioc

RnBeads:RnBeads

RnBeads facilitates comprehensive analysis of various types of DNA methylation data at the genome scale.

Maintained by Fabian Mueller. Last updated 1 months ago.

dnamethylation methylationarray methylseq epigenetics qualitycontrol preprocessing batcheffect differentialmethylation sequencing cpgisland immunooncology twochannel dataimport

3.5 match 6.85 score 169 scripts 1 dependents

alsguimaraes

MUS:Monetary Unit Sampling and Estimation Methods, Widely Used in Auditing

Sampling and evaluation methods to apply Monetary Unit Sampling (or in older literature Dollar Unit Sampling) during an audit of financial statements.

Maintained by Henning Prömpers. Last updated 6 years ago.

audit mus r-project

5.3 match 5 stars 4.60 score 16 scripts

erhard-lab

grandR:Comprehensive Analysis of Nucleotide Conversion Sequencing Data

Nucleotide conversion sequencing experiments have been developed to add a temporal dimension to RNA-seq and single-cell RNA-seq. Such experiments require specialized tools for primary processing such as GRAND-SLAM, (see 'Jürges et al' <doi:10.1093/bioinformatics/bty256>) and specialized tools for downstream analyses. 'grandR' provides a comprehensive toolbox for quality control, kinetic modeling, differential gene expression analysis and visualization of such data.

Maintained by Florian Erhard. Last updated 1 months ago.

3.4 match 11 stars 7.03 score 18 scripts 1 dependents

adamlilith

fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'

Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.

Maintained by Adam B. Smith. Last updated 19 days ago.

aspect distance fragmentation fragmentation-indices gis grass grass-gis raster raster-projection rasterize slope topography vectorization

3.1 match 58 stars 7.69 score 8 scripts

homerhanumat

tigerstats:R Functions for Elementary Statistics

A collection of data sets and functions that are useful in the teaching of statistics at an elementary level to students who may have little or no previous experience with the command line. The functions for elementary inferential procedures follow a uniform interface for user input. Some of the functions are instructional applets that can only be run on the R Studio integrated development environment with package 'manipulate' installed. Other instructional applets are Shiny apps that may be run locally. In teaching the package is used alongside of package 'mosaic', 'mosaicData' and 'abd', which are therefore listed as dependencies.

Maintained by Homer White. Last updated 4 years ago.

4.0 match 16 stars 5.77 score 327 scripts

beckerbenj

eatGADS:Data Management of Large Hierarchical Data

Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.

Maintained by Benjamin Becker. Last updated 23 days ago.

3.1 match 1 stars 7.36 score 34 scripts 1 dependents

sonsoleslp

tna:Transition Network Analysis (TNA)

Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.

Maintained by Sonsoles López-Pernas. Last updated 3 days ago.

educational-data-mining learning-analytics markov-model temporal-analysis

3.5 match 4 stars 6.48 score 5 scripts

arnaudgallou

plume:A Simple Author Handler for Scientific Writing

Handles and formats author information in scientific writing in 'R Markdown' and 'Quarto'. 'plume' provides easy-to-use and flexible tools for injecting author metadata in 'YAML' headers as well as generating author and contribution lists (among others) as strings from tabular data.

Maintained by Arnaud Gallou. Last updated 30 days ago.

authors contribution contributions list lists markdown paper preprint quarto role roles

3.3 match 21 stars 6.80 score 15 scripts

ecor

RMAWGEN:Multi-Site Auto-Regressive Weather GENerator

S3 and S4 functions are implemented for spatial multi-site stochastic generation of daily time series of temperature and precipitation. These tools make use of Vector AutoRegressive models (VARs). The weather generator model is then saved as an object and is calibrated by daily instrumental "Gaussianized" time series through the 'vars' package tools. Once obtained this model, it can it can be used for weather generations and be adapted to work with several climatic monthly time series.

Maintained by Emanuele Cordano. Last updated 26 days ago.

4.0 match 3 stars 5.62 score 115 scripts 4 dependents

bioc

oposSOM:Comprehensive analysis of transcriptome data

This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data.

Maintained by Henry Loeffler-Wirth. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment datarepresentation visualization cpp

5.0 match 4.48 score 7 scripts

person-c

easybio:Comprehensive Single-Cell Annotation and Transcriptomic Analysis Toolkit

Provides a comprehensive toolkit for single-cell annotation with the 'CellMarker2.0' database (see Xia Li, Peng Wang, Yunpeng Zhang (2023) <doi: 10.1093/nar/gkac947>). Streamlines biological label assignment in single-cell RNA-seq data and facilitates transcriptomic analysis, including preparation of TCGA<https://portal.gdc.cancer.gov/> and GEO<https://www.ncbi.nlm.nih.gov/geo/> datasets, differential expression analysis and visualization of enrichment analysis results. Additional utility functions support various bioinformatics workflows. See Wei Cui (2024) <doi: 10.1101/2024.09.14.609619> for more details.

Maintained by Wei Cui. Last updated 13 days ago.

limma geoquery edger fgsea bioinformatics cellmarker2 gsea rna-seq single-cell

3.4 match 10 stars 6.62 score 35 scripts

ropensci

rredlist:'IUCN' Red List Client

'IUCN' Red List (<https://api.iucnredlist.org/>) client. The 'IUCN' Red List is a global list of threatened and endangered species. Functions cover all of the Red List 'API' routes. An 'API' key is required.

Maintained by William Gearty. Last updated 1 months ago.

iucn biodiversity api web-services traits habitat species conservation api-wrapper iucn-red-list taxize

1.9 match 53 stars 11.49 score 195 scripts 24 dependents

asa12138

pctax:Professional Comprehensive Omics Data Analysis

Provides a comprehensive suite of tools for analyzing omics data. It includes functionalities for alpha diversity analysis, beta diversity analysis, differential abundance analysis, community assembly analysis, visualization of phylogenetic tree, and functional enrichment analysis. With a progressive approach, the package offers a range of analysis methods to explore and understand the complex communities. It is designed to support researchers and practitioners in conducting in-depth and professional omics data analysis.

Maintained by Chen Peng. Last updated 4 months ago.

microbiome software visualization omics

3.5 match 14 stars 5.89 score 14 scripts

mjlajeunesse

metagear:Comprehensive Research Synthesis Tools for Systematic Reviews and Meta-Analysis

Functionalities for facilitating systematic reviews, data extractions, and meta-analyses. It includes a GUI (graphical user interface) to help screen the abstracts and titles of bibliographic data; tools to assign screening effort across multiple collaborators/reviewers and to assess inter- reviewer reliability; tools to help automate the download and retrieval of journal PDF articles from online databases; figure and image extractions from PDFs; web scraping of citations; automated and manual data extraction from scatter-plot and bar-plot images; PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagrams; simple imputation tools to fill gaps in incomplete or missing study parameters; generation of random effects sizes for Hedges' d, log response ratio, odds ratio, and correlation coefficients for Monte Carlo experiments; covariance equations for modelling dependencies among multiple effect sizes (e.g., effect sizes with a common control); and finally summaries that replicate analyses and outputs from widely used but no longer updated meta-analysis software (i.e., metawin). Funding for this package was supported by National Science Foundation (NSF) grants DBI-1262545 and DEB-1451031. CITE: Lajeunesse, M.J. (2016) Facilitating systematic reviews, data extraction and meta-analysis with the metagear package for R. Methods in Ecology and Evolution 7, 323-330 <doi:10.1111/2041-210X.12472>.

Maintained by Marc J. Lajeunesse. Last updated 4 years ago.

2.8 match 14 stars 6.71 score 91 scripts

palderman

DSSAT:A Comprehensive R Interface for the DSSAT Cropping Systems Model

The purpose of this package is to provide a comprehensive R interface to the Decision Support System for Agrotechnology Transfer Cropping Systems Model (DSSAT-CSM; see <https://dssat.net> for more information). The package provides cross-platform functions to read and write input files, run DSSAT-CSM, and read output files.

Maintained by Phillip D. Alderman. Last updated 1 years ago.

3.4 match 22 stars 5.57 score 34 scripts

zheng206

ComBatFamQC:Comprehensive Batch Effect Diagnostics and Harmonization

Provides a comprehensive framework for batch effect diagnostics, harmonization, and post-harmonization downstream analysis. Features include interactive visualization tools, robust statistical tests, and a range of harmonization techniques. Additionally, 'ComBatFamQC' enables the creation of life-span age trend plots with estimated age-adjusted centiles and facilitates the generation of covariate-corrected residuals for analytical purposes. Methods for harmonization are based on approaches described in Johnson et al., (2007) <doi:10.1093/biostatistics/kxj037>, Beer et al., (2020) <doi:10.1016/j.neuroimage.2020.117129>, Pomponio et al., (2020) <doi:10.1016/j.neuroimage.2019.116450>, and Chen et al., (2021) <doi:10.1002/hbm.25688>.

Maintained by Zheng Ren. Last updated 2 months ago.

diagnostic-tool harmonization rshinyapp

3.5 match 2 stars 5.35 score 16 scripts

tonigi

dtw:Dynamic Time Warping Algorithms

A comprehensive implementation of dynamic time warping (DTW) algorithms in R. DTW computes the optimal (least cumulative distance) alignment between points of two time series. Common DTW variants covered include local (slope) and global (window) constraints, subsequence matches, arbitrary distance definitions, normalizations, minimum variance matching, and so on. Provides cumulative distances, alignments, specialized plot styles, etc., as described in Giorgino (2009) <doi:10.18637/jss.v031.i07>.

Maintained by Toni Giorgino. Last updated 2 years ago.

2.2 match 5 stars 8.48 score 582 scripts 49 dependents

statleila

priorityelasticnet:Comprehensive Analysis of Multi-Omics Data Using an Offset-Based Method

Priority-ElasticNet extends the Priority-LASSO method (Klau et al. (2018) <doi:10.1186/s12859-018-2344-6>) by incorporating the ElasticNet penalty, allowing for both L1 and L2 regularization. This approach fits successive ElasticNet models for several blocks of (omics) data with different priorities, using the predicted values from each block as an offset for the subsequent block. It also offers robust options to handle block-wise missingness in multi-omics data, improving the flexibility and applicability of the model in the presence of incomplete datasets.

Maintained by Laila Qadir Musib. Last updated 2 months ago.

5.5 match 3.34 score

welch-lab

rliger:Linked Inference of Genomic Experimental Relationships

Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.

Maintained by Yichen Wang. Last updated 2 months ago.

nonnegative-matrix-factorization single-cell openblas cpp

1.7 match 408 stars 10.77 score 334 scripts 1 dependents

cran

CopulaREMADA:Copula Mixed Models for Multivariate Meta-Analysis of Diagnostic Test Accuracy Studies

The bivariate copula mixed model for meta-analysis of diagnostic test accuracy studies in Nikoloulopoulos (2015) <doi:10.1002/sim.6595> and Nikoloulopoulos (2018) <doi:10.1007/s10182-017-0299-y>. The vine copula mixed model for meta-analysis of diagnostic test accuracy studies accounting for disease prevalence in Nikoloulopoulos (2017) <doi:10.1177/0962280215596769> and also accounting for non-evaluable subjects in Nikoloulopoulos (2020) <doi:10.1515/ijb-2019-0107>. The hybrid vine copula mixed model for meta-analysis of diagnostic test accuracy case-control and cohort studies in Nikoloulopoulos (2018) <doi:10.1177/0962280216682376>. The D-vine copula mixed model for meta-analysis and comparison of two diagnostic tests in Nikoloulopoulos (2019) <doi:10.1177/0962280218796685>. The multinomial quadrivariate D-vine copula mixed model for meta-analysis of diagnostic tests with non-evaluable subjects in Nikoloulopoulos (2020) <doi:10.1177/0962280220913898>. The one-factor copula mixed model for joint meta-analysis of multiple diagnostic tests in Nikoloulopoulos (2022) <doi:10.1111/rssa.12838>. The multinomial six-variate 1-truncated D-vine copula mixed model for meta-analysis of two diagnostic tests accounting for within and between studies dependence in Nikoloulopoulos (2024) <doi:10.1177/09622802241269645>. The 1-truncated D-vine copula mixed models for meta-analysis of diagnostic accuracy studies without a gold standard (Nikoloulopoulos, 2024).

Maintained by Aristidis K. Nikoloulopoulos. Last updated 5 months ago.

11.3 match 2 stars 1.60 score 10 scripts

biogenies

countfitteR:Comprehensive Automatized Evaluation of Distribution Models for Count Data

A large number of measurements generate count data. This is a statistical data type that only assumes non-negative integer values and is generated by counting. Typically, counting data can be found in biomedical applications, such as the analysis of DNA double-strand breaks. The number of DNA double-strand breaks can be counted in individual cells using various bioanalytical methods. For diagnostic applications, it is relevant to record the distribution of the number data in order to determine their biomedical significance (Roediger, S. et al., 2018. Journal of Laboratory and Precision Medicine. <doi:10.21037/jlpm.2018.04.10>). The software offers functions for a comprehensive automated evaluation of distribution models of count data. In addition to programmatic interaction, a graphical user interface (web server) is included, which enables fast and interactive data-scientific analyses. The user is supported in selecting the most suitable counting distribution for his own data set.

Maintained by Jaroslaw Chilimoniuk. Last updated 2 years ago.

cancer cancer-imaging-research count-data count-distribution foci

3.4 match 4 stars 5.33 score 27 scripts

dzhakparov

GeneSelectR:Comprehensive Feature Selection Worfkflow for Bulk RNAseq Datasets

GeneSelectR is a versatile R package designed for efficient RNA sequencing data analysis. Its key innovation lies in the seamless integration of the Python sklearn machine learning framework with R-based bioinformatics tools. This integration enables GeneSelectR to perform robust ML-driven feature selection while simultaneously leveraging the power of Gene Ontology (GO) enrichment and semantic similarity analyses. By combining these diverse methodologies, GeneSelectR offers a comprehensive workflow that optimizes both the computational aspects of ML and the biological insights afforded by advanced bioinformatics analyses. Ideal for researchers in bioinformatics, GeneSelectR stands out as a unique tool for analyzing complex RNAseq datasets with enhanced precision and relevance.

Maintained by Damir Zhakparov. Last updated 10 months ago.

3.4 match 19 stars 5.28 score 7 scripts

nandp1

gpbStat:Comprehensive Statistical Analysis of Plant Breeding Experiments

Performs statistical data analysis of various Plant Breeding experiments. Contains functions for Line by Tester analysis as per Arunachalam, V.(1974) <http://repository.ias.ac.in/89299/> and Diallel analysis as per Griffing, B. (1956) <https://www.publish.csiro.au/bi/pdf/BI9560463>.

Maintained by Nandan Patil. Last updated 4 months ago.

biometrics genetics plantbreeding

2.9 match 3 stars 6.08 score 27 scripts

bioc

decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data

Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.

differentialexpression functionalgenomics geneexpression generegulation network software statisticalmethod transcription

1.6 match 230 stars 11.27 score 316 scripts 3 dependents

insightsengineering

teal.data:Data Model for 'teal' Applications

Provides a 'teal_data' class as a unified data model for 'teal' applications focusing on reproducibility and relational data.

Maintained by Dawid Kaledkowski. Last updated 2 months ago.

data-model nest

1.8 match 11 stars 9.93 score 44 scripts 8 dependents

duncantl

CodeDepends:Analysis of R Code for Reproducible Research and Code Comprehension

Tools for analyzing R expressions or blocks of code and determining the dependencies between them. It focuses on R scripts, but can be used on the bodies of functions. There are many facilities including the ability to summarize or get a high-level view of code, determining dependencies between variables, code improvement suggestions.

Maintained by Gabriel Becker. Last updated 1 years ago.

2.9 match 89 stars 5.87 score 70 scripts 1 dependents

cjvanlissa

worcs:Workflow for Open Reproducible Code in Science

Create reproducible and transparent research projects in 'R'. This package is based on the Workflow for Open Reproducible Code in Science (WORCS), a step-by-step procedure based on best practices for Open Science. It includes an 'RStudio' project template, several convenience functions, and all dependencies required to make your project reproducible and transparent. WORCS is explained in the tutorial paper by Van Lissa, Brandmaier, Brinkman, Lamprecht, Struiksma, & Vreede (2021). <doi:10.3233/DS-210031>.

Maintained by Caspar J. Van Lissa. Last updated 11 days ago.

1.8 match 83 stars 9.26 score 59 scripts

bioc

CaMutQC:An R Package for Comprehensive Filtration and Selection of Cancer Somatic Mutations

CaMutQC is able to filter false positive mutations generated due to technical issues, as well as to select candidate cancer mutations through a series of well-structured functions by labeling mutations with various flags. And a detailed and vivid filter report will be offered after completing a whole filtration or selection section. Also, CaMutQC integrates serveral methods and gene panels for Tumor Mutational Burden (TMB) estimation.

Maintained by Xin Wang. Last updated 5 months ago.

software qualitycontrol genetarget cancer-genomics somatic-mutations

2.8 match 7 stars 5.92 score 1 scripts

rodivinity

mbreaks:Estimation and Inference for Structural Breaks in Linear Regression Models

Functions provide comprehensive treatments for estimating, inferring, testing and model selecting in linear regression models with structural breaks. The tests, estimation methods, inference and information criteria implemented are discussed in Bai and Perron (1998) "Estimating and Testing Linear Models with Multiple Structural Changes" <doi:10.2307/2998540>.

Maintained by Linh Nguyen. Last updated 4 months ago.

4.1 match 4.04 score 11 scripts

psychmeta

psychmeta:Psychometric Meta-Analysis Toolkit

Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more. Bugs can be reported to <https://github.com/psychmeta/psychmeta/issues> or <issues@psychmeta.com>.

Maintained by Jeffrey A. Dahlke. Last updated 9 months ago.

hacktoberfest meta-analysis psychology psychometric psychometrics

2.0 match 57 stars 8.25 score 151 scripts

talegari

pkggraph:A Consistent and Intuitive Platform to Explore the Dependencies of Packages on the Comprehensive R Archive Network Like Repositories

Interactively explore various dependencies of a package(s) (on the Comprehensive R Archive Network Like repositories) and perform analysis using tidy philosophy. Most of the functions return a 'tibble' object (enhancement of 'dataframe') which can be used for further analysis. The package offers functions to produce 'network' and 'igraph' dependency graphs. The 'plot' method produces a static plot based on 'ggnetwork' and 'plotd3' function produces an interactive D3 plot based on 'networkD3'.

Maintained by KS Srikanth. Last updated 6 years ago.

graphs

3.2 match 9 stars 5.12 score 29 scripts

bioc

HDF5Array:HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Maintained by Hervé Pagès. Last updated 26 days ago.

infrastructure datarepresentation dataimport sequencing rnaseq coverage annotation genomeannotation singlecell immunooncology bioconductor-package core-package u24ca289073

1.2 match 12 stars 13.19 score 844 scripts 123 dependents

gaospecial

ggVennDiagram:A 'ggplot2' Implement of Venn Diagram

Easy-to-use functions to generate 2-7 sets Venn or upset plot in publication quality. 'ggVennDiagram' plot Venn or upset using well-defined geometry dataset and 'ggplot2'. The shapes of 2-4 sets Venn use circles and ellipses, while the shapes of 4-7 sets Venn use irregular polygons (4 has both forms), which are developed and imported from another package 'venn', authored by Adrian Dusa. We provided internal functions to integrate shape data with user provided sets data, and calculated the geometry of every regions/intersections of them, then separately plot Venn in four components, set edges/labels, and region edges/labels. From version 1.0, it is possible to customize these components as you demand in ordinary 'ggplot2' grammar. From version 1.4.4, it supports unlimited number of sets, as it can draw a plain upset plot automatically when number of sets is more than 7.

Maintained by Chun-Hui Gao. Last updated 5 months ago.

set-operations upset upsetplot venn-diagram venn-plot

1.2 match 289 stars 12.67 score 1.3k scripts 4 dependents

jacgoldsm

peruse:A Tidy API for Sequence Iteration and Set Comprehension

A friendly API for sequence iteration and set comprehension.

Maintained by Jacob Goldsmith. Last updated 4 years ago.

5.5 match 1 stars 2.70 score 2 scripts

nathanael-g-durst

KrakenR:Comprehensive R Interface for Accessing Kraken Cryptocurrency Exchange REST API

A comprehensive R interface to access data from the Kraken cryptocurrency exchange REST API <https://docs.kraken.com/api/>. It allows users to retrieve various market data, such as asset information, trading pairs, and price data. The package is designed to facilitate efficient data access for analysis, strategy development, and monitoring of cryptocurrency market trends.

Maintained by Nathanaël Dürst. Last updated 3 days ago.

3.3 match 4.48 score 10 scripts

denironyx

tidycountries:Access and Manipulate Comprehensive Country Level Data in Tidy Format

A comprehensive and user-friendly interface for accessing, manipulating, and analyzing country-level data from around the world. It allows users to retrieve detailed information on countries, including names, regions, continents, populations, currencies, calling codes, and more, all in a tidy data format. The package is designed to work seamlessly within the 'tidyverse' ecosystem, making it easy to filter, arrange, and visualize country-level data in R.

Maintained by Dennis Irorere. Last updated 5 months ago.

3.3 match 9 stars 4.35 score 7 scripts

zyang2k

bluebike:Blue Bike Comprehensive Data

Facilitates the importation of the Boston Blue Bike trip data since 2015. Functions include the computation of trip distances of given trip data. It can also map the location of stations within a given radius and calculate the distance to nearby stations. Data is from <https://www.bluebikes.com/system-data>.

Maintained by Ziyue Yang. Last updated 3 years ago.

3.1 match 4 stars 4.60 score 7 scripts

silentspringinstitute

RNHANES:Facilitates Analysis of CDC NHANES Data

Tools for downloading and analyzing CDC NHANES data, with a focus on analytical laboratory data.

Maintained by Herb Susmann. Last updated 2 days ago.

nhanes publichealth

1.8 match 77 stars 7.58 score 83 scripts

decisionpatterns

na.tools:Comprehensive Library for Working with Missing (NA) Values in Vectors

This comprehensive toolkit provide a consistent and extensible framework for working with missing values in vectors. The companion package 'tidyimpute' provides similar functionality for list-like and table-like structures). Functions exist for detection, removal, replacement, imputation, recollection, etc. of 'NAs'.

Maintained by Christopher Brown. Last updated 6 years ago.

3.4 match 2 stars 4.04 score 109 scripts

bioc

PhosR:A set of methods and tools for comprehensive analysis of phosphoproteomics data

PhosR is a package for the comprenhensive analysis of phosphoproteomic data. There are two major components to PhosR: processing and downstream analysis. PhosR consists of various processing tools for phosphoproteomics data including filtering, imputation, normalisation, and functional analysis for inferring active kinases and signalling pathways.

Maintained by Taiyun Kim. Last updated 5 months ago.

software researchfield proteomics

2.9 match 4.71 score 51 scripts

bioc

seq.hotSPOT:Targeted sequencing panel design based on mutation hotspots

seq.hotSPOT provides a resource for designing effective sequencing panels to help improve mutation capture efficacy for ultradeep sequencing projects. Using SNV datasets, this package designs custom panels for any tissue of interest and identify the genomic regions likely to contain the most mutations. Establishing efficient targeted sequencing panels can allow researchers to study mutation burden in tissues at high depth without the economic burden of whole-exome or whole-genome sequencing. This tool was developed to make high-depth sequencing panels to study low-frequency clonal mutations in clinically normal and cancerous tissues.

Maintained by Sydney Grant. Last updated 5 months ago.

software technology sequencing dnaseq wholegenome

3.2 match 4.00 score 3 scripts

ozancanozdemir

turkeyelections:The Most Comprehensive R Package for Turkish Election Results

Includes the results of general, local, and presidential elections held in Turkey between 1995 and 2023, broken down by provinces and overall national results. It facilitates easy processing of this data and the creation of visual representations based on these election results.

Maintained by Ozancan Ozdemir. Last updated 9 months ago.

2.9 match 15 stars 4.18 score 1 scripts

kathbaum

LoopDetectR:Comprehensive Feedback Loop Detection in ODE Models

Detect feedback loops (cycles, circuits) between species (nodes) in ordinary differential equation (ODE) models. Feedback loops are paths from a node to itself without visiting any other node twice, and they have important regulatory functions. Loops are reported with their order of participating nodes and their length, and whether the loop is a positive or a negative feedback loop. An upper limit of the number of feedback loops limits runtime (which scales with feedback loop count). Model parametrizations and values of the modelled variables are accounted for. Computation uses the characteristics of the Jacobian matrix as described e.g. in Thomas and Kaufman (2002) <doi:10.1016/s1631-0691(02)01452-x>. Input can be the Jacobian matrix of the ODE model or the ODE function definition; in the latter case, the Jacobian matrix is determined using 'numDeriv'. Graph-based algorithms from 'igraph' are employed for path detection.

Maintained by Katharina Baum. Last updated 5 years ago.

5.8 match 2.00 score 1 scripts

zzz1990771

geeVerse:A Comprehensive Analysis of High Dimensional Longitudinal Data

To provide a comprehensive analysis of high dimensional longitudinal data,this package provides analysis for any combination of 1) simultaneous variable selection and estimation, 2) mean regression or quantile regression for heterogeneous data, 3) cross-sectional or longitudinal data, 4) balanced or imbalanced data, 5) moderate, high or even ultra-high dimensional data, via computationally efficient implementations of penalized generalized estimating equations.

Maintained by Tianhai Zu. Last updated 4 months ago.

cpp

3.4 match 3.30 score 5 scripts

isaakiel

mortAAR:Analysis of Archaeological Mortality Data

A collection of functions for the analysis of archaeological mortality data (on the topic see e.g. Chamberlain 2006 <https://books.google.de/books?id=nG5FoO_becAC&lpg=PA27&ots=LG0b_xrx6O&dq=life%20table%20archaeology&pg=PA27#v=onepage&q&f=false>). It takes demographic data in different formats and displays the result in a standard life table as well as plots the relevant indices (percentage of deaths, survivorship, probability of death, life expectancy, percentage of population). It also checks for possible biases in the age structure and applies corrections to life tables.

Maintained by Nils Mueller-Scheessel. Last updated 2 months ago.

anthropology archaeology demography statistics

1.5 match 15 stars 7.49 score 23 scripts

bioc

YAPSA:Yet Another Package for Signature Analysis

This package provides functions and routines for supervised analyses of mutational signatures (i.e., the signatures have to be known, cf. L. Alexandrov et al., Nature 2013 and L. Alexandrov et al., Bioaxiv 2018). In particular, the family of functions LCD (LCD = linear combination decomposition) can use optimal signature-specific cutoffs which takes care of different detectability of the different signatures. Moreover, the package provides different sets of mutational signatures, including the COSMIC and PCAWG SNV signatures and the PCAWG Indel signatures; the latter infering that with YAPSA, the concept of supervised analysis of mutational signatures is extended to Indel signatures. YAPSA also provides confidence intervals as computed by profile likelihoods and can perform signature analysis on a stratified mutational catalogue (SMC = stratify mutational catalogue) in order to analyze enrichment and depletion patterns for the signatures in different strata.

Maintained by Zuguang Gu. Last updated 5 months ago.

sequencing dnaseq somaticmutation visualization clustering genomicvariation statisticalmethod biologicalquestion

1.8 match 6.41 score 57 scripts

masurp

specr:Conducting and Visualizing Specification Curve Analyses

Provides utilities for conducting specification curve analyses (Simonsohn, Simmons & Nelson (2020, <doi: 10.1038/s41562-020-0912-z>) or multiverse analyses (Steegen, Tuerlinckx, Gelman & Vanpaemel, 2016, <doi: 10.1177/1745691616658637>) including functions to setup, run, evaluate, and plot all specifications.

Maintained by Philipp K. Masur. Last updated 10 months ago.

multiverse specification-curve

1.3 match 68 stars 8.02 score 85 scripts

cran

GWASinspector:Comprehensive and Easy to Use Quality Control of GWAS Results

When evaluating the results of a genome-wide association study (GWAS), it is important to perform a quality control to ensure that the results are valid, complete, correctly formatted, and, in case of meta-analysis, consistent with other studies that have applied the same analysis. This package was developed to facilitate and streamline this process and provide the user with a comprehensive report.

Maintained by Alireza Ani. Last updated 10 months ago.

5.1 match 2.00 score

eddelbuettel

digest:Create Compact Hash Digests of R Objects

Implementation of a function 'digest()' for the creation of hash digests of arbitrary R objects (using the 'md5', 'sha-1', 'sha-256', 'crc32', 'xxhash', 'murmurhash', 'spookyhash', 'blake3', 'crc32c', 'xxh3_64', and 'xxh3_128' algorithms) permitting easy comparison of R language objects, as well as functions such as 'hmac()' to create hash-based message authentication code. Please note that this package is not meant to be deployed for cryptographic purposes for which more comprehensive (and widely tested) libraries such as 'OpenSSL' should be used.

Maintained by Dirk Eddelbuettel. Last updated 2 months ago.

hash-digest

0.5 match 114 stars 19.82 score 11k scripts 6.9k dependents

ineelhere

clintrialx:Connect and Work with Clinical Trials Data Sources

Are you spending too much time fetching and managing clinical trial data? Struggling with complex queries and bulk data extraction? What if you could simplify this process with just a few lines of code? Introducing 'clintrialx' - Fetch clinical trial data from sources like 'ClinicalTrials.gov' <https://clinicaltrials.gov/> and the 'Clinical Trials Transformation Initiative - Access to Aggregate Content of ClinicalTrials.gov' database <https://aact.ctti-clinicaltrials.org/>, supporting pagination and bulk downloads. Also, you can generate HTML reports based on the data obtained from the sources!

Maintained by Indraneel Chakraborty. Last updated 5 days ago.

aact bioinformatics clinical-data clinical-trials clinicaltrialsgov ctti data data-management medical-informatics r-language trials

1.8 match 15 stars 5.76 score 11 scripts

bioc

abseqR:Reporting and data analysis functionalities for Rep-Seq datasets of antibody libraries

AbSeq is a comprehensive bioinformatic pipeline for the analysis of sequencing datasets generated from antibody libraries and abseqR is one of its packages. abseqR empowers the users of abseqPy (https://github.com/malhamdoosh/abseqPy) with plotting and reporting capabilities and allows them to generate interactive HTML reports for the convenience of viewing and sharing with other researchers. Additionally, abseqR extends abseqPy to compare multiple repertoire analyses and perform further downstream analysis on its output.

Maintained by JiaHong Fong. Last updated 5 months ago.

sequencing visualization reportwriting qualitycontrol multiplecomparison

2.5 match 4.00 score 3 scripts

bioc

scmeth:Functions to conduct quality control analysis in methylation data

Functions to analyze methylation data can be found here. Some functions are relevant for single cell methylation data but most other functions can be used for any methylation data. Highlight of this workflow is the comprehensive quality control report.

Maintained by Divy Kangeyan. Last updated 5 months ago.

dnamethylation qualitycontrol preprocessing singlecell immunooncology bioconductor-package methylation single-cell-methylation

2.1 match 4.70 score 5 scripts

bioc

GeneExpressionSignature:Gene Expression Signature based Similarity Metric

This package gives the implementations of the gene expression signature and its distance to each. Gene expression signature is represented as a list of genes whose expression is correlated with a biological state of interest. And its distance is defined using a nonparametric, rank-based pattern-matching strategy based on the Kolmogorov-Smirnov statistic. Gene expression signature and its distance can be used to detect similarities among the signatures of drugs, diseases, and biological states of interest.

Maintained by Yang Cao. Last updated 5 months ago.

geneexpression

1.9 match 1 stars 5.00 score 5 scripts

yufeng031

bestridge:A Comprehensive R Package for Best Subset Selection

The bestridge package is designed to provide a one-stand service for users to successfully carry out best ridge regression in various complex situations via the primal dual active set algorithm proposed by Wen, C., Zhang, A., Quan, S. and Wang, X. (2020) <doi:10.18637/jss.v094.i04>. This package allows users to perform the regression, classification, count regression and censored regression for (ultra) high dimensional data, and it also supports advanced usages like group variable selection and nuisance variable selection.

Maintained by Liyuan Hu. Last updated 3 years ago.

cpp

4.6 match 2.00 score 6 scripts

bioc

SGCP:SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks

SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks. SGC consists of multiple novel steps that enable the computation of highly enriched modules in an unsupervised manner. But unlike all existing frameworks, it further incorporates a novel step that leverages Gene Ontology information in a semi-supervised clustering method that further improves the quality of the computed modules.

Maintained by Niloofar AghaieAbiane. Last updated 5 months ago.

geneexpression genesetenrichment networkenrichment systemsbiology classification clustering dimensionreduction graphandnetwork neuralnetwork network mrnamicroarray rnaseq visualization bioinformatics genecoexpressionnetwork graphs networkclustering networks self-training semi-supervised-learning unsupervised-learning

1.8 match 2 stars 5.12 score 44 scripts

oobianom

r2symbols:Symbols for 'Markdown' and 'Shiny' Application

Direct insertion of over 1000 symbols (e.g. currencies, letters, emojis, arrows, mathematical symbols and so on) into 'Rmarkdown' documents and 'Shiny' applications by incorporating 'HTML' hex codes.

Maintained by Obinna Obianom. Last updated 2 years ago.

1.3 match 11 stars 6.67 score 94 scripts 1 dependents

ropensci

magick:Advanced Graphics and Image-Processing in R

Bindings to 'ImageMagick': the most comprehensive open-source image processing library available. Supports many common formats (png, jpeg, tiff, pdf, etc) and manipulations (rotate, scale, crop, trim, flip, blur, etc). All operations are vectorized via the Magick++ STL meaning they operate either on a single frame or a series of frames for working with layers, collages, or animation. In RStudio images are automatically previewed when printed to the console, resulting in an interactive editing environment. The latest version of the package includes a native graphics device for creating in-memory graphics or drawing onto images using pixel coordinates.

Maintained by Jeroen Ooms. Last updated 20 days ago.

image-manipulation image-processing imagemagick cpp

0.5 match 468 stars 17.31 score 9.0k scripts 256 dependents

quanteda

quanteda:Quantitative Analysis of Textual Data

A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.

Maintained by Kenneth Benoit. Last updated 2 months ago.

corpus natural-language-processing quanteda text-analytics onetbb cpp

0.5 match 851 stars 16.68 score 5.4k scripts 51 dependents

wviechtb

metafor:Meta-Analysis Package for R

A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.

Maintained by Wolfgang Viechtbauer. Last updated 2 days ago.

meta-analysis mixed-effects multilevel-models multivariate

0.5 match 246 stars 16.30 score 4.9k scripts 92 dependents

spatstat

spatstat:Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests

Comprehensive open-source toolbox for analysing Spatial Point Patterns. Focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. Also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. Supports spatial covariate data such as pixel images. Contains over 3000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference. Data types include point patterns, line segment patterns, spatial windows, pixel images, tessellations, and linear networks. Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.

Maintained by Adrian Baddeley. Last updated 2 months ago.

cluster-process cox-point-process gibbs-process kernel-density network-analysis point-process poisson-process spatial-analysis spatial-data spatial-data-analysis spatial-statistics spatstat statistical-methods statistical-models statistical-tests statistics

0.5 match 200 stars 16.32 score 5.5k scripts 41 dependents

bioc

iSEEu:iSEE Universe

iSEEu (the iSEE universe) contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels, or modes allowing easy configuration of iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

immunooncology visualization gui dimensionreduction featureextraction clustering transcription geneexpression transcriptomics singlecell cellbasedassays hacktoberfest

1.1 match 9 stars 7.15 score 35 scripts 1 dependents

utsavlamichhane

mbX:A Comprehensive Microbiome Data Processing Pipeline

Provides tools for cleaning, processing, and preparing microbiome sequencing data (e.g., 16S rRNA) for downstream analysis. Supports CSV, TXT, and 'Excel' file formats. The main function, ezclean(), automates microbiome data transformation, including format validation, transposition, numeric conversion, and metadata integration. Also ensures efficient handling of taxonomic levels, resolves duplicated taxa entries, and outputs a well-structured, analysis-ready dataset.

Maintained by Utsav Lamichhane. Last updated 13 days ago.

3.0 match 2.70 score

bioc

biomaRt:Interface to BioMart databases (i.e. Ensembl)

In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.

Maintained by Mike Smith. Last updated 2 days ago.

annotation bioconductor biomart ensembl

0.5 match 38 stars 15.99 score 13k scripts 230 dependents

andriyprotsak5

UAHDataScienceUC:Learn Clustering Techniques Through Examples and Code

A comprehensive educational package combining clustering algorithms with detailed step-by-step explanations. Provides implementations of both traditional (hierarchical, k-means) and modern (Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Gaussian Mixture Models (GMM), genetic k-means) clustering methods as described in Ezugwu et. al., (2022) <doi:10.1016/j.engappai.2022.104743>. Includes educational datasets highlighting different clustering challenges, based on 'scikit-learn' examples (Pedregosa et al., 2011) <https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html>. Features detailed algorithm explanations, visualizations, and weighted distance calculations for enhanced learning.

Maintained by Andriy Protsak Protsak. Last updated 27 days ago.

3.5 match 2.30 score

gokmenzararsiz

dtComb:Statistical Combination of Diagnostic Tests

A system for combining two diagnostic tests using various approaches that include statistical and machine-learning-based methodologies. These approaches are divided into four groups: linear combination methods, non-linear combination methods, mathematical operators, and machine learning algorithms. See the <https://biotools.erciyes.edu.tr/dtComb/> website for more information, documentation, and examples.

Maintained by Gokmen Zararsiz. Last updated 5 months ago.

1.7 match 4.70 score 7 scripts

ecor

geotopbricks:An R Plug-in for the Distributed Hydrological Model GEOtop

It analyzes raster maps and other information as input/output files from the Hydrological Distributed Model GEOtop. It contains functions and methods to import maps and other keywords from geotop.inpts file. Some examples with simulation cases of GEOtop 2.x/3.x are presented in the package. Any information about the GEOtop Distributed Hydrological Model source code is available on www.geotop.org. Technical details about the model are available in Endrizzi et al (2014) <https://gmd.copernicus.org/articles/7/2831/2014/gmd-7-2831-2014.html>.

Maintained by Emanuele Cordano. Last updated 2 months ago.

1.6 match 4 stars 4.83 score 112 scripts

michael-cw

susographql:Comprehensive Interface to the Survey Solutions 'GraphQL' API

Provides a complete suite of tools for interacting with the Survey Solutions 'GraphQL' API <https://demo.mysurvey.solutions/graphql/>. This package encompasses all currently available queries and mutations, including the latest features for map uploads. It is built on the modern 'httr2' package, offering a streamlined and efficient interface without relying on external 'GraphQL' client packages. In addition to core API functionalities, the package includes a range of helper functions designed to facilitate the use of available query filters.

Maintained by Michael Wild. Last updated 1 years ago.

2.9 match 2.70 score 4 scripts

jacobkap

crimeutils:A Comprehensive Set of Functions to Clean, Analyze, and Present Crime Data

A collection of functions that make it easier to understand crime (or other) data, and assist others in understanding it. The package helps you read data from various sources, clean it, fix column names, and graph the data.

Maintained by Jacob Kaplan. Last updated 2 years ago.

2.8 match 1 stars 2.78 score 12 scripts

chaoliu-cl

conversim:Conversation Similarity Analysis Package

A comprehensive toolkit for analyzing and comparing conversations. This package provides functions to calculate various similarity measures between conversations, including topic, lexical, semantic, structural, stylistic, sentiment, participant, and timing similarities. It supports both pairwise conversation comparisons and analysis of multiple dyads.

Maintained by Chao Liu. Last updated 6 months ago.

1.8 match 4.30 score 10 scripts

coffeemuggler

eseis:Environmental Seismology Toolbox

Environmental seismology is a scientific field that studies the seismic signals, emitted by Earth surface processes. This package provides all relevant functions to read/write seismic data files, prepare, analyse and visualise seismic data, and generate reports of the processing history.

Maintained by Michael Dietze. Last updated 4 months ago.

cpp

1.7 match 9 stars 4.42 score 58 scripts

bioc

bugsigdbr:R-side access to published microbial signatures from BugSigDB

The bugsigdbr package implements convenient access to bugsigdb.org from within R/Bioconductor. The goal of the package is to facilitate import of BugSigDB data into R/Bioconductor, provide utilities for extracting microbe signatures, and enable export of the extracted signatures to plain text files in standard file formats such as GMT.

Maintained by Ludwig Geistlinger. Last updated 9 days ago.

dataimport genesetenrichment metagenomics microbiome bioconductor-package

1.2 match 3 stars 6.46 score 48 scripts

bioc

ENmix:Quality control and analysis tools for Illumina DNA methylation BeadChip

Tools for quanlity control, analysis and visulization of Illumina DNA methylation array data.

Maintained by Zongli Xu. Last updated 2 days ago.

dnamethylation preprocessing qualitycontrol twochannel microarray onechannel methylationarray batcheffect normalization dataimport regression principalcomponent epigenetics multichannel differentialmethylation immunooncology

1.3 match 6.01 score 115 scripts

r-lib

clock:Date-Time Types and Tools

Provides a comprehensive library for date-time manipulations using a new family of orthogonal date-time classes (durations, time points, zoned-times, and calendars) that partition responsibilities so that the complexities of time zones are only considered when they are really needed. Capabilities include: date-time parsing, formatting, arithmetic, extraction and updating of components, and rounding.

Maintained by Davis Vaughan. Last updated 2 days ago.

cpp

0.5 match 106 stars 14.48 score 296 scripts 407 dependents

ssi-dk

aedseo:Automated and Early Detection of Seasonal Epidemic Onset and Burden Levels

A powerful tool for automating the early detection of seasonal epidemic onsets in time series data. It offers the ability to estimate growth rates across consecutive time intervals, calculate the sum of cases (SoC) within those intervals, and estimate seasonal onsets within user defined seasons. With use of a disease-specific threshold it also offers the possibility to estimate seasonal onset of epidemics. Additionally it offers the ability to estimate burden levels for seasons based on historical data. It is aimed towards epidemiologists, public health professionals, and researchers seeking to identify and respond to seasonal epidemics in a timely fashion. For reference on growth rate estimation, see Walling and Lipstich (2007) <doi:10.1098/rspb.2006.3754> and Obadia et al. (2012) <doi:10.1186/1472-6947-12-147>. Seasonal burden level calculations have been inspired by The Moving Epidemic Method (MEM), see Vega and Lozano (2012) <doi:10.1111/j.1750-2659.2012.00422.x>.

Maintained by Lasse Engbo Christiansen. Last updated 4 days ago.

1.3 match 1 stars 5.72 score 8 scripts

my-jiang

vamc:A Monte Carlo Valuation Framework for Variable Annuities

Implementation of a Monte Carlo simulation engine for valuing synthetic portfolios of variable annuities, which reflect realistic features of common annuity contracts in practice. It aims to facilitate the development and dissemination of research related to the efficient valuation of a portfolio of large variable annuities. The main valuation methodology was proposed by Gan (2017) <doi:10.1515/demo-2017-0021>.

Maintained by Mingyi Jiang. Last updated 5 years ago.

2.9 match 1 stars 2.45 score 28 scripts

veronica0206

nlpsem:Linear and Nonlinear Longitudinal Process in Structural Equation Modeling Framework

Provides computational tools for nonlinear longitudinal models, in particular the intrinsically nonlinear models, in four scenarios: (1) univariate longitudinal processes with growth factors, with or without covariates including time-invariant covariates (TICs) and time-varying covariates (TVCs); (2) multivariate longitudinal processes that facilitate the assessment of correlation or causation between multiple longitudinal variables; (3) multiple-group models for scenarios (1) and (2) to evaluate differences among manifested groups, and (4) longitudinal mixture models for scenarios (1) and (2), with an assumption that trajectories are from multiple latent classes. The methods implemented are introduced in Jin Liu (2023) <arXiv:2302.03237v2>.

Maintained by Jin Liu. Last updated 4 months ago.

1.0 match 145 stars 6.91 score 16 scripts

falafel19

AutoPipe:Automated Transcriptome Classifier Pipeline: Comprehensive Transcriptome Analysis

An unsupervised fully-automated pipeline for transcriptome analysis or a supervised option to identify characteristic genes from predefined subclasses. We rely on the 'pamr' <http://www.bioconductor.org/packages//2.7/bioc/html/pamr.html> clustering algorithm to cluster the Data and then draw a heatmap of the clusters with the most significant genes and the least significant genes according to the 'pamr' algorithm. This way we get easy to grasp heatmaps that show us for each cluster which are the clusters most defining genes.

Maintained by Karam Daka. Last updated 6 years ago.

2.9 match 2.48 score

cran

UAHDataScienceUC:Learn Clustering Techniques Through Examples and Code

A comprehensive educational package combining clustering algorithms with detailed step-by-step explanations. Provides implementations of both traditional (hierarchical, k-means) and modern (Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Gaussian Mixture Models (GMM), genetic k-means) clustering methods as described in Ezugwu et. al., (2022) <doi:10.1016/j.engappai.2022.104743>. Includes educational datasets highlighting different clustering challenges, based on 'scikit-learn' examples (Pedregosa et al., 2011) <https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html>. Features detailed algorithm explanations, visualizations, and weighted distance calculations for enhanced learning.

Maintained by Andriy Protsak Protsak. Last updated 27 days ago.

3.5 match 2.00 score

cran

rPDBapi:A Comprehensive Interface for Accessing the Protein Data Bank

Streamlines the interaction with the 'RCSB' Protein Data Bank ('PDB') <https://www.rcsb.org/>. This interface offers an intuitive and powerful tool for searching and retrieving a diverse range of data types from the 'PDB'. It includes advanced functionalities like BLAST and sequence motif queries. Built upon the existing XML-based API of the 'PDB', it simplifies the creation of custom requests, thereby enhancing usability and flexibility for researchers.

Maintained by Selcuk Korkmaz. Last updated 5 months ago.

2.9 match 2.40 score 2 scripts

bioc

CINdex:Chromosome Instability Index

The CINdex package addresses important area of high-throughput genomic analysis. It allows the automated processing and analysis of the experimental DNA copy number data generated by Affymetrix SNP 6.0 arrays or similar high throughput technologies. It calculates the chromosome instability (CIN) index that allows to quantitatively characterize genome-wide DNA copy number alterations as a measure of chromosomal instability. This package calculates not only overall genomic instability, but also instability in terms of copy number gains and losses separately at the chromosome and cytoband level.

Maintained by Yuriy Gusev. Last updated 5 months ago.

software copynumbervariation genomicvariation acgh microarray genetics sequencing

1.7 match 4.08 score 2 scripts

bioc

wavClusteR:Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data

The package provides an integrated pipeline for the analysis of PAR-CLIP data. PAR-CLIP-induced transitions are first discriminated from sequencing errors, SNPs and additional non-experimental sources by a non- parametric mixture model. The protein binding sites (clusters) are then resolved at high resolution and cluster statistics are estimated using a rigorous Bayesian framework. Post-processing of the results, data export for UCSC genome browser visualization and motif search analysis are provided. In addition, the package allows to integrate RNA-Seq data to estimate the False Discovery Rate of cluster detection. Key functions support parallel multicore computing. Note: while wavClusteR was designed for PAR-CLIP data analysis, it can be applied to the analysis of other NGS data obtained from experimental procedures that induce nucleotide substitutions (e.g. BisSeq).

Maintained by Federico Comoglio. Last updated 5 months ago.

immunooncology sequencing technology ripseq rnaseq bayesian

1.5 match 4.60 score 3 scripts

miraisolutions

XLConnect:Excel Connector for R

Provides comprehensive functionality to read, write and format Excel data.

Maintained by Martin Studer. Last updated 18 days ago.

cross-platform excel r-language xlconnect openjdk

0.6 match 130 stars 12.28 score 1.2k scripts 1 dependents

ocheab

FormulR:Comprehensive Tools for Drug Formulation Analysis and Visualization

This presents a comprehensive set of tools for the analysis and visualization of drug formulation data. It includes functions for statistical analysis, regression modeling, hypothesis testing, and comparative analysis to assess the impact of formulation parameters on drug release and other critical attributes. Additionally, the package offers a variety of data visualization functions, such as scatterplots, histograms, and boxplots, to facilitate the interpretation of formulation data. With its focus on usability and efficiency, this package aims to streamline the drug formulation process and aid researchers in making informed decisions during formulation design and optimization.

Maintained by Oche Ambrose George. Last updated 12 months ago.

3.4 match 2.00 score

lcef97

SchoolDataIT:Retrieve, Harmonise and Map Open Data Regarding the Italian School System

Compiles and displays the available data sets regarding the Italian school system, with a focus on the infrastructural aspects. Input datasets are downloaded from the web, with the aim of updating everything to real time. The functions are divided in four main modules, namely 'Get', to scrape raw data from the web 'Util', various utilities needed to process raw data 'Group', to aggregate data at the municipality or province level 'Map', to visualize the output datasets.

Maintained by Leonardo Cefalo. Last updated 2 months ago.

1.8 match 3.88 score

kosukehamazaki

RAINBOWR:Genome-Wide Association Study with SNP-Set Methods

By using 'RAINBOWR' (Reliable Association INference By Optimizing Weights with R), users can test multiple SNPs (Single Nucleotide Polymorphisms) simultaneously by kernel-based (SNP-set) methods. This package can also be applied to haplotype-based GWAS (Genome-Wide Association Study). Users can test not only additive effects but also dominance and epistatic effects. In detail, please check our paper on PLOS Computational Biology: Kosuke Hamazaki and Hiroyoshi Iwata (2020) <doi:10.1371/journal.pcbi.1007663>.

Maintained by Kosuke Hamazaki. Last updated 3 months ago.

cpp

1.1 match 22 stars 5.99 score 22 scripts

bioc

scDblFinder:scDblFinder

The scDblFinder package gathers various methods for the detection and handling of doublets/multiplets in single-cell sequencing data (i.e. multiple cells captured within the same droplet or reaction volume). It includes methods formerly found in the scran package, the new fast and comprehensive scDblFinder method, and a reimplementation of the Amulet detection method for single-cell ATAC-seq.

Maintained by Pierre-Luc Germain. Last updated 2 months ago.

preprocessing singlecell rnaseq atacseq doublets single-cell

0.5 match 184 stars 12.34 score 888 scripts 1 dependents

rsquaredacademy

olsrr:Tools for Building OLS Regression Models

Tools designed to make it easier for users, particularly beginner/intermediate R users to build ordinary least squares regression models. Includes comprehensive regression output, heteroskedasticity tests, collinearity diagnostics, residual diagnostics, measures of influence, model fit assessment and variable selection procedures.

Maintained by Aravind Hebbali. Last updated 4 months ago.

collinearity-diagnostics linear-models regression stepwise-regression

0.5 match 103 stars 12.19 score 1.4k scripts 4 dependents

frenchrh

klovan:Geostatistics Methods and Klovan Data

A comprehensive set of geostatistical, visual, and analytical methods, in conjunction with the expanded version of the acclaimed J.E. Klovan's mining dataset, are included in 'klovan'. This makes the package an excellent learning resource for Principal Component Analysis (PCA), Factor Analysis (FA), kriging, and other geostatistical techniques. Originally published in the 1976 book 'Geological Factor Analysis', the included mining dataset was assembled by Professor J. E. Klovan of the University of Calgary. Being one of the first applications of FA in the geosciences, this dataset has significant historical importance. As a well-regarded and published dataset, it is an excellent resource for demonstrating the capabilities of PCA, FA, kriging, and other geostatistical techniques in geosciences. For those interested in these methods, the 'klovan' datasets provide a valuable and illustrative resource. Note that some methods require the 'RGeostats' package. Please refer to the README or Additional_repositories for installation instructions. This material is based upon research in the Materials Data Science for Stockpile Stewardship Center of Excellence (MDS3-COE), and supported by the Department of Energy's National Nuclear Security Administration under Award Number DE-NA0004104.

Maintained by Roger H French. Last updated 1 years ago.

2.5 match 2.48 score

ropensci

rotl:Interface to the 'Open Tree of Life' API

An interface to the 'Open Tree of Life' API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to 'Open Tree identifiers'. The 'Open Tree of Life' aims at assembling a comprehensive phylogenetic tree for all named species.

Maintained by Francois Michonneau. Last updated 2 years ago.

metadata ropensci phylogenetics independant-contrasts biodiversity peer-reviewed phylogeny taxonomy

0.5 match 40 stars 12.05 score 356 scripts 29 dependents

bioc

iSEEhex:iSEE extension for summarising data points in hexagonal bins

This package provides panels summarising data points in hexagonal bins for `iSEE`. It is part of `iSEEu`, the iSEE universe of panels that extend the `iSEE` package.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure bioconductor iseeu shiny-r

1.1 match 5.38 score 7 scripts 2 dependents

joemsong

DiffXTables:Pattern Analysis Across Contingency Tables

Statistical hypothesis testing of pattern heterogeneity via differences in underlying distributions across multiple contingency tables. Five tests are included: the comparative chi-squared test (Song et al. 2014) <doi:10.1093/nar/gku086> (Zhang et al. 2015) <doi:10.1093/nar/gkv358>, the Sharma-Song test (Sharma et al. 2021) <doi:10.1093/bioinformatics/btab240>, the heterogeneity test, the marginal-change test (Sharma et al. 2020) <doi:10.1145/3388440.3412485>, and the strength test (Sharma et al. 2020) <doi:10.1145/3388440.3412485>. Under the null hypothesis that row and column variables are statistically independent and joint distributions are equal, their test statistics all follow an asymptotically chi-squared distribution. A comprehensive type analysis categorizes the relation among the contingency tables into type null, 0, 1, and 2 (Sharma et al. 2020) <doi:10.1145/3388440.3412485>. They can identify heterogeneous patterns that differ in either the first order (marginal) or the second order (differential departure from independence). Second-order differences reveal more fundamental changes than first-order differences across heterogeneous patterns.

Maintained by Joe Song. Last updated 4 years ago.

2.2 match 2.70 score 2 scripts

guido-s

netmeta:Network Meta-Analysis using Frequentist Methods

A comprehensive set of functions providing frequentist methods for network meta-analysis (Balduzzi et al., 2023) <doi:10.18637/jss.v106.i02> and supporting Schwarzer et al. (2015) <doi:10.1007/978-3-319-21416-0>, Chapter 8 "Network Meta-Analysis": - frequentist network meta-analysis following Rücker (2012) <doi:10.1002/jrsm.1058>; - additive network meta-analysis for combinations of treatments (Rücker et al., 2020) <doi:10.1002/bimj.201800167>; - network meta-analysis of binary data using the Mantel-Haenszel or non-central hypergeometric distribution method (Efthimiou et al., 2019) <doi:10.1002/sim.8158>, or penalised logistic regression (Evrenoglou et al., 2022) <doi:10.1002/sim.9562>; - rankograms and ranking of treatments by the Surface under the cumulative ranking curve (SUCRA) (Salanti et al., 2013) <doi:10.1016/j.jclinepi.2010.03.016>; - ranking of treatments using P-scores (frequentist analogue of SUCRAs without resampling) according to Rücker & Schwarzer (2015) <doi:10.1186/s12874-015-0060-8>; - split direct and indirect evidence to check consistency (Dias et al., 2010) <doi:10.1002/sim.3767>, (Efthimiou et al., 2019) <doi:10.1002/sim.8158>; - league table with network meta-analysis results; - 'comparison-adjusted' funnel plot (Chaimani & Salanti, 2012) <doi:10.1002/jrsm.57>; - net heat plot and design-based decomposition of Cochran's Q according to Krahn et al. (2013) <doi:10.1186/1471-2288-13-35>; - measures characterizing the flow of evidence between two treatments by König et al. (2013) <doi:10.1002/sim.6001>; - automated drawing of network graphs described in Rücker & Schwarzer (2016) <doi:10.1002/jrsm.1143>; - partial order of treatment rankings ('poset') and Hasse diagram for 'poset' (Carlsen & Bruggemann, 2014) <doi:10.1002/cem.2569>; (Rücker & Schwarzer, 2017) <doi:10.1002/jrsm.1270>; - contribution matrix as described in Papakonstantinou et al. (2018) <doi:10.12688/f1000research.14770.3> and Davies et al. (2022) <doi:10.1002/sim.9346>; - subgroup network meta-analysis.

Maintained by Guido Schwarzer. Last updated 2 days ago.

meta-analysis network-meta-analysis rstudio

0.5 match 33 stars 11.82 score 199 scripts 10 dependents

bioc

synergyfinder:Calculate and Visualize Synergy Scores for Drug Combinations

Efficient implementations for analyzing pre-clinical multiple drug combination datasets. It provides efficient implementations for 1.the popular synergy scoring models, including HSA, Loewe, Bliss, and ZIP to quantify the degree of drug combination synergy; 2. higher order drug combination data analysis and synergy landscape visualization for unlimited number of drugs in a combination; 3. statistical analysis of drug combination synergy and sensitivity with confidence intervals and p-values; 4. synergy barometer for harmonizing multiple synergy scoring methods to provide a consensus metric of synergy; 5. evaluation of synergy and sensitivity simultaneously to provide an unbiased interpretation of the clinical potential of the drug combinations. Based on this package, we also provide a web application (http://www.synergyfinder.org) for users who prefer graphical user interface.

Maintained by Shuyu Zheng. Last updated 5 months ago.

software statisticalmethod

1.1 match 5.42 score 44 scripts

srivastavbudugutta

tvtools:Comprehensive Tools for Panel Data Analysis - 'tvtools'

Longitudinal data offers insights into population changes over time but often requires a flexible structure, especially with varying follow-up intervals. Panel data is one way to store such records, though it adds complexity to analysis. The 'tvtools' package for R simplifies exploring and analyzing panel data.

Maintained by Srivastav Budugutta. Last updated 5 months ago.

2.9 match 2.00 score 4 scripts

green-striped-gecko

PopGenReport:A Simple Framework to Analyse Population and Landscape Genetic Data

Provides beginner friendly framework to analyse population genetic data. Based on 'adegenet' objects it uses 'knitr' to create comprehensive reports on spatial genetic data. For detailed information how to use the package refer to the comprehensive tutorials or visit <http://www.popgenreport.org/>.

Maintained by Bernd Gruber. Last updated 1 years ago.

0.8 match 5 stars 7.27 score 82 scripts 1 dependents

sssydysss

TransProR:Analysis and Visualization of Multi-Omics Data

A tool for comprehensive transcriptomic data analysis, with a focus on transcript-level data preprocessing, expression profiling, differential expression analysis, and functional enrichment. It enables researchers to identify key biological processes, disease biomarkers, and gene regulatory mechanisms. 'TransProR' is aimed at researchers and bioinformaticians working with RNA-Seq data, providing an intuitive framework for in-depth analysis and visualization of transcriptomic datasets. The package includes comprehensive documentation and usage examples to guide users through the entire analysis pipeline. The differential expression analysis methods incorporated in the package include 'limma' (Ritchie et al., 2015, <doi:10.1093/nar/gkv007>; Smyth, 2005, <doi:10.1007/0-387-29362-0_23>), 'edgeR' (Robinson et al., 2010, <doi:10.1093/bioinformatics/btp616>), 'DESeq2' (Love et al., 2014, <doi:10.1186/s13059-014-0550-8>), and Wilcoxon tests (Li et al., 2022, <doi:10.1186/s13059-022-02648-4>), providing flexible and robust approaches to RNA-Seq data analysis. For more information, refer to the package vignettes and related publications.

Maintained by Dongyue Yu. Last updated 20 days ago.

0.8 match 174 stars 7.55 score 34 scripts

cran

care4cmodel:Carbon-Related Assessment of Silvicultural Concepts

A simulation model and accompanying functions that support assessing silvicultural concepts on the forest estate level with a focus on the CO2 uptake by wood growth and CO2 emissions by forest operations. For achieving this, a virtual forest estate area is split into the areas covered by typical phases of the silvicultural concept of interest. Given initial area shares of these phases, the dynamics of these areas is simulated. The typical carbon stocks and flows which are known for all phases are attributed post-hoc to the areas and upscaled to the estate level. CO2 emissions by forest operations are estimated based on the amounts and dimensions of the harvested timber. Probabilities of damage events are taken into account.

Maintained by Peter Biber. Last updated 4 months ago.

1.8 match 3.18 score 3 scripts

lbaole17

superdiag:A Comprehensive Test Suite for Testing Markov Chain Nonconvergence

The 'superdiag' package provides a comprehensive test suite for testing Markov Chain nonconvergence. It integrates five standard empirical MCMC convergence diagnostics (Gelman-Rubin, Geweke, Heidelberger-Welch, Raftery-Lewis, and Hellinger distance) and plotting functions for trace plots and density histograms. The functions of the package can be used to present all diagnostic statistics and graphs at once for conveniently checking MCMC nonconvergence.

Maintained by Le Bao. Last updated 4 years ago.

3.4 match 1.68 score 16 scripts 1 dependents

bioc

Maaslin2:"Multivariable Association Discovery in Population-scale Meta-omics Studies"

MaAsLin2 is comprehensive R package for efficiently determining multivariable association between clinical metadata and microbial meta'omic features. MaAsLin2 relies on general linear models to accommodate most modern epidemiological study designs, including cross-sectional and longitudinal, and offers a variety of data exploration, normalization, and transformation methods. MaAsLin2 is the next generation of MaAsLin.

Maintained by Lauren McIver. Last updated 5 months ago.

metagenomics software microbiome normalization biobakery bioconductor differential-abundance-analysis false-discovery-rate multiple-covariates public repeated-measures tools

0.5 match 133 stars 11.03 score 532 scripts 3 dependents

r-lib

tzdb:Time Zone Database Information

Provides an up-to-date copy of the Internet Assigned Numbers Authority (IANA) Time Zone Database. It is updated periodically to reflect changes made by political bodies to time zone boundaries, UTC offsets, and daylight saving time rules. Additionally, this package provides a C++ interface for working with the 'date' library. 'date' provides comprehensive support for working with dates and date-times, which this package exposes to make it easier for other R packages to utilize. Headers are provided for calendar specific calculations, along with a limited interface for time zone manipulations.

Maintained by Davis Vaughan. Last updated 2 days ago.

cpp

0.5 match 7 stars 10.90 score 38 scripts 2.4k dependents

azure

AzureGraph:Simple Interface to 'Microsoft Graph'

A simple interface to the 'Microsoft Graph' API <https://learn.microsoft.com/en-us/graph/overview>. 'Graph' is a comprehensive framework for accessing data in various online Microsoft services. This package was originally intended to provide an R interface only to the 'Azure Active Directory' part, with a view to supporting interoperability of R and 'Azure': users, groups, registered apps and service principals. However it has since been expanded into a more general tool for interacting with Graph. Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 2 years ago.

azure-active-directory-graph-api azure-sdk-r microsoft-graph-api

0.5 match 32 stars 10.30 score 36 scripts 21 dependents

tarnduong

ks:Kernel Smoothing

Kernel smoothers for univariate and multivariate data, with comprehensive visualisation and bandwidth selection capabilities, including for densities, density derivatives, cumulative distributions, clustering, classification, density ridges, significant modal regions, and two-sample hypothesis tests. Chacon & Duong (2018) <doi:10.1201/9780429485572>.

Maintained by Tarn Duong. Last updated 6 months ago.

0.5 match 6 stars 10.14 score 920 scripts 262 dependents

fang-zhaoyuan

GANPAdata:The GANPA Datasets Package

This is a dataset package for GANPA, which implements a network-based gene weighting approach to pathway analysis. This package includes data useful for GANPA, such as a functional association network, pathways, an expression dataset and multi-subunit proteins.

Maintained by Zhaoyuan Fang. Last updated 14 years ago.

3.5 match 1.48 score 7 scripts 1 dependents

nanxstats

protr:Generating Various Numerical Representation Schemes for Protein Sequences

Comprehensive toolkit for generating various numerical features of protein sequences described in Xiao et al. (2015) <DOI:10.1093/bioinformatics/btv042>. For full functionality, the software 'ncbi-blast+' is needed, see <https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html> for more information.

Maintained by Nan Xiao. Last updated 6 months ago.

bioinformatics feature-engineering feature-extraction machine-learning peptides protein-sequences sequence-analysis

0.5 match 52 stars 10.02 score 173 scripts 3 dependents

egeulgen

pathfindR:Enrichment Analysis Utilizing Active Subnetworks

Enrichment analysis enables researchers to uncover mechanisms underlying a phenotype. However, conventional methods for enrichment analysis do not take into account protein-protein interaction information, resulting in incomplete conclusions. 'pathfindR' is a tool for enrichment analysis utilizing active subnetworks. The main function identifies active subnetworks in a protein-protein interaction network using a user-provided list of genes and associated p values. It then performs enrichment analyses on the identified subnetworks, identifying enriched terms (i.e. pathways or, more broadly, gene sets) that possibly underlie the phenotype of interest. 'pathfindR' also offers functionalities to cluster the enriched terms and identify representative terms in each cluster, to score the enriched terms per sample and to visualize analysis results. The enrichment, clustering and other methods implemented in 'pathfindR' are described in detail in Ulgen E, Ozisik O, Sezerman OU. 2019. 'pathfindR': An R Package for Comprehensive Identification of Enriched Pathways in Omics Data Through Active Subnetworks. Front. Genet. <doi:10.3389/fgene.2019.00858>.

Maintained by Ege Ulgen. Last updated 28 days ago.

active-subnetworks enrichment pathway pathway-enrichment-analysis subnetwork

0.5 match 186 stars 10.13 score 138 scripts

azure

AzureRMR:Interface to 'Azure Resource Manager'

A lightweight but powerful R interface to the 'Azure Resource Manager' REST API. The package exposes a comprehensive class framework and related tools for creating, updating and deleting 'Azure' resource groups, resources and templates. While 'AzureRMR' can be used to manage any 'Azure' service, it can also be extended by other packages to provide extra functionality for specific services. Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 1 years ago.

azure azure-resource-manager azure-sdk-r cloud

0.5 match 20 stars 9.94 score 51 scripts 12 dependents

immunomind

immunarch:Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires

A comprehensive framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires. It provides seamless data loading, analysis and visualisation for AIRR (Adaptive Immune Receptor Repertoire) data, both bulk immunosequencing (RepSeq) and single-cell sequencing (scRNAseq). Immunarch implements most of the widely used AIRR analysis methods, such as: clonality analysis, estimation of repertoire similarities in distribution of clonotypes and gene segments, repertoire diversity analysis, annotation of clonotypes using external immune receptor databases and clonotype tracking in vaccination and cancer studies. A successor to our previously published 'tcR' immunoinformatics package (Nazarov 2015) <doi:10.1186/s12859-015-0613-1>.

Maintained by Vadim I. Nazarov. Last updated 12 months ago.

airr-analysis b-cell-receptor bcr bcr-repertoire bioinformatics ig ig-repertoire immune-repertoire immune-repertoire-analysis immune-repertoire-data immunoglobulin immunoinformatics immunology rep-seq repertoire-analysis single-cell single-cell-analysis t-cell-receptor tcr tcr-repertoire cpp

0.5 match 315 stars 9.49 score 203 scripts

bioc

RBGL:An interface to the BOOST graph library

A fairly extensive and comprehensive interface to the graph algorithms contained in the BOOST library.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

graphandnetwork network cpp

0.6 match 8.59 score 320 scripts 132 dependents

bupaverse

bupaR:Business Process Analysis in R

Comprehensive Business Process Analysis toolkit. Creates S3-class for event log objects, and related handler functions. Imports related packages for filtering event data, computation of descriptive statistics, handling of 'Petri Net' objects and visualization of process maps. See also packages 'edeaR','processmapR', 'eventdataR' and 'processmonitR'.

Maintained by Gert Janssenswillen. Last updated 2 years ago.

0.5 match 55 stars 9.07 score 389 scripts 11 dependents

bluefoxr

COINr:Composite Indicator Construction and Analysis

A comprehensive high-level package, for composite indicator construction and analysis. It is a "development environment" for composite indicators and scoreboards, which includes utilities for construction (indicator selection, denomination, imputation, data treatment, normalisation, weighting and aggregation) and analysis (multivariate analysis, correlation plotting, short cuts for principal component analysis, global sensitivity analysis, and more). A composite indicator is completely encapsulated inside a single hierarchical list called a "coin". This allows a fast and efficient work flow, as well as making quick copies, testing methodological variations and making comparisons. It also includes many plotting options, both statistical (scatter plots, distribution plots) as well as for presenting results.

Maintained by William Becker. Last updated 2 months ago.

0.5 match 26 stars 9.07 score 73 scripts 1 dependents

wjakethompson

taylor:Lyrics and Song Data for Taylor Swift's Discography

A comprehensive resource for data on Taylor Swift songs. Data is included for all officially released studio albums, extended plays (EPs), and individual singles are included. Data comes from 'Genius' (lyrics) and 'Spotify' (song characteristics). Additional functions are included for easily creating data visualizations with color palettes inspired by Taylor Swift's album covers.

Maintained by W. Jake Thompson. Last updated 1 months ago.

color-palettes data genius-lyrics ggplot2-themes lyrics spotify spotify-api taylor-swift

0.5 match 45 stars 8.79 score 105 scripts

magnusdv

pedtools:Creating and Working with Pedigrees and Marker Data

A comprehensive collection of tools for creating, manipulating and visualising pedigrees and genetic marker data. Pedigrees can be read from text files or created on the fly with built-in functions. A range of utilities enable modifications like adding or removing individuals, breaking loops, and merging pedigrees. An online tool for creating pedigrees interactively, based on 'pedtools', is available at <https://magnusdv.shinyapps.io/quickped>. 'pedtools' is the hub of the 'pedsuite', a collection of packages for pedigree analysis. A detailed presentation of the 'pedsuite' is given in the book 'Pedigree Analysis in R' (Vigeland, 2021, ISBN:9780128244302).

Maintained by Magnus Dehli Vigeland. Last updated 2 months ago.

0.5 match 25 stars 8.83 score 60 scripts 18 dependents

eddelbuettel

RQuantLib:R Interface to the 'QuantLib' Library

The 'RQuantLib' package makes parts of 'QuantLib' accessible from R The 'QuantLib' project aims to provide a comprehensive software framework for quantitative finance. The goal is to provide a standard open source library for quantitative analysis, modeling, trading, and risk management of financial assets.

Maintained by Dirk Eddelbuettel. Last updated 2 months ago.

cpp quantlib cpp

0.5 match 123 stars 8.52 score 194 scripts

jamesliley

SPARRAfairness:Analysis of Differential Behaviour of SPARRA Score Across Demographic Groups

The SPARRA risk score (Scottish Patients At Risk of admission and Re-Admission) estimates yearly risk of emergency hospital admission using electronic health records on a monthly basis for most of the Scottish population. This package implements a suite of functions used to analyse the behaviour and performance of the score, focusing particularly on differential performance over demographically-defined groups. It includes useful utility functions to plot receiver-operator-characteristic, precision-recall and calibration curves, draw stock human figures, estimate counterfactual quantities without the need to re-compute risk scores, to simulate a semi-realistic dataset.

Maintained by James Liley. Last updated 4 months ago.

1.6 match 2.70 score 4 scripts

bioc

SPIAT:Spatial Image Analysis of Tissues

SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.

Maintained by Yuzhou Feng. Last updated 16 hours ago.

biomedicalinformatics cellbiology spatial clustering dataimport immunooncology qualitycontrol singlecell software visualization

0.5 match 22 stars 8.59 score 69 scripts

axlehner

SpatialRDD:Conduct Multiple Types of Geographic Regression Discontinuity Designs

Spatial versions of Regression Discontinuity Designs (RDDs) are becoming increasingly popular as tools for causal inference. However, conducting state-of-the-art analyses often involves tedious and time-consuming steps. This package offers comprehensive functionalities for executing all required spatial and econometric tasks in a streamlined manner. Moreover, it equips researchers with tools for performing essential placebo and balancing checks comprehensively. The fact that researchers do not have to rely on 'APIs' of external 'GIS' software ensures replicability and raises the standard for spatial RDDs.

Maintained by Alexander Lehner. Last updated 12 months ago.

0.8 match 37 stars 5.57 score 8 scripts

bioc

UniProt.ws:R Interface to UniProt Web Services

The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. This package provides a collection of functions for retrieving, processing, and re-packaging UniProt web services. The package makes use of UniProt's modernized REST API and allows mapping of identifiers accross different databases.

Maintained by Marcel Ramos. Last updated 2 months ago.

annotation infrastructure go kegg biocarta bioconductor-package core-package

0.5 match 4 stars 8.38 score 167 scripts 4 dependents

asa12138

MetaNet:Network Analysis for Omics Data

Comprehensive network analysis package. Calculate correlation network fastly, accelerate lots of analysis by parallel computing. Support for multi-omics data, search sub-nets fluently. Handle bigger data, more than 10,000 nodes in each omics. Offer various layout method for multi-omics network and some interfaces to other software ('Gephi', 'Cytoscape', 'ggplot2'), easy to visualize. Provide comprehensive topology indexes calculation, including ecological network stability.

Maintained by Chen Peng. Last updated 11 days ago.

dataimport network analysis omics software visualization

0.8 match 13 stars 5.51 score 9 scripts

insightsengineering

chevron:Standard TLGs for Clinical Trials Reporting

Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.

Maintained by Joe Zhu. Last updated 24 days ago.

clinical-trials graphs listings nest reporting tables

0.5 match 12 stars 8.24 score 12 scripts

ax3man

phylopath:Perform Phylogenetic Path Analysis

A comprehensive and easy to use R implementation of confirmatory phylogenetic path analysis as described by Von Hardenberg and Gonzalez-Voyer (2012) <doi:10.1111/j.1558-5646.2012.01790.x>.

Maintained by Wouter van der Bijl. Last updated 6 months ago.

analysis comparative-methods path phylogenetics

0.5 match 13 stars 8.10 score 81 scripts 1 dependents

rrrlw

TDAstats:Pipeline for Topological Data Analysis

A comprehensive toolset for any useR conducting topological data analysis, specifically via the calculation of persistent homology in a Vietoris-Rips complex. The tools this package currently provides can be conveniently split into three main sections: (1) calculating persistent homology; (2) conducting statistical inference on persistent homology calculations; (3) visualizing persistent homology and statistical inference. The published form of TDAstats can be found in Wadhwa et al. (2018) <doi:10.21105/joss.00860>. For a general background on computing persistent homology for topological data analysis, see Otter et al. (2017) <doi:10.1140/epjds/s13688-017-0109-5>. To learn more about how the permutation test is used for nonparametric statistical inference in topological data analysis, read Robinson & Turner (2017) <doi:10.1007/s41468-017-0008-7>. To learn more about how TDAstats calculates persistent homology, you can visit the GitHub repository for Ripser, the software that works behind the scenes at <https://github.com/Ripser/ripser>. This package has been published as Wadhwa et al. (2018) <doi:10.21105/joss.00860>.

Maintained by Raoul Wadhwa. Last updated 3 years ago.

data-science ggplot2 homology homology-calculations homology-computation joss persistent-homology pipeline ripser tda topological-data-analysis topology topology-visualization visualization cpp

0.5 match 40 stars 8.30 score 46 scripts 4 dependents

bioc

POMA:Tools for Omics Data Analysis

The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

batcheffect classification clustering decisiontree dimensionreduction multidimensionalscaling normalization preprocessing principalcomponent regression rnaseq software statisticalmethod visualization bioconductor bioinformatics data-visualization dimension-reduction exploratory-data-analysis machine-learning omics-data-integration pipeline pre-processing statistical-analysis user-friendly workflow

0.5 match 11 stars 8.23 score 20 scripts 1 dependents

bioc

countsimQC:Compare Characteristic Features of Count Data Sets

countsimQC provides functionality to create a comprehensive report comparing a broad range of characteristics across a collection of count matrices. One important use case is the comparison of one or more synthetic count matrices to a real count matrix, possibly the one underlying the simulations. However, any collection of count matrices can be compared.

Maintained by Charlotte Soneson. Last updated 3 months ago.

microbiome rnaseq singlecell experimentaldesign qualitycontrol reportwriting visualization immunooncology

0.5 match 27 stars 7.69 score 24 scripts

mlampros

nmslibR:Non Metric Space (Approximate) Library

A Non-Metric Space Library ('NMSLIB' <https://github.com/nmslib/nmslib>) wrapper, which according to the authors "is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The goal of the 'NMSLIB' <https://github.com/nmslib/nmslib> Library is to create an effective and comprehensive toolkit for searching in generic non-metric spaces. Being comprehensive is important, because no single method is likely to be sufficient in all cases. Also note that exact solutions are hardly efficient in high dimensions and/or non-metric spaces. Hence, the main focus is on approximate methods". The wrapper also includes Approximate Kernel k-Nearest-Neighbor functions based on the 'NMSLIB' <https://github.com/nmslib/nmslib> 'Python' Library.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

approximate-nearest-neighbor-search nmslib non-metric python reticulate cpp openmp

0.8 match 12 stars 5.14 score 23 scripts

cran

rocbc:Statistical Inference for Box-Cox Based Receiver Operating Characteristic Curves

Generation of Box-Cox based ROC curves and several aspects of inferences and hypothesis testing. Can be used when inferences for one biomarker (Bantis LE, Nakas CT, Reiser B. (2018)<doi:10.1002/bimj.201700107>) are of interest or when comparisons of two correlated biomarkers (Bantis LE, Nakas CT, Reiser B. (2021)<doi:10.1002/bimj.202000128>) are of interest. Provides inferences and comparisons around the AUC, the Youden index, the sensitivity at a given specificity level (and vice versa), the optimal operating point of the ROC curve (in the Youden sense), and the Youden based cutoff.

Maintained by Benjamin Brewer. Last updated 11 months ago.

1.7 match 2.30 score

haghish

shapley:Weighted Mean SHAP and CI for Robust Feature Selection in ML Grid

This R package introduces Weighted Mean SHapley Additive exPlanations (WMSHAP), an innovative method for calculating SHAP values for a grid of fine-tuned base-learner machine learning models as well as stacked ensembles, a method not previously available due to the common reliance on single best-performing models. By integrating the weighted mean SHAP values from individual base-learners comprising the ensemble or individual base-learners in a tuning grid search, the package weights SHAP contributions according to each model's performance, assessed by multiple either R squared (for both regression and classification models). alternatively, this software also offers weighting SHAP values based on the area under the precision-recall curve (AUCPR), the area under the curve (AUC), and F2 measures for binary classifiers. It further extends this framework to implement weighted confidence intervals for weighted mean SHAP values, offering a more comprehensive and robust feature importance evaluation over a grid of machine learning models, instead of solely computing SHAP values for the best model. This methodology is particularly beneficial for addressing the severe class imbalance (class rarity) problem by providing a transparent, generalized measure of feature importance that mitigates the risk of reporting SHAP values for an overfitted or biased model and maintains robustness under severe class imbalance, where there is no universal criteria of identifying the absolute best model. Furthermore, the package implements hypothesis testing to ascertain the statistical significance of SHAP values for individual features, as well as comparative significance testing of SHAP contributions between features. Additionally, it tackles a critical gap in feature selection literature by presenting criteria for the automatic feature selection of the most important features across a grid of models or stacked ensembles, eliminating the need for arbitrary determination of the number of top features to be extracted. This utility is invaluable for researchers analyzing feature significance, particularly within severely imbalanced outcomes where conventional methods fall short. Moreover, it is also expected to report democratic feature importance across a grid of models, resulting in a more comprehensive and generalizable feature selection. The package further implements a novel method for visualizing SHAP values both at subject level and feature level as well as a plot for feature selection based on the weighted mean SHAP ratios.

Maintained by E. F. Haghish. Last updated 3 days ago.

class-imbalance class-imbalance-problem feature-extraction feature-importance feature-selection machine-learning machine-learning-algorithms shap shap-analysis shap-values shapely shapley-additive-explanations shapley-decomposition shapley-value shapley-values shapleyvalue weighted-shap weighted-shap-confidence-interval weighted-shapley weighted-shapley-ci

0.8 match 14 stars 5.19 score 17 scripts

ropengov

retroharmonize:Ex Post Survey Data Harmonization

Assist in reproducible retrospective (ex-post) harmonization of data, particularly individual level survey data, by providing tools for organizing metadata, standardizing the coding of variables, and variable names and value labels, including missing values, and documenting the data transformations, with the help of comprehensive s3 classes.

Maintained by Daniel Antal. Last updated 2 months ago.

ropengov

0.5 match 10 stars 7.62 score 59 scripts

mccarthy-m-g

palettes:Methods for Colour Vectors and Colour Palettes

Provides a comprehensive library for colour vectors and colour palettes using a new family of colour classes (palettes_colour and palettes_palette) that always print as hex codes with colour previews. Capabilities include: formatting, casting and coercion, extraction and updating of components, plotting, colour mixing arithmetic, and colour interpolation.

Maintained by Michael McCarthy. Last updated 6 months ago.

color-palette colors colour-palette colours ggplot2 gt palettes vctrs

0.5 match 25 stars 7.58 score 42 scripts 1 dependents

robindenz1

simDAG:Simulate Data from a DAG and Associated Node Information

Simulate complex data from a given directed acyclic graph and information about each individual node. Root nodes are simply sampled from the specified distribution. Child Nodes are simulated according to one of many implemented regressions, such as logistic regression, linear regression, poisson regression and more. Also includes a comprehensive framework for discrete-time simulation, which can generate even more complex longitudinal data.

Maintained by Robin Denz. Last updated 21 days ago.

causal-inference directed-acyclic-graph simulation

0.5 match 10 stars 7.55 score 77 scripts

epiverse-trace

cleanepi:Clean and Standardize Epidemiological Data

Cleaning and standardizing tabular data package, tailored specifically for curating epidemiological data. It streamlines various data cleaning tasks that are typically expected when working with datasets in epidemiology. It returns the processed data in the same format, and generates a comprehensive report detailing the outcomes of each cleaning task.

Maintained by Karim Mané. Last updated 3 days ago.

data-cleaning epidemiology epiverse

0.5 match 9 stars 7.44 score 19 scripts

dgerbing

lessR:Less Code, More Results

Each function replaces multiple standard R functions. For example, two function calls, Read() and CountAll(), generate summary statistics for all variables in the data frame, plus histograms and bar charts as appropriate. Other functions provide for summary statistics via pivot tables, a comprehensive regression analysis, ANOVA and t-test, visualizations including the Violin/Box/Scatter plot for a numerical variable, bar chart, histogram, box plot, density curves, calibrated power curve, reading multiple data formats with the same function call, variable labels, time series with aggregation and forecasting, color themes, and Trellis (facet) graphics. Also includes a confirmatory factor analysis of multiple indicator measurement models, pedagogical routines for data simulation such as for the Central Limit Theorem, generation and rendering of regression instructions for interpretative output, and interactive visualizations.

Maintained by David W. Gerbing. Last updated 1 months ago.

0.5 match 6 stars 7.47 score 394 scripts 3 dependents

jo-karl

ccpsyc:Methods for Cross-Cultural Psychology

With the development of new cross-cultural methods this package is intended to combine multiple functions automating and simplifying functions providing a unified analysis approach for commonly employed methods.

Maintained by Johannes Karl. Last updated 2 years ago.

1.9 match 1 stars 2.00 score 1 scripts

eltebioinformatics

mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate

Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.

Maintained by Tamas Stirling. Last updated 3 months ago.

annotation differentialexpression geneexpression genesetenrichment go graphandnetwork multiplecomparison pathways reactome software transcription visualization enrichment enrichment-analysis functional-enrichment-analysis gene-set-enrichment ontologies transcriptomics cpp

0.5 match 28 stars 7.36 score 34 scripts

rsquaredacademy

blorr:Tools for Developing Binary Logistic Regression Models

Tools designed to make it easier for beginner and intermediate users to build and validate binary logistic regression models. Includes bivariate analysis, comprehensive regression output, model fit statistics, variable selection procedures, model validation techniques and a 'shiny' app for interactive model building.

Maintained by Aravind Hebbali. Last updated 4 months ago.

logistic-regression-models regression cpp

0.5 match 17 stars 7.13 score 144 scripts 1 dependents

neurodata

lolR:Linear Optimal Low-Rank Projection

Supervised learning techniques designed for the situation when the dimensionality exceeds the sample size have a tendency to overfit as the dimensionality of the data increases. To remedy this High dimensionality; low sample size (HDLSS) situation, we attempt to learn a lower-dimensional representation of the data before learning a classifier. That is, we project the data to a situation where the dimensionality is more manageable, and then are able to better apply standard classification or clustering techniques since we will have fewer dimensions to overfit. A number of previous works have focused on how to strategically reduce dimensionality in the unsupervised case, yet in the supervised HDLSS regime, few works have attempted to devise dimensionality reduction techniques that leverage the labels associated with the data. In this package and the associated manuscript Vogelstein et al. (2017) <arXiv:1709.01233>, we provide several methods for feature extraction, some utilizing labels and some not, along with easily extensible utilities to simplify cross-validative efforts to identify the best feature extraction method. Additionally, we include a series of adaptable benchmark simulations to serve as a standard for future investigative efforts into supervised HDLSS. Finally, we produce a comprehensive comparison of the included algorithms across a range of benchmark simulations and real data applications.

Maintained by Eric Bridgeford. Last updated 4 years ago.

0.5 match 20 stars 7.28 score 80 scripts

cran

hudr:Providing Data from the US Department of Housing and Urban Development

Provides functions to access data from the US Department of Housing and Urban Development <https://www.huduser.gov/portal/dataset/fmr-api.html>.

Maintained by Paul Richardson. Last updated 2 years ago.

3.2 match 1.15 score 14 scripts

pauljohn32

rockchalk:Regression Estimation and Presentation

A collection of functions for interpretation and presentation of regression analysis. These functions are used to produce the statistics lectures in <https://pj.freefaculty.org/guides/>. Includes regression diagnostics, regression tables, and plots of interactions and "moderator" variables. The emphasis is on "mean-centered" and "residual-centered" predictors. The vignette 'rockchalk' offers a fairly comprehensive overview. The vignette 'Rstyle' has advice about coding in R. The package title 'rockchalk' refers to our school motto, 'Rock Chalk Jayhawk, Go K.U.'.

Maintained by Paul E. Johnson. Last updated 3 years ago.

0.5 match 7.13 score 584 scripts 18 dependents

apariciojohan

flexFitR:Flexible Non-Linear Least Square Model Fitting

Provides tools for flexible non-linear least squares model fitting using general-purpose optimization techniques. The package supports a variety of optimization algorithms, including those provided by the 'optimx' package, making it suitable for handling complex non-linear models. Features include parallel processing support via the 'future' and 'foreach' packages, comprehensive model diagnostics, and visualization capabilities. Implements methods described in Nash and Varadhan (2011, <doi:10.18637/jss.v043.i09>).

Maintained by Johan Aparicio. Last updated 9 days ago.

nls optimization

0.5 match 2 stars 7.09 score 77 scripts

bioc

TADCompare:TADCompare: Identification and characterization of differential TADs

TADCompare is an R package designed to identify and characterize differential Topologically Associated Domains (TADs) between multiple Hi-C contact matrices. It contains functions for finding differential TADs between two datasets, finding differential TADs over time and identifying consensus TADs across multiple matrices. It takes all of the main types of HiC input and returns simple, comprehensive, easy to analyze results.

Maintained by Mikhail Dozmorov. Last updated 5 months ago.

software hic sequencing featureextraction clustering

0.5 match 23 stars 7.04 score 10 scripts

mpascariu

MortalityLaws:Parametric Mortality Models, Life Tables and HMD

Fit the most popular human mortality 'laws', and construct full and abridge life tables given various input indices. A mortality law is a parametric function that describes the dying-out process of individuals in a population during a significant portion of their life spans. For a comprehensive review of the most important mortality laws see Tabeau (2001) <doi:10.1007/0-306-47562-6_1>. Practical functions for downloading data from various human mortality databases are provided as well.

Maintained by Marius D. Pascariu. Last updated 1 years ago.

actuarial-science demography download-hmd human-mortality-laws life-table mortality

0.5 match 32 stars 7.00 score 103 scripts 1 dependents

cran

gss:General Smoothing Splines

A comprehensive package for structural multivariate function estimation using smoothing splines.

Maintained by Chong Gu. Last updated 5 months ago.

fortran openblas

0.6 match 3 stars 6.40 score 137 dependents

statistikat

surveysd:Survey Standard Error Estimation for Cumulated Estimates and their Differences in Complex Panel Designs

Calculate point estimates and their standard errors in complex household surveys using bootstrap replicates. Bootstrapping considers survey design with a rotating panel. A comprehensive description of the methodology can be found under <https://statistikat.github.io/surveysd/articles/methodology.html>.

Maintained by Johannes Gussenbauer. Last updated 3 months ago.

bootstrap error-estimation survey cpp

0.5 match 9 stars 6.86 score 67 scripts

ikosmidis

cranly:Package Directives and Collaboration Networks in CRAN

Core visualizations and summaries for the CRAN package database. The package provides comprehensive methods for cleaning up and organizing the information in the CRAN package database, for building package directives networks (depends, imports, suggests, enhances, linking to) and collaboration networks, producing package dependence trees, and for computing useful summaries and producing interactive visualizations from the resulting networks and summaries. The resulting networks can be coerced to 'igraph' <https://CRAN.R-project.org/package=igraph> objects for further analyses and modelling.

Maintained by Ioannis Kosmidis. Last updated 3 years ago.

network-analysis network-visualization

0.5 match 49 stars 6.85 score 32 scripts 1 dependents

asa12138

ReporterScore:Generalized Reporter Score-Based Enrichment Analysis for Omics Data

Inspired by the classic 'RSA', we developed the improved 'Generalized Reporter Score-based Analysis (GRSA)' method, implemented in the R package 'ReporterScore', along with comprehensive visualization methods and pathway databases. 'GRSA' is a threshold-free method that works well with all types of biomedical features, such as genes, chemical compounds, and microbial species. Importantly, the 'GRSA' supports multi-group and longitudinal experimental designs, because of the included multi-group-compatible statistical methods.

Maintained by Chen Peng. Last updated 2 months ago.

0.5 match 67 stars 6.79 score 13 scripts

leilamarvian

WeatherSentiment:Comprehensive Analysis of Tweet Sentiments and Weather Data

A comprehensive suite of functions for processing, analyzing, and visualizing textual data from tweets is offered. Users can clean tweets, analyze their sentiments, visualize data, and examine the correlation between sentiments and environmental data such as weather conditions. Main features include text processing, sentiment analysis, data visualization, correlation analysis, and synthetic data generation. Text processing involves cleaning and preparing tweets by removing textual noise and irrelevant words. Sentiment analysis extracts and accurately analyzes sentiments from tweet texts using advanced algorithms. Data visualization creates various charts like word clouds and sentiment polarity graphs for visual representation of data. Correlation analysis examines and calculates the correlation between tweet sentiments and environmental variables such as weather conditions. Additionally, random tweets can be generated for testing and evaluating the performance of analyses, empowering users to effectively analyze and interpret 'Twitter' data for research and commercial purposes.

Maintained by Leila Marvian Mashhad. Last updated 7 months ago.

3.4 match 1.00 score

jokergoo

pkgndep:Analyze Dependency Heaviness of R Packages

A new metric named 'dependency heaviness' is proposed that measures the number of additional dependency packages that a parent package brings to its child package and are unique to the dependency packages imported by all other parents. The dependency heaviness analysis is visualized by a customized heatmap. The package is described in <doi:10.1093/bioinformatics/btac449>. We have also performed the dependency heaviness analysis on the CRAN/Bioconductor package ecosystem and the results are implemented as a web-based database which provides comprehensive tools for querying dependencies of individual R packages. The systematic analysis on the CRAN/Bioconductor ecosystem is described in <doi:10.1016/j.jss.2023.111610>. From 'pkgndep' version 2.0.0, the heaviness database includes snapshots of the CRAN/Bioconductor ecosystems for many old R versions.

Maintained by Zuguang Gu. Last updated 14 days ago.

0.5 match 47 stars 6.75 score 30 scripts

jhk0530

gemini.R:Interface for 'Google Gemini' API

Provides a comprehensive interface for Google Gemini API, enabling users to access and utilize Gemini Large Language Model (LLM) functionalities directly from R. This package facilitates seamless integration with Google Gemini, allowing for advanced language processing, text generation, and other AI-driven capabilities within the R environment. For more information, please visit <https://ai.google.dev/docs/gemini_api_overview>.

Maintained by Jinhwan Kim. Last updated 5 days ago.

0.5 match 68 stars 6.66 score 37 scripts 1 dependents

nepem-ufsc

pliman:Tools for Plant Image Analysis

Tools for both single and batch image manipulation and analysis (Olivoto, 2022 <doi:10.1111/2041-210X.13803>) and phytopathometry (Olivoto et al., 2022 <doi:10.1007/S40858-021-00487-5>). The tools can be used for the quantification of leaf area, object counting, extraction of image indexes, shape measurement, object landmark identification, and Elliptical Fourier Analysis of object outlines (Claude (2008) <doi:10.1007/978-0-387-77789-4>). The package also provides a comprehensive pipeline for generating shapefiles with complex layouts and supports high-throughput phenotyping of RGB, multispectral, and hyperspectral orthomosaics. This functionality facilitates field phenotyping using UAV- or satellite-based imagery.

Maintained by Tiago Olivoto. Last updated 2 days ago.

openblas fftw3 cpp openmp

0.5 match 10 stars 6.68 score 476 scripts

kylegrealis

froggeR:Enhance 'Quarto' Project Workflows and Standards

Streamlines 'Quarto' workflows by providing tools for consistent project setup and documentation. Enables portability through reusable metadata, automated project structure creation, and standardized templates. Features include enhanced project initialization, pre-formatted 'Quarto' documents, comprehensive data protection settings, custom styling, and structured documentation generation. Designed to improve efficiency and collaboration in R data science projects by reducing repetitive setup tasks while maintaining consistent formatting across multiple documents. There are many valuable resources providing in-depth explanations of customizing 'Quarto' templates and theme styling by the Posit team: <https://quarto.org/docs/output-formats/html-themes.html#customizing-themes> & <https://quarto.org/docs/output-formats/html-themes-more.html>, and at the Bootstrap community's GitHub at <https://github.com/twbs/bootstrap/blob/main/scss/_variables.scss>.

Maintained by Kyle Grealis. Last updated 5 hours ago.

data-science project-management quarto

0.5 match 26 stars 6.67 score 6 scripts

cefet-rj-dal

daltoolbox:Leveraging Experiment Lines to Data Analytics

The natural increase in the complexity of current research experiments and data demands better tools to enhance productivity in Data Analytics. The package is a framework designed to address the modern challenges in data analytics workflows. The package is inspired by Experiment Line concepts. It aims to provide seamless support for users in developing their data mining workflows by offering a uniform data model and method API. It enables the integration of various data mining activities, including data preprocessing, classification, regression, clustering, and time series prediction. It also offers options for hyper-parameter tuning and supports integration with existing libraries and languages. Overall, the package provides researchers with a comprehensive set of functionalities for data science, promoting ease of use, extensibility, and integration with various tools and libraries. Information on Experiment Line is based on Ogasawara et al. (2009) <doi:10.1007/978-3-642-02279-1_20>.

Maintained by Eduardo Ogasawara. Last updated 1 months ago.

0.5 match 1 stars 6.65 score 536 scripts 4 dependents

johnschwenck

bp:Blood Pressure Analysis in R

A comprehensive package to aid in the analysis of blood pressure data of all forms by providing both descriptive and visualization tools for researchers.

Maintained by John Schwenck. Last updated 3 years ago.

0.5 match 26 stars 6.23 score 13 scripts

bioc

GSEABenchmarkeR:Reproducible GSEA Benchmarking

The GSEABenchmarkeR package implements an extendable framework for reproducible evaluation of set- and network-based methods for enrichment analysis of gene expression data. This includes support for the efficient execution of these methods on comprehensive real data compendia (microarray and RNA-seq) using parallel computation on standard workstations and institutional computer grids. Methods can then be assessed with respect to runtime, statistical significance, and relevance of the results for the phenotypes investigated.

Maintained by Ludwig Geistlinger. Last updated 5 months ago.

immunooncology microarray rnaseq geneexpression differentialexpression pathways graphandnetwork network genesetenrichment networkenrichment visualization reportwriting bioconductor-package u24ca289073

0.5 match 13 stars 6.55 score 23 scripts

loukiaspin

rnmamod:Bayesian Network Meta-Analysis with Missing Participants

A comprehensive suite of functions to perform and visualise pairwise and network meta-analysis with aggregate binary or continuous missing participant outcome data. The package covers core Bayesian one-stage models implemented in a systematic review with multiple interventions, including fixed-effect and random-effects network meta-analysis, meta-regression, evaluation of the consistency assumption via the node-splitting approach and the unrelated mean effects model (original and revised model proposed by Spineli, (2022) <doi:10.1177/0272989X211068005>), and sensitivity analysis (see Spineli et al., (2021) <doi:10.1186/s12916-021-02195-y>). Missing participant outcome data are addressed in all models of the package (see Spineli, (2019) <doi:10.1186/s12874-019-0731-y>, Spineli et al., (2019) <doi:10.1002/sim.8207>, Spineli, (2019) <doi:10.1016/j.jclinepi.2018.09.002>, and Spineli et al., (2021) <doi:10.1002/jrsm.1478>). The robustness to primary analysis results can also be investigated using a novel intuitive index (see Spineli et al., (2021) <doi:10.1177/0962280220983544>). Methods to evaluate the transitivity assumption quantitatively are provided (see Spineli, (2024) <doi:10.1186/s12874-024-02436-7>). A novel index to facilitate interpretation of local inconsistency is also available (see Spineli, (2024) <doi:0.1186/s13643-024-02680-4>) The package also offers a rich, user-friendly visualisation toolkit that aids in appraising and interpreting the results thoroughly and preparing the manuscript for journal submission. The visualisation tools comprise the network plot, forest plots, panel of diagnostic plots, heatmaps on the extent of missing participant outcome data in the network, league heatmaps on estimation and prediction, rankograms, Bland-Altman plot, leverage plot, deviance scatterplot, heatmap of robustness, barplot of Kullback-Leibler divergence, heatmap of comparison dissimilarities and dendrogram of comparison clustering. The package also allows the user to export the results to an Excel file at the working directory.

Maintained by Loukia Spineli. Last updated 10 days ago.

jags cpp

0.5 match 5 stars 6.64 score 12 scripts

tvpham

iq:Protein Quantification in Mass Spectrometry-Based Proteomics

An implementation of the MaxLFQ algorithm by Cox et al. (2014) <doi:10.1074/mcp.M113.031591> in a comprehensive pipeline for processing proteomics data in data-independent acquisition mode (Pham et al. 2020 <doi:10.1093/bioinformatics/btz961>). It offers additional options for protein quantification using the N most intense fragment ions, using all fragment ions, and a wrapper for the median polish algorithm by Tukey (1977, ISBN:0201076160). In general, the tool can be used to integrate multiple proportional observations into a single quantitative value.

Maintained by Thang Pham. Last updated 15 days ago.

cpp openmp

0.5 match 27 stars 6.49 score 25 scripts

riatelab

maplegend:Legends for Maps

Create legends for maps and other graphics. Thematic maps need to be accompanied by legible legends to be fully comprehensible. This package offers a wide range of legends useful for cartography, some of which may also be useful for other types of graphics.

Maintained by Timothée Giraud. Last updated 5 months ago.

legend maps

0.5 match 13 stars 6.33 score 5 scripts 14 dependents

terrytangyuan

scaffolder:Scaffolding Interfaces to Packages in Other Programming Languages

Comprehensive set of tools for scaffolding R interfaces to modules, classes, functions, and documentations written in other programming languages, such as 'Python'.

Maintained by Yuan Tang. Last updated 2 years ago.

code-generation python reticulate scaffolding

0.5 match 27 stars 6.13 score 9 scripts

monty-se

PINstimation:Estimation of the Probability of Informed Trading

A comprehensive bundle of utilities for the estimation of probability of informed trading models: original PIN in Easley and O'Hara (1992) and Easley et al. (1996); Multilayer PIN (MPIN) in Ersan (2016); Adjusted PIN (AdjPIN) in Duarte and Young (2009); and volume-synchronized PIN (VPIN) in Easley et al. (2011, 2012). Implementations of various estimation methods suggested in the literature are included. Additional compelling features comprise posterior probabilities, an implementation of an expectation-maximization (EM) algorithm, and PIN decomposition into layers, and into bad/good components. Versatile data simulation tools, and trade classification algorithms are among the supplementary utilities. The package provides fast, compact, and precise utilities to tackle the sophisticated, error-prone, and time-consuming estimation procedure of informed trading, and this solely using the raw trade-level data.

Maintained by Montasser Ghachem. Last updated 5 months ago.

clustering-analysis expectation-maximisation-algorithm hierarchical-clustering information-asymmetry market-microstructure maximum-likelihood-estimation mixture-distributions poisson-distribution

0.5 match 36 stars 6.48 score 14 scripts

adwolfer

santaR:Short Asynchronous Time-Series Analysis

A graphical and automated pipeline for the analysis of short time-series in R ('santaR'). This approach is designed to accommodate asynchronous time sampling (i.e. different time points for different individuals), inter-individual variability, noisy measurements and large numbers of variables. Based on a smoothing splines functional model, 'santaR' is able to detect variables highlighting significantly different temporal trajectories between study groups. Designed initially for metabolic phenotyping, 'santaR' is also suited for other Systems Biology disciplines. Command line and graphical analysis (via a 'shiny' application) enable fast and parallel automated analysis and reporting, intuitive visualisation and comprehensive plotting options for non-specialist users.

Maintained by Arnaud Wolfer. Last updated 1 years ago.

pcamethods (>= 1.92.0)

0.5 match 11 stars 6.44 score 63 scripts

smartdata-analysis-and-statistics

SimTOST:Sample Size Estimation for Bio-Equivalence Trials Through Simulation

Sample size estimation for bio-equivalence trials is supported through a simulation-based approach that extends the Two One-Sided Tests (TOST) procedure. The methodology provides flexibility in hypothesis testing, accommodates multiple treatment comparisons, and accounts for correlated endpoints. Users can model complex trial scenarios, including parallel and crossover designs, intra-subject variability, and different equivalence margins. Monte Carlo simulations enable accurate estimation of power and type I error rates, ensuring well-calibrated study designs. The statistical framework builds on established methods for equivalence testing and multiple hypothesis testing in bio-equivalence studies, as described in Schuirmann (1987) <doi:10.1007/BF01068419>, Mielke et al. (2018) <doi:10.1080/19466315.2017.1371071>, Shieh (2022) <doi:10.1371/journal.pone.0269128>, and Sozu et al. (2015) <doi:10.1007/978-3-319-22005-5>. Comprehensive documentation and vignettes guide users through implementation and interpretation of results.

Maintained by Thomas Debray. Last updated 26 days ago.

mcmc multi-arm multiple-comparisons sample-size-calculation sample-size-estimation trial-simulation openblas cpp

0.5 match 2 stars 6.47 score 7 scripts

stc04003

reReg:Recurrent Event Regression

A comprehensive collection of practical and easy-to-use tools for regression analysis of recurrent events, with or without the presence of a (possibly) informative terminal event described in Chiou et al. (2023) <doi:10.18637/jss.v105.i05>. The modeling framework is based on a joint frailty scale-change model, that includes models described in Wang et al. (2001) <doi:10.1198/016214501753209031>, Huang and Wang (2004) <doi:10.1198/016214504000001033>, Xu et al. (2017) <doi:10.1080/01621459.2016.1173557>, and Xu et al. (2019) <doi:10.5705/SS.202018.0224> as special cases. The implemented estimating procedure does not require any parametric assumption on the frailty distribution. The package also allows the users to specify different model forms for both the recurrent event process and the terminal event.

Maintained by Sy Han (Steven) Chiou. Last updated 2 months ago.

openblas cpp

0.5 match 23 stars 6.35 score 36 scripts 1 dependents

theomargel

ProtE:Processing Proteomics Data, Statistical Analysis and Visualization

The 'Proteomics Eye' ('ProtE') offers a comprehensive and intuitive framework for the univariate analysis of label-free proteomics data. By integrating essential data wrangling and processing steps into a single function, 'ProtE' streamlines pairwise statistical comparisons for categorical variables. It provides quality checks and generates publication-ready visualizations, enabling efficient and robust data analysis. 'ProtE' is compatible with proteomics data outputs from 'MaxQuant' (Cox & Mann, (2008) <doi:10.1038/nbt.1511>), 'DIA-NN' (Demichev et al., (2020) <doi:10.1038/s41592-019-0638-x>), and 'Proteome Discoverer' (Thermo Fisher Scientific, version 2.5). The package leverages 'ggplot2' for visualization (Wickham, (2016) <doi:10.1007/978-3-319-24277-4>) and 'limma' for statistical analysis (Ritchie et al., (2015) <doi:10.1093/nar/gkv007>).

Maintained by Theodoros Margelos. Last updated 10 days ago.

0.5 match 6.33 score 2 scripts

bioc

rhinotypeR:Rhinovirus genotyping

"rhinotypeR" is designed to automate the comparison of sequence data against prototype strains, streamlining the genotype assignment process. By implementing predefined pairwise distance thresholds, this package makes genotype assignment accessible to researchers and public health professionals. This tool enhances our epidemiological toolkit by enabling more efficient surveillance and analysis of rhinoviruses (RVs) and other viral pathogens with complex genomic landscapes. Additionally, "rhinotypeR" supports comprehensive visualization and analysis of single nucleotide polymorphisms (SNPs) and amino acid substitutions, facilitating in-depth genetic and evolutionary studies.

Maintained by Martha Luka. Last updated 5 months ago.

sequencing genetics phylogenetics

0.5 match 4 stars 6.28 score 2 scripts

pgomba

MDPIexploreR:Web Scraping and Bibliometric Analysis of MDPI Journals

Provides comprehensive tools to scrape and analyze data from the MDPI journals. It allows users to extract metrics such as submission-to-acceptance times, article types, and whether articles are part of special issues. The package can also visualize this information through plots. Additionally, 'MDPIexploreR' offers tools to explore patterns of self-citations within articles and provides insights into guest-edited special issues.

Maintained by Pablo Gómez Barreiro. Last updated 4 months ago.

analysis data-analysis data-visualization mdpi metrics scientific-journals visualization web-scraping

0.5 match 20 stars 6.20 score 9 scripts

skstavroglou

patterncausality:Pattern Causality Algorithm

A comprehensive package for detecting and analyzing causal relationships in complex systems using pattern-based approaches. Key features include state space reconstruction, pattern identification, and causality strength evaluation.

Maintained by Hui Wang. Last updated 29 days ago.

0.5 match 1 stars 6.08 score 20 scripts