R-universe search: cites

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 1 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

20.8 match 20 stars 7.60 score 55 scripts

niaid

dsb:Normalize & Denoise Droplet Single Cell Protein Data (CITE-Seq)

This lightweight R package provides a method for normalizing and denoising protein expression data from droplet based single cell experiments. Raw protein Unique Molecular Index (UMI) counts from sequencing DNA-conjugated antibody derived tags (ADT) in droplets (e.g. 'CITE-seq') have substantial measurement noise. Our experiments and computational modeling revealed two major components of this noise: 1) protein-specific noise originating from ambient, unbound antibody encapsulated in droplets that can be accurately inferred via the expected protein counts detected in empty droplets, and 2) droplet/cell-specific noise revealed via the shared variance component associated with isotype antibody controls and background protein counts in each cell. This package normalizes and removes both of these sources of noise from raw protein data derived from methods such as 'CITE-seq', 'REAP-seq', 'ASAP-seq', 'TEA-seq', 'proteogenomic' data from the Mission Bio platform, etc. See the vignette for tutorials on how to integrate dsb with 'Seurat' and 'Bioconductor' and how to use dsb in 'Python'. Please see our paper Mulè M.P., Martins A.J., and Tsang J.S. Nature Communications 2022 <https://www.nature.com/articles/s41467-022-29356-8> for more details on the method.

Maintained by Matthew Mulè. Last updated 9 months ago.

cite-seq niaid-tsang-lab

20.3 match 65 stars 7.73 score 104 scripts

eguidotti

calculus:High Dimensional Numerical and Symbolic Calculus

Efficient C++ optimized functions for numerical and symbolic calculus as described in Guidotti (2022) <doi:10.18637/jss.v104.i05>. It includes basic arithmetic, tensor calculus, Einstein summing convention, fast computation of the Levi-Civita symbol and generalized Kronecker delta, Taylor series expansion, multivariate Hermite polynomials, high-order derivatives, ordinary differential equations, differential operators (Gradient, Jacobian, Hessian, Divergence, Curl, Laplacian) and numerical integration in arbitrary orthogonal coordinate systems: cartesian, polar, spherical, cylindrical, parabolic or user defined by custom scale factors.

Maintained by Emanuele Guidotti. Last updated 2 years ago.

calculus coordinate-systems curl divergence einstein finite-difference gradient hermite hessian jacobian laplacian numerical-derivation numerical-derivatives numerical-differentiation symbolic-computation symbolic-differentiation taylor cpp

14.0 match 47 stars 8.92 score 66 scripts 7 dependents

easystats

report:Automated Reporting of Results and Statistical Models

The aim of the 'report' package is to bridge the gap between R’s output and the formatted results contained in your manuscript. This package converts statistical models and data frames into textual reports suited for publication, ensuring standardization and quality in results reporting.

Maintained by Rémi Thériault. Last updated 1 months ago.

anovas apa automated-report-generation automatic bayesian describe easystats hacktoberfest manuscript models report reporting reports scientific statsmodels

8.5 match 698 stars 14.48 score 1.1k scripts 3 dependents

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 12 hours ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

5.4 match 582 stars 21.11 score 31k scripts 1.9k dependents

ropengov

regions:Processing Regional Statistics

Validating sub-national statistical typologies, re-coding across standard typologies of sub-national statistics, and making valid aggregate level imputation, re-aggregation, re-weighting and projection down to lower hierarchical levels to create meaningful data panels and time series.

Maintained by Daniel Antal. Last updated 2 years ago.

observatory regions ropengov statistics

12.8 match 12 stars 8.81 score 67 scripts 5 dependents

bioc

scviR:experimental inferface from R to scvi-tools

This package defines interfaces from R to scvi-tools. A vignette works through the totalVI tutorial for analyzing CITE-seq data. Another vignette compares outputs of Chapter 12 of the OSCA book with analogous outputs based on totalVI quantifications. Future work will address other components of scvi-tools, with a focus on building understanding of probabilistic methods based on variational autoencoders.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure singlecell dataimport bioconductor cite-seq scverse

20.0 match 6 stars 5.60 score 11 scripts

ropensci

rcites:R Interface to the Species+ Database

A programmatic interface to the Species+ <https://speciesplus.net/> database via the Species+/CITES Checklist API <https://api.speciesplus.net/>.

Maintained by Kevin Cazelles. Last updated 2 years ago.

api-client cites database endangered-species trade

14.8 match 14 stars 6.52 score 26 scripts

doi-usgs

dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data

Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.

Maintained by Laura DeCicco. Last updated 18 days ago.

usgs

5.3 match 280 stars 14.18 score 1.7k scripts 15 dependents

cboettig

knitcitations:Citations for 'Knitr' Markdown Files

Provides the ability to create dynamic citations in which the bibliographic information is pulled from the web rather than having to be entered into a local database such as 'bibtex' ahead of time. The package is primarily aimed at authoring in the R 'markdown' format, and can provide outputs for web-based authoring such as linked text for inline citations. Cite using a 'DOI', URL, or 'bibtex' file key. See the package URL for details.

Maintained by Carl Boettiger. Last updated 4 years ago.

7.0 match 220 stars 10.21 score 836 scripts 2 dependents

statnet

statnet.common:Common R Scripts and Utilities Used by the Statnet Project Software

Non-statistical utilities used by the software developed by the Statnet Project. They may also be of use to others.

Maintained by Pavel N. Krivitsky. Last updated 27 days ago.

6.0 match 8 stars 11.42 score 197 scripts 148 dependents

ropensci

RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management

Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.

Maintained by Mathew W. McLean. Last updated 4 months ago.

peer-reviewed

5.6 match 115 stars 12.06 score 2.3k scripts 16 dependents

rezakj

iCellR:Analyzing High-Throughput Single Cell Sequencing Data

A toolkit that allows scientists to work with data from single cell sequencing technologies such as scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST). Single (i) Cell R package ('iCellR') provides unprecedented flexibility at every step of the analysis pipeline, including normalization, clustering, dimensionality reduction, imputation, visualization, and so on. Users can design both unsupervised and supervised models to best suit their research. In addition, the toolkit provides 2D and 3D interactive visualizations, differential expression analysis, filters based on cells, genes and clusters, data merging, normalizing for dropouts, data imputation methods, correcting for batch differences, pathway analysis, tools to find marker genes for clusters and conditions, predict cell types and pseudotime analysis. See Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.05.05.078550> and Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.03.31.019109> for more details.

Maintained by Alireza Khodadadi-Jamayran. Last updated 8 months ago.

10xgenomics 3d batch-normalization cell-type-classification cite-seq clustering clustering-algorithm diffusion-maps dropout icellr imputation intractive-graph normalization pseudotime scrna-seq scvdj-seq singel-cell-sequencing umap cpp

9.7 match 121 stars 5.56 score 7 scripts 1 dependents

alexisvdb

singleCellHaystack:A Universal Differential Expression Prediction Tool for Single-Cell and Spatial Genomics Data

One key exploratory analysis step in single-cell genomics data analysis is the prediction of features with different activity levels. For example, we want to predict differentially expressed genes (DEGs) in single-cell RNA-seq data, spatial DEGs in spatial transcriptomics data, or differentially accessible regions (DARs) in single-cell ATAC-seq data. 'singleCellHaystack' predicts differentially active features in single cell omics datasets without relying on the clustering of cells into arbitrary clusters. 'singleCellHaystack' uses Kullback-Leibler divergence to find features (e.g., genes, genomic regions, etc) that are active in subsets of cells that are non-randomly positioned inside an input space (such as 1D trajectories, 2D tissue sections, multi-dimensional embeddings, etc). For the theoretical background of 'singleCellHaystack' we refer to our original paper Vandenbon and Diez (Nature Communications, 2020) <doi:10.1038/s41467-020-17900-3> and our update Vandenbon and Diez (Scientific Reports, 2023) <doi:10.1038/s41598-023-38965-2>.

Maintained by Alexis Vandenbon. Last updated 1 years ago.

bioinformatics cite-seq pseudotime scatac-seq single-cell spatial-proteomics spatial-transcriptomics transcriptomics

7.5 match 81 stars 6.71 score 64 scripts

bioc

recountmethylation:Access and analyze public DNA methylation array data compilations

Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.

Maintained by Sean K Maden. Last updated 5 months ago.

dnamethylation epigenetics microarray methylationarray experimenthub

7.5 match 9 stars 6.28 score 9 scripts

mathewchamberlain

SignacX:Cell Type Identification and Discovery from Single Cell Gene Expression Data

An implementation of neural networks trained with flow-sorted gene expression data to classify cellular phenotypes in single cell RNA-sequencing data. See Chamberlain M et al. (2021) <doi:10.1101/2021.02.01.429207> for more details.

Maintained by Mathew Chamberlain. Last updated 2 years ago.

cellular-phenotypes seurat single-cell-rna-seq

5.6 match 24 stars 6.46 score 34 scripts

bioc

beadarray:Quality assessment and low-level analysis for Illumina BeadArray data

The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.

Maintained by Mark Dunning. Last updated 5 months ago.

microarray onechannel qualitycontrol preprocessing

4.5 match 7.88 score 70 scripts 4 dependents

bioc

MuData:Serialization for MultiAssayExperiment Objects

Save MultiAssayExperiments to h5mu files supported by muon and mudata. Muon is a Python framework for multimodal omics data analysis. It uses an HDF5-based format for data storage.

Maintained by Ilia Kats. Last updated 20 days ago.

dataimport anndata bioconductor mudata multi-omics multimodal-omics scrna-seq

6.0 match 5 stars 5.89 score 26 scripts

arnaudgallou

pakret:Cite 'R' Packages on the Fly in 'R Markdown' and 'Quarto'

References and cites 'R' and 'R' packages on the fly in 'R Markdown' and 'Quarto'. 'pakret' provides a minimalistic API that generates preformatted citations of 'R' and 'R' packages, and adds their reference to a '.bib' file directly from within your document.

Maintained by Arnaud Gallou. Last updated 18 days ago.

bib bibtex citation citations generate

7.1 match 5 stars 4.51 score 5 scripts

cjvanlissa

worcs:Workflow for Open Reproducible Code in Science

Create reproducible and transparent research projects in 'R'. This package is based on the Workflow for Open Reproducible Code in Science (WORCS), a step-by-step procedure based on best practices for Open Science. It includes an 'RStudio' project template, several convenience functions, and all dependencies required to make your project reproducible and transparent. WORCS is explained in the tutorial paper by Van Lissa, Brandmaier, Brinkman, Lamprecht, Struiksma, & Vreede (2021). <doi:10.3233/DS-210031>.

Maintained by Caspar J. Van Lissa. Last updated 11 days ago.

3.3 match 83 stars 9.26 score 59 scripts

dormancy1

lefko3:Historical and Ahistorical Population Projection Matrix Analysis

Complete analytical environment for the construction and analysis of matrix population models and integral projection models. Includes the ability to construct historical matrices, which are 2d matrices comprising 3 consecutive times of demographic information. Estimates both raw and function-based forms of historical and standard ahistorical matrices. It also estimates function-based age-by-stage matrices and raw and function-based Leslie matrices.

Maintained by Richard P. Shefferson. Last updated 4 days ago.

openblas cpp

9.0 match 3.30 score 11 scripts

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 16 days ago.

ecological-modelling ecology ordination fortran openblas

1.5 match 472 stars 19.41 score 15k scripts 440 dependents

bioc

recount:Explore and download data from the recount project

Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport immunooncology annotation-agnostic bioconductor count derfinder deseq2 exon gene human illumina junction recount

2.8 match 41 stars 9.57 score 498 scripts 3 dependents

geobosh

Rdpack:Update and Manipulate Rd Documentation Objects

Functions for manipulation of R documentation objects, including functions reprompt() and ereprompt() for updating 'Rd' documentation for functions, methods and classes; 'Rd' macros for citations and import of references from 'bibtex' files for use in 'Rd' files and 'roxygen2' comments; 'Rd' macros for evaluating and inserting snippets of 'R' code and the results of its evaluation or creating graphics on the fly; and many functions for manipulation of references and Rd files.

Maintained by Georgi N. Boshnakov. Last updated 11 hours ago.

bibtex bibtex-references citations documentation rd-format roxygen2

1.9 match 30 stars 13.76 score 73 scripts 2.3k dependents

svmiller

stevemisc:Steve's Miscellaneous Functions

These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.

Maintained by Steve Miller. Last updated 6 days ago.

dplyr mixed-effects-models multivariate-normal-distribution tidyverse

3.8 match 10 stars 6.85 score 392 scripts 2 dependents

crsh

rmdfiltr:'Lua'-Filters for R Markdown

A collection of 'Lua' filters that extend the functionality of R Markdown templates (e.g., count words or post-process citations).

Maintained by Frederik Aust. Last updated 5 months ago.

3.1 match 42 stars 8.08 score 4 scripts 3 dependents

waldronlab

SingleCellMultiModal:Integrating Multi-modal Single Cell Experiment datasets

SingleCellMultiModal is an ExperimentHub package that serves multiple datasets obtained from GEO and other sources and represents them as MultiAssayExperiment objects. We provide several multi-modal datasets including scNMT, 10X Multiome, seqFISH, CITEseq, SCoPE2, and others. The scope of the package is is to provide data for benchmarking and analysis. To cite, use the 'citation' function and see <https://doi.org/10.1371/journal.pcbi.1011324>.

Maintained by Marcel Ramos. Last updated 4 months ago.

experimentdata singlecelldata reproducibleresearch experimenthub geo bioconductor-package u24ca289073

3.3 match 17 stars 7.29 score 60 scripts

bioc

lute:Framework for cell size scale factor normalized bulk transcriptomics deconvolution experiments

Provides a framework for adjustment on cell type size when performing bulk transcripomics deconvolution. The main framework function provides a means of reference normalization using cell size scale factors. It allows for marker selection and deconvolution using non-negative least squares (NNLS) by default. The framework is extensible for other marker selection and deconvolution algorithms, and users may reuse the generics, methods, and classes for these when developing new algorithms.

Maintained by Sean K Maden. Last updated 5 months ago.

rnaseq sequencing singlecell coverage transcriptomics normalization

4.5 match 2 stars 5.26 score 3 scripts

ropensci

rgbif:Interface to the Global Biodiversity Information Facility API

A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.

Maintained by John Waller. Last updated 3 days ago.

gbif specimens api web-services occurrences species taxonomy biodiversity data lifewatch oscibio spocc

1.8 match 161 stars 13.26 score 2.1k scripts 20 dependents

ropensci

tidypmc:Parse Full Text XML Documents from PubMed Central

Parse XML documents from the Open Access subset of Europe PubMed Central <https://europepmc.org> including section paragraphs, tables, captions and references.

Maintained by Chris Stubben. Last updated 5 years ago.

3.8 match 33 stars 5.95 score 27 scripts

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

1.5 match 71 stars 14.95 score 670 scripts 127 dependents

bioc

CiteFuse:CiteFuse: multi-modal analysis of CITE-seq data

CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.

Maintained by Yingxin Lin. Last updated 5 months ago.

singlecell geneexpression bioinformatics single-cell cpp

3.4 match 27 stars 6.59 score 18 scripts

crsh

papaja:Prepare American Psychological Association Journal Articles with R Markdown

Tools to create dynamic, submission-ready manuscripts, which conform to American Psychological Association manuscript guidelines. We provide R Markdown document formats for manuscripts (PDF and Word) and revision letters (PDF). Helper functions facilitate reporting statistical analyses or create publication-ready tables and plots.

Maintained by Frederik Aust. Last updated 18 days ago.

apa apa-guidelines journal manuscript psychology reproducible-paper reproducible-research rmarkdown

1.9 match 662 stars 11.74 score 1.7k scripts 1 dependents

hkgjess

Haplin:Analyzing Case-Parent Triad and/or Case-Control Data with SNP Haplotypes

Performs genetic association analyses of case-parent triad (trio) data with multiple markers. It can also incorporate complete or incomplete control triads, for instance independent control children. Estimation is based on haplotypes, for instance SNP haplotypes, even though phase is not known from the genetic data. 'Haplin' estimates relative risk (RR + conf.int.) and p-value associated with each haplotype. It uses maximum likelihood estimation to make optimal use of data from triads with missing genotypic data, for instance if some SNPs has not been typed for some individuals. 'Haplin' also allows estimation of effects of maternal haplotypes and parent-of-origin effects, particularly appropriate in perinatal epidemiology. 'Haplin' allows special models, like X-inactivation, to be fitted on the X-chromosome. A GxE analysis allows testing interactions between environment and all estimated genetic effects. The models were originally described in "Gjessing HK and Lie RT. Case-parent triads: Estimating single- and double-dose effects of fetal and maternal disease gene haplotypes. Annals of Human Genetics (2006) 70, pp. 382-396".

Maintained by Hakon K. Gjessing. Last updated 7 months ago.

cpp

4.5 match 3 stars 4.87 score 49 scripts

bioc

GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)

The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.

Maintained by Sean Davis. Last updated 5 months ago.

microarray dataimport onechannel twochannel sage bioconductor bioinformatics data-science genomics ncbi-geo

1.5 match 92 stars 14.46 score 4.1k scripts 44 dependents

cols4all

cols4all:Colors for all

Color palettes for all people, including those with color vision deficiency. Popular color palette series have been organized by type and have been scored on several properties such as color-blind-friendliness and fairness (i.e. do colors stand out equally?). Own palettes can also be loaded and analysed. Besides the common palette types (categorical, sequential, and diverging) it also includes cyclic and bivariate color palettes. Furthermore, a color for missing values is assigned to each palette.

Maintained by Martijn Tennekes. Last updated 2 months ago.

2.0 match 343 stars 9.98 score 26 dependents

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 2 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

1.5 match 109 stars 13.20 score 342 scripts 3 dependents

mpierrejean

jointseg:Joint Segmentation of Multivariate (Copy Number) Signals

Methods for fast segmentation of multivariate signals into piecewise constant profiles and for generating realistic copy-number profiles. A typical application is the joint segmentation of total DNA copy numbers and allelic ratios obtained from Single Nucleotide Polymorphism (SNP) microarrays in cancer studies. The methods are described in Pierre-Jean, Rigaill and Neuvial (2015) <doi:10.1093/bib/bbu026>.

Maintained by Morgane Pierre-Jean. Last updated 6 years ago.

cpp

3.0 match 6 stars 6.50 score 44 scripts 2 dependents

ngreifer

cobalt:Covariate Balance Tables and Plots

Generate balance tables and plots for covariates of groups preprocessed through matching, weighting or subclassification, for example, using propensity scores. Includes integration with 'MatchIt', 'WeightIt', 'MatchThem', 'twang', 'Matching', 'optmatch', 'CBPS', 'ebal', 'cem', 'sbw', and 'designmatch' for assessing balance on the output of their preprocessing functions. Users can also specify data for balance assessment not generated through the above packages. Also included are methods for assessing balance in clustered or multiply imputed data sets or data sets with multi-category, continuous, or longitudinal treatments.

Maintained by Noah Greifer. Last updated 11 months ago.

causal-inference propensity-scores

1.5 match 75 stars 12.98 score 1.0k scripts 8 dependents

tconwell

html5:Creates Valid HTML5 Strings

Generates valid HTML tag strings for HTML5 elements documented by Mozilla. Attributes are passed as named lists, with names being the attribute name and values being the attribute value. Attribute values are automatically double-quoted. To declare a DOCTYPE, wrap html() with function doctype(). Mozilla's documentation for HTML5 is available here: <https://developer.mozilla.org/en-US/docs/Web/HTML/Element>. Elements marked as obsolete are not included.

Maintained by Timothy Conwell. Last updated 2 years ago.

5.2 match 1 stars 3.65 score 1 scripts 3 dependents

data-cleaning

validate:Data Validation Infrastructure

Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.

Maintained by Mark van der Loo. Last updated 12 days ago.

data-cleaning validation

1.5 match 418 stars 12.50 score 448 scripts 9 dependents

lcrawlab

smer:Sparse Marginal Epistasis Test

The Sparse Marginal Epistasis Test is a computationally efficient genetics method which detects statistical epistasis in complex traits; see Stamp et al. (2025, <doi:10.1101/2025.01.11.632557>) for details.

Maintained by Julian Stamp. Last updated 2 months ago.

genomewideassociation epistasis genetics snp linearmixedmodel cpp epistasis-analysis epistatis gwas gwas-tools mapit zlib cpp openmp

3.8 match 1 stars 4.95 score 8 scripts

bioc

sccomp:Tests differences in cell-type proportion for single-cell data, robust to outliers

A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.

Maintained by Stefano Mangiola. Last updated 1 days ago.

bayesian regression differentialexpression singlecell metagenomics flowcytometry spatial batch-correction composition cytof differential-proportion microbiome multilevel proportions random-effects single-cell unwanted-variation

2.2 match 99 stars 8.43 score 69 scripts

bioc

IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data

Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.

Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.

geneexpression transcription alternativesplicing differentialexpression differentialsplicing visualization statisticalmethod transcriptomevariant biomedicalinformatics functionalgenomics systemsbiology transcriptomics rnaseq annotation functionalprediction geneprediction dataimport multiplecomparison batcheffect immunooncology

2.0 match 108 stars 9.26 score 125 scripts

florianhartig

DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models

The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.

Maintained by Florian Hartig. Last updated 12 days ago.

glmm regression regression-diagnostics residual

1.3 match 226 stars 14.74 score 2.8k scripts 10 dependents

christophergandrud

repmis:Miscellaneous Tools for Reproducible Research

Tools to load 'R' packages and automatically generate BibTeX files citing them as well as load and cache plain-text and 'Excel' formatted data stored on 'GitHub', and from other sources.

Maintained by Christopher Gandrud. Last updated 9 years ago.

2.3 match 24 stars 7.54 score 394 scripts 10 dependents

bioc

minfi:Analyze Illumina Infinium DNA methylation arrays

Tools to analyze & visualize Illumina Infinium methylation arrays.

Maintained by Kasper Daniel Hansen. Last updated 4 months ago.

immunooncology dnamethylation differentialmethylation epigenetics microarray methylationarray multichannel twochannel dataimport normalization preprocessing qualitycontrol

1.3 match 60 stars 12.83 score 996 scripts 26 dependents

lazappi

clustree:Visualise Clusterings at Different Resolutions

Deciding what resolution to use can be a difficult question when approaching a clustering analysis. One way to approach this problem is to look at how samples move as the number of clusters increases. This package allows you to produce clustering trees, a visualisation for interrogating clusterings as resolution increases.

Maintained by Luke Zappia. Last updated 1 years ago.

clustering clustering-trees visualisation visualization

1.5 match 219 stars 11.40 score 1.9k scripts 5 dependents

stencila

stencilaschema:Bindings for Stencila Schema

Provides R bindings for the Stencila Schema <https://schema.stenci.la>. This package is primarily aimed at R developers wanting to programmatically generate, or modify, executable documents.

Maintained by Nokome Bentley. Last updated 3 years ago.

json-schema python rust schema-org semantic typescript vocabulary

3.3 match 17 stars 4.93 score 2 scripts

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

1.5 match 10.82 score 10k scripts 54 dependents

mflores72000

ILS:Interlaboratory Study

It performs interlaboratory studies (ILS) to detect those laboratories that provide non-consistent results when comparing to others. It permits to work simultaneously with various testing materials, from standard univariate, and functional data analysis (FDA) perspectives. The univariate approach based on ASTM E691-08 consist of estimating the Mandel's h and k statistics to identify those laboratories that provide more significant different results, testing also the presence of outliers by Cochran and Grubbs tests, Analysis of variance (ANOVA) techniques are provided (F and Tuckey tests) to test differences in means corresponding to different laboratories per each material. Taking into account the functional nature of data retrieved in analytical chemistry, applied physics and engineering (spectra, thermograms, etc.). ILS package provides a FDA approach for finding the Mandel's k and h statistics distribution by smoothing bootstrap resampling.

Maintained by Miguel Flores. Last updated 2 years ago.

2.5 match 6.48 score 75 scripts

ropenspain

spanishoddata:Get Spanish Origin-Destination Data

Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.

Maintained by Egor Kotov. Last updated 7 days ago.

cdr data data-package mobile-telephone-data mobility origin-destination

2.0 match 35 stars 7.89 score 14 scripts

google

CausalImpact:Inferring Causal Effects using Bayesian Structural Time-Series Models

Implements a Bayesian approach to causal impact estimation in time series, as described in Brodersen et al. (2015) <DOI:10.1214/14-AOAS788>. See the package documentation on GitHub <https://google.github.io/CausalImpact/> to get started.

Maintained by Alain Hauser. Last updated 2 years ago.

1.3 match 1.7k stars 11.73 score 276 scripts 2 dependents

bioc

PhyloProfile:PhyloProfile

PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.

Maintained by Vinh Tran. Last updated 7 days ago.

software visualization datarepresentation multiplecomparison functionalprediction dimensionreduction bioinformatics heatmap interactive-visualizations orthologs phylogenetic-profile shiny

2.0 match 33 stars 7.77 score 10 scripts

bioc

QUBIC:An R package for qualitative biclustering in support of gene co-expression analyses

The core function of this R package is to provide the implementation of the well-cited and well-reviewed QUBIC algorithm, aiming to deliver an effective and efficient biclustering capability. This package also includes the following related functions: (i) a qualitative representation of the input gene expression data, through a well-designed discretization way considering the underlying data property, which can be directly used in other biclustering programs; (ii) visualization of identified biclusters using heatmap in support of overall expression pattern analysis; (iii) bicluster-based co-expression network elucidation and visualization, where different correlation coefficient scores between a pair of genes are provided; and (iv) a generalize output format of biclusters and corresponding network can be freely downloaded so that a user can easily do following comprehensive functional enrichment analysis (e.g. DAVID) and advanced network visualization (e.g. Cytoscape).

Maintained by Yu Zhang. Last updated 5 months ago.

statisticalmethod microarray differentialexpression multiplecomparison clustering visualization geneexpression network bioconductor-package bioconductor-packages cpp openmp

2.5 match 3 stars 6.10 score 14 scripts 1 dependents

silvadenisson

electionsBR:R Functions to Download and Clean Brazilian Electoral Data

Offers a set of functions to easily download and clean Brazilian electoral data from the Superior Electoral Court and 'CepespData' websites. Among other features, the package retrieves data on local and federal elections for all positions (city councilor, mayor, state deputy, federal deputy, governor, and president) aggregated by state, city, and electoral zones.

Maintained by Denisson Silva. Last updated 4 months ago.

2.0 match 65 stars 7.54 score 66 scripts

l-ramirez-lopez

prospectr:Miscellaneous Functions for Processing and Sample Selection of Spectroscopic Data

Functions to preprocess spectroscopic data and conduct (representative) sample selection/calibration sampling.

Maintained by Leonardo Ramirez-Lopez. Last updated 11 days ago.

chemometrics derivatives infrared near-infrared nir pedometrics preprocessing resample sampling signal soil-spectroscopy spectroscopy openblas cpp openmp

1.5 match 42 stars 10.00 score 326 scripts 4 dependents

ropengov

helsinki:R Tools for Helsinki Open Data

Tools for accessing various open data APIs in the Helsinki region in Finland. Current data sources include the Service Map API, Linked Events API, and Helsinki Region Infoshare statistics API.

Maintained by Juuso Parkkinen. Last updated 2 years ago.

ropengov finland helsinki helsinki-region

2.8 match 6 stars 5.28 score 21 scripts

daniel1noble

metaDigitise:Extract and Summarise Data from Published Figures

High-throughput, flexible and reproducible extraction of data from figures in primary research papers. metaDigitise() can extract data and / or automatically calculate summary statistics for users from box plots, bar plots (e.g., mean and errors), scatter plots and histograms.

Maintained by Daniel Noble. Last updated 9 months ago.

2.3 match 82 stars 6.46 score 35 scripts

pakillo

grateful:Facilitate Citation of R Packages

Facilitates the citation of R packages used in analysis projects. Scans project for packages used, gets their citations, and produces a document with citations in the preferred bibliography format, ready to be pasted into reports or manuscripts. Alternatively, 'grateful' can be used directly within an 'R Markdown' or 'Quarto' document.

Maintained by Francisco Rodriguez-Sanchez. Last updated 2 days ago.

citation-generator software-citation

1.8 match 229 stars 8.04 score 269 scripts

ropengov

pxweb:R Interface to PXWEB APIs

Generic interface for the PX-Web/PC-Axis API. The PX-Web/PC-Axis API is used by organizations such as Statistics Sweden and Statistics Finland to disseminate data. The R package can interact with all PX-Web/PC-Axis APIs to fetch information about the data hierarchy, extract metadata and extract and parse statistics to R data.frame format. PX-Web is a solution to disseminate PC-Axis data files in dynamic tables on the web. Since 2013 PX-Web contains an API to disseminate PC-Axis files.

Maintained by Mans Magnusson. Last updated 1 years ago.

ropengov

1.9 match 66 stars 7.67 score 2 dependents

ndphillips

FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees

Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.

Maintained by Hansjoerg Neth. Last updated 5 months ago.

1.5 match 135 stars 9.58 score 144 scripts

r-forge

CHNOSZ:Thermodynamic Calculations and Diagrams for Geochemistry

An integrated set of tools for thermodynamic calculations in aqueous geochemistry and geobiochemistry. Functions are provided for writing balanced reactions to form species from user-selected basis species and for calculating the standard molal properties of species and reactions, including the standard Gibbs energy and equilibrium constant. Calculations of the non-equilibrium chemical affinity and equilibrium chemical activity of species can be portrayed on diagrams as a function of temperature, pressure, or activity of basis species; in two dimensions, this gives a maximum affinity or predominance diagram. The diagrams have formatted chemical formulas and axis labels, and water stability limits can be added to Eh-pH, oxygen fugacity- temperature, and other diagrams with a redox variable. The package has been developed to handle common calculations in aqueous geochemistry, such as solubility due to complexation of metal ions, mineral buffers of redox or pH, and changing the basis species across a diagram ("mosaic diagrams"). CHNOSZ also implements a group additivity algorithm for the standard thermodynamic properties of proteins.

Maintained by Jeffrey Dick. Last updated 8 days ago.

fortran

1.5 match 9.46 score 238 scripts 4 dependents

r-barnes

dggridR:Discrete Global Grids

Spatial analyses involving binning require that every bin have the same area, but this is impossible using a rectangular grid laid over the Earth or over any projection of the Earth. Discrete global grids use hexagons, triangles, and diamonds to overcome this issue, overlaying the Earth with equally-sized bins. This package provides utilities for working with discrete global grids, along with utilities to aid in plotting such data.

Maintained by Sebastian Krantz. Last updated 6 months ago.

discrete-global-grids geospatial spatial-analysis cpp

1.5 match 168 stars 9.37 score 388 scripts 1 dependents

eguidotti

bidask:Efficient Estimation of Bid-Ask Spreads from Open, High, Low, and Close Prices

Implements the efficient estimator of bid-ask spreads from open, high, low, and close prices described in Ardia, Guidotti, & Kroencke (JFE, 2024) <doi:10.1016/j.jfineco.2024.103916>. It also provides an implementation of the estimators described in Roll (JF, 1984) <doi:10.1111/j.1540-6261.1984.tb03897.x>, Corwin & Schultz (JF, 2012) <doi:10.1111/j.1540-6261.2012.01729.x>, and Abdi & Ranaldo (RFS, 2017) <doi:10.1093/rfs/hhx084>.

Maintained by Emanuele Guidotti. Last updated 19 days ago.

finance

2.0 match 107 stars 6.98 score 6 scripts

ropensci

rredlist:'IUCN' Red List Client

'IUCN' Red List (<https://api.iucnredlist.org/>) client. The 'IUCN' Red List is a global list of threatened and endangered species. Functions cover all of the Red List 'API' routes. An 'API' key is required.

Maintained by William Gearty. Last updated 1 months ago.

iucn biodiversity api web-services traits habitat species conservation api-wrapper iucn-red-list taxize

1.2 match 53 stars 11.49 score 195 scripts 24 dependents

january3

tmod:Feature Set Enrichment Analysis for Metabolomics and Transcriptomics

Methods and feature set definitions for feature or gene set enrichment analysis in transcriptional and metabolic profiling data. Package includes tests for enrichment based on ranked lists of features, functions for visualisation and multivariate functional analysis. See Zyla et al (2019) <doi:10.1093/bioinformatics/btz447>.

Maintained by January Weiner. Last updated 2 months ago.

2.0 match 3 stars 6.88 score 168 scripts 1 dependents

bioc

bambu:Context-Aware Transcript Quantification from Long Read RNA-Seq data

bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.

Maintained by Ying Chen. Last updated 1 months ago.

alignment coverage differentialexpression featureextraction geneexpression genomeannotation genomeassembly immunooncology longread multiplecomparison normalization rnaseq regression sequencing software transcription transcriptomics bambu bioconductor long-reads nanopore nanopore-sequencing rna-seq rna-seq-analysis transcript-quantification transcript-reconstruction cpp

1.5 match 197 stars 9.03 score 91 scripts 1 dependents

annechao

iNEXT.3D:Interpolation and Extrapolation for Three Dimensions of Biodiversity

Biodiversity is a multifaceted concept covering different levels of organization from genes to ecosystems. 'iNEXT.3D' extends 'iNEXT' to include three dimensions (3D) of biodiversity, i.e., taxonomic diversity (TD), phylogenetic diversity (PD) and functional diversity (FD). This package provides functions to compute standardized 3D diversity estimates with a common sample size or sample coverage. A unified framework based on Hill numbers and their generalizations (Hill-Chao numbers) are used to quantify 3D. All 3D estimates are in the same units of species/lineage equivalents and can be meaningfully compared. The package features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of 3D diversity across individual assemblages. Asymptotic 3D diversity estimates are also provided. See Chao et al. (2021) <doi:10.1111/2041-210X.13682> for more details.

Maintained by Anne Chao. Last updated 27 days ago.

cpp

2.0 match 6.74 score 26 scripts 2 dependents

bioc

RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples

This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.

Maintained by Marcel Ramos. Last updated 4 months ago.

infrastructure datarepresentation copynumber core-package data-structure mutations u24ca289073

1.5 match 4 stars 8.96 score 76 scripts 15 dependents

mandymejia

ciftiTools:Tools for Reading, Writing, Viewing and Manipulating CIFTI Files

CIFTI files contain brain imaging data in "grayordinates," which represent the gray matter as cortical surface vertices (left and right) and subcortical voxels (cerebellum, basal ganglia, and other deep gray matter). 'ciftiTools' provides a unified environment for reading, writing, visualizing and manipulating CIFTI-format data. It supports the "dscalar," "dlabel," and "dtseries" intents. Grayordinate data is read in as a "xifti" object, which is structured for convenient access to the data and metadata, and includes support for surface geometry files to enable spatially-dependent functionality such as static or interactive visualizations and smoothing.

Maintained by Amanda Mejia. Last updated 2 months ago.

1.5 match 47 stars 8.90 score 176 scripts 4 dependents

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 2 months ago.

annotation chipseq chipchip

1.5 match 8.75 score 584 scripts 6 dependents

jacolien

itsadug:Interpreting Time Series and Autocorrelated Data Using GAMMs

GAMM (Generalized Additive Mixed Modeling; Lin & Zhang, 1999) as implemented in the R package 'mgcv' (Wood, S.N., 2006; 2011) is a nonlinear regression analysis which is particularly useful for time course data such as EEG, pupil dilation, gaze data (eye tracking), and articulography recordings, but also for behavioral data such as reaction times and response data. As time course measures are sensitive to autocorrelation problems, GAMMs implements methods to reduce the autocorrelation problems. This package includes functions for the evaluation of GAMM models (e.g., model comparisons, determining regions of significance, inspection of autocorrelational structure in residuals) and interpreting of GAMMs (e.g., visualization of complex interactions, and contrasts).

Maintained by Jacolien van Rij. Last updated 3 years ago.

2.0 match 6.51 score 576 scripts 2 dependents

bioc

SPIAT:Spatial Image Analysis of Tissues

SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.

Maintained by Yuzhou Feng. Last updated 15 hours ago.

biomedicalinformatics cellbiology spatial clustering dataimport immunooncology qualitycontrol singlecell software visualization

1.5 match 22 stars 8.59 score 69 scripts

ropensci

lingtypology:Linguistic Typology and Mapping

Provides R with the Glottolog database <https://glottolog.org/> and some more abilities for purposes of linguistic mapping. The Glottolog database contains the catalogue of languages of the world. This package helps researchers to make a linguistic maps, using philosophy of the Cross-Linguistic Linked Data project <https://clld.org/>, which allows for while at the same time facilitating uniform access to the data across publications. A tutorial for this package is available on GitHub pages <https://docs.ropensci.org/lingtypology/> and package vignette. Maps created by this package can be used both for the investigation and linguistic teaching. In addition, package provides an ability to download data from typological databases such as WALS, AUTOTYP and some others and to create your own database website.

Maintained by George Moroz. Last updated 5 months ago.

abvd afbo atlas autotype bivaltyp clld glottolog-database linguistic-maps linguistics phoible sails typology wals

1.3 match 51 stars 9.58 score 694 scripts

florianhartig

BayesianTools:General-Purpose MCMC and SMC Samplers and Tools for Bayesian Statistics

General-purpose MCMC and SMC samplers, as well as plots and diagnostic functions for Bayesian statistics, with a particular focus on calibrating complex system models. Implemented samplers include various Metropolis MCMC variants (including adaptive and/or delayed rejection MH), the T-walk, two differential evolution MCMCs, two DREAM MCMCs, and a sequential Monte Carlo (SMC) particle filter.

Maintained by Florian Hartig. Last updated 1 years ago.

bayes ecological-models mcmc optimization smc systems-biology cpp

1.3 match 122 stars 10.17 score 580 scripts 5 dependents

bioc

lefser:R implementation of the LEfSE method for microbiome biomarker discovery

lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).

Maintained by Sehyun Oh. Last updated 26 days ago.

software sequencing differentialexpression microbiome statisticalmethod classification bioconductor-package

1.5 match 55 stars 8.47 score 56 scripts

bioc

mpra:Analyze massively parallel reporter assays

Tools for data management, count preprocessing, and differential analysis in massively parallel report assays (MPRA).

Maintained by Leslie Myint. Last updated 5 months ago.

software generegulation sequencing functionalgenomics

2.0 match 6 stars 6.28 score 15 scripts

bioc

derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach

This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq chipseq differentialpeakcalling software immunooncology coverage annotation-agnostic bioconductor derfinder

1.3 match 42 stars 10.03 score 78 scripts 6 dependents

bioc

RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences

This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.

Maintained by Pascal Belleau. Last updated 5 months ago.

genetics software sequencing wholegenome principalcomponent geneticvariability dimensionreduction biocviews ancestry cancer-genomics exome-sequencing genomics inference r-language rna-seq rna-sequencing whole-genome-sequencing

2.0 match 5 stars 6.23 score 19 scripts

bioc

made4:Multivariate analysis of microarray data using ADE4

Multivariate data analysis and graphical display of microarray data. Functions include for supervised dimension reduction (between group analysis) and joint dimension reduction of 2 datasets (coinertia analysis). It contains functions that require R package ade4.

Maintained by Aedin Culhane. Last updated 5 months ago.

clustering classification dimensionreduction principalcomponent transcriptomics multiplecomparison geneexpression sequencing microarray

2.0 match 6.11 score 107 scripts 2 dependents

robindenz1

adjustedCurves:Confounder-Adjusted Survival Curves and Cumulative Incidence Functions

Estimate and plot confounder-adjusted survival curves using either 'Direct Adjustment', 'Direct Adjustment with Pseudo-Values', various forms of 'Inverse Probability of Treatment Weighting', two forms of 'Augmented Inverse Probability of Treatment Weighting', 'Empirical Likelihood Estimation' or 'Targeted Maximum Likelihood Estimation'. Also includes a significance test for the difference between two adjusted survival curves and the calculation of adjusted restricted mean survival times. Additionally enables the user to estimate and plot cause-specific confounder-adjusted cumulative incidence functions in the competing risks setting using the same methods (with some exceptions). For details, see Denz et. al (2023) <doi:10.1002/sim.9681>.

Maintained by Robin Denz. Last updated 29 days ago.

adjusted confidence-intervals cumulative-incidence survival-curves

1.5 match 38 stars 8.12 score 93 scripts

skoval

RISmed:Download Content from NCBI Databases

A set of tools to extract bibliographic content from the National Center for Biotechnology Information (NCBI) databases, including PubMed. The name RISmed is a portmanteau of RIS (for Research Information Systems, a common tag format for bibliographic data) and PubMed.

Maintained by Stephanie Kovalchik. Last updated 3 years ago.

1.8 match 38 stars 6.94 score 252 scripts 3 dependents

anestistouloumis

SimCorMultRes:Simulates Correlated Multinomial Responses

Simulates correlated multinomial responses conditional on a marginal model specification.

Maintained by Anestis Touloumis. Last updated 12 months ago.

binary longitudinal-studies multinomial simulation

2.0 match 7 stars 6.04 score 26 scripts 2 dependents

kaihsianghu

iNEXT.4steps:Four-Step Biodiversity Analysis Based on 'iNEXT'

Expands 'iNEXT' to include the estimation of sample completeness and evenness. The package provides simple functions to perform the following four-step biodiversity analysis: STEP 1: Assessment of sample completeness profiles. STEP 2a: Analysis of size-based rarefaction and extrapolation sampling curves to determine whether the asymptotic diversity can be accurately estimated. STEP 2b: Comparison of the observed and the estimated asymptotic diversity profiles. STEP 3: Analysis of non-asymptotic coverage-based rarefaction and extrapolation sampling curves. STEP 4: Assessment of evenness profiles. The analyses in STEPs 2a, 2b and STEP 3 are mainly based on the previous 'iNEXT' package. Refer to the 'iNEXT' package for details. This package is mainly focusing on the computation for STEPs 1 and 4. See Chao et al. (2020) <doi:10.1111/1440-1703.12102> for statistical background.

Maintained by Anne Chao. Last updated 9 months ago.

2.0 match 4 stars 6.00 score 8 scripts

molinlab

Holomics:An User-Friendly R 'shiny' Application for Multi-Omics Data Integration and Analysis

A 'shiny' application, which allows you to perform single- and multi-omics analyses using your own omics datasets. After the upload of the omics datasets and a metadata file, single-omics is performed for feature selection and dataset reduction. These datasets are used for pairwise- and multi-omics analyses, where automatic tuning is done to identify correlations between the datasets - the end goal of the recommended 'Holomics' workflow. Methods used in the package were implemented in the package 'mixomics' by Florian Rohart,Benoît Gautier,Amrit Singh,Kim-Anh Lê Cao (2017) <doi:10.1371/journal.pcbi.1005752> and are described there in further detail.

Maintained by Katharina Munk. Last updated 9 months ago.

2.2 match 7 stars 5.45 score 7 scripts

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

1.5 match 37 stars 7.81 score 29 scripts

bioc

biocthis:Automate package and project setup for Bioconductor packages

This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

software reportwriting actions bioconductor biocthis github styler usethis

1.5 match 51 stars 7.78 score 4 scripts 1 dependents

edonnachie

ICD10gm:Metadata Processing for the German Modification of the ICD-10 Coding System

Provides convenient access to the German modification of the International Classification of Diagnoses, 10th revision (ICD-10-GM). It provides functionality to aid in the identification, specification and historisation of ICD-10 codes. Its intended use is the analysis of routinely collected data in the context of epidemiology, medical research and health services research. The underlying metadata are released by the German Institute for Medical Documentation and Information <https://www.dimdi.de>, and are redistributed in accordance with their license.

Maintained by Ewan Donnachie. Last updated 1 years ago.

bfarm charlson comorbidities diagnoses dimdi icd-10 metadata routinedaten versorgungsforschung

2.2 match 10 stars 5.30 score 20 scripts

alexpkeil1

qgcomp:Quantile G-Computation

G-computation for a set of time-fixed exposures with quantile-based basis functions, possibly under linearity and homogeneity assumptions. This approach estimates a regression line corresponding to the expected change in the outcome (on the link basis) given a simultaneous increase in the quantile-based category for all exposures. Works with continuous, binary, and right-censored time-to-event outcomes. Reference: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, and Alexandra J. White (2019) A quantile-based g-computation approach to addressing the effects of exposure mixtures; <doi:10.1289/EHP5838>.

Maintained by Alexander Keil. Last updated 4 days ago.

exposure exposure-mixture exposure-mixtures quantile-gcomputation survival

1.3 match 37 stars 8.73 score 70 scripts 2 dependents

bradduthie

resevol:Simulate Agricultural Production and Evolution of Pesticide Resistance

Simulates individual-based models of agricultural pest management and the evolution of pesticide resistance. Management occurs on a spatially explicit landscape that is divided into an arbitrary number of farms that can grow one of up to 10 crops and apply one of up to 10 pesticides. Pest genomes are modelled in a way that allows for any number of pest traits with an arbitrary covariance structure that is constructed using an evolutionary algorithm in the mine_gmatrix() function. Simulations are then run using the run_farm_sim() function. This package thereby allows for highly mechanistic social-ecological models of the evolution of pesticide resistance under different types of crop rotation and pesticide application regimes.

Maintained by A. Bradley Duthie. Last updated 1 years ago.

2.5 match 3 stars 4.65 score 1 scripts

bioc

rrvgo:Reduce + Visualize GO

Reduce and visualize lists of Gene Ontology terms by identifying redudance based on semantic similarity.

Maintained by Sergi Sayols. Last updated 5 months ago.

annotation clustering go network pathways software

1.5 match 24 stars 7.74 score 190 scripts

azizka

conserveR:Identifying Conservation Prioritization Methods Based on Data Availability

Helping biologists to choose the most suitable approach to link their research to conservation. After answering few questions on the data available, geographic and taxonomic scope, 'conserveR' ranks existing methods for conservation prioritization and systematic conservation planning by suitability. The methods data base of 'conserveR' contains 133 methods for conservation prioritization based on a systematic review of > 12,000 scientific publications from the fields of spatial conservation prioritization, systematic conservation planning, biogeography and ecology.

Maintained by Alexander Zizka. Last updated 4 years ago.

3.2 match 8 stars 3.60 score

lifewatch

sdmpredictors:Species Distribution Modelling Predictor Datasets

Terrestrial and marine predictors for species distribution modelling from multiple sources, including WorldClim <https://www.worldclim.org/>,, ENVIREM <https://envirem.github.io/>, Bio-ORACLE <https://bio-oracle.org/> and MARSPEC <http://www.marspec.org/>.

Maintained by Salvador Fernandez. Last updated 2 years ago.

bio-oracle lifewatch lifewatchvliz species-distribution-modelling

1.5 match 30 stars 7.47 score 218 scripts

mw201608

SuperExactTest:Exact Test and Visualization of Multi-Set Intersections

Identification of sets of objects with shared features is a common operation in all disciplines. Analysis of intersections among multiple sets is fundamental for in-depth understanding of their complex relationships. This package implements a theoretical framework for efficient computation of statistical distributions of multi-set intersections based upon combinatorial theory, and provides multiple scalable techniques for visualizing the intersection statistics. The statistical algorithm behind this package was published in Wang et al. (2015) <doi:10.1038/srep16923>.

Maintained by Minghui Wang. Last updated 1 years ago.

intersection set statistics visualization

1.5 match 28 stars 7.47 score 70 scripts 1 dependents

bioc

MOSim:Multi-Omics Simulation (MOSim)

MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.

Maintained by Sonia Tarazona. Last updated 5 months ago.

software timecourse experimentaldesign rnaseq cpp

1.5 match 9 stars 7.46 score 11 scripts

pepijn-devries

ECOTOXr:Download and Extract Data from US EPA's ECOTOX Database

The US EPA ECOTOX database is a freely available database with a treasure of aquatic and terrestrial ecotoxicological data. As the online search interface doesn't come with an API, this package provides the means to easily access and search the database in R. To this end, all raw tables are downloaded from the EPA website and stored in a local SQLite database <doi:10.1016/j.chemosphere.2024.143078>.

Maintained by Pepijn de Vries. Last updated 5 days ago.

1.8 match 10 stars 6.20 score 6 scripts

amilkey1

lorad:Lowest Radial Distance Method of Marginal Likelihood Estimation

Estimates marginal likelihood from a posterior sample using the method described in Wang et al. (2023) <doi:10.1093/sysbio/syad007>, which does not require evaluation of any additional points and requires only the log of the unnormalized posterior density for each sampled parameter vector.

Maintained by Analisa Milkey. Last updated 1 years ago.

4.5 match 2.48 score 5 scripts

bioc

ELMER:Inferring Regulatory Element Landscapes and Transcription Factor Networks Using Cancer Methylomes

ELMER is designed to use DNA methylation and gene expression from a large number of samples to infere regulatory element landscape and transcription factor network in primary tissue.

Maintained by Tiago Chedraoui Silva. Last updated 5 months ago.

dnamethylation geneexpression motifannotation software generegulation transcription network

1.5 match 7.42 score 176 scripts

nataliepatten

gatoRs:Geographic and Taxonomic Occurrence R-Based Scrubbing

Streamlines downloading and cleaning biodiversity data from Integrated Digitized Biocollections (iDigBio) and the Global Biodiversity Information Facility (GBIF).

Maintained by Natalie N. Patten. Last updated 10 months ago.

1.8 match 11 stars 6.16 score 66 scripts

bioc

enrichViewNet:From functional enrichment results to biological networks

This package enables the visualization of functional enrichment results as network graphs. First the package enables the visualization of enrichment results, in a format corresponding to the one generated by gprofiler2, as a customizable Cytoscape network. In those networks, both gene datasets (GO terms/pathways/protein complexes) and genes associated to the datasets are represented as nodes. While the edges connect each gene to its dataset(s). The package also provides the option to create enrichment maps from functional enrichment results. Enrichment maps enable the visualization of enriched terms into a network with edges connecting overlapping genes.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion software network networkenrichment go cystocape functional-enrichment

2.0 match 5 stars 5.54 score 6 scripts

fhdsl

metricminer:Mine Metrics from Common Places on the Web

Mine metrics on common places on the web through the power of their APIs (application programming interfaces). It also helps make the data in a format that is easily used for a dashboard or other purposes. There is an associated dashboard template and tutorials that are underdevelopment that help you fully utilize 'metricminer'.

Maintained by Candace Savonen. Last updated 3 days ago.

edtech-software

1.8 match 2 stars 6.13 score 21 scripts

pepijn-devries

CopernicusMarine:Search Download and Handle Data from Copernicus Marine Service Information

Subset and download data from EU Copernicus Marine Service Information: <https://data.marine.copernicus.eu>. Import data on the oceans physical and biogeochemical state from Copernicus into R without the need of external software.

Maintained by Pepijn de Vries. Last updated 3 months ago.

data spatial

1.9 match 25 stars 5.88 score 20 scripts 2 dependents

doi-usgs

toxEval:Exploring Biological Relevance of Environmental Chemistry Observations

Data analysis package for estimating potential biological effects from chemical concentrations in environmental samples. Included are a set of functions to analyze, visualize, and organize measured concentration data as it relates to user-selected chemical-biological interaction benchmark data such as water quality criteria. The intent of these analyses is to develop a better understanding of the potential biological relevance of environmental chemistry data. Results can be used to prioritize which chemicals at which sites may be of greatest concern. These methods are meant to be used as a screening technique to predict potential for biological influence from chemicals that ultimately need to be validated with direct biological assays. A description of the analysis can be found in Blackwell (2017) <doi:10.1021/acs.est.7b01613>.

Maintained by Laura DeCicco. Last updated 3 months ago.

toxicity water-quality

1.5 match 21 stars 7.34 score 58 scripts

bioc

QuasR:Quantify and Annotate Short Reads in R

This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest. Read alignments are either generated through Rbowtie (data from DNA/ChIP/ATAC/Bis-seq experiments) or Rhisat2 (data from RNA-seq experiments that require spliced alignments), or can be provided in the form of bam files.

Maintained by Michael Stadler. Last updated 24 days ago.

genetics preprocessing sequencing chipseq rnaseq methylseq coverage alignment qualitycontrol immunooncology curl bzip2 xz-utils zlib cpp

1.3 match 6 stars 8.70 score 79 scripts 1 dependents

bioc

CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems

The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.

Maintained by Lihua Julie Zhu. Last updated 6 days ago.

immunooncology generegulation sequencematching crispr

1.5 match 7.18 score 51 scripts 2 dependents

bioc

isomiRs:Analyze isomiRs and miRNAs from small RNA-seq

Characterization of miRNAs and isomiRs, clustering and differential expression.

Maintained by Lorena Pantano. Last updated 5 months ago.

mirna rnaseq differentialexpression clustering immunooncology analyze-isomirs bioconductor isomirs

1.5 match 8 stars 7.09 score 43 scripts

bioc

GenomicSuperSignature:Interpretation of RNA-seq experiments through robust, efficient comparison to public databases

This package provides a novel method for interpreting new transcriptomic datasets through near-instantaneous comparison to public archives without high-performance computing requirements. Through the pre-computed index, users can identify public resources associated with their dataset such as gene sets, MeSH term, and publication. Functions to identify interpretable annotations and intuitive visualization options are implemented in this package.

Maintained by Sehyun Oh. Last updated 5 months ago.

transcriptomics systemsbiology principalcomponent rnaseq sequencing pathways clustering bioconductor-package exploratory-data-analysis gsea mesh principal-component-analysis rna-sequencing-profiles transferlearning

1.5 match 16 stars 6.97 score 59 scripts

patzaw

BED:Biological Entity Dictionary (BED)

An interface for the 'Neo4j' database providing mapping between different identifiers of biological entities. This Biological Entity Dictionary (BED) has been developed to address three main challenges. The first one is related to the completeness of identifier mappings. Indeed, direct mapping information provided by the different systems are not always complete and can be enriched by mappings provided by other resources. More interestingly, direct mappings not identified by any of these resources can be indirectly inferred by using mappings to a third reference. For example, many human Ensembl gene ID are not directly mapped to any Entrez gene ID but such mappings can be inferred using respective mappings to HGNC ID. The second challenge is related to the mapping of deprecated identifiers. Indeed, entity identifiers can change from one resource release to another. The identifier history is provided by some resources, such as Ensembl or the NCBI, but it is generally not used by mapping tools. The third challenge is related to the automation of the mapping process according to the relationships between the biological entities of interest. Indeed, mapping between gene and protein ID scopes should not be done the same way than between two scopes regarding gene ID. Also, converting identifiers from different organisms should be possible using gene orthologs information. The method has been published by Godard and van Eyll (2018) <doi:10.12688/f1000research.13925.3>.

Maintained by Patrice Godard. Last updated 3 months ago.

1.5 match 8 stars 6.85 score 25 scripts

bioc

Rbowtie:R bowtie wrapper

This package provides an R wrapper around the popular bowtie short read aligner and around SpliceMap, a de novo splice junction discovery and alignment tool. The package is used by the QuasR bioconductor package. We recommend to use the QuasR package instead of using Rbowtie directly.

Maintained by Michael Stadler. Last updated 2 months ago.

sequencing alignment

1.5 match 1 stars 6.80 score 22 scripts 8 dependents

bioc

dupRadar:Assessment of duplication rates in RNA-Seq datasets

Duplication rate quality control for RNA-Seq datasets.

Maintained by Sergi Sayols. Last updated 5 months ago.

technology sequencing rnaseq qualitycontrol immunooncology

1.5 match 2 stars 6.78 score 60 scripts

bioc

CNVMetrics:Copy Number Variant Metrics

The CNVMetrics package calculates similarity metrics to facilitate copy number variant comparison among samples and/or methods. Similarity metrics can be employed to compare CNV profiles of genetically unrelated samples as well as those with a common genetic background. Some metrics are based on the shared amplified/deleted regions while other metrics rely on the level of amplification/deletion. The data type used as input is a plain text file containing the genomic position of the copy number variations, as well as the status and/or the log2 ratio values. Finally, a visualization tool is provided to explore resulting metrics.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion software copynumbervariation cnv copy-number-variation metrics r-language

2.0 match 4 stars 5.08 score 8 scripts

smbc-nzp

MigConnectivity:Estimate Migratory Connectivity for Migratory Animals

Allows the user to estimate transition probabilities for migratory animals between any two phases of the annual cycle, using a variety of different data types. Also quantifies the strength of migratory connectivity (MC), a standardized metric to quantify the extent to which populations co-occur between two phases of the annual cycle. Includes functions to estimate MC and the more traditional metric of migratory connectivity strength (Mantel correlation) incorporating uncertainty from multiple sources of sampling error. For cross-species comparisons, methods are provided to estimate differences in migratory connectivity strength, incorporating uncertainty. See Cohen et al. (2018) <doi:10.1111/2041-210X.12916>, Cohen et al. (2019) <doi:10.1111/ecog.03974>, and Roberts et al. (2023) <doi:10.1002/eap.2788> for details on some of these methods.

Maintained by Jeffrey A. Hostetler. Last updated 12 months ago.

jags cpp

1.5 match 8 stars 6.77 score 41 scripts

maarten14c

rbacon:Age-Depth Modelling using Bayesian Statistics

An approach to age-depth modelling that uses Bayesian statistics to reconstruct accumulation histories for deposits, through combining radiocarbon and other dates with prior information on accumulation rates and their variability. See Blaauw & Christen (2011).

Maintained by Maarten Blaauw. Last updated 26 days ago.

age-depth-model bayesian holocene lakes ocean-sediments peat radiocarbon-calibration cpp

1.5 match 7 stars 6.75 score 57 scripts 1 dependents

bioc

CoGAPS:Coordinated Gene Activity in Pattern Sets

Coordinated Gene Activity in Pattern Sets (CoGAPS) implements a Bayesian MCMC matrix factorization algorithm, GAPS, and links it to gene set statistic methods to infer biological process activity. It can be used to perform sparse matrix factorization on any data, and when this data represents biomolecules, to do gene set analysis.

Maintained by Elana J. Fertig. Last updated 5 months ago.

geneexpression transcription genesetenrichment differentialexpression bayesian clustering timecourse rnaseq microarray multiplecomparison dimensionreduction immunooncology cpp

1.5 match 6.72 score 104 scripts

bioc

recount3:Explore and download data from the recount3 project

The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport annotation-agnostic bioconductor count derfinder exon gene human illumina junction mouse recount recount3

1.3 match 33 stars 8.03 score 216 scripts

neon-biodiversity

Ostats:O-Stats, or Pairwise Community-Level Niche Overlap Statistics

O-statistics, or overlap statistics, measure the degree of community-level trait overlap. They are estimated by fitting nonparametric kernel density functions to each species’ trait distribution and calculating their areas of overlap. For instance, the median pairwise overlap for a community is calculated by first determining the overlap of each species pair in trait space, and then taking the median overlap of each species pair in a community. This median overlap value is called the O-statistic (O for overlap). The Ostats() function calculates separate univariate overlap statistics for each trait, while the Ostats_multivariate() function calculates a single multivariate overlap statistic for all traits. O-statistics can be evaluated against null models to obtain standardized effect sizes. 'Ostats' is part of the collaborative Macrosystems Biodiversity Project "Local- to continental-scale drivers of biodiversity across the National Ecological Observatory Network (NEON)." For more information on this project, see the Macrosystems Biodiversity Website (<https://neon-biodiversity.github.io/>). Calculation of O-statistics is described in Read et al. (2018) <doi:10.1111/ecog.03641>, and a teaching module for introducing the underlying biological concepts at an undergraduate level is described in Grady et al. (2018) <http://tiee.esa.org/vol/v14/issues/figure_sets/grady/abstract.html>.

Maintained by Quentin D. Read. Last updated 4 months ago.

ecology

1.5 match 7 stars 6.69 score 28 scripts

bioc

megadepth:megadepth: BigWig and BAM related utilities

This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.

Maintained by David Zhang. Last updated 3 months ago.

software coverage dataimport transcriptomics rnaseq preprocessing bam bigwig daspter megadepth recount2 recount3

1.5 match 12 stars 6.69 score 7 scripts 3 dependents

bioc

proActiv:Estimate Promoter Activity from RNA-Seq data

Most human genes have multiple promoters that control the expression of different isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv is an R package that enables the analysis of promoters from RNA-seq data. proActiv uses aligned reads as input, and generates counts and normalized promoter activity estimates for each annotated promoter. In particular, proActiv accepts junction files from TopHat2 or STAR or BAM files as inputs. These estimates can then be used to identify which promoter is active, which promoter is inactive, and which promoters change their activity across conditions. proActiv also allows visualization of promoter activity across conditions.

Maintained by Joseph Lee. Last updated 5 months ago.

rnaseq geneexpression transcription alternativesplicing generegulation differentialsplicing functionalgenomics epigenetics transcriptomics preprocessing alternative-promoters genomics promoter-activity promoter-annotation rna-seq-data

1.5 match 51 stars 6.66 score 15 scripts

bioc

MultiBaC:Multiomic Batch effect Correction

MultiBaC is a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. MultiBaC is the first Batch effect correction algorithm that dealing with batch effect correction in multiomics datasets. MultiBaC is able to remove batch effects across different omics generated within separate batches provided that at least one common omic data type is included in all the batches considered.

Maintained by The package maintainer. Last updated 5 months ago.

software statisticalmethod principalcomponent datarepresentation geneexpression transcription batcheffect

3.0 match 3.30 score 7 scripts

eltebioinformatics

mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate

Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.

Maintained by Tamas Stirling. Last updated 3 months ago.

annotation differentialexpression geneexpression genesetenrichment go graphandnetwork multiplecomparison pathways reactome software transcription visualization enrichment enrichment-analysis functional-enrichment-analysis gene-set-enrichment ontologies transcriptomics cpp

1.3 match 28 stars 7.36 score 34 scripts

lakshay-anand

chromoMap:Interactive Genomic Visualization of Biological Data

Provides interactive, configurable and elegant graphics visualization of the chromosomes or chromosome regions of any living organism allowing users to map chromosome elements (like genes, SNPs etc.) on the chromosome plot. It introduces a special plot viz. the "chromosome heatmap" that, in addition to mapping elements, can visualize the data associated with chromosome elements (like gene expression) in the form of heat colors which can be highly advantageous in the scientific interpretations and research work. Because of the large size of the chromosomes, it is impractical to visualize each element on the same plot. However, the plot provides a magnified view for each of chromosome locus to render additional information and visualization specific for that location. You can map thousands of genes and can view all mappings easily. Users can investigate the detailed information about the mappings (like gene names or total genes mapped on a location) or can view the magnified single or double stranded view of the chromosome at a location showing each mapped element in sequential order. The package provide multiple features like visualizing multiple sets, chromosome heat-maps, group annotations, adding hyperlinks, and labelling. The plots can be saved as HTML documents that can be customized and shared easily. In addition, you can include them in R Markdown or in R 'Shiny' applications.

Maintained by Lakshay Anand. Last updated 3 years ago.

2.2 match 9 stars 4.46 score 80 scripts

bioc

ChAMP:Chip Analysis Methylation Pipeline for Illumina HumanMethylation450 and EPIC

The package includes quality control metrics, a selection of normalization methods and novel methods to identify differentially methylated regions and to highlight copy number alterations.

Maintained by Yuan Tian. Last updated 5 months ago.

microarray methylationarray normalization twochannel copynumber dnamethylation

1.5 match 6.54 score 278 scripts

bioc

wateRmelon:Illumina DNA methylation array normalization and metrics

15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.

Maintained by Leo C Schalkwyk. Last updated 4 months ago.

dnamethylation microarray twochannel preprocessing qualitycontrol

1.3 match 7.75 score 247 scripts 2 dependents

anestistouloumis

ShrinkCovMat:Shrinkage Covariance Matrix Estimators

Provides nonparametric Steinian shrinkage estimators of the covariance matrix that are suitable in high dimensional settings, that is when the number of variables is larger than the sample size.

Maintained by Anestis Touloumis. Last updated 2 years ago.

covariance-matrix shrinkage-estimators openblas cpp openmp

2.0 match 8 stars 4.83 score 17 scripts

bioc

evaluomeR:Evaluation of Bioinformatics Metrics

Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.

Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.

clustering classification featureextraction assessment clustering-evaluation evaluome evaluomer metrics

2.0 match 4.82 score 33 scripts

bioc

HDTD:Statistical Inference about the Mean Matrix and the Covariance Matrices in High-Dimensional Transposable Data (HDTD)

Characterization of intra-individual variability using physiologically relevant measurements provides important insights into fundamental biological questions ranging from cell type identity to tumor development. For each individual, the data measurements can be written as a matrix with the different subsamples of the individual recorded in the columns and the different phenotypic units recorded in the rows. Datasets of this type are called high-dimensional transposable data. The HDTD package provides functions for conducting statistical inference for the mean relationship between the row and column variables and for the covariance structure within and between the row and column variables.

Maintained by Anestis Touloumis. Last updated 5 months ago.

differentialexpression genetics geneexpression microarray sequencing statisticalmethod software bioconductor-package high-dimensional statistics openblas cpp openmp

2.0 match 1 stars 4.78 score

tkcaccia

KODAMA:Knowledge Discovery by Accuracy Maximization

An unsupervised and semi-supervised learning algorithm that performs feature extraction from noisy and high-dimensional data. It facilitates identification of patterns representing underlying groups on all samples in a data set. Based on Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA. (2017) Bioinformatics <doi:10.1093/bioinformatics/btw705> and Cacciatore S, Luchinat C, Tenori L. (2014) Proc Natl Acad Sci USA <doi:10.1073/pnas.1220873111>.

Maintained by Stefano Cacciatore. Last updated 14 hours ago.

openblas cpp

1.3 match 1 stars 7.00 score 63 scripts 1 dependents

schweflo

pandocfilters:Pandoc Filters for R

The document converter 'pandoc' <https://pandoc.org/> is widely used in the R community. One feature of 'pandoc' is that it can produce and consume JSON-formatted abstract syntax trees (AST). This allows to transform a given source document into JSON-formatted AST, alter it by so called filters and pass the altered JSON-formatted AST back to 'pandoc'. This package provides functions which allow to write such filters in native R code. Although this package is inspired by the Python package 'pandocfilters' <https://github.com/jgm/pandocfilters/>, it provides additional convenience functions which make it simple to use the 'pandocfilters' package as a report generator. Since 'pandocfilters' inherits most of it's functionality from 'pandoc' it can create documents in many formats (for more information see <https://pandoc.org/>) but is also bound to the same limitations as 'pandoc'.

Maintained by Florian Schwendinger. Last updated 3 years ago.

3.3 match 2.80 score 63 scripts

bioc

xCell2:A Tool for Generic Cell Type Enrichment Analysis

xCell2 provides methods for cell type enrichment analysis using cell type signatures. It includes three main functions - 1. xCell2Train for training custom references objects from bulk or single-cell RNA-seq datasets. 2. xCell2Analysis for conducting the cell type enrichment analysis using the custom reference. 3. xCell2GetLineage for identifying dependencies between different cell types using ontology.

Maintained by Almog Angel. Last updated 16 hours ago.

geneexpression transcriptomics microarray rnaseq singlecell differentialexpression immunooncology genesetenrichment

1.5 match 6 stars 6.16 score 15 scripts

bioc

methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect

Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion epigenetics dnamethylation differentialmethylation methylseq software immunooncology statisticalmethod wholegenome sequencing analysis bioconductor bioinformatics cpg differentially-methylated-elements inheritance monte-carlo-sampling permutation

2.0 match 4.60 score 1 scripts

bioc

methInheritSim:Simulating Whole-Genome Inherited Bisulphite Sequencing Data

Simulate a multigeneration methylation case versus control experiment with inheritance relation using a real control dataset.

Maintained by Pascal Belleau. Last updated 5 months ago.

biologicalquestion epigenetics dnamethylation differentialmethylation methylseq software immunooncology statisticalmethod wholegenome sequencing bisulphite-sequencing inheritance methylation simulation

2.0 match 1 stars 4.60 score 1 scripts

lcrawlab

mvMAPIT:Multivariate Genome Wide Marginal Epistasis Test

Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this package, we present the 'multivariate MArginal ePIstasis Test' ('mvMAPIT') – a multi-outcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact – thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search based methods. Our proposed 'mvMAPIT' builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate 'mvMAPIT' as a multivariate linear mixed model and develop a multi-trait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. Crawford et al. (2017) <doi:10.1371/journal.pgen.1006869>. Stamp et al. (2023) <doi:10.1093/g3journal/jkad118>.

Maintained by Julian Stamp. Last updated 5 months ago.

cpp epistasis epistasis-analysis gwas gwas-tools linear-mixed-models mapit mvmapit variance-components openblas cpp openmp

1.3 match 11 stars 6.90 score 17 scripts 1 dependents

myles-lewis

glmmSeq:General Linear Mixed Models for Gene-Level Differential Expression

Using mixed effects models to analyse longitudinal gene expression can highlight differences between sample groups over time. The most widely used differential gene expression tools are unable to fit linear mixed effect models, and are less optimal for analysing longitudinal data. This package provides negative binomial and Gaussian mixed effects models to fit gene expression and other biological data across repeated samples. This is particularly useful for investigating changes in RNA-Sequencing gene expression between groups of individuals over time, as described in: Rivellese, F., Surace, A. E., Goldmann, K., Sciacca, E., Cubuk, C., Giorli, G., ... Lewis, M. J., & Pitzalis, C. (2022) Nature medicine <doi:10.1038/s41591-022-01789-0>.

Maintained by Myles Lewis. Last updated 2 months ago.

bioinformatics differential-gene-expression gene-expression glmm mixed-models transcriptomics

1.5 match 19 stars 6.11 score 45 scripts

bioc

fastseg:fastseg - a fast segmentation algorithm

fastseg implements a very fast and efficient segmentation algorithm. It has similar functionality as DNACopy (Olshen and Venkatraman 2004), but is considerably faster and more flexible. fastseg can segment data from DNA microarrays and data from next generation sequencing for example to detect copy number segments. Further it can segment data from RNA microarrays like tiling arrays to identify transcripts. Most generally, it can segment data given as a matrix or as a vector. Various data formats can be used as input to fastseg like expression set objects for microarrays or GRanges for sequencing data. The segmentation criterion of fastseg is based on a statistical test in a Bayesian framework, namely the cyber t-test (Baldi 2001). The speed-up arises from the facts, that sampling is not necessary in for fastseg and that a dynamic programming approach is used for calculation of the segments' first and higher order moments.

Maintained by Alexander Blume. Last updated 1 months ago.

classification copynumbervariation cpp

1.5 match 6.07 score 20 scripts 4 dependents

bioc

AnVILWorkflow:Run workflows implemented in Terra/AnVIL workspace

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The main cloud-based genomics platform deported by the AnVIL project is Terra. The AnVILWorkflow package allows remote access to Terra implemented workflows, enabling end-user to utilize Terra/ AnVIL provided resources - such as data, workflows, and flexible/scalble computing resources - through the conventional R functions.

Maintained by Sehyun Oh. Last updated 27 days ago.

infrastructure software anvil gcp terra workflows

1.5 match 6 stars 6.03 score 1 scripts

bioc

regionReport:Generate HTML or PDF reports for a set of genomic regions or DESeq2/edgeR results

Generate HTML or PDF reports to explore a set of regions such as the results from annotation-agnostic expression analysis of RNA-seq data at base-pair resolution performed by derfinder. You can also create reports for DESeq2 or edgeR results.

Maintained by Leonardo Collado-Torres. Last updated 2 months ago.

differentialexpression sequencing rnaseq software visualization transcription coverage reportwriting differentialmethylation differentialpeakcalling immunooncology qualitycontrol bioconductor derfinder deseq2 edger regionreport rmarkdown

1.3 match 9 stars 7.22 score 46 scripts

ropensci

europepmc:R Interface to the Europe PubMed Central RESTful Web Service

An R Client for the Europe PubMed Central RESTful Web Service (see <https://europepmc.org/RestfulWebService> for more information). It gives access to both metadata on life science literature and open access full texts. Europe PMC indexes all PubMed content and other literature sources including Agricola, a bibliographic database of citations to the agricultural literature, or Biological Patents. In addition to bibliographic metadata, the client allows users to fetch citations and reference lists. Links between life-science literature and other EBI databases, including ENA, PDB or ChEMBL are also accessible. No registration or API key is required. See the vignettes for usage examples.

Maintained by Najko Jahn. Last updated 1 years ago.

bibliometrics europe-pmc pubmed pubmedcentral scientific-literature scientific-publications

1.1 match 27 stars 7.94 score 122 scripts 2 dependents

bioc

HiContacts:Analysing cool files in R with HiContacts

HiContacts provides a collection of tools to analyse and visualize Hi-C datasets imported in R by HiCExperiment.

Maintained by Jacques Serizay. Last updated 5 months ago.

hic dna3dstructure

1.5 match 12 stars 5.95 score 49 scripts

l-ramirez-lopez

resemble:Memory-Based Learning in Spectral Chemometrics

Functions for dissimilarity analysis and memory-based learning (MBL, a.k.a local modeling) in complex spectral data sets. Most of these functions are based on the methods presented in Ramirez-Lopez et al. (2013) <doi:10.1016/j.geoderma.2012.12.014>.

Maintained by Leonardo Ramirez-Lopez. Last updated 2 years ago.

chemoinformatics chemometrics infrared-spectroscopy lazy-learning local-regression machine-learning memory-based-learning nir pedometrics soil-spectroscopy spectral-data spectral-library spectroscopy openblas cpp openmp

1.5 match 20 stars 5.91 score 27 scripts

bioc

ASSIGN:Adaptive Signature Selection and InteGratioN (ASSIGN)

ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.

Maintained by Ying Shen. Last updated 5 months ago.

software geneexpression pathways bayesian

1.2 match 2 stars 7.37 score 65 scripts 1 dependents

bioc

DFplyr:A `DataFrame` (`S4Vectors`) backend for `dplyr`

Provides `dplyr` verbs (`mutate`, `select`, `filter`, etc...) supporting `S4Vectors::DataFrame` objects. Importantly, this is achieved without conversion to an intermediate `tibble`. Adds grouping infrastructure to `DataFrame` which is respected by the transformation verbs.

Maintained by Jonathan Carroll. Last updated 5 months ago.

datarepresentation infrastructure software

1.5 match 21 stars 5.87 score 5 scripts

kzst

neutrostat:Neutrosophic Statistics

Analyzes data involving imprecise and vague information. Provides summary statistics and describes the characteristics of neutrosophic data, as defined by Florentin Smarandache (2013).<ISBN:9781599732749>.

Maintained by Zsolt T. Kosztyan. Last updated 4 months ago.

3.4 match 2.60 score

lvclark

polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids

Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.

Maintained by Lindsay V. Clark. Last updated 8 days ago.

bioinformatics dna-sequencing genotype-likelihoods genotyping-by-sequencing hacktoberfest rad-seq rad-sequencing snp-genotyping cpp

1.3 match 28 stars 6.98 score 85 scripts

bioc

twoddpcr:Classify 2-d Droplet Digital PCR (ddPCR) data and quantify the number of starting molecules

The twoddpcr package takes Droplet Digital PCR (ddPCR) droplet amplitude data from Bio-Rad's QuantaSoft and can classify the droplets. A summary of the positive/negative droplet counts can be generated, which can then be used to estimate the number of molecules using the Poisson distribution. This is the first open source package that facilitates the automatic classification of general two channel ddPCR data. Previous work includes 'definetherain' (Jones et al., 2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle one channel ddPCR experiments only. The 'ddpcr' package available on CRAN (Attali et al., 2016) supports automatic gating of a specific class of two channel ddPCR experiments only.

Maintained by Anthony Chiu. Last updated 5 months ago.

ddpcr software classification

1.5 match 10 stars 5.78 score 4 scripts

bioc

RPA:RPA: Robust Probabilistic Averaging for probe-level analysis

Probabilistic analysis of probe reliability and differential gene expression on short oligonucleotide arrays.

Maintained by Leo Lahti. Last updated 5 months ago.

geneexpression microarray preprocessing qualitycontrol

1.5 match 5.78 score 20 scripts 1 dependents

cran

STREAK:Receptor Abundance Estimation using Feature Selection and Gene Set Scoring

Performs receptor abundance estimation for single cell RNA-sequencing data using a supervised feature selection mechanism and a thresholded gene set scoring procedure. Seurat's normalization method is described in: Hao et al., (2021) <doi:10.1016/j.cell.2021.04.048>, Stuart et al., (2019) <doi:10.1016/j.cell.2019.05.031>, Butler et al., (2018) <doi:10.1038/nbt.4096> and Satija et al., (2015) <doi:10.1038/nbt.3192>. Method for reduced rank reconstruction and rank-k selection is detailed in: Javaid et al., (2022) <doi:10.1101/2022.10.08.511197>. Gene set scoring procedure is described in: Frost et al., (2020) <doi:10.1093/nar/gkaa582>. Clustering method is outlined in: Song et al., (2020) <doi:10.1093/bioinformatics/btaa613> and Wang et al., (2011) <doi:10.32614/RJ-2011-015>.

Maintained by Azka Javaid. Last updated 1 years ago.

4.3 match 2.00 score 2 scripts

gtonkinhill

rhierbaps:Clustering Genetic Sequence Data Using the HierBAPS Algorithm

Implements the hierarchical Bayesian analysis of populations structure (hierBAPS) algorithm of Cheng et al. (2013) <doi:10.1093/molbev/mst028> for clustering DNA sequences from multiple sequence alignments in FASTA format. The implementation includes improved defaults and plotting capabilities and unlike the original 'MATLAB' version removes singleton SNPs by default.

Maintained by Gerry Tonkin-Hill. Last updated 4 years ago.

population-genetics population-genomics population-structure

1.5 match 34 stars 5.66 score 27 scripts

bioc

iSEEindex:iSEE extension for a landing page to a custom collection of data sets

This package provides an interface to any collection of data sets within a single iSEE web-application. The main functionality of this package is to define a custom landing page allowing app maintainers to list a custom collection of data sets that users can selected from and directly load objects into an iSEE web-application.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure bioconductor hacktoberfest

1.5 match 2 stars 5.65 score 8 scripts

bioc

netresponse:Functional Network Analysis

Algorithms for functional network analysis. Includes an implementation of a variational Dirichlet process Gaussian mixture model for nonparametric mixture modeling.

Maintained by Leo Lahti. Last updated 5 months ago.

cellbiology clustering geneexpression genetics network graphandnetwork differentialexpression microarray networkinference transcription

1.5 match 3 stars 5.64 score 21 scripts

biotimehub

BioTIMEr:Tools to Use and Explore the 'BioTIME' Database

The 'BioTIME' database was first published in 2018 and inspired ideas, questions, project and research article. To make it even more accessible, an R package was created. The 'BioTIMEr' package provides tools designed to interact with the 'BioTIME' database. The functions provided include the 'BioTIME' recommended methods for preparing (gridding and rarefaction) time series data, a selection of standard biodiversity metrics (including species richness, numerical abundance and exponential Shannon) alongside examples on how to display change over time. It also includes a sample subset of both the query and meta data, the full versions of which are freely available on the 'BioTIME' website <https://biotime.st-andrews.ac.uk/home.php>.

Maintained by Alban Sagouis. Last updated 8 months ago.

1.5 match 4 stars 5.60 score 10 scripts

alissonrp

fastrep:Time-Saving Package for Creating Reports

Provides templates for reports in 'rmarkdown' and functions to create tables and summaries of data.

Maintained by Alisson Rosa. Last updated 2 years ago.

latex pdf rmarkdown

1.9 match 6 stars 4.48 score 6 scripts

mikejareds

hermiter:Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)

Facilitates estimation of full univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation (bivariate) using Hermite series based estimators. These estimators are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. Based on: Stephanou, Michael, Varughese, Melvin and Macdonald, Iain. "Sequential quantiles via Hermite series density estimation." Electronic Journal of Statistics 11.1 (2017): 570-607 <doi:10.1214/17-EJS1245>, Stephanou, Michael and Varughese, Melvin. "On the properties of Hermite series based distribution function estimators." Metrika (2020) <doi:10.1007/s00184-020-00785-z> and Stephanou, Michael and Varughese, Melvin. "Sequential estimation of Spearman rank correlation using Hermite series estimators." Journal of Multivariate Analysis (2021) <doi:10.1016/j.jmva.2021.104783>.

Maintained by Michael Stephanou. Last updated 7 months ago.

cumulative-distribution-function kendall-correlation-coefficient online-algorithms probability-density-function quantile spearman-correlation-coefficient statistics streaming-algorithms streaming-data cpp

1.5 match 15 stars 5.58 score 17 scripts

bioc

iSEEhub:iSEE for the Bioconductor ExperimentHub

This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

dataimport immunooncology infrastructure shinyapps singlecell software bioconductor bioconductor-package hacktoberfest isee

1.5 match 3 stars 5.56 score 4 scripts

hughparsonage

TeXCheckR:Parses LaTeX Documents for Errors

Checks LaTeX documents and .bib files for typing errors, such as spelling errors, incorrect quotation marks. Also provides useful functions for parsing and linting bibliography files.

Maintained by Hugh Parsonage. Last updated 1 years ago.

bibtex latex spellcheck

1.9 match 8 stars 4.44 score 23 scripts

bioc

iSEEde:iSEE extension for panels related to differential expression analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 4 months ago.

software infrastructure differentialexpression bioconductor hacktoberfest iseeu

1.5 match 1 stars 5.38 score 15 scripts

bioc

visiumStitched:Enable downstream analysis of Visium capture areas stitched together with Fiji

This package provides helper functions for working with multiple Visium capture areas that overlap each other. This package was developed along with the companion example use case data available from https://github.com/LieberInstitute/visiumStitched_brain. visiumStitched prepares SpaceRanger (10x Genomics) output files so you can stitch the images from groups of capture areas together with Fiji. Then visiumStitched builds a SpatialExperiment object with the stitched data and makes an artificial hexogonal grid enabling the seamless use of spatial clustering methods that rely on such grid to identify neighboring spots, such as PRECAST and BayesSpace. The SpatialExperiment objects created by visiumStitched are compatible with spatialLIBD, which can be used to build interactive websites for stitched SpatialExperiment objects. visiumStitched also enables casting SpatialExperiment objects as Seurat objects.

Maintained by Nicholas J. Eagles. Last updated 3 months ago.

software spatial transcriptomics transcription geneexpression visualization dataimport 10xgenomics bioconductor spatial-transcriptomics spatialexperiment spatiallibd visium

1.5 match 1 stars 5.36 score 4 scripts

jl5000

tidyged:Handle GEDCOM Files Using Tidyverse Principles

Create and summarise family tree GEDCOM files using tidy dataframes.

Maintained by Jamie Lendrum. Last updated 3 years ago.

1.3 match 8 stars 5.96 score 23 scripts 3 dependents

ltrr-arizona-edu

burnr:Forest Fire History Analysis

Tools to read, write, parse, and analyze forest fire history data (e.g. FHX). Described in Malevich et al. (2018) <doi:10.1016/j.dendro.2018.02.005>.

Maintained by Steven Malevich. Last updated 3 years ago.

citation dendrochronology ecology forestfire plot scientific statistics

1.3 match 15 stars 5.95 score 59 scripts

bioc

epialleleR:Fast, Epiallele-Aware Methylation Caller and Reporter

Epialleles are specific DNA methylation patterns that are mitotically and/or meiotically inherited. This package calls and reports cytosine methylation as well as frequencies of hypermethylated epialleles at the level of genomic regions or individual cytosines in next-generation sequencing data using binary alignment map (BAM) files as an input. Among other things, this package can also extract and visualise methylation patterns and assess allele specificity of methylation.

Maintained by Oleksii Nikolaienko. Last updated 11 days ago.

dnamethylation epigenetics methylseq longread bioconductor dna-methylation epiallele next-generation-sequencing samtools curl bzip2 xz-utils zlib cpp

1.3 match 4 stars 5.94 score 5 scripts

aravind-j

augmentedRCBD:Analysis of Augmented Randomised Complete Block Designs

Functions for analysis of data generated from experiments in augmented randomised complete block design according to Federer, W.T. (1961) <doi:10.2307/2527837>. Computes analysis of variance, adjusted means, descriptive statistics, genetic variability statistics etc. Further includes data visualization and report generation functions.

Maintained by J. Aravind. Last updated 5 months ago.

augmented-block augmented-design augmented-rcbd

1.3 match 7 stars 5.94 score 21 scripts

bioc

consensusSeekeR:Detection of consensus regions inside a group of experiences using genomic positions and genomic ranges

This package compares genomic positions and genomic ranges from multiple experiments to extract common regions. The size of the analyzed region is adjustable as well as the number of experiences in which a feature must be present in a potential region to tag this region as a consensus region. In genomic analysis where feature identification generates a position value surrounded by a genomic range, such as ChIP-Seq peaks and nucleosome positions, the replication of an experiment may result in slight differences between predicted values. This package enables the conciliation of the results into consensus regions.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion chipseq genetics multiplecomparison transcription peakdetection sequencing coverage chip-seq-analysis genomic-data-analysis nucleosome-positioning

1.5 match 1 stars 5.26 score 5 scripts 1 dependents

brian-j-smith

MRMCaov:Multi-Reader Multi-Case Analysis of Variance

Estimation and comparison of the performances of diagnostic tests in multi-reader multi-case studies where true case statuses (or ground truths) are known and one or more readers provide test ratings for multiple cases. Reader performance metrics are provided for area under and expected utility of ROC curves, likelihood ratio of positive or negative tests, and sensitivity and specificity. ROC curves can be estimated empirically or with binormal or binormal likelihood-ratio models. Statistical comparisons of diagnostic tests are based on the ANOVA model of Obuchowski-Rockette and the unified framework of Hillis (2005) <doi:10.1002/sim.2024>. The ANOVA can be conducted with data from a full factorial, nested, or partially paired study design; with random or fixed readers or cases; and covariances estimated with the DeLong method, jackknifing, or an unbiased method. Smith and Hillis (2020) <doi:10.1117/12.2549075>.

Maintained by Brian J Smith. Last updated 2 years ago.

1.5 match 12 stars 5.26 score 8 scripts 1 dependents

bioc

qsvaR:Generate Quality Surrogate Variable Analysis for Degradation Correction

The qsvaR package contains functions for removing the effect of degration in rna-seq data from postmortem brain tissue. The package is equipped to help users generate principal components associated with degradation. The components can be used in differential expression analysis to remove the effects of degradation.

Maintained by Hedia Tnani. Last updated 3 months ago.

software workflowstep normalization biologicalquestion differentialexpression sequencing coverage bioconductor brain degradation human qsva

1.5 match 5.26 score 4 scripts

jjustison

SiPhyNetwork:A Phylogenetic Simulator for Reticulate Evolution

A simulator for reticulate evolution under a birth-death-hybridization process. Here the birth-death process is extended to consider reticulate Evolution by allowing hybridization events to occur. The general purpose simulator allows the modeling of three different reticulate patterns: lineage generative hybridization, lineage neutral hybridization, and lineage degenerative hybridization. Users can also specify hybridization events to be dependent on a trait value or genetic distance. We also extend some phylogenetic tree utility and plotting functions for networks. We allow two different stopping conditions: simulated to a fixed time or number of taxa. When simulating to a fixed number of taxa, the user can simulate under the Generalized Sampling Approach that properly simulates phylogenies when assuming a uniform prior on the root age.

Maintained by Joshua Justison. Last updated 6 months ago.

cpp

1.5 match 11 stars 5.25 score 16 scripts

bioc

TREG:Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data

RNA abundance and cell size parameters could improve RNA-seq deconvolution algorithms to more accurately estimate cell type proportions given the different cell type transcription activity levels. A Total RNA Expression Gene (TREG) can facilitate estimating total RNA content using single molecule fluorescent in situ hybridization (smFISH). We developed a data-driven approach using a measure of expression invariance to find candidate TREGs in postmortem human brain single nucleus RNA-seq. This R package implements the method for identifying candidate TREGs from snRNA-seq data.

Maintained by Louise Huuki-Myers. Last updated 3 months ago.

software singlecell rnaseq geneexpression transcriptomics transcription sequencing bioconductor deconvolution rnascope scrna-seq smfish snrna-seq treg

1.5 match 4 stars 5.20 score 5 scripts

bioc

regutools:regutools: an R package for data extraction from RegulonDB

RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.

Maintained by Joselyn Chavez. Last updated 3 months ago.

generegulation geneexpression systemsbiology network networkinference visualization transcription bioconductor cdsb regulondb

1.5 match 4 stars 5.20 score 6 scripts

bioc

snapcount:R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts

snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).

Maintained by Rone Charles. Last updated 5 months ago.

coverage geneexpression rnaseq sequencing software dataimport

1.5 match 3 stars 5.19 score 13 scripts

ropensci

antanym:Antarctic Geographic Place Names

Antarctic geographic names from the Composite Gazetteer of Antarctica, and functions for working with those place names.

Maintained by Ben Raymond. Last updated 3 years ago.

antarctic southern ocean place names gazetteer peer-reviewed

2.0 match 7 stars 3.89 score 22 scripts

bioc

spaSim:Spatial point data simulator for tissue images

A suite of functions for simulating spatial patterns of cells in tissue images. Output images are multitype point data in SingleCellExperiment format. Each point represents a cell, with its 2D locations and cell type. Potential cell patterns include background cells, tumour/immune cell clusters, immune rings, and blood/lymphatic vessels.

Maintained by Yuzhou Feng. Last updated 5 months ago.

statisticalmethod spatial biomedicalinformatics

1.5 match 2 stars 5.18 score 25 scripts

bioc

derfinderHelper:derfinder helper package

Helper package for speeding up the derfinder package when using multiple cores. This package is particularly useful when using BiocParallel and it helps reduce the time spent loading the full derfinder package when running the F-statistics calculation in parallel.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq software immunooncology bioconductor derfinder

1.3 match 6.20 score 7 dependents

trevorhastie

glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models

Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.

Maintained by Trevor Hastie. Last updated 2 years ago.

fortran cpp

0.5 match 82 stars 15.15 score 22k scripts 736 dependents

earthsystemdiagnostics

sedproxy:Simulation of Sediment Archived Climate Proxy Records

Proxy forward modelling for sediment archived climate proxies such as Mg/Ca, d18O or Alkenones. The user provides a hypothesised "true" past climate, such as output from a climate model, and details of the sedimentation rate and sampling scheme of a sediment core. Sedproxy returns simulated proxy records. Implements the methods described in Dolman and Laepple (2018) <doi:10.5194/cp-14-1851-2018>.

Maintained by Andrew Dolman. Last updated 1 months ago.

1.5 match 7 stars 5.10 score 18 scripts

bioc

SplineDV:Differential Variability (DV) analysis for single-cell RNA sequencing data. (e.g. Identify Differentially Variable Genes across two experimental conditions)

A spline based scRNA-seq method for identifying differentially variable (DV) genes across two experimental conditions. Spline-DV constructs a 3D spline from 3 key gene statistics: mean expression, coefficient of variance, and dropout rate. This is done for both conditions. The 3D spline provides the “expected” behavior of genes in each condition. The distance of the observed mean, CV and dropout rate of each gene from the expected 3D spline is used to measure variability. As the final step, the spline-DV method compares the variabilities of each condition to identify differentially variable (DV) genes.

Maintained by Shreyan Gupta. Last updated 1 months ago.

software singlecell sequencing differentialexpression rnaseq geneexpression transcriptomics featureextraction

1.5 match 2 stars 5.08 score 3 scripts

bioc

icetea:Integrating Cap Enrichment with Transcript Expression Analysis

icetea (Integrating Cap Enrichment with Transcript Expression Analysis) provides functions for end-to-end analysis of multiple 5'-profiling methods such as CAGE, RAMPAGE and MAPCap, beginning from raw reads to detection of transcription start sites using replicates. It also allows performing differential TSS detection between group of samples, therefore, integrating the mRNA cap enrichment information with transcript expression analysis.

Maintained by Vivek Bhardwaj. Last updated 5 months ago.

immunooncology transcription geneexpression sequencing rnaseq transcriptomics differentialexpression cage expression rna-seq

1.5 match 2 stars 5.08 score 7 scripts

bioc

yamss:Tools for high-throughput metabolomics

Tools to analyze and visualize high-throughput metabolomics data aquired using chromatography-mass spectrometry. These tools preprocess data in a way that enables reliable and powerful differential analysis. At the core of these methods is a peak detection phase that pools information across all samples simultaneously. This is in contrast to other methods that detect peaks in a sample-by-sample basis.

Maintained by Leslie Myint. Last updated 5 months ago.

massspectrometry metabolomics peakdetection software

1.5 match 3 stars 5.08 score 9 scripts

ralmond

CPTtools:Tools for Creating Conditional Probability Tables

Provides support parameterized tables for Bayesian networks, particularly the IRT-like DiBello tables. Also, provides some tools for visualing the networks.

Maintained by Russell Almond. Last updated 3 months ago.

bayesian-network statistics

1.5 match 1 stars 5.05 score 21 scripts 4 dependents

giscience

ohsome:An 'ohsome API' Client

A client that grants access to the power of the 'ohsome API' from R. It lets you analyze the rich data source of the 'OpenStreetMap (OSM)' history. You can retrieve the geometry of 'OSM' data at specific points in time, and you can get aggregated statistics on the evolution of 'OSM' elements and specify your own temporal, spatial and/or thematic filters.

Maintained by Oliver Fritz. Last updated 2 years ago.

heigit ohsome openstreetmap openstreetmap-data openstreetmap-history osm osm-data

1.5 match 11 stars 5.04 score 9 scripts

bioc

nucleoSim:Generate synthetic nucleosome maps

This package can generate a synthetic map with reads covering the nucleosome regions as well as a synthetic map with forward and reverse reads emulating next-generation sequencing. The synthetic hybridization data of “Tiling Arrays” can also be generated. The user has choice between three different distributions for the read positioning: Normal, Student and Uniform. In addition, a visualization tool is provided to explore the synthetic nucleosome maps.

Maintained by Astrid Deschênes. Last updated 5 months ago.

genetics sequencing software statisticalmethod alignment bioconductor nucleosome-maps nucleosomes simulation simulator synthetic-nucleosomes

1.5 match 2 stars 5.00 score 8 scripts

ajbass

sffdr:Surrogate Functional False Discovery Rates for Genome-Wide Association Studies

Pleiotropy-informed significance analysis of genome-wide association studies with surrogate functional false discovery rates (sfFDR). The sfFDR framework adapts the fFDR to leverage informative data from multiple sets of GWAS summary statistics to increase power in study while accommodating for linkage disequilibrium. sfFDR provides estimates of key FDR quantities in a significance analysis such as the functional local FDR and $q$-value, and uses these estimates to derive a functional $p$-value for type I error rate control and a functional local Bayes' factor for post-GWAS analyses (e.g., fine mapping and colocalization).

Maintained by Andrew Bass. Last updated 1 months ago.

cpp

1.5 match 4 stars 5.00 score 3 scripts

bioc

rRDP:Interface to the RDP Classifier

This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.

Maintained by Michael Hahsler. Last updated 5 months ago.

genetics sequencing infrastructure classification microbiome immunooncology alignment sequencematching dataimport bayesian bioconductor bioinformatics openjdk

1.5 match 4 stars 5.00 score 6 scripts

deploid-dev

DEploid:Deconvolute Mixed Genomes with Unknown Proportions

Traditional phasing programs are limited to diploid organisms. Our method modifies Li and Stephens algorithm with Markov chain Monte Carlo (MCMC) approaches, and builds a generic framework that allows haplotype searches in a multiple infection setting. This package is primarily developed as part of the Pf3k project, which is a global collaboration using the latest sequencing technologies to provide a high-resolution view of natural variation in the malaria parasite Plasmodium falciparum. Parasite DNA are extracted from patient blood sample, which often contains more than one parasite strain, with unknown proportions. This package is used for deconvoluting mixed haplotypes, and reporting the mixture proportions from each sample.

Maintained by Joe Zhu. Last updated 2 months ago.

deconvoluting-mixed-genomes hmm malaria mcmc parasites phasing unknown-proportions zlib cpp

1.5 match 1 stars 4.99 score 39 scripts

bioc

awst:Asymmetric Within-Sample Transformation

We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.

Maintained by Davide Risso. Last updated 5 months ago.

normalization geneexpression rnaseq software transcriptomics sequencing singlecell

1.5 match 3 stars 4.95 score 15 scripts

bioc

iSEEpathways:iSEE extension for panels related to pathway analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of pathway analysis results. This package does not perform pathway analysis. Instead, it provides methods to embed precomputed pathway analysis results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure differentialexpression geneexpression gui visualization pathways genesetenrichment go shinyapps bioconductor hacktoberfest isee iseeu

1.5 match 1 stars 4.95 score 10 scripts

bioc

epigraHMM:Epigenomic R-based analysis with hidden Markov models

epigraHMM provides a set of tools for the analysis of epigenomic data based on hidden Markov Models. It contains two separate peak callers, one for consensus peaks from biological or technical replicates, and one for differential peaks from multi-replicate multi-condition experiments. In differential peak calling, epigraHMM provides window-specific posterior probabilities associated with every possible combinatorial pattern of read enrichment across conditions.

Maintained by Pedro Baldoni. Last updated 5 months ago.

chipseq atacseq dnaseseq hiddenmarkovmodel epigenetics zlib openblas cpp openmp

1.5 match 4.94 score 88 scripts

tjetka

SLEMI:Statistical Learning Based Estimation of Mutual Information

The implementation of the algorithm for estimation of mutual information and channel capacity from experimental data by classification procedures (logistic regression). Technically, it allows to estimate information-theoretic measures between finite-state input and multivariate, continuous output. Method described in Jetka et al. (2019) <doi:10.1371/journal.pcbi.1007132>.

Maintained by Tomasz Jetka. Last updated 1 years ago.

channel-capacity information-theory logistic-regression mutual-information-estimation

1.5 match 4 stars 4.92 score 21 scripts

muschellij2

gcite:Google Citation Parser

Scrapes Google Citation pages and creates data frames of citations over time.

Maintained by John Muschelli. Last updated 3 years ago.

2.0 match 3 stars 3.67 score 31 scripts

joeroe

rpaleoclim:Download Paleoclimate Data from 'PaleoClim'

'PaleoClim' <http://www.paleoclim.org> (Brown et al. 2019, <doi:10.1038/sdata.2018.254>) is a set of free, high resolution paleoclimate surfaces covering the whole globe. It includes data on surface temperature, precipitation and the standard bioclimatic variables commonly used in ecological modelling, derived from the 'HadCM3' general circulation model and downscaled to a spatial resolution of up to 2.5 minutes. Simulations are available for key time periods from the Late Holocene to mid-Pliocene. Data on current and Last Glacial Maximum climate is derived from 'CHELSA' (Karger et al. 2017, <doi:10.1038/sdata.2017.122>) and reprocessed by 'PaleoClim' to match their format; it is available at up to 30 seconds resolution. This package provides a simple interface for downloading 'PaleoClim' data in R, with support for caching and filtering retrieved data by period, resolution, and geographic extent.

Maintained by Joe Roe. Last updated 2 years ago.

paleoclimate

1.5 match 15 stars 4.88 score 3 scripts

bioc

rmelting:R Interface to MELTING 5

R interface to the MELTING 5 program (https://www.ebi.ac.uk/biomodels/tools/melting/) to compute melting temperatures of nucleic acid duplexes along with other thermodynamic parameters.

Maintained by J. Aravind. Last updated 5 months ago.

biomedicalinformatics cheminformatics bioconductor bioinformatics melting-temperature openjdk

1.5 match 2 stars 4.78 score 10 scripts

angeella

pARI:Permutation-Based All-Resolutions Inference

Computes the All-Resolution Inference method in the permutation framework, i.e., simultaneous lower confidence bounds for the number of true discoveries. <doi:10.1002/sim.9725>.

Maintained by Angela Andreella. Last updated 6 months ago.

ari cluster-map copes discoveries fmri fsl permutation selective-inference simultaneous-confidence-bounds spm openblas cpp

1.5 match 4 stars 4.78 score 9 scripts 1 dependents

bioc

wpm:Well Plate Maker

The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.

Maintained by Helene Borges. Last updated 5 months ago.

gui proteomics massspectrometry batcheffect experimentaldesign

1.5 match 6 stars 4.78 score 7 scripts

bioc

decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting

Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.

Maintained by Rosario M. Piro. Last updated 5 months ago.

software snp sequencing dnaseq genomicvariation somaticmutation biomedicalinformatics genetics biologicalquestion statisticalmethod

1.5 match 1 stars 4.78 score 10 scripts 1 dependents

ajaygpb

ammistability:Additive Main Effects and Multiplicative Interaction Model Stability Parameters

Computes various stability parameters from Additive Main Effects and Multiplicative Interaction (AMMI) analysis results such as Modified AMMI Stability Value (MASV), Sums of the Absolute Value of the Interaction Principal Component Scores (SIPC), Sum Across Environments of Genotype-Environment Interaction Modelled by AMMI (AMGE), Sum Across Environments of Absolute Value of Genotype-Environment Interaction Modelled by AMMI (AV_(AMGE)), AMMI Stability Index (ASI), Modified ASI (MASI), AMMI Based Stability Parameter (ASTAB), Annicchiarico's D Parameter (DA), Zhang's D Parameter (DZ), Averages of the Squared Eigenvector Values (EV), Stability Measure Based on Fitted AMMI Model (FA), Absolute Value of the Relative Contribution of IPCs to the Interaction (Za). Further calculates the Simultaneous Selection Index for Yield and Stability from the computed stability parameters. See the vignette for complete list of citations for the methods implemented.

Maintained by B. C. Ajay. Last updated 2 years ago.

1.5 match 3 stars 4.76 score 19 scripts

mw201608

NetWeaver:Graphic Presentation of Complex Genomic and Network Data Analysis

Implements various simple function utilities and flexible pipelines to generate circular images for visualizing complex genomic and network data analysis features.

Maintained by Minghui Wang. Last updated 2 years ago.

1.5 match 4 stars 4.75 score 28 scripts

bioc

plyinteractions:Extending tidy verbs to genomic interactions

Operate on `GInteractions` objects as tabular data using `dplyr`-like verbs. The functions and methods in `plyinteractions` provide a grammatical approach to manipulate `GInteractions`, to facilitate their integration in genomic analysis workflows.

Maintained by Jacques Serizay. Last updated 5 months ago.

software infrastructure

1.5 match 4.75 score 14 scripts

annechao

iNEXT.beta3D:Interpolation and Extrapolation with Beta Diversity for Three Dimensions of Biodiversity

As a sequel to 'iNEXT', the 'iNEXT.beta3D' package provides functions to compute standardized taxonomic, phylogenetic, and functional diversity (3D) estimates with a common sample size (for alpha and gamma diversity) or sample coverage (for alpha, beta, gamma diversity as well as dissimilarity or turnover indices). Hill numbers and their generalizations are used to quantify 3D and to make multiplicative decomposition (gamma = alpha x beta). The package also features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of beta diversity across datasets. See Chao et al. (2023) <doi:10.1002/ecm.1588> for more details.

Maintained by Anne Chao. Last updated 4 months ago.

1.3 match 5.30 score 6 scripts

bioc

AWFisher:An R package for fast computing for adaptively weighted fisher's method

Implementation of the adaptively weighted fisher's method, including fast p-value computing, variability index, and meta-pattern.

Maintained by Zhiguang Huo. Last updated 5 months ago.

statisticalmethod software

1.5 match 5 stars 4.70 score 4 scripts

bioc

flowcatchR:Tools to analyze in vivo microscopy imaging data focused on tracking flowing blood cells

flowcatchR is a set of tools to analyze in vivo microscopy imaging data, focused on tracking flowing blood cells. It guides the steps from segmentation to calculation of features, filtering out particles not of interest, providing also a set of utilities to help checking the quality of the performed operations (e.g. how good the segmentation was). It allows investigating the issue of tracking flowing cells such as in blood vessels, to categorize the particles in flowing, rolling and adherent. This classification is applied in the study of phenomena such as hemostasis and study of thrombosis development. Moreover, flowcatchR presents an integrated workflow solution, based on the integration with a Shiny App and Jupyter notebooks, which is delivered alongside the package, and can enable fully reproducible bioimage analysis in the R environment.

Maintained by Federico Marini. Last updated 3 months ago.

software visualization cellbiology classification infrastructure gui shinyapps bioconductor fluorescence microscopy particles tracking

1.3 match 4 stars 5.62 score 8 scripts

bioc

TargetDecoy:Diagnostic Plots to Evaluate the Target Decoy Approach

A first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.

Maintained by Elke Debrie. Last updated 5 months ago.

massspectrometry proteomics qualitycontrol software visualization bioconductor mass-spectrometry

1.5 match 1 stars 4.60 score 9 scripts

rrwen

nbc4va:Bayes Classifier for Verbal Autopsy Data

An implementation of the Naive Bayes Classifier (NBC) algorithm used for Verbal Autopsy (VA) built on code from Miasnikof et al (2015) <DOI:10.1186/s12916-015-0521-2>.

Maintained by Richard Wen. Last updated 3 years ago.

autopsy bayes cause classifier coded computer death estimate imputation learning machine mds million naive nbc probability study theory va verbal

1.5 match 4.60 score 79 scripts