Showing 200 of total 328 results (show query)
moosa-r
rbioapi:User-Friendly R Interface to Biologic Web Services' API
Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.
Maintained by Moosa Rezwani. Last updated 1 months ago.
api-clientbioinformaticsbiologyenrichmentenrichment-analysisenrichrjasparmieaaover-representation-analysispantherreactomestringuniprot
20.8 match 20 stars 7.60 score 55 scriptseguidotti
calculus:High Dimensional Numerical and Symbolic Calculus
Efficient C++ optimized functions for numerical and symbolic calculus as described in Guidotti (2022) <doi:10.18637/jss.v104.i05>. It includes basic arithmetic, tensor calculus, Einstein summing convention, fast computation of the Levi-Civita symbol and generalized Kronecker delta, Taylor series expansion, multivariate Hermite polynomials, high-order derivatives, ordinary differential equations, differential operators (Gradient, Jacobian, Hessian, Divergence, Curl, Laplacian) and numerical integration in arbitrary orthogonal coordinate systems: cartesian, polar, spherical, cylindrical, parabolic or user defined by custom scale factors.
Maintained by Emanuele Guidotti. Last updated 2 years ago.
calculuscoordinate-systemscurldivergenceeinsteinfinite-differencegradienthermitehessianjacobianlaplaciannumerical-derivationnumerical-derivativesnumerical-differentiationsymbolic-computationsymbolic-differentiationtaylorcpp
14.0 match 47 stars 8.92 score 66 scripts 7 dependentseasystats
report:Automated Reporting of Results and Statistical Models
The aim of the 'report' package is to bridge the gap between R’s output and the formatted results contained in your manuscript. This package converts statistical models and data frames into textual reports suited for publication, ensuring standardization and quality in results reporting.
Maintained by Rémi Thériault. Last updated 1 months ago.
anovasapaautomated-report-generationautomaticbayesiandescribeeasystatshacktoberfestmanuscriptmodelsreportreportingreportsscientificstatsmodels
8.5 match 698 stars 14.48 score 1.1k scripts 3 dependentsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 12 hours ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
5.4 match 582 stars 21.11 score 31k scripts 1.9k dependentsropengov
regions:Processing Regional Statistics
Validating sub-national statistical typologies, re-coding across standard typologies of sub-national statistics, and making valid aggregate level imputation, re-aggregation, re-weighting and projection down to lower hierarchical levels to create meaningful data panels and time series.
Maintained by Daniel Antal. Last updated 2 years ago.
observatoryregionsropengovstatistics
12.8 match 12 stars 8.81 score 67 scripts 5 dependentsbioc
scviR:experimental inferface from R to scvi-tools
This package defines interfaces from R to scvi-tools. A vignette works through the totalVI tutorial for analyzing CITE-seq data. Another vignette compares outputs of Chapter 12 of the OSCA book with analogous outputs based on totalVI quantifications. Future work will address other components of scvi-tools, with a focus on building understanding of probabilistic methods based on variational autoencoders.
Maintained by Vincent Carey. Last updated 5 months ago.
infrastructuresinglecelldataimportbioconductorcite-seqscverse
20.0 match 6 stars 5.60 score 11 scriptsropensci
rcites:R Interface to the Species+ Database
A programmatic interface to the Species+ <https://speciesplus.net/> database via the Species+/CITES Checklist API <https://api.speciesplus.net/>.
Maintained by Kevin Cazelles. Last updated 2 years ago.
api-clientcitesdatabaseendangered-speciestrade
14.8 match 14 stars 6.52 score 26 scriptsdoi-usgs
dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data
Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.
Maintained by Laura DeCicco. Last updated 18 days ago.
5.3 match 280 stars 14.18 score 1.7k scripts 15 dependentscboettig
knitcitations:Citations for 'Knitr' Markdown Files
Provides the ability to create dynamic citations in which the bibliographic information is pulled from the web rather than having to be entered into a local database such as 'bibtex' ahead of time. The package is primarily aimed at authoring in the R 'markdown' format, and can provide outputs for web-based authoring such as linked text for inline citations. Cite using a 'DOI', URL, or 'bibtex' file key. See the package URL for details.
Maintained by Carl Boettiger. Last updated 4 years ago.
7.0 match 220 stars 10.21 score 836 scripts 2 dependentsstatnet
statnet.common:Common R Scripts and Utilities Used by the Statnet Project Software
Non-statistical utilities used by the software developed by the Statnet Project. They may also be of use to others.
Maintained by Pavel N. Krivitsky. Last updated 27 days ago.
6.0 match 8 stars 11.42 score 197 scripts 148 dependentsropensci
RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.
Maintained by Mathew W. McLean. Last updated 4 months ago.
5.6 match 115 stars 12.06 score 2.3k scripts 16 dependentsrezakj
iCellR:Analyzing High-Throughput Single Cell Sequencing Data
A toolkit that allows scientists to work with data from single cell sequencing technologies such as scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST). Single (i) Cell R package ('iCellR') provides unprecedented flexibility at every step of the analysis pipeline, including normalization, clustering, dimensionality reduction, imputation, visualization, and so on. Users can design both unsupervised and supervised models to best suit their research. In addition, the toolkit provides 2D and 3D interactive visualizations, differential expression analysis, filters based on cells, genes and clusters, data merging, normalizing for dropouts, data imputation methods, correcting for batch differences, pathway analysis, tools to find marker genes for clusters and conditions, predict cell types and pseudotime analysis. See Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.05.05.078550> and Khodadadi-Jamayran, et al (2020) <doi:10.1101/2020.03.31.019109> for more details.
Maintained by Alireza Khodadadi-Jamayran. Last updated 8 months ago.
10xgenomics3dbatch-normalizationcell-type-classificationcite-seqclusteringclustering-algorithmdiffusion-mapsdropouticellrimputationintractive-graphnormalizationpseudotimescrna-seqscvdj-seqsingel-cell-sequencingumapcpp
9.7 match 121 stars 5.56 score 7 scripts 1 dependentsalexisvdb
singleCellHaystack:A Universal Differential Expression Prediction Tool for Single-Cell and Spatial Genomics Data
One key exploratory analysis step in single-cell genomics data analysis is the prediction of features with different activity levels. For example, we want to predict differentially expressed genes (DEGs) in single-cell RNA-seq data, spatial DEGs in spatial transcriptomics data, or differentially accessible regions (DARs) in single-cell ATAC-seq data. 'singleCellHaystack' predicts differentially active features in single cell omics datasets without relying on the clustering of cells into arbitrary clusters. 'singleCellHaystack' uses Kullback-Leibler divergence to find features (e.g., genes, genomic regions, etc) that are active in subsets of cells that are non-randomly positioned inside an input space (such as 1D trajectories, 2D tissue sections, multi-dimensional embeddings, etc). For the theoretical background of 'singleCellHaystack' we refer to our original paper Vandenbon and Diez (Nature Communications, 2020) <doi:10.1038/s41467-020-17900-3> and our update Vandenbon and Diez (Scientific Reports, 2023) <doi:10.1038/s41598-023-38965-2>.
Maintained by Alexis Vandenbon. Last updated 1 years ago.
bioinformaticscite-seqpseudotimescatac-seqsingle-cellspatial-proteomicsspatial-transcriptomicstranscriptomics
7.5 match 81 stars 6.71 score 64 scriptsbioc
recountmethylation:Access and analyze public DNA methylation array data compilations
Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.
Maintained by Sean K Maden. Last updated 5 months ago.
dnamethylationepigeneticsmicroarraymethylationarrayexperimenthub
7.5 match 9 stars 6.28 score 9 scriptsmathewchamberlain
SignacX:Cell Type Identification and Discovery from Single Cell Gene Expression Data
An implementation of neural networks trained with flow-sorted gene expression data to classify cellular phenotypes in single cell RNA-sequencing data. See Chamberlain M et al. (2021) <doi:10.1101/2021.02.01.429207> for more details.
Maintained by Mathew Chamberlain. Last updated 2 years ago.
cellular-phenotypesseuratsingle-cell-rna-seq
5.6 match 24 stars 6.46 score 34 scriptsbioc
beadarray:Quality assessment and low-level analysis for Illumina BeadArray data
The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.
Maintained by Mark Dunning. Last updated 5 months ago.
microarrayonechannelqualitycontrolpreprocessing
4.5 match 7.88 score 70 scripts 4 dependentsbioc
MuData:Serialization for MultiAssayExperiment Objects
Save MultiAssayExperiments to h5mu files supported by muon and mudata. Muon is a Python framework for multimodal omics data analysis. It uses an HDF5-based format for data storage.
Maintained by Ilia Kats. Last updated 20 days ago.
dataimportanndatabioconductormudatamulti-omicsmultimodal-omicsscrna-seq
6.0 match 5 stars 5.89 score 26 scriptsarnaudgallou
pakret:Cite 'R' Packages on the Fly in 'R Markdown' and 'Quarto'
References and cites 'R' and 'R' packages on the fly in 'R Markdown' and 'Quarto'. 'pakret' provides a minimalistic API that generates preformatted citations of 'R' and 'R' packages, and adds their reference to a '.bib' file directly from within your document.
Maintained by Arnaud Gallou. Last updated 18 days ago.
bibbibtexcitationcitationsgenerate
7.1 match 5 stars 4.51 score 5 scriptscjvanlissa
worcs:Workflow for Open Reproducible Code in Science
Create reproducible and transparent research projects in 'R'. This package is based on the Workflow for Open Reproducible Code in Science (WORCS), a step-by-step procedure based on best practices for Open Science. It includes an 'RStudio' project template, several convenience functions, and all dependencies required to make your project reproducible and transparent. WORCS is explained in the tutorial paper by Van Lissa, Brandmaier, Brinkman, Lamprecht, Struiksma, & Vreede (2021). <doi:10.3233/DS-210031>.
Maintained by Caspar J. Van Lissa. Last updated 11 days ago.
3.3 match 83 stars 9.26 score 59 scriptsdormancy1
lefko3:Historical and Ahistorical Population Projection Matrix Analysis
Complete analytical environment for the construction and analysis of matrix population models and integral projection models. Includes the ability to construct historical matrices, which are 2d matrices comprising 3 consecutive times of demographic information. Estimates both raw and function-based forms of historical and standard ahistorical matrices. It also estimates function-based age-by-stage matrices and raw and function-based Leslie matrices.
Maintained by Richard P. Shefferson. Last updated 4 days ago.
9.0 match 3.30 score 11 scriptsvegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 16 days ago.
ecological-modellingecologyordinationfortranopenblas
1.5 match 472 stars 19.41 score 15k scripts 440 dependentsbioc
recount:Explore and download data from the recount project
Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportimmunooncologyannotation-agnosticbioconductorcountderfinderdeseq2exongenehumanilluminajunctionrecount
2.8 match 41 stars 9.57 score 498 scripts 3 dependentsgeobosh
Rdpack:Update and Manipulate Rd Documentation Objects
Functions for manipulation of R documentation objects, including functions reprompt() and ereprompt() for updating 'Rd' documentation for functions, methods and classes; 'Rd' macros for citations and import of references from 'bibtex' files for use in 'Rd' files and 'roxygen2' comments; 'Rd' macros for evaluating and inserting snippets of 'R' code and the results of its evaluation or creating graphics on the fly; and many functions for manipulation of references and Rd files.
Maintained by Georgi N. Boshnakov. Last updated 11 hours ago.
bibtexbibtex-referencescitationsdocumentationrd-formatroxygen2
1.9 match 30 stars 13.76 score 73 scripts 2.3k dependentssvmiller
stevemisc:Steve's Miscellaneous Functions
These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.
Maintained by Steve Miller. Last updated 6 days ago.
dplyrmixed-effects-modelsmultivariate-normal-distributiontidyverse
3.8 match 10 stars 6.85 score 392 scripts 2 dependentscrsh
rmdfiltr:'Lua'-Filters for R Markdown
A collection of 'Lua' filters that extend the functionality of R Markdown templates (e.g., count words or post-process citations).
Maintained by Frederik Aust. Last updated 5 months ago.
3.1 match 42 stars 8.08 score 4 scripts 3 dependentswaldronlab
SingleCellMultiModal:Integrating Multi-modal Single Cell Experiment datasets
SingleCellMultiModal is an ExperimentHub package that serves multiple datasets obtained from GEO and other sources and represents them as MultiAssayExperiment objects. We provide several multi-modal datasets including scNMT, 10X Multiome, seqFISH, CITEseq, SCoPE2, and others. The scope of the package is is to provide data for benchmarking and analysis. To cite, use the 'citation' function and see <https://doi.org/10.1371/journal.pcbi.1011324>.
Maintained by Marcel Ramos. Last updated 4 months ago.
experimentdatasinglecelldatareproducibleresearchexperimenthubgeobioconductor-packageu24ca289073
3.3 match 17 stars 7.29 score 60 scriptsbioc
lute:Framework for cell size scale factor normalized bulk transcriptomics deconvolution experiments
Provides a framework for adjustment on cell type size when performing bulk transcripomics deconvolution. The main framework function provides a means of reference normalization using cell size scale factors. It allows for marker selection and deconvolution using non-negative least squares (NNLS) by default. The framework is extensible for other marker selection and deconvolution algorithms, and users may reuse the generics, methods, and classes for these when developing new algorithms.
Maintained by Sean K Maden. Last updated 5 months ago.
rnaseqsequencingsinglecellcoveragetranscriptomicsnormalization
4.5 match 2 stars 5.26 score 3 scriptsropensci
rgbif:Interface to the Global Biodiversity Information Facility API
A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.
Maintained by John Waller. Last updated 3 days ago.
gbifspecimensapiweb-servicesoccurrencesspeciestaxonomybiodiversitydatalifewatchoscibiospocc
1.8 match 161 stars 13.26 score 2.1k scripts 20 dependentsropensci
tidypmc:Parse Full Text XML Documents from PubMed Central
Parse XML documents from the Open Access subset of Europe PubMed Central <https://europepmc.org> including section paragraphs, tables, captions and references.
Maintained by Chris Stubben. Last updated 5 years ago.
3.8 match 33 stars 5.95 score 27 scriptsbioc
MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor
Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.
Maintained by Marcel Ramos. Last updated 2 months ago.
infrastructuredatarepresentationbioconductorbioconductor-packagegenomicsnci-itcrtcgau24ca289073
1.5 match 71 stars 14.95 score 670 scripts 127 dependentsbioc
CiteFuse:CiteFuse: multi-modal analysis of CITE-seq data
CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.
Maintained by Yingxin Lin. Last updated 5 months ago.
singlecellgeneexpressionbioinformaticssingle-cellcpp
3.4 match 27 stars 6.59 score 18 scriptscrsh
papaja:Prepare American Psychological Association Journal Articles with R Markdown
Tools to create dynamic, submission-ready manuscripts, which conform to American Psychological Association manuscript guidelines. We provide R Markdown document formats for manuscripts (PDF and Word) and revision letters (PDF). Helper functions facilitate reporting statistical analyses or create publication-ready tables and plots.
Maintained by Frederik Aust. Last updated 18 days ago.
apaapa-guidelinesjournalmanuscriptpsychologyreproducible-paperreproducible-researchrmarkdown
1.9 match 662 stars 11.74 score 1.7k scripts 1 dependentsbioc
GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)
The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.
Maintained by Sean Davis. Last updated 5 months ago.
microarraydataimportonechanneltwochannelsagebioconductorbioinformaticsdata-sciencegenomicsncbi-geo
1.5 match 92 stars 14.46 score 4.1k scripts 44 dependentscols4all
cols4all:Colors for all
Color palettes for all people, including those with color vision deficiency. Popular color palette series have been organized by type and have been scored on several properties such as color-blind-friendliness and fairness (i.e. do colors stand out equally?). Own palettes can also be loaded and analysed. Besides the common palette types (categorical, sequential, and diverging) it also includes cyclic and bivariate color palettes. Furthermore, a color for missing values is assigned to each palette.
Maintained by Martijn Tennekes. Last updated 2 months ago.
2.0 match 343 stars 9.98 score 26 dependentswadpac
GGIR:Raw Accelerometer Data Analysis
A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Maintained by Vincent T van Hees. Last updated 2 days ago.
accelerometeractivity-recognitioncircadian-rhythmmovement-sensorsleep
1.5 match 109 stars 13.20 score 342 scripts 3 dependentsmpierrejean
jointseg:Joint Segmentation of Multivariate (Copy Number) Signals
Methods for fast segmentation of multivariate signals into piecewise constant profiles and for generating realistic copy-number profiles. A typical application is the joint segmentation of total DNA copy numbers and allelic ratios obtained from Single Nucleotide Polymorphism (SNP) microarrays in cancer studies. The methods are described in Pierre-Jean, Rigaill and Neuvial (2015) <doi:10.1093/bib/bbu026>.
Maintained by Morgane Pierre-Jean. Last updated 6 years ago.
3.0 match 6 stars 6.50 score 44 scripts 2 dependentsngreifer
cobalt:Covariate Balance Tables and Plots
Generate balance tables and plots for covariates of groups preprocessed through matching, weighting or subclassification, for example, using propensity scores. Includes integration with 'MatchIt', 'WeightIt', 'MatchThem', 'twang', 'Matching', 'optmatch', 'CBPS', 'ebal', 'cem', 'sbw', and 'designmatch' for assessing balance on the output of their preprocessing functions. Users can also specify data for balance assessment not generated through the above packages. Also included are methods for assessing balance in clustered or multiply imputed data sets or data sets with multi-category, continuous, or longitudinal treatments.
Maintained by Noah Greifer. Last updated 11 months ago.
causal-inferencepropensity-scores
1.5 match 75 stars 12.98 score 1.0k scripts 8 dependentstconwell
html5:Creates Valid HTML5 Strings
Generates valid HTML tag strings for HTML5 elements documented by Mozilla. Attributes are passed as named lists, with names being the attribute name and values being the attribute value. Attribute values are automatically double-quoted. To declare a DOCTYPE, wrap html() with function doctype(). Mozilla's documentation for HTML5 is available here: <https://developer.mozilla.org/en-US/docs/Web/HTML/Element>. Elements marked as obsolete are not included.
Maintained by Timothy Conwell. Last updated 2 years ago.
5.2 match 1 stars 3.65 score 1 scripts 3 dependentsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 12 days ago.
1.5 match 418 stars 12.50 score 448 scripts 9 dependentslcrawlab
smer:Sparse Marginal Epistasis Test
The Sparse Marginal Epistasis Test is a computationally efficient genetics method which detects statistical epistasis in complex traits; see Stamp et al. (2025, <doi:10.1101/2025.01.11.632557>) for details.
Maintained by Julian Stamp. Last updated 2 months ago.
genomewideassociationepistasisgeneticssnplinearmixedmodelcppepistasis-analysisepistatisgwasgwas-toolsmapitzlibcppopenmp
3.8 match 1 stars 4.95 score 8 scriptsbioc
sccomp:Tests differences in cell-type proportion for single-cell data, robust to outliers
A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.
Maintained by Stefano Mangiola. Last updated 1 days ago.
bayesianregressiondifferentialexpressionsinglecellmetagenomicsflowcytometryspatialbatch-correctioncompositioncytofdifferential-proportionmicrobiomemultilevelproportionsrandom-effectssingle-cellunwanted-variation
2.2 match 99 stars 8.43 score 69 scriptsbioc
IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data
Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.
Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.
geneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicingvisualizationstatisticalmethodtranscriptomevariantbiomedicalinformaticsfunctionalgenomicssystemsbiologytranscriptomicsrnaseqannotationfunctionalpredictiongenepredictiondataimportmultiplecomparisonbatcheffectimmunooncology
2.0 match 108 stars 9.26 score 125 scriptsflorianhartig
DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models
The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.
Maintained by Florian Hartig. Last updated 12 days ago.
glmmregressionregression-diagnosticsresidual
1.3 match 226 stars 14.74 score 2.8k scripts 10 dependentschristophergandrud
repmis:Miscellaneous Tools for Reproducible Research
Tools to load 'R' packages and automatically generate BibTeX files citing them as well as load and cache plain-text and 'Excel' formatted data stored on 'GitHub', and from other sources.
Maintained by Christopher Gandrud. Last updated 9 years ago.
2.3 match 24 stars 7.54 score 394 scripts 10 dependentsbioc
minfi:Analyze Illumina Infinium DNA methylation arrays
Tools to analyze & visualize Illumina Infinium methylation arrays.
Maintained by Kasper Daniel Hansen. Last updated 4 months ago.
immunooncologydnamethylationdifferentialmethylationepigeneticsmicroarraymethylationarraymultichanneltwochanneldataimportnormalizationpreprocessingqualitycontrol
1.3 match 60 stars 12.83 score 996 scripts 26 dependentslazappi
clustree:Visualise Clusterings at Different Resolutions
Deciding what resolution to use can be a difficult question when approaching a clustering analysis. One way to approach this problem is to look at how samples move as the number of clusters increases. This package allows you to produce clustering trees, a visualisation for interrogating clusterings as resolution increases.
Maintained by Luke Zappia. Last updated 1 years ago.
clusteringclustering-treesvisualisationvisualization
1.5 match 219 stars 11.40 score 1.9k scripts 5 dependentsstencila
stencilaschema:Bindings for Stencila Schema
Provides R bindings for the Stencila Schema <https://schema.stenci.la>. This package is primarily aimed at R developers wanting to programmatically generate, or modify, executable documents.
Maintained by Nokome Bentley. Last updated 3 years ago.
json-schemapythonrustschema-orgsemantictypescriptvocabulary
3.3 match 17 stars 4.93 score 2 scriptst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
1.5 match 10.82 score 10k scripts 54 dependentsropenspain
spanishoddata:Get Spanish Origin-Destination Data
Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.
Maintained by Egor Kotov. Last updated 7 days ago.
cdrdatadata-packagemobile-telephone-datamobilityorigin-destination
2.0 match 35 stars 7.89 score 14 scriptsCausalImpact:Inferring Causal Effects using Bayesian Structural Time-Series Models
Implements a Bayesian approach to causal impact estimation in time series, as described in Brodersen et al. (2015) <DOI:10.1214/14-AOAS788>. See the package documentation on GitHub <https://google.github.io/CausalImpact/> to get started.
Maintained by Alain Hauser. Last updated 2 years ago.
1.3 match 1.7k stars 11.73 score 276 scripts 2 dependentsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 7 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
2.0 match 33 stars 7.77 score 10 scriptsbioc
QUBIC:An R package for qualitative biclustering in support of gene co-expression analyses
The core function of this R package is to provide the implementation of the well-cited and well-reviewed QUBIC algorithm, aiming to deliver an effective and efficient biclustering capability. This package also includes the following related functions: (i) a qualitative representation of the input gene expression data, through a well-designed discretization way considering the underlying data property, which can be directly used in other biclustering programs; (ii) visualization of identified biclusters using heatmap in support of overall expression pattern analysis; (iii) bicluster-based co-expression network elucidation and visualization, where different correlation coefficient scores between a pair of genes are provided; and (iv) a generalize output format of biclusters and corresponding network can be freely downloaded so that a user can easily do following comprehensive functional enrichment analysis (e.g. DAVID) and advanced network visualization (e.g. Cytoscape).
Maintained by Yu Zhang. Last updated 5 months ago.
statisticalmethodmicroarraydifferentialexpressionmultiplecomparisonclusteringvisualizationgeneexpressionnetworkbioconductor-packagebioconductor-packagescppopenmp
2.5 match 3 stars 6.10 score 14 scripts 1 dependentssilvadenisson
electionsBR:R Functions to Download and Clean Brazilian Electoral Data
Offers a set of functions to easily download and clean Brazilian electoral data from the Superior Electoral Court and 'CepespData' websites. Among other features, the package retrieves data on local and federal elections for all positions (city councilor, mayor, state deputy, federal deputy, governor, and president) aggregated by state, city, and electoral zones.
Maintained by Denisson Silva. Last updated 4 months ago.
2.0 match 65 stars 7.54 score 66 scriptsl-ramirez-lopez
prospectr:Miscellaneous Functions for Processing and Sample Selection of Spectroscopic Data
Functions to preprocess spectroscopic data and conduct (representative) sample selection/calibration sampling.
Maintained by Leonardo Ramirez-Lopez. Last updated 11 days ago.
chemometricsderivativesinfrarednear-infrarednirpedometricspreprocessingresamplesamplingsignalsoil-spectroscopyspectroscopyopenblascppopenmp
1.5 match 42 stars 10.00 score 326 scripts 4 dependentsropengov
helsinki:R Tools for Helsinki Open Data
Tools for accessing various open data APIs in the Helsinki region in Finland. Current data sources include the Service Map API, Linked Events API, and Helsinki Region Infoshare statistics API.
Maintained by Juuso Parkkinen. Last updated 2 years ago.
ropengovfinlandhelsinkihelsinki-region
2.8 match 6 stars 5.28 score 21 scriptsdaniel1noble
metaDigitise:Extract and Summarise Data from Published Figures
High-throughput, flexible and reproducible extraction of data from figures in primary research papers. metaDigitise() can extract data and / or automatically calculate summary statistics for users from box plots, bar plots (e.g., mean and errors), scatter plots and histograms.
Maintained by Daniel Noble. Last updated 9 months ago.
2.3 match 82 stars 6.46 score 35 scriptspakillo
grateful:Facilitate Citation of R Packages
Facilitates the citation of R packages used in analysis projects. Scans project for packages used, gets their citations, and produces a document with citations in the preferred bibliography format, ready to be pasted into reports or manuscripts. Alternatively, 'grateful' can be used directly within an 'R Markdown' or 'Quarto' document.
Maintained by Francisco Rodriguez-Sanchez. Last updated 2 days ago.
citation-generatorsoftware-citation
1.8 match 229 stars 8.04 score 269 scriptsropengov
pxweb:R Interface to PXWEB APIs
Generic interface for the PX-Web/PC-Axis API. The PX-Web/PC-Axis API is used by organizations such as Statistics Sweden and Statistics Finland to disseminate data. The R package can interact with all PX-Web/PC-Axis APIs to fetch information about the data hierarchy, extract metadata and extract and parse statistics to R data.frame format. PX-Web is a solution to disseminate PC-Axis data files in dynamic tables on the web. Since 2013 PX-Web contains an API to disseminate PC-Axis files.
Maintained by Mans Magnusson. Last updated 1 years ago.
1.9 match 66 stars 7.67 score 2 dependentsndphillips
FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees
Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.
Maintained by Hansjoerg Neth. Last updated 5 months ago.
1.5 match 135 stars 9.58 score 144 scriptsr-barnes
dggridR:Discrete Global Grids
Spatial analyses involving binning require that every bin have the same area, but this is impossible using a rectangular grid laid over the Earth or over any projection of the Earth. Discrete global grids use hexagons, triangles, and diamonds to overcome this issue, overlaying the Earth with equally-sized bins. This package provides utilities for working with discrete global grids, along with utilities to aid in plotting such data.
Maintained by Sebastian Krantz. Last updated 6 months ago.
discrete-global-gridsgeospatialspatial-analysiscpp
1.5 match 168 stars 9.37 score 388 scripts 1 dependentseguidotti
bidask:Efficient Estimation of Bid-Ask Spreads from Open, High, Low, and Close Prices
Implements the efficient estimator of bid-ask spreads from open, high, low, and close prices described in Ardia, Guidotti, & Kroencke (JFE, 2024) <doi:10.1016/j.jfineco.2024.103916>. It also provides an implementation of the estimators described in Roll (JF, 1984) <doi:10.1111/j.1540-6261.1984.tb03897.x>, Corwin & Schultz (JF, 2012) <doi:10.1111/j.1540-6261.2012.01729.x>, and Abdi & Ranaldo (RFS, 2017) <doi:10.1093/rfs/hhx084>.
Maintained by Emanuele Guidotti. Last updated 19 days ago.
2.0 match 107 stars 6.98 score 6 scriptsropensci
rredlist:'IUCN' Red List Client
'IUCN' Red List (<https://api.iucnredlist.org/>) client. The 'IUCN' Red List is a global list of threatened and endangered species. Functions cover all of the Red List 'API' routes. An 'API' key is required.
Maintained by William Gearty. Last updated 1 months ago.
iucnbiodiversityapiweb-servicestraitshabitatspeciesconservationapi-wrapperiucn-red-listtaxize
1.2 match 53 stars 11.49 score 195 scripts 24 dependentsjanuary3
tmod:Feature Set Enrichment Analysis for Metabolomics and Transcriptomics
Methods and feature set definitions for feature or gene set enrichment analysis in transcriptional and metabolic profiling data. Package includes tests for enrichment based on ranked lists of features, functions for visualisation and multivariate functional analysis. See Zyla et al (2019) <doi:10.1093/bioinformatics/btz447>.
Maintained by January Weiner. Last updated 2 months ago.
2.0 match 3 stars 6.88 score 168 scripts 1 dependentsbioc
bambu:Context-Aware Transcript Quantification from Long Read RNA-Seq data
bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.
Maintained by Ying Chen. Last updated 1 months ago.
alignmentcoveragedifferentialexpressionfeatureextractiongeneexpressiongenomeannotationgenomeassemblyimmunooncologylongreadmultiplecomparisonnormalizationrnaseqregressionsequencingsoftwaretranscriptiontranscriptomicsbambubioconductorlong-readsnanoporenanopore-sequencingrna-seqrna-seq-analysistranscript-quantificationtranscript-reconstructioncpp
1.5 match 197 stars 9.03 score 91 scripts 1 dependentsbioc
RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples
This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.
Maintained by Marcel Ramos. Last updated 4 months ago.
infrastructuredatarepresentationcopynumbercore-packagedata-structuremutationsu24ca289073
1.5 match 4 stars 8.96 score 76 scripts 15 dependentsmandymejia
ciftiTools:Tools for Reading, Writing, Viewing and Manipulating CIFTI Files
CIFTI files contain brain imaging data in "grayordinates," which represent the gray matter as cortical surface vertices (left and right) and subcortical voxels (cerebellum, basal ganglia, and other deep gray matter). 'ciftiTools' provides a unified environment for reading, writing, visualizing and manipulating CIFTI-format data. It supports the "dscalar," "dlabel," and "dtseries" intents. Grayordinate data is read in as a "xifti" object, which is structured for convenient access to the data and metadata, and includes support for surface geometry files to enable spatially-dependent functionality such as static or interactive visualizations and smoothing.
Maintained by Amanda Mejia. Last updated 2 months ago.
1.5 match 47 stars 8.90 score 176 scripts 4 dependentsjacolien
itsadug:Interpreting Time Series and Autocorrelated Data Using GAMMs
GAMM (Generalized Additive Mixed Modeling; Lin & Zhang, 1999) as implemented in the R package 'mgcv' (Wood, S.N., 2006; 2011) is a nonlinear regression analysis which is particularly useful for time course data such as EEG, pupil dilation, gaze data (eye tracking), and articulography recordings, but also for behavioral data such as reaction times and response data. As time course measures are sensitive to autocorrelation problems, GAMMs implements methods to reduce the autocorrelation problems. This package includes functions for the evaluation of GAMM models (e.g., model comparisons, determining regions of significance, inspection of autocorrelational structure in residuals) and interpreting of GAMMs (e.g., visualization of complex interactions, and contrasts).
Maintained by Jacolien van Rij. Last updated 3 years ago.
2.0 match 6.51 score 576 scripts 2 dependentsbioc
SPIAT:Spatial Image Analysis of Tissues
SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.
Maintained by Yuzhou Feng. Last updated 15 hours ago.
biomedicalinformaticscellbiologyspatialclusteringdataimportimmunooncologyqualitycontrolsinglecellsoftwarevisualization
1.5 match 22 stars 8.59 score 69 scriptsropensci
lingtypology:Linguistic Typology and Mapping
Provides R with the Glottolog database <https://glottolog.org/> and some more abilities for purposes of linguistic mapping. The Glottolog database contains the catalogue of languages of the world. This package helps researchers to make a linguistic maps, using philosophy of the Cross-Linguistic Linked Data project <https://clld.org/>, which allows for while at the same time facilitating uniform access to the data across publications. A tutorial for this package is available on GitHub pages <https://docs.ropensci.org/lingtypology/> and package vignette. Maps created by this package can be used both for the investigation and linguistic teaching. In addition, package provides an ability to download data from typological databases such as WALS, AUTOTYP and some others and to create your own database website.
Maintained by George Moroz. Last updated 5 months ago.
abvdafboatlasautotypebivaltypclldglottolog-databaselinguistic-mapslinguisticsphoiblesailstypologywals
1.3 match 51 stars 9.58 score 694 scriptsflorianhartig
BayesianTools:General-Purpose MCMC and SMC Samplers and Tools for Bayesian Statistics
General-purpose MCMC and SMC samplers, as well as plots and diagnostic functions for Bayesian statistics, with a particular focus on calibrating complex system models. Implemented samplers include various Metropolis MCMC variants (including adaptive and/or delayed rejection MH), the T-walk, two differential evolution MCMCs, two DREAM MCMCs, and a sequential Monte Carlo (SMC) particle filter.
Maintained by Florian Hartig. Last updated 1 years ago.
bayesecological-modelsmcmcoptimizationsmcsystems-biologycpp
1.3 match 122 stars 10.17 score 580 scripts 5 dependentsbioc
lefser:R implementation of the LEfSE method for microbiome biomarker discovery
lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).
Maintained by Sehyun Oh. Last updated 26 days ago.
softwaresequencingdifferentialexpressionmicrobiomestatisticalmethodclassificationbioconductor-package
1.5 match 55 stars 8.47 score 56 scriptsbioc
mpra:Analyze massively parallel reporter assays
Tools for data management, count preprocessing, and differential analysis in massively parallel report assays (MPRA).
Maintained by Leslie Myint. Last updated 5 months ago.
softwaregeneregulationsequencingfunctionalgenomics
2.0 match 6 stars 6.28 score 15 scriptsbioc
derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach
This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
differentialexpressionsequencingrnaseqchipseqdifferentialpeakcallingsoftwareimmunooncologycoverageannotation-agnosticbioconductorderfinder
1.3 match 42 stars 10.03 score 78 scripts 6 dependentsbioc
RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences
This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.
Maintained by Pascal Belleau. Last updated 5 months ago.
geneticssoftwaresequencingwholegenomeprincipalcomponentgeneticvariabilitydimensionreductionbiocviewsancestrycancer-genomicsexome-sequencinggenomicsinferencer-languagerna-seqrna-sequencingwhole-genome-sequencing
2.0 match 5 stars 6.23 score 19 scriptsbioc
made4:Multivariate analysis of microarray data using ADE4
Multivariate data analysis and graphical display of microarray data. Functions include for supervised dimension reduction (between group analysis) and joint dimension reduction of 2 datasets (coinertia analysis). It contains functions that require R package ade4.
Maintained by Aedin Culhane. Last updated 5 months ago.
clusteringclassificationdimensionreductionprincipalcomponenttranscriptomicsmultiplecomparisongeneexpressionsequencingmicroarray
2.0 match 6.11 score 107 scripts 2 dependentsrobindenz1
adjustedCurves:Confounder-Adjusted Survival Curves and Cumulative Incidence Functions
Estimate and plot confounder-adjusted survival curves using either 'Direct Adjustment', 'Direct Adjustment with Pseudo-Values', various forms of 'Inverse Probability of Treatment Weighting', two forms of 'Augmented Inverse Probability of Treatment Weighting', 'Empirical Likelihood Estimation' or 'Targeted Maximum Likelihood Estimation'. Also includes a significance test for the difference between two adjusted survival curves and the calculation of adjusted restricted mean survival times. Additionally enables the user to estimate and plot cause-specific confounder-adjusted cumulative incidence functions in the competing risks setting using the same methods (with some exceptions). For details, see Denz et. al (2023) <doi:10.1002/sim.9681>.
Maintained by Robin Denz. Last updated 29 days ago.
adjustedconfidence-intervalscumulative-incidencesurvival-curves
1.5 match 38 stars 8.12 score 93 scriptsskoval
RISmed:Download Content from NCBI Databases
A set of tools to extract bibliographic content from the National Center for Biotechnology Information (NCBI) databases, including PubMed. The name RISmed is a portmanteau of RIS (for Research Information Systems, a common tag format for bibliographic data) and PubMed.
Maintained by Stephanie Kovalchik. Last updated 3 years ago.
1.8 match 38 stars 6.94 score 252 scripts 3 dependentsanestistouloumis
SimCorMultRes:Simulates Correlated Multinomial Responses
Simulates correlated multinomial responses conditional on a marginal model specification.
Maintained by Anestis Touloumis. Last updated 12 months ago.
binarylongitudinal-studiesmultinomialsimulation
2.0 match 7 stars 6.04 score 26 scripts 2 dependentsmolinlab
Holomics:An User-Friendly R 'shiny' Application for Multi-Omics Data Integration and Analysis
A 'shiny' application, which allows you to perform single- and multi-omics analyses using your own omics datasets. After the upload of the omics datasets and a metadata file, single-omics is performed for feature selection and dataset reduction. These datasets are used for pairwise- and multi-omics analyses, where automatic tuning is done to identify correlations between the datasets - the end goal of the recommended 'Holomics' workflow. Methods used in the package were implemented in the package 'mixomics' by Florian Rohart,Benoît Gautier,Amrit Singh,Kim-Anh Lê Cao (2017) <doi:10.1371/journal.pcbi.1005752> and are described there in further detail.
Maintained by Katharina Munk. Last updated 9 months ago.
2.2 match 7 stars 5.45 score 7 scriptsbioc
Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery
A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.
Maintained by Nan Xiao. Last updated 5 months ago.
softwaredataimportdatarepresentationfeatureextractioncheminformaticsbiomedicalinformaticsproteomicsgosystemsbiologybioconductorbioinformaticsdrug-discoveryfeature-extractionfingerprintmolecular-descriptorsprotein-sequences
1.5 match 37 stars 7.81 score 29 scriptsbioc
biocthis:Automate package and project setup for Bioconductor packages
This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
softwarereportwritingactionsbioconductorbiocthisgithubstylerusethis
1.5 match 51 stars 7.78 score 4 scripts 1 dependentsedonnachie
ICD10gm:Metadata Processing for the German Modification of the ICD-10 Coding System
Provides convenient access to the German modification of the International Classification of Diagnoses, 10th revision (ICD-10-GM). It provides functionality to aid in the identification, specification and historisation of ICD-10 codes. Its intended use is the analysis of routinely collected data in the context of epidemiology, medical research and health services research. The underlying metadata are released by the German Institute for Medical Documentation and Information <https://www.dimdi.de>, and are redistributed in accordance with their license.
Maintained by Ewan Donnachie. Last updated 1 years ago.
bfarmcharlsoncomorbiditiesdiagnosesdimdiicd-10metadataroutinedatenversorgungsforschung
2.2 match 10 stars 5.30 score 20 scriptsalexpkeil1
qgcomp:Quantile G-Computation
G-computation for a set of time-fixed exposures with quantile-based basis functions, possibly under linearity and homogeneity assumptions. This approach estimates a regression line corresponding to the expected change in the outcome (on the link basis) given a simultaneous increase in the quantile-based category for all exposures. Works with continuous, binary, and right-censored time-to-event outcomes. Reference: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, and Alexandra J. White (2019) A quantile-based g-computation approach to addressing the effects of exposure mixtures; <doi:10.1289/EHP5838>.
Maintained by Alexander Keil. Last updated 4 days ago.
exposureexposure-mixtureexposure-mixturesquantile-gcomputationsurvival
1.3 match 37 stars 8.73 score 70 scripts 2 dependentsbradduthie
resevol:Simulate Agricultural Production and Evolution of Pesticide Resistance
Simulates individual-based models of agricultural pest management and the evolution of pesticide resistance. Management occurs on a spatially explicit landscape that is divided into an arbitrary number of farms that can grow one of up to 10 crops and apply one of up to 10 pesticides. Pest genomes are modelled in a way that allows for any number of pest traits with an arbitrary covariance structure that is constructed using an evolutionary algorithm in the mine_gmatrix() function. Simulations are then run using the run_farm_sim() function. This package thereby allows for highly mechanistic social-ecological models of the evolution of pesticide resistance under different types of crop rotation and pesticide application regimes.
Maintained by A. Bradley Duthie. Last updated 1 years ago.
2.5 match 3 stars 4.65 score 1 scriptsbioc
rrvgo:Reduce + Visualize GO
Reduce and visualize lists of Gene Ontology terms by identifying redudance based on semantic similarity.
Maintained by Sergi Sayols. Last updated 5 months ago.
annotationclusteringgonetworkpathwayssoftware
1.5 match 24 stars 7.74 score 190 scriptsazizka
conserveR:Identifying Conservation Prioritization Methods Based on Data Availability
Helping biologists to choose the most suitable approach to link their research to conservation. After answering few questions on the data available, geographic and taxonomic scope, 'conserveR' ranks existing methods for conservation prioritization and systematic conservation planning by suitability. The methods data base of 'conserveR' contains 133 methods for conservation prioritization based on a systematic review of > 12,000 scientific publications from the fields of spatial conservation prioritization, systematic conservation planning, biogeography and ecology.
Maintained by Alexander Zizka. Last updated 4 years ago.
3.2 match 8 stars 3.60 scorelifewatch
sdmpredictors:Species Distribution Modelling Predictor Datasets
Terrestrial and marine predictors for species distribution modelling from multiple sources, including WorldClim <https://www.worldclim.org/>,, ENVIREM <https://envirem.github.io/>, Bio-ORACLE <https://bio-oracle.org/> and MARSPEC <http://www.marspec.org/>.
Maintained by Salvador Fernandez. Last updated 2 years ago.
bio-oraclelifewatchlifewatchvlizspecies-distribution-modelling
1.5 match 30 stars 7.47 score 218 scriptsmw201608
SuperExactTest:Exact Test and Visualization of Multi-Set Intersections
Identification of sets of objects with shared features is a common operation in all disciplines. Analysis of intersections among multiple sets is fundamental for in-depth understanding of their complex relationships. This package implements a theoretical framework for efficient computation of statistical distributions of multi-set intersections based upon combinatorial theory, and provides multiple scalable techniques for visualizing the intersection statistics. The statistical algorithm behind this package was published in Wang et al. (2015) <doi:10.1038/srep16923>.
Maintained by Minghui Wang. Last updated 1 years ago.
intersectionsetstatisticsvisualization
1.5 match 28 stars 7.47 score 70 scripts 1 dependentsbioc
MOSim:Multi-Omics Simulation (MOSim)
MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Maintained by Sonia Tarazona. Last updated 5 months ago.
softwaretimecourseexperimentaldesignrnaseqcpp
1.5 match 9 stars 7.46 score 11 scriptspepijn-devries
ECOTOXr:Download and Extract Data from US EPA's ECOTOX Database
The US EPA ECOTOX database is a freely available database with a treasure of aquatic and terrestrial ecotoxicological data. As the online search interface doesn't come with an API, this package provides the means to easily access and search the database in R. To this end, all raw tables are downloaded from the EPA website and stored in a local SQLite database <doi:10.1016/j.chemosphere.2024.143078>.
Maintained by Pepijn de Vries. Last updated 5 days ago.
1.8 match 10 stars 6.20 score 6 scriptsamilkey1
lorad:Lowest Radial Distance Method of Marginal Likelihood Estimation
Estimates marginal likelihood from a posterior sample using the method described in Wang et al. (2023) <doi:10.1093/sysbio/syad007>, which does not require evaluation of any additional points and requires only the log of the unnormalized posterior density for each sampled parameter vector.
Maintained by Analisa Milkey. Last updated 1 years ago.
4.5 match 2.48 score 5 scriptsbioc
ELMER:Inferring Regulatory Element Landscapes and Transcription Factor Networks Using Cancer Methylomes
ELMER is designed to use DNA methylation and gene expression from a large number of samples to infere regulatory element landscape and transcription factor network in primary tissue.
Maintained by Tiago Chedraoui Silva. Last updated 5 months ago.
dnamethylationgeneexpressionmotifannotationsoftwaregeneregulationtranscriptionnetwork
1.5 match 7.42 score 176 scriptsnataliepatten
gatoRs:Geographic and Taxonomic Occurrence R-Based Scrubbing
Streamlines downloading and cleaning biodiversity data from Integrated Digitized Biocollections (iDigBio) and the Global Biodiversity Information Facility (GBIF).
Maintained by Natalie N. Patten. Last updated 10 months ago.
1.8 match 11 stars 6.16 score 66 scriptsbioc
enrichViewNet:From functional enrichment results to biological networks
This package enables the visualization of functional enrichment results as network graphs. First the package enables the visualization of enrichment results, in a format corresponding to the one generated by gprofiler2, as a customizable Cytoscape network. In those networks, both gene datasets (GO terms/pathways/protein complexes) and genes associated to the datasets are represented as nodes. While the edges connect each gene to its dataset(s). The package also provides the option to create enrichment maps from functional enrichment results. Enrichment maps enable the visualization of enriched terms into a network with edges connecting overlapping genes.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionsoftwarenetworknetworkenrichmentgocystocapefunctional-enrichment
2.0 match 5 stars 5.54 score 6 scriptsfhdsl
metricminer:Mine Metrics from Common Places on the Web
Mine metrics on common places on the web through the power of their APIs (application programming interfaces). It also helps make the data in a format that is easily used for a dashboard or other purposes. There is an associated dashboard template and tutorials that are underdevelopment that help you fully utilize 'metricminer'.
Maintained by Candace Savonen. Last updated 3 days ago.
1.8 match 2 stars 6.13 score 21 scriptspepijn-devries
CopernicusMarine:Search Download and Handle Data from Copernicus Marine Service Information
Subset and download data from EU Copernicus Marine Service Information: <https://data.marine.copernicus.eu>. Import data on the oceans physical and biogeochemical state from Copernicus into R without the need of external software.
Maintained by Pepijn de Vries. Last updated 3 months ago.
1.9 match 25 stars 5.88 score 20 scripts 2 dependentsdoi-usgs
toxEval:Exploring Biological Relevance of Environmental Chemistry Observations
Data analysis package for estimating potential biological effects from chemical concentrations in environmental samples. Included are a set of functions to analyze, visualize, and organize measured concentration data as it relates to user-selected chemical-biological interaction benchmark data such as water quality criteria. The intent of these analyses is to develop a better understanding of the potential biological relevance of environmental chemistry data. Results can be used to prioritize which chemicals at which sites may be of greatest concern. These methods are meant to be used as a screening technique to predict potential for biological influence from chemicals that ultimately need to be validated with direct biological assays. A description of the analysis can be found in Blackwell (2017) <doi:10.1021/acs.est.7b01613>.
Maintained by Laura DeCicco. Last updated 3 months ago.
1.5 match 21 stars 7.34 score 58 scriptsbioc
QuasR:Quantify and Annotate Short Reads in R
This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest. Read alignments are either generated through Rbowtie (data from DNA/ChIP/ATAC/Bis-seq experiments) or Rhisat2 (data from RNA-seq experiments that require spliced alignments), or can be provided in the form of bam files.
Maintained by Michael Stadler. Last updated 24 days ago.
geneticspreprocessingsequencingchipseqrnaseqmethylseqcoveragealignmentqualitycontrolimmunooncologycurlbzip2xz-utilszlibcpp
1.3 match 6 stars 8.70 score 79 scripts 1 dependentsbioc
CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems
The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.
Maintained by Lihua Julie Zhu. Last updated 6 days ago.
immunooncologygeneregulationsequencematchingcrispr
1.5 match 7.18 score 51 scripts 2 dependentsbioc
isomiRs:Analyze isomiRs and miRNAs from small RNA-seq
Characterization of miRNAs and isomiRs, clustering and differential expression.
Maintained by Lorena Pantano. Last updated 5 months ago.
mirnarnaseqdifferentialexpressionclusteringimmunooncologyanalyze-isomirsbioconductorisomirs
1.5 match 8 stars 7.09 score 43 scriptsbioc
GenomicSuperSignature:Interpretation of RNA-seq experiments through robust, efficient comparison to public databases
This package provides a novel method for interpreting new transcriptomic datasets through near-instantaneous comparison to public archives without high-performance computing requirements. Through the pre-computed index, users can identify public resources associated with their dataset such as gene sets, MeSH term, and publication. Functions to identify interpretable annotations and intuitive visualization options are implemented in this package.
Maintained by Sehyun Oh. Last updated 5 months ago.
transcriptomicssystemsbiologyprincipalcomponentrnaseqsequencingpathwaysclusteringbioconductor-packageexploratory-data-analysisgseameshprincipal-component-analysisrna-sequencing-profilestransferlearning
1.5 match 16 stars 6.97 score 59 scriptsbioc
Rbowtie:R bowtie wrapper
This package provides an R wrapper around the popular bowtie short read aligner and around SpliceMap, a de novo splice junction discovery and alignment tool. The package is used by the QuasR bioconductor package. We recommend to use the QuasR package instead of using Rbowtie directly.
Maintained by Michael Stadler. Last updated 2 months ago.
1.5 match 1 stars 6.80 score 22 scripts 8 dependentsbioc
dupRadar:Assessment of duplication rates in RNA-Seq datasets
Duplication rate quality control for RNA-Seq datasets.
Maintained by Sergi Sayols. Last updated 5 months ago.
technologysequencingrnaseqqualitycontrolimmunooncology
1.5 match 2 stars 6.78 score 60 scriptsbioc
CNVMetrics:Copy Number Variant Metrics
The CNVMetrics package calculates similarity metrics to facilitate copy number variant comparison among samples and/or methods. Similarity metrics can be employed to compare CNV profiles of genetically unrelated samples as well as those with a common genetic background. Some metrics are based on the shared amplified/deleted regions while other metrics rely on the level of amplification/deletion. The data type used as input is a plain text file containing the genomic position of the copy number variations, as well as the status and/or the log2 ratio values. Finally, a visualization tool is provided to explore resulting metrics.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionsoftwarecopynumbervariationcnvcopy-number-variationmetricsr-language
2.0 match 4 stars 5.08 score 8 scriptssmbc-nzp
MigConnectivity:Estimate Migratory Connectivity for Migratory Animals
Allows the user to estimate transition probabilities for migratory animals between any two phases of the annual cycle, using a variety of different data types. Also quantifies the strength of migratory connectivity (MC), a standardized metric to quantify the extent to which populations co-occur between two phases of the annual cycle. Includes functions to estimate MC and the more traditional metric of migratory connectivity strength (Mantel correlation) incorporating uncertainty from multiple sources of sampling error. For cross-species comparisons, methods are provided to estimate differences in migratory connectivity strength, incorporating uncertainty. See Cohen et al. (2018) <doi:10.1111/2041-210X.12916>, Cohen et al. (2019) <doi:10.1111/ecog.03974>, and Roberts et al. (2023) <doi:10.1002/eap.2788> for details on some of these methods.
Maintained by Jeffrey A. Hostetler. Last updated 12 months ago.
1.5 match 8 stars 6.77 score 41 scriptsmaarten14c
rbacon:Age-Depth Modelling using Bayesian Statistics
An approach to age-depth modelling that uses Bayesian statistics to reconstruct accumulation histories for deposits, through combining radiocarbon and other dates with prior information on accumulation rates and their variability. See Blaauw & Christen (2011).
Maintained by Maarten Blaauw. Last updated 26 days ago.
age-depth-modelbayesianholocenelakesocean-sedimentspeatradiocarbon-calibrationcpp
1.5 match 7 stars 6.75 score 57 scripts 1 dependentsbioc
CoGAPS:Coordinated Gene Activity in Pattern Sets
Coordinated Gene Activity in Pattern Sets (CoGAPS) implements a Bayesian MCMC matrix factorization algorithm, GAPS, and links it to gene set statistic methods to infer biological process activity. It can be used to perform sparse matrix factorization on any data, and when this data represents biomolecules, to do gene set analysis.
Maintained by Elana J. Fertig. Last updated 5 months ago.
geneexpressiontranscriptiongenesetenrichmentdifferentialexpressionbayesianclusteringtimecoursernaseqmicroarraymultiplecomparisondimensionreductionimmunooncologycpp
1.5 match 6.72 score 104 scriptsbioc
recount3:Explore and download data from the recount3 project
The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportannotation-agnosticbioconductorcountderfinderexongenehumanilluminajunctionmouserecountrecount3
1.3 match 33 stars 8.03 score 216 scriptsbioc
megadepth:megadepth: BigWig and BAM related utilities
This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.
Maintained by David Zhang. Last updated 3 months ago.
softwarecoveragedataimporttranscriptomicsrnaseqpreprocessingbambigwigdasptermegadepthrecount2recount3
1.5 match 12 stars 6.69 score 7 scripts 3 dependentsbioc
proActiv:Estimate Promoter Activity from RNA-Seq data
Most human genes have multiple promoters that control the expression of different isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv is an R package that enables the analysis of promoters from RNA-seq data. proActiv uses aligned reads as input, and generates counts and normalized promoter activity estimates for each annotated promoter. In particular, proActiv accepts junction files from TopHat2 or STAR or BAM files as inputs. These estimates can then be used to identify which promoter is active, which promoter is inactive, and which promoters change their activity across conditions. proActiv also allows visualization of promoter activity across conditions.
Maintained by Joseph Lee. Last updated 5 months ago.
rnaseqgeneexpressiontranscriptionalternativesplicinggeneregulationdifferentialsplicingfunctionalgenomicsepigeneticstranscriptomicspreprocessingalternative-promotersgenomicspromoter-activitypromoter-annotationrna-seq-data
1.5 match 51 stars 6.66 score 15 scriptsbioc
MultiBaC:Multiomic Batch effect Correction
MultiBaC is a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. MultiBaC is the first Batch effect correction algorithm that dealing with batch effect correction in multiomics datasets. MultiBaC is able to remove batch effects across different omics generated within separate batches provided that at least one common omic data type is included in all the batches considered.
Maintained by The package maintainer. Last updated 5 months ago.
softwarestatisticalmethodprincipalcomponentdatarepresentationgeneexpressiontranscriptionbatcheffect
3.0 match 3.30 score 7 scriptseltebioinformatics
mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate
Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.
Maintained by Tamas Stirling. Last updated 3 months ago.
annotationdifferentialexpressiongeneexpressiongenesetenrichmentgographandnetworkmultiplecomparisonpathwaysreactomesoftwaretranscriptionvisualizationenrichmentenrichment-analysisfunctional-enrichment-analysisgene-set-enrichmentontologiestranscriptomicscpp
1.3 match 28 stars 7.36 score 34 scriptsbioc
ChAMP:Chip Analysis Methylation Pipeline for Illumina HumanMethylation450 and EPIC
The package includes quality control metrics, a selection of normalization methods and novel methods to identify differentially methylated regions and to highlight copy number alterations.
Maintained by Yuan Tian. Last updated 5 months ago.
microarraymethylationarraynormalizationtwochannelcopynumberdnamethylation
1.5 match 6.54 score 278 scriptsbioc
wateRmelon:Illumina DNA methylation array normalization and metrics
15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.
Maintained by Leo C Schalkwyk. Last updated 4 months ago.
dnamethylationmicroarraytwochannelpreprocessingqualitycontrol
1.3 match 7.75 score 247 scripts 2 dependentsanestistouloumis
ShrinkCovMat:Shrinkage Covariance Matrix Estimators
Provides nonparametric Steinian shrinkage estimators of the covariance matrix that are suitable in high dimensional settings, that is when the number of variables is larger than the sample size.
Maintained by Anestis Touloumis. Last updated 2 years ago.
covariance-matrixshrinkage-estimatorsopenblascppopenmp
2.0 match 8 stars 4.83 score 17 scriptsbioc
evaluomeR:Evaluation of Bioinformatics Metrics
Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.
Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.
clusteringclassificationfeatureextractionassessmentclustering-evaluationevaluomeevaluomermetrics
2.0 match 4.82 score 33 scriptsbioc
HDTD:Statistical Inference about the Mean Matrix and the Covariance Matrices in High-Dimensional Transposable Data (HDTD)
Characterization of intra-individual variability using physiologically relevant measurements provides important insights into fundamental biological questions ranging from cell type identity to tumor development. For each individual, the data measurements can be written as a matrix with the different subsamples of the individual recorded in the columns and the different phenotypic units recorded in the rows. Datasets of this type are called high-dimensional transposable data. The HDTD package provides functions for conducting statistical inference for the mean relationship between the row and column variables and for the covariance structure within and between the row and column variables.
Maintained by Anestis Touloumis. Last updated 5 months ago.
differentialexpressiongeneticsgeneexpressionmicroarraysequencingstatisticalmethodsoftwarebioconductor-packagehigh-dimensionalstatisticsopenblascppopenmp
2.0 match 1 stars 4.78 scoretkcaccia
KODAMA:Knowledge Discovery by Accuracy Maximization
An unsupervised and semi-supervised learning algorithm that performs feature extraction from noisy and high-dimensional data. It facilitates identification of patterns representing underlying groups on all samples in a data set. Based on Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA. (2017) Bioinformatics <doi:10.1093/bioinformatics/btw705> and Cacciatore S, Luchinat C, Tenori L. (2014) Proc Natl Acad Sci USA <doi:10.1073/pnas.1220873111>.
Maintained by Stefano Cacciatore. Last updated 14 hours ago.
1.3 match 1 stars 7.00 score 63 scripts 1 dependentsbioc
xCell2:A Tool for Generic Cell Type Enrichment Analysis
xCell2 provides methods for cell type enrichment analysis using cell type signatures. It includes three main functions - 1. xCell2Train for training custom references objects from bulk or single-cell RNA-seq datasets. 2. xCell2Analysis for conducting the cell type enrichment analysis using the custom reference. 3. xCell2GetLineage for identifying dependencies between different cell types using ontology.
Maintained by Almog Angel. Last updated 16 hours ago.
geneexpressiontranscriptomicsmicroarrayrnaseqsinglecelldifferentialexpressionimmunooncologygenesetenrichment
1.5 match 6 stars 6.16 score 15 scriptsbioc
methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect
Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencinganalysisbioconductorbioinformaticscpgdifferentially-methylated-elementsinheritancemonte-carlo-samplingpermutation
2.0 match 4.60 score 1 scriptsbioc
methInheritSim:Simulating Whole-Genome Inherited Bisulphite Sequencing Data
Simulate a multigeneration methylation case versus control experiment with inheritance relation using a real control dataset.
Maintained by Pascal Belleau. Last updated 5 months ago.
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencingbisulphite-sequencinginheritancemethylationsimulation
2.0 match 1 stars 4.60 score 1 scriptslcrawlab
mvMAPIT:Multivariate Genome Wide Marginal Epistasis Test
Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this package, we present the 'multivariate MArginal ePIstasis Test' ('mvMAPIT') – a multi-outcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact – thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search based methods. Our proposed 'mvMAPIT' builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate 'mvMAPIT' as a multivariate linear mixed model and develop a multi-trait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. Crawford et al. (2017) <doi:10.1371/journal.pgen.1006869>. Stamp et al. (2023) <doi:10.1093/g3journal/jkad118>.
Maintained by Julian Stamp. Last updated 5 months ago.
cppepistasisepistasis-analysisgwasgwas-toolslinear-mixed-modelsmapitmvmapitvariance-componentsopenblascppopenmp
1.3 match 11 stars 6.90 score 17 scripts 1 dependentsmyles-lewis
glmmSeq:General Linear Mixed Models for Gene-Level Differential Expression
Using mixed effects models to analyse longitudinal gene expression can highlight differences between sample groups over time. The most widely used differential gene expression tools are unable to fit linear mixed effect models, and are less optimal for analysing longitudinal data. This package provides negative binomial and Gaussian mixed effects models to fit gene expression and other biological data across repeated samples. This is particularly useful for investigating changes in RNA-Sequencing gene expression between groups of individuals over time, as described in: Rivellese, F., Surace, A. E., Goldmann, K., Sciacca, E., Cubuk, C., Giorli, G., ... Lewis, M. J., & Pitzalis, C. (2022) Nature medicine <doi:10.1038/s41591-022-01789-0>.
Maintained by Myles Lewis. Last updated 2 months ago.
bioinformaticsdifferential-gene-expressiongene-expressionglmmmixed-modelstranscriptomics
1.5 match 19 stars 6.11 score 45 scriptsbioc
fastseg:fastseg - a fast segmentation algorithm
fastseg implements a very fast and efficient segmentation algorithm. It has similar functionality as DNACopy (Olshen and Venkatraman 2004), but is considerably faster and more flexible. fastseg can segment data from DNA microarrays and data from next generation sequencing for example to detect copy number segments. Further it can segment data from RNA microarrays like tiling arrays to identify transcripts. Most generally, it can segment data given as a matrix or as a vector. Various data formats can be used as input to fastseg like expression set objects for microarrays or GRanges for sequencing data. The segmentation criterion of fastseg is based on a statistical test in a Bayesian framework, namely the cyber t-test (Baldi 2001). The speed-up arises from the facts, that sampling is not necessary in for fastseg and that a dynamic programming approach is used for calculation of the segments' first and higher order moments.
Maintained by Alexander Blume. Last updated 1 months ago.
classificationcopynumbervariationcpp
1.5 match 6.07 score 20 scripts 4 dependentsbioc
AnVILWorkflow:Run workflows implemented in Terra/AnVIL workspace
The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The main cloud-based genomics platform deported by the AnVIL project is Terra. The AnVILWorkflow package allows remote access to Terra implemented workflows, enabling end-user to utilize Terra/ AnVIL provided resources - such as data, workflows, and flexible/scalble computing resources - through the conventional R functions.
Maintained by Sehyun Oh. Last updated 27 days ago.
infrastructuresoftwareanvilgcpterraworkflows
1.5 match 6 stars 6.03 score 1 scriptsbioc
regionReport:Generate HTML or PDF reports for a set of genomic regions or DESeq2/edgeR results
Generate HTML or PDF reports to explore a set of regions such as the results from annotation-agnostic expression analysis of RNA-seq data at base-pair resolution performed by derfinder. You can also create reports for DESeq2 or edgeR results.
Maintained by Leonardo Collado-Torres. Last updated 2 months ago.
differentialexpressionsequencingrnaseqsoftwarevisualizationtranscriptioncoveragereportwritingdifferentialmethylationdifferentialpeakcallingimmunooncologyqualitycontrolbioconductorderfinderdeseq2edgerregionreportrmarkdown
1.3 match 9 stars 7.22 score 46 scriptsropensci
europepmc:R Interface to the Europe PubMed Central RESTful Web Service
An R Client for the Europe PubMed Central RESTful Web Service (see <https://europepmc.org/RestfulWebService> for more information). It gives access to both metadata on life science literature and open access full texts. Europe PMC indexes all PubMed content and other literature sources including Agricola, a bibliographic database of citations to the agricultural literature, or Biological Patents. In addition to bibliographic metadata, the client allows users to fetch citations and reference lists. Links between life-science literature and other EBI databases, including ENA, PDB or ChEMBL are also accessible. No registration or API key is required. See the vignettes for usage examples.
Maintained by Najko Jahn. Last updated 1 years ago.
bibliometricseurope-pmcpubmedpubmedcentralscientific-literaturescientific-publications
1.1 match 27 stars 7.94 score 122 scripts 2 dependentsbioc
HiContacts:Analysing cool files in R with HiContacts
HiContacts provides a collection of tools to analyse and visualize Hi-C datasets imported in R by HiCExperiment.
Maintained by Jacques Serizay. Last updated 5 months ago.
1.5 match 12 stars 5.95 score 49 scriptsl-ramirez-lopez
resemble:Memory-Based Learning in Spectral Chemometrics
Functions for dissimilarity analysis and memory-based learning (MBL, a.k.a local modeling) in complex spectral data sets. Most of these functions are based on the methods presented in Ramirez-Lopez et al. (2013) <doi:10.1016/j.geoderma.2012.12.014>.
Maintained by Leonardo Ramirez-Lopez. Last updated 2 years ago.
chemoinformaticschemometricsinfrared-spectroscopylazy-learninglocal-regressionmachine-learningmemory-based-learningnirpedometricssoil-spectroscopyspectral-dataspectral-libraryspectroscopyopenblascppopenmp
1.5 match 20 stars 5.91 score 27 scriptsbioc
ASSIGN:Adaptive Signature Selection and InteGratioN (ASSIGN)
ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.
Maintained by Ying Shen. Last updated 5 months ago.
softwaregeneexpressionpathwaysbayesian
1.2 match 2 stars 7.37 score 65 scripts 1 dependentsbioc
DFplyr:A `DataFrame` (`S4Vectors`) backend for `dplyr`
Provides `dplyr` verbs (`mutate`, `select`, `filter`, etc...) supporting `S4Vectors::DataFrame` objects. Importantly, this is achieved without conversion to an intermediate `tibble`. Adds grouping infrastructure to `DataFrame` which is respected by the transformation verbs.
Maintained by Jonathan Carroll. Last updated 5 months ago.
datarepresentationinfrastructuresoftware
1.5 match 21 stars 5.87 score 5 scriptskzst
neutrostat:Neutrosophic Statistics
Analyzes data involving imprecise and vague information. Provides summary statistics and describes the characteristics of neutrosophic data, as defined by Florentin Smarandache (2013).<ISBN:9781599732749>.
Maintained by Zsolt T. Kosztyan. Last updated 4 months ago.
3.4 match 2.60 scorelvclark
polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids
Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.
Maintained by Lindsay V. Clark. Last updated 8 days ago.
bioinformaticsdna-sequencinggenotype-likelihoodsgenotyping-by-sequencinghacktoberfestrad-seqrad-sequencingsnp-genotypingcpp
1.3 match 28 stars 6.98 score 85 scriptsbioc
twoddpcr:Classify 2-d Droplet Digital PCR (ddPCR) data and quantify the number of starting molecules
The twoddpcr package takes Droplet Digital PCR (ddPCR) droplet amplitude data from Bio-Rad's QuantaSoft and can classify the droplets. A summary of the positive/negative droplet counts can be generated, which can then be used to estimate the number of molecules using the Poisson distribution. This is the first open source package that facilitates the automatic classification of general two channel ddPCR data. Previous work includes 'definetherain' (Jones et al., 2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle one channel ddPCR experiments only. The 'ddpcr' package available on CRAN (Attali et al., 2016) supports automatic gating of a specific class of two channel ddPCR experiments only.
Maintained by Anthony Chiu. Last updated 5 months ago.
1.5 match 10 stars 5.78 score 4 scriptsbioc
RPA:RPA: Robust Probabilistic Averaging for probe-level analysis
Probabilistic analysis of probe reliability and differential gene expression on short oligonucleotide arrays.
Maintained by Leo Lahti. Last updated 5 months ago.
geneexpressionmicroarraypreprocessingqualitycontrol
1.5 match 5.78 score 20 scripts 1 dependentsgtonkinhill
rhierbaps:Clustering Genetic Sequence Data Using the HierBAPS Algorithm
Implements the hierarchical Bayesian analysis of populations structure (hierBAPS) algorithm of Cheng et al. (2013) <doi:10.1093/molbev/mst028> for clustering DNA sequences from multiple sequence alignments in FASTA format. The implementation includes improved defaults and plotting capabilities and unlike the original 'MATLAB' version removes singleton SNPs by default.
Maintained by Gerry Tonkin-Hill. Last updated 4 years ago.
population-geneticspopulation-genomicspopulation-structure
1.5 match 34 stars 5.66 score 27 scriptsbioc
iSEEindex:iSEE extension for a landing page to a custom collection of data sets
This package provides an interface to any collection of data sets within a single iSEE web-application. The main functionality of this package is to define a custom landing page allowing app maintainers to list a custom collection of data sets that users can selected from and directly load objects into an iSEE web-application.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
softwareinfrastructurebioconductorhacktoberfest
1.5 match 2 stars 5.65 score 8 scriptsbioc
netresponse:Functional Network Analysis
Algorithms for functional network analysis. Includes an implementation of a variational Dirichlet process Gaussian mixture model for nonparametric mixture modeling.
Maintained by Leo Lahti. Last updated 5 months ago.
cellbiologyclusteringgeneexpressiongeneticsnetworkgraphandnetworkdifferentialexpressionmicroarraynetworkinferencetranscription
1.5 match 3 stars 5.64 score 21 scriptsbiotimehub
BioTIMEr:Tools to Use and Explore the 'BioTIME' Database
The 'BioTIME' database was first published in 2018 and inspired ideas, questions, project and research article. To make it even more accessible, an R package was created. The 'BioTIMEr' package provides tools designed to interact with the 'BioTIME' database. The functions provided include the 'BioTIME' recommended methods for preparing (gridding and rarefaction) time series data, a selection of standard biodiversity metrics (including species richness, numerical abundance and exponential Shannon) alongside examples on how to display change over time. It also includes a sample subset of both the query and meta data, the full versions of which are freely available on the 'BioTIME' website <https://biotime.st-andrews.ac.uk/home.php>.
Maintained by Alban Sagouis. Last updated 8 months ago.
1.5 match 4 stars 5.60 score 10 scriptsalissonrp
fastrep:Time-Saving Package for Creating Reports
Provides templates for reports in 'rmarkdown' and functions to create tables and summaries of data.
Maintained by Alisson Rosa. Last updated 2 years ago.
1.9 match 6 stars 4.48 score 6 scriptsmikejareds
hermiter:Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)
Facilitates estimation of full univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation (bivariate) using Hermite series based estimators. These estimators are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. Based on: Stephanou, Michael, Varughese, Melvin and Macdonald, Iain. "Sequential quantiles via Hermite series density estimation." Electronic Journal of Statistics 11.1 (2017): 570-607 <doi:10.1214/17-EJS1245>, Stephanou, Michael and Varughese, Melvin. "On the properties of Hermite series based distribution function estimators." Metrika (2020) <doi:10.1007/s00184-020-00785-z> and Stephanou, Michael and Varughese, Melvin. "Sequential estimation of Spearman rank correlation using Hermite series estimators." Journal of Multivariate Analysis (2021) <doi:10.1016/j.jmva.2021.104783>.
Maintained by Michael Stephanou. Last updated 7 months ago.
cumulative-distribution-functionkendall-correlation-coefficientonline-algorithmsprobability-density-functionquantilespearman-correlation-coefficientstatisticsstreaming-algorithmsstreaming-datacpp
1.5 match 15 stars 5.58 score 17 scriptsbioc
iSEEhub:iSEE for the Bioconductor ExperimentHub
This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
dataimportimmunooncology infrastructureshinyappssinglecellsoftwarebioconductorbioconductor-packagehacktoberfestisee
1.5 match 3 stars 5.56 score 4 scriptshughparsonage
TeXCheckR:Parses LaTeX Documents for Errors
Checks LaTeX documents and .bib files for typing errors, such as spelling errors, incorrect quotation marks. Also provides useful functions for parsing and linting bibliography files.
Maintained by Hugh Parsonage. Last updated 1 years ago.
1.9 match 8 stars 4.44 score 23 scriptsbioc
iSEEde:iSEE extension for panels related to differential expression analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Maintained by Kevin Rue-Albrecht. Last updated 4 months ago.
softwareinfrastructuredifferentialexpressionbioconductorhacktoberfestiseeu
1.5 match 1 stars 5.38 score 15 scriptsbioc
visiumStitched:Enable downstream analysis of Visium capture areas stitched together with Fiji
This package provides helper functions for working with multiple Visium capture areas that overlap each other. This package was developed along with the companion example use case data available from https://github.com/LieberInstitute/visiumStitched_brain. visiumStitched prepares SpaceRanger (10x Genomics) output files so you can stitch the images from groups of capture areas together with Fiji. Then visiumStitched builds a SpatialExperiment object with the stitched data and makes an artificial hexogonal grid enabling the seamless use of spatial clustering methods that rely on such grid to identify neighboring spots, such as PRECAST and BayesSpace. The SpatialExperiment objects created by visiumStitched are compatible with spatialLIBD, which can be used to build interactive websites for stitched SpatialExperiment objects. visiumStitched also enables casting SpatialExperiment objects as Seurat objects.
Maintained by Nicholas J. Eagles. Last updated 3 months ago.
softwarespatialtranscriptomicstranscriptiongeneexpressionvisualizationdataimport10xgenomicsbioconductorspatial-transcriptomicsspatialexperimentspatiallibdvisium
1.5 match 1 stars 5.36 score 4 scriptsjl5000
tidyged:Handle GEDCOM Files Using Tidyverse Principles
Create and summarise family tree GEDCOM files using tidy dataframes.
Maintained by Jamie Lendrum. Last updated 3 years ago.
1.3 match 8 stars 5.96 score 23 scripts 3 dependentsltrr-arizona-edu
burnr:Forest Fire History Analysis
Tools to read, write, parse, and analyze forest fire history data (e.g. FHX). Described in Malevich et al. (2018) <doi:10.1016/j.dendro.2018.02.005>.
Maintained by Steven Malevich. Last updated 3 years ago.
citationdendrochronologyecologyforestfireplotscientificstatistics
1.3 match 15 stars 5.95 score 59 scriptsbioc
epialleleR:Fast, Epiallele-Aware Methylation Caller and Reporter
Epialleles are specific DNA methylation patterns that are mitotically and/or meiotically inherited. This package calls and reports cytosine methylation as well as frequencies of hypermethylated epialleles at the level of genomic regions or individual cytosines in next-generation sequencing data using binary alignment map (BAM) files as an input. Among other things, this package can also extract and visualise methylation patterns and assess allele specificity of methylation.
Maintained by Oleksii Nikolaienko. Last updated 11 days ago.
dnamethylationepigeneticsmethylseqlongreadbioconductordna-methylationepiallelenext-generation-sequencingsamtoolscurlbzip2xz-utilszlibcpp
1.3 match 4 stars 5.94 score 5 scriptsaravind-j
augmentedRCBD:Analysis of Augmented Randomised Complete Block Designs
Functions for analysis of data generated from experiments in augmented randomised complete block design according to Federer, W.T. (1961) <doi:10.2307/2527837>. Computes analysis of variance, adjusted means, descriptive statistics, genetic variability statistics etc. Further includes data visualization and report generation functions.
Maintained by J. Aravind. Last updated 5 months ago.
augmented-blockaugmented-designaugmented-rcbd
1.3 match 7 stars 5.94 score 21 scriptsbioc
consensusSeekeR:Detection of consensus regions inside a group of experiences using genomic positions and genomic ranges
This package compares genomic positions and genomic ranges from multiple experiments to extract common regions. The size of the analyzed region is adjustable as well as the number of experiences in which a feature must be present in a potential region to tag this region as a consensus region. In genomic analysis where feature identification generates a position value surrounded by a genomic range, such as ChIP-Seq peaks and nucleosome positions, the replication of an experiment may result in slight differences between predicted values. This package enables the conciliation of the results into consensus regions.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionchipseqgeneticsmultiplecomparisontranscriptionpeakdetectionsequencingcoveragechip-seq-analysisgenomic-data-analysisnucleosome-positioning
1.5 match 1 stars 5.26 score 5 scripts 1 dependentsbrian-j-smith
MRMCaov:Multi-Reader Multi-Case Analysis of Variance
Estimation and comparison of the performances of diagnostic tests in multi-reader multi-case studies where true case statuses (or ground truths) are known and one or more readers provide test ratings for multiple cases. Reader performance metrics are provided for area under and expected utility of ROC curves, likelihood ratio of positive or negative tests, and sensitivity and specificity. ROC curves can be estimated empirically or with binormal or binormal likelihood-ratio models. Statistical comparisons of diagnostic tests are based on the ANOVA model of Obuchowski-Rockette and the unified framework of Hillis (2005) <doi:10.1002/sim.2024>. The ANOVA can be conducted with data from a full factorial, nested, or partially paired study design; with random or fixed readers or cases; and covariances estimated with the DeLong method, jackknifing, or an unbiased method. Smith and Hillis (2020) <doi:10.1117/12.2549075>.
Maintained by Brian J Smith. Last updated 2 years ago.
1.5 match 12 stars 5.26 score 8 scripts 1 dependentsbioc
qsvaR:Generate Quality Surrogate Variable Analysis for Degradation Correction
The qsvaR package contains functions for removing the effect of degration in rna-seq data from postmortem brain tissue. The package is equipped to help users generate principal components associated with degradation. The components can be used in differential expression analysis to remove the effects of degradation.
Maintained by Hedia Tnani. Last updated 3 months ago.
softwareworkflowstepnormalizationbiologicalquestiondifferentialexpressionsequencingcoveragebioconductorbraindegradationhumanqsva
1.5 match 5.26 score 4 scriptsjjustison
SiPhyNetwork:A Phylogenetic Simulator for Reticulate Evolution
A simulator for reticulate evolution under a birth-death-hybridization process. Here the birth-death process is extended to consider reticulate Evolution by allowing hybridization events to occur. The general purpose simulator allows the modeling of three different reticulate patterns: lineage generative hybridization, lineage neutral hybridization, and lineage degenerative hybridization. Users can also specify hybridization events to be dependent on a trait value or genetic distance. We also extend some phylogenetic tree utility and plotting functions for networks. We allow two different stopping conditions: simulated to a fixed time or number of taxa. When simulating to a fixed number of taxa, the user can simulate under the Generalized Sampling Approach that properly simulates phylogenies when assuming a uniform prior on the root age.
Maintained by Joshua Justison. Last updated 6 months ago.
1.5 match 11 stars 5.25 score 16 scriptsbioc
TREG:Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data
RNA abundance and cell size parameters could improve RNA-seq deconvolution algorithms to more accurately estimate cell type proportions given the different cell type transcription activity levels. A Total RNA Expression Gene (TREG) can facilitate estimating total RNA content using single molecule fluorescent in situ hybridization (smFISH). We developed a data-driven approach using a measure of expression invariance to find candidate TREGs in postmortem human brain single nucleus RNA-seq. This R package implements the method for identifying candidate TREGs from snRNA-seq data.
Maintained by Louise Huuki-Myers. Last updated 3 months ago.
softwaresinglecellrnaseqgeneexpressiontranscriptomicstranscriptionsequencingbioconductordeconvolutionrnascopescrna-seqsmfishsnrna-seqtreg
1.5 match 4 stars 5.20 score 5 scriptsbioc
regutools:regutools: an R package for data extraction from RegulonDB
RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.
Maintained by Joselyn Chavez. Last updated 3 months ago.
generegulationgeneexpressionsystemsbiologynetworknetworkinferencevisualizationtranscriptionbioconductorcdsbregulondb
1.5 match 4 stars 5.20 score 6 scriptsbioc
snapcount:R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts
snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).
Maintained by Rone Charles. Last updated 5 months ago.
coveragegeneexpressionrnaseqsequencingsoftwaredataimport
1.5 match 3 stars 5.19 score 13 scriptsropensci
antanym:Antarctic Geographic Place Names
Antarctic geographic names from the Composite Gazetteer of Antarctica, and functions for working with those place names.
Maintained by Ben Raymond. Last updated 3 years ago.
antarcticsouthern oceanplace namesgazetteerpeer-reviewed
2.0 match 7 stars 3.89 score 22 scriptsbioc
spaSim:Spatial point data simulator for tissue images
A suite of functions for simulating spatial patterns of cells in tissue images. Output images are multitype point data in SingleCellExperiment format. Each point represents a cell, with its 2D locations and cell type. Potential cell patterns include background cells, tumour/immune cell clusters, immune rings, and blood/lymphatic vessels.
Maintained by Yuzhou Feng. Last updated 5 months ago.
statisticalmethodspatialbiomedicalinformatics
1.5 match 2 stars 5.18 score 25 scriptsbioc
derfinderHelper:derfinder helper package
Helper package for speeding up the derfinder package when using multiple cores. This package is particularly useful when using BiocParallel and it helps reduce the time spent loading the full derfinder package when running the F-statistics calculation in parallel.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
differentialexpressionsequencingrnaseqsoftwareimmunooncologybioconductorderfinder
1.3 match 6.20 score 7 dependentstrevorhastie
glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models
Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.
Maintained by Trevor Hastie. Last updated 2 years ago.
0.5 match 82 stars 15.15 score 22k scripts 736 dependentsearthsystemdiagnostics
sedproxy:Simulation of Sediment Archived Climate Proxy Records
Proxy forward modelling for sediment archived climate proxies such as Mg/Ca, d18O or Alkenones. The user provides a hypothesised "true" past climate, such as output from a climate model, and details of the sedimentation rate and sampling scheme of a sediment core. Sedproxy returns simulated proxy records. Implements the methods described in Dolman and Laepple (2018) <doi:10.5194/cp-14-1851-2018>.
Maintained by Andrew Dolman. Last updated 1 months ago.
1.5 match 7 stars 5.10 score 18 scriptsbioc
SplineDV:Differential Variability (DV) analysis for single-cell RNA sequencing data. (e.g. Identify Differentially Variable Genes across two experimental conditions)
A spline based scRNA-seq method for identifying differentially variable (DV) genes across two experimental conditions. Spline-DV constructs a 3D spline from 3 key gene statistics: mean expression, coefficient of variance, and dropout rate. This is done for both conditions. The 3D spline provides the “expected” behavior of genes in each condition. The distance of the observed mean, CV and dropout rate of each gene from the expected 3D spline is used to measure variability. As the final step, the spline-DV method compares the variabilities of each condition to identify differentially variable (DV) genes.
Maintained by Shreyan Gupta. Last updated 1 months ago.
softwaresinglecellsequencingdifferentialexpressionrnaseqgeneexpressiontranscriptomicsfeatureextraction
1.5 match 2 stars 5.08 score 3 scriptsbioc
icetea:Integrating Cap Enrichment with Transcript Expression Analysis
icetea (Integrating Cap Enrichment with Transcript Expression Analysis) provides functions for end-to-end analysis of multiple 5'-profiling methods such as CAGE, RAMPAGE and MAPCap, beginning from raw reads to detection of transcription start sites using replicates. It also allows performing differential TSS detection between group of samples, therefore, integrating the mRNA cap enrichment information with transcript expression analysis.
Maintained by Vivek Bhardwaj. Last updated 5 months ago.
immunooncologytranscriptiongeneexpressionsequencingrnaseqtranscriptomicsdifferentialexpressioncageexpressionrna-seq
1.5 match 2 stars 5.08 score 7 scriptsbioc
yamss:Tools for high-throughput metabolomics
Tools to analyze and visualize high-throughput metabolomics data aquired using chromatography-mass spectrometry. These tools preprocess data in a way that enables reliable and powerful differential analysis. At the core of these methods is a peak detection phase that pools information across all samples simultaneously. This is in contrast to other methods that detect peaks in a sample-by-sample basis.
Maintained by Leslie Myint. Last updated 5 months ago.
massspectrometrymetabolomicspeakdetectionsoftware
1.5 match 3 stars 5.08 score 9 scriptsralmond
CPTtools:Tools for Creating Conditional Probability Tables
Provides support parameterized tables for Bayesian networks, particularly the IRT-like DiBello tables. Also, provides some tools for visualing the networks.
Maintained by Russell Almond. Last updated 3 months ago.
1.5 match 1 stars 5.05 score 21 scripts 4 dependentsgiscience
ohsome:An 'ohsome API' Client
A client that grants access to the power of the 'ohsome API' from R. It lets you analyze the rich data source of the 'OpenStreetMap (OSM)' history. You can retrieve the geometry of 'OSM' data at specific points in time, and you can get aggregated statistics on the evolution of 'OSM' elements and specify your own temporal, spatial and/or thematic filters.
Maintained by Oliver Fritz. Last updated 2 years ago.
heigitohsomeopenstreetmapopenstreetmap-dataopenstreetmap-historyosmosm-data
1.5 match 11 stars 5.04 score 9 scriptsbioc
nucleoSim:Generate synthetic nucleosome maps
This package can generate a synthetic map with reads covering the nucleosome regions as well as a synthetic map with forward and reverse reads emulating next-generation sequencing. The synthetic hybridization data of “Tiling Arrays” can also be generated. The user has choice between three different distributions for the read positioning: Normal, Student and Uniform. In addition, a visualization tool is provided to explore the synthetic nucleosome maps.
Maintained by Astrid Deschênes. Last updated 5 months ago.
geneticssequencingsoftwarestatisticalmethodalignmentbioconductornucleosome-mapsnucleosomessimulationsimulatorsynthetic-nucleosomes
1.5 match 2 stars 5.00 score 8 scriptsajbass
sffdr:Surrogate Functional False Discovery Rates for Genome-Wide Association Studies
Pleiotropy-informed significance analysis of genome-wide association studies with surrogate functional false discovery rates (sfFDR). The sfFDR framework adapts the fFDR to leverage informative data from multiple sets of GWAS summary statistics to increase power in study while accommodating for linkage disequilibrium. sfFDR provides estimates of key FDR quantities in a significance analysis such as the functional local FDR and $q$-value, and uses these estimates to derive a functional $p$-value for type I error rate control and a functional local Bayes' factor for post-GWAS analyses (e.g., fine mapping and colocalization).
Maintained by Andrew Bass. Last updated 1 months ago.
1.5 match 4 stars 5.00 score 3 scriptsbioc
rRDP:Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Maintained by Michael Hahsler. Last updated 5 months ago.
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
1.5 match 4 stars 5.00 score 6 scriptsdeploid-dev
DEploid:Deconvolute Mixed Genomes with Unknown Proportions
Traditional phasing programs are limited to diploid organisms. Our method modifies Li and Stephens algorithm with Markov chain Monte Carlo (MCMC) approaches, and builds a generic framework that allows haplotype searches in a multiple infection setting. This package is primarily developed as part of the Pf3k project, which is a global collaboration using the latest sequencing technologies to provide a high-resolution view of natural variation in the malaria parasite Plasmodium falciparum. Parasite DNA are extracted from patient blood sample, which often contains more than one parasite strain, with unknown proportions. This package is used for deconvoluting mixed haplotypes, and reporting the mixture proportions from each sample.
Maintained by Joe Zhu. Last updated 2 months ago.
deconvoluting-mixed-genomeshmmmalariamcmcparasitesphasingunknown-proportionszlibcpp
1.5 match 1 stars 4.99 score 39 scriptsbioc
awst:Asymmetric Within-Sample Transformation
We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.
Maintained by Davide Risso. Last updated 5 months ago.
normalizationgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
1.5 match 3 stars 4.95 score 15 scriptsbioc
iSEEpathways:iSEE extension for panels related to pathway analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of pathway analysis results. This package does not perform pathway analysis. Instead, it provides methods to embed precomputed pathway analysis results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
softwareinfrastructuredifferentialexpressiongeneexpressionguivisualizationpathwaysgenesetenrichmentgoshinyappsbioconductorhacktoberfestiseeiseeu
1.5 match 1 stars 4.95 score 10 scriptsbioc
epigraHMM:Epigenomic R-based analysis with hidden Markov models
epigraHMM provides a set of tools for the analysis of epigenomic data based on hidden Markov Models. It contains two separate peak callers, one for consensus peaks from biological or technical replicates, and one for differential peaks from multi-replicate multi-condition experiments. In differential peak calling, epigraHMM provides window-specific posterior probabilities associated with every possible combinatorial pattern of read enrichment across conditions.
Maintained by Pedro Baldoni. Last updated 5 months ago.
chipseqatacseqdnaseseqhiddenmarkovmodelepigeneticszlibopenblascppopenmp
1.5 match 4.94 score 88 scriptstjetka
SLEMI:Statistical Learning Based Estimation of Mutual Information
The implementation of the algorithm for estimation of mutual information and channel capacity from experimental data by classification procedures (logistic regression). Technically, it allows to estimate information-theoretic measures between finite-state input and multivariate, continuous output. Method described in Jetka et al. (2019) <doi:10.1371/journal.pcbi.1007132>.
Maintained by Tomasz Jetka. Last updated 1 years ago.
channel-capacityinformation-theorylogistic-regressionmutual-information-estimation
1.5 match 4 stars 4.92 score 21 scriptsmuschellij2
gcite:Google Citation Parser
Scrapes Google Citation pages and creates data frames of citations over time.
Maintained by John Muschelli. Last updated 3 years ago.
2.0 match 3 stars 3.67 score 31 scriptsbioc
rmelting:R Interface to MELTING 5
R interface to the MELTING 5 program (https://www.ebi.ac.uk/biomodels/tools/melting/) to compute melting temperatures of nucleic acid duplexes along with other thermodynamic parameters.
Maintained by J. Aravind. Last updated 5 months ago.
biomedicalinformaticscheminformaticsbioconductorbioinformaticsmelting-temperatureopenjdk
1.5 match 2 stars 4.78 score 10 scriptsangeella
pARI:Permutation-Based All-Resolutions Inference
Computes the All-Resolution Inference method in the permutation framework, i.e., simultaneous lower confidence bounds for the number of true discoveries. <doi:10.1002/sim.9725>.
Maintained by Angela Andreella. Last updated 6 months ago.
aricluster-mapcopesdiscoveriesfmrifslpermutationselective-inferencesimultaneous-confidence-boundsspmopenblascpp
1.5 match 4 stars 4.78 score 9 scripts 1 dependentsbioc
wpm:Well Plate Maker
The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.
Maintained by Helene Borges. Last updated 5 months ago.
guiproteomicsmassspectrometrybatcheffectexperimentaldesign
1.5 match 6 stars 4.78 score 7 scriptsbioc
decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting
Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.
Maintained by Rosario M. Piro. Last updated 5 months ago.
softwaresnpsequencingdnaseqgenomicvariationsomaticmutationbiomedicalinformaticsgeneticsbiologicalquestionstatisticalmethod
1.5 match 1 stars 4.78 score 10 scripts 1 dependentsmw201608
NetWeaver:Graphic Presentation of Complex Genomic and Network Data Analysis
Implements various simple function utilities and flexible pipelines to generate circular images for visualizing complex genomic and network data analysis features.
Maintained by Minghui Wang. Last updated 2 years ago.
1.5 match 4 stars 4.75 score 28 scriptsbioc
plyinteractions:Extending tidy verbs to genomic interactions
Operate on `GInteractions` objects as tabular data using `dplyr`-like verbs. The functions and methods in `plyinteractions` provide a grammatical approach to manipulate `GInteractions`, to facilitate their integration in genomic analysis workflows.
Maintained by Jacques Serizay. Last updated 5 months ago.
1.5 match 4.75 score 14 scriptsannechao
iNEXT.beta3D:Interpolation and Extrapolation with Beta Diversity for Three Dimensions of Biodiversity
As a sequel to 'iNEXT', the 'iNEXT.beta3D' package provides functions to compute standardized taxonomic, phylogenetic, and functional diversity (3D) estimates with a common sample size (for alpha and gamma diversity) or sample coverage (for alpha, beta, gamma diversity as well as dissimilarity or turnover indices). Hill numbers and their generalizations are used to quantify 3D and to make multiplicative decomposition (gamma = alpha x beta). The package also features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of beta diversity across datasets. See Chao et al. (2023) <doi:10.1002/ecm.1588> for more details.
Maintained by Anne Chao. Last updated 4 months ago.
1.3 match 5.30 score 6 scriptsbioc
AWFisher:An R package for fast computing for adaptively weighted fisher's method
Implementation of the adaptively weighted fisher's method, including fast p-value computing, variability index, and meta-pattern.
Maintained by Zhiguang Huo. Last updated 5 months ago.
1.5 match 5 stars 4.70 score 4 scriptsbioc
flowcatchR:Tools to analyze in vivo microscopy imaging data focused on tracking flowing blood cells
flowcatchR is a set of tools to analyze in vivo microscopy imaging data, focused on tracking flowing blood cells. It guides the steps from segmentation to calculation of features, filtering out particles not of interest, providing also a set of utilities to help checking the quality of the performed operations (e.g. how good the segmentation was). It allows investigating the issue of tracking flowing cells such as in blood vessels, to categorize the particles in flowing, rolling and adherent. This classification is applied in the study of phenomena such as hemostasis and study of thrombosis development. Moreover, flowcatchR presents an integrated workflow solution, based on the integration with a Shiny App and Jupyter notebooks, which is delivered alongside the package, and can enable fully reproducible bioimage analysis in the R environment.
Maintained by Federico Marini. Last updated 3 months ago.
softwarevisualizationcellbiologyclassificationinfrastructureguishinyappsbioconductorfluorescencemicroscopyparticlestracking
1.3 match 4 stars 5.62 score 8 scriptsbioc
TargetDecoy:Diagnostic Plots to Evaluate the Target Decoy Approach
A first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.
Maintained by Elke Debrie. Last updated 5 months ago.
massspectrometryproteomicsqualitycontrolsoftwarevisualizationbioconductormass-spectrometry
1.5 match 1 stars 4.60 score 9 scriptsrrwen
nbc4va:Bayes Classifier for Verbal Autopsy Data
An implementation of the Naive Bayes Classifier (NBC) algorithm used for Verbal Autopsy (VA) built on code from Miasnikof et al (2015) <DOI:10.1186/s12916-015-0521-2>.
Maintained by Richard Wen. Last updated 3 years ago.
autopsybayescauseclassifiercodedcomputerdeathestimateimputationlearningmachinemdsmillionnaivenbcprobabilitystudytheoryvaverbal
1.5 match 4.60 score 79 scripts