Showing 27 of total 27 results (show query)
rspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 21 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
560 stars 17.65 score 17k scripts 856 dependentsquanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 3 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
851 stars 16.65 score 5.4k scripts 52 dependentsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
293 stars 12.51 score 2.0k scripts 5 dependentsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 26 days ago.
419 stars 12.39 score 448 scripts 8 dependentsbioc
VariantAnnotation:Annotation of Genetic Variants
Annotate variants, compute amino acid coding changes, predict coding outcomes.
Maintained by Bioconductor Package Maintainer. Last updated 3 months ago.
dataimportsequencingsnpannotationgeneticsvariantannotationcurlbzip2xz-utilszlib
11.39 score 1.9k scripts 152 dependentsropensci
git2rdata:Store and Retrieve Data.frames in a Git Repository
The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.
Maintained by Thierry Onkelinx. Last updated 2 months ago.
reproducible-researchversion-control
99 stars 10.03 score 216 scripts 4 dependentsropensci
RNeXML:Semantically Rich I/O for the 'NeXML' Format
Provides access to phyloinformatic data in 'NeXML' format. The package should add new functionality to R such as the possibility to manipulate 'NeXML' objects in more various and refined way and compatibility with 'ape' objects.
Maintained by Carl Boettiger. Last updated 11 months ago.
metadatanexmlphylogeneticslinked-data
13 stars 9.97 score 100 scripts 19 dependentsmikewlcheung
metaSEM:Meta-Analysis using Structural Equation Modeling
A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via the 'OpenMx' and 'lavaan' packages. It also implements various procedures to perform meta-analytic structural equation modeling on the correlation and covariance matrices, see Cheung (2015) <doi:10.3389/fpsyg.2014.01521>.
Maintained by Mike Cheung. Last updated 23 days ago.
meta-analysismeta-analytic-semmissing-datamultilevel-modelsmultivariate-analysisstructural-equation-modelingstructural-equation-models
30 stars 9.43 score 208 scripts 1 dependentskurthornik
NLP:Natural Language Processing Infrastructure
Basic classes and methods for Natural Language Processing.
Maintained by Kurt Hornik. Last updated 4 months ago.
6 stars 9.42 score 1.0k scripts 127 dependentsropensci
textreuse:Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Maintained by Yaoxiang Li. Last updated 1 months ago.
200 stars 9.28 score 226 scriptsbioc
cmapR:CMap Tools in R
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.
Maintained by Ted Natoli. Last updated 5 months ago.
dataimportdatarepresentationgeneexpressionbioconductorbioinformaticscmap
90 stars 8.86 score 298 scriptsbioc
dittoSeq:User Friendly Single-Cell and Bulk RNA Sequencing Visualization
A universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected dittoColors().
Maintained by Daniel Bunis. Last updated 5 months ago.
softwarevisualizationrnaseqsinglecellgeneexpressiontranscriptomicsdataimport
7.56 score 760 scripts 2 dependentsmeireles
spectrolab:Class and Methods for Spectral Data
Input/Output, processing and visualization of spectra taken with different spectrometers, including SVC (Spectra Vista), ASD and PSR (Spectral Evolution). Implements an S3 class spectra that other packages can build on. Provides methods to access, plot, manipulate, splice sensor overlap, vector normalize and smooth spectra.
Maintained by Jose Eduardo Meireles. Last updated 3 months ago.
16 stars 7.39 score 256 scriptsbioc
multiHiCcompare:Normalize and detect differences between Hi-C datasets when replicates of each experimental condition are available
multiHiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. This extension of the original HiCcompare package now allows for Hi-C experiments with more than 2 groups and multiple samples per group. multiHiCcompare operates on processed Hi-C data in the form of sparse upper triangular matrices. It accepts four column (chromosome, region1, region2, IF) tab-separated text files storing chromatin interaction matrices. multiHiCcompare provides cyclic loess and fast loess (fastlo) methods adapted to jointly normalizing Hi-C data. Additionally, it provides a general linear model (GLM) framework adapting the edgeR package to detect differences in Hi-C data in a distance dependent manner.
Maintained by Mikhail Dozmorov. Last updated 5 months ago.
softwarehicsequencingnormalization
9 stars 7.30 score 37 scripts 2 dependentsbioc
SIAMCAT:Statistical Inference of Associations between Microbial Communities And host phenoTypes
Pipeline for Statistical Inference of Associations between Microbial Communities And host phenoTypes (SIAMCAT). A primary goal of analyzing microbiome data is to determine changes in community composition that are associated with environmental factors. In particular, linking human microbiome composition to host phenotypes such as diseases has become an area of intense research. For this, robust statistical modeling and biomarker extraction toolkits are crucially needed. SIAMCAT provides a full pipeline supporting data preprocessing, statistical association testing, statistical modeling (LASSO logistic regression) including tools for evaluation and interpretation of these models (such as cross validation, parameter selection, ROC analysis and diagnostic model plots).
Maintained by Jakob Wirbel. Last updated 5 months ago.
immunooncologymetagenomicsclassificationmicrobiomesequencingpreprocessingclusteringfeatureextractiongeneticvariabilitymultiplecomparisonregression
6.72 score 147 scriptsgadenbuie
metathis:HTML Metadata Tags for 'R Markdown' and 'Shiny'
Create meta tags for 'R Markdown' HTML documents and 'Shiny' apps for customized social media cards, for accessibility, and quality search engine indexing. 'metathis' currently supports HTML documents created with 'rmarkdown', 'shiny', 'xaringan', 'pagedown', 'bookdown', and 'flexdashboard'.
Maintained by Garrick Aden-Buie. Last updated 1 years ago.
67 stars 6.29 score 584 scriptsbioc
biscuiteer:Convenience Functions for Biscuit
A test harness for bsseq loading of Biscuit output, summarization of WGBS data over defined regions and in mappable samples, with or without imputation, dropping of mostly-NA rows, age estimates, etc.
Maintained by Jacob Morrison. Last updated 5 months ago.
dataimportmethylseqdnamethylation
6 stars 5.98 score 16 scriptsrethomics
behavr:Canonical Data Structure for Behavioural Data
Implements an S3 class based on 'data.table' to store and process efficiently ethomics (high-throughput behavioural) data.
Maintained by Quentin Geissmann. Last updated 4 years ago.
biological-data-analysisdata-structuresethomics
6 stars 5.91 score 64 scripts 7 dependentsstatistikat
tatoo:Combine and Export Data Frames
Functions to combine data.frames in ways that require additional effort in base R, and to add metadata (id, title, ...) that can be used for printing and xlsx export. The 'Tatoo_report' class is provided as a convenient helper to write several such tables to a workbook, one table per worksheet. Tatoo is built on top of 'openxlsx', but intimate knowledge of that package is not required to use tatoo.
Maintained by Stefan Fleck. Last updated 2 years ago.
7 stars 5.53 score 24 scriptsbioc
Rcwl:An R interface to the Common Workflow Language
The Common Workflow Language (CWL) is an open standard for development of data analysis workflows that is portable and scalable across different tools and working environments. Rcwl provides a simple way to wrap command line tools and build CWL data analysis pipelines programmatically within R. It increases the ease of usage, development, and maintenance of CWL pipelines.
Maintained by Qiang Hu. Last updated 5 months ago.
softwareworkflowstepimmunooncology
5.52 score 37 scripts 2 dependentsbioc
bacon:Controlling bias and inflation in association studies using the empirical null distribution
Bacon can be used to remove inflation and bias often observed in epigenome- and transcriptome-wide association studies. To this end bacon constructs an empirical null distribution using a Gibbs Sampling algorithm by fitting a three-component normal mixture on z-scores.
Maintained by Maarten van Iterson. Last updated 5 months ago.
immunooncologystatisticalmethodbayesianregressiongenomewideassociationtranscriptomicsrnaseqmethylationarraybatcheffectmultiplecomparison
5.19 score 97 scriptsbergsmat
nonmemica:Create and Evaluate NONMEM Models in a Project Context
Systematically creates and modifies NONMEM(R) control streams. Harvests NONMEM output, builds run logs, creates derivative data, generates diagnostics. NONMEM (ICON Development Solutions <https://www.iconplc.com/>) is software for nonlinear mixed effects modeling. See 'package?nonmemica'.
Maintained by Tim Bergsma. Last updated 3 months ago.
4 stars 4.58 score 45 scriptsbioc
SpatialOmicsOverlay:Spatial Overlay for Omic Data from Nanostring GeoMx Data
Tools for NanoString Technologies GeoMx Technology. Package to easily graph on top of an OME-TIFF image. Plotting annotations can range from tissue segment to gene expression.
Maintained by Maddy Griswold. Last updated 5 months ago.
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsproprietaryplatformsrnaseqspatialdatarepresentationvisualizationopenjdk
4.30 score 8 scriptstconwell
html5:Creates Valid HTML5 Strings
Generates valid HTML tag strings for HTML5 elements documented by Mozilla. Attributes are passed as named lists, with names being the attribute name and values being the attribute value. Attribute values are automatically double-quoted. To declare a DOCTYPE, wrap html() with function doctype(). Mozilla's documentation for HTML5 is available here: <https://developer.mozilla.org/en-US/docs/Web/HTML/Element>. Elements marked as obsolete are not included.
Maintained by Timothy Conwell. Last updated 2 years ago.
1 stars 3.65 score 1 scripts 3 dependentslawremi
rsolr:R to Solr Interface
A comprehensive R API for querying Apache Solr databases. A Solr core is represented as a data frame or list that supports Solr-side filtering, sorting, transformation and aggregation, all through the familiar base R API. Queries are processed lazily, i.e., a query is only sent to the database when the data are required.
Maintained by Michael Lawrence. Last updated 3 years ago.
9 stars 3.65 score 6 scripts