Showing 200 of total 234 results (show query)
kjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 12 months ago.
75.5 match 2.28 score 38 scriptseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 12 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
11.1 match 118 stars 9.40 score 76 scriptsselkamand
assertions:Simple Assertions for Beautiful and Customisable Error Messages
Provides simple assertions with sensible defaults and customisable error messages. It offers convenient assertion call wrappers and a general assert function that can handle any condition. Default error messages are user friendly and easily customized with inline code evaluation and styling powered by the 'cli' package.
Maintained by Sam El-Kamand. Last updated 4 months ago.
14.3 match 3 stars 6.84 score 172 scripts 3 dependentsbozercavdar
less:Learning with Subset Stacking
"Learning with Subset Stacking" is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript at <arXiv:2112.06251>.
Maintained by Burhan Ozer Cavdar. Last updated 3 years ago.
55.5 match 1.70 score 5 scriptshenrikbengtsson
R.utils:Various Programming Utilities
Utility functions useful when programming and developing R packages.
Maintained by Henrik Bengtsson. Last updated 1 years ago.
5.6 match 63 stars 13.74 score 5.7k scripts 814 dependentsciirc-kso
rless:Leaner Style Sheets
Converts LESS to CSS. It uses V8 engine, where LESS parser is run. Functions for LESS text, file or folder conversion are provided. This work was supported by a junior grant research project by Czech Science Foundation 'GACR' no. 'GJ18-04150Y'.
Maintained by Jonas Vaclavek. Last updated 6 years ago.
18.7 match 1 stars 4.00 score 8 scriptsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 6 days ago.
3.3 match 521 stars 16.50 score 1.4k scripts 39 dependentsopengeos
whitebox:'WhiteboxTools' R Frontend
An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.
Maintained by Andrew Brown. Last updated 6 months ago.
geomorphometrygeoprocessinggeospatialgishydrologyremote-sensingrstudio
5.3 match 173 stars 9.65 score 203 scripts 2 dependentspoissonconsulting
chk:Check User-Supplied Function Arguments
For developers to check user-supplied function arguments. It is designed to be simple, fast and customizable. Error messages follow the tidyverse style guide.
Maintained by Joe Thorley. Last updated 2 months ago.
4.3 match 48 stars 11.91 score 22 scripts 98 dependentsixpantia
lacrmr:Connect to the 'Less Annoying CRM' API
Connect to the 'Less Annoying CRM' API with ease to get your crm data in a clean and tidy format. 'Less Annoying CRM' is a simple CRM built for small businesses, more information is available on their website <https://www.lessannoyingcrm.com/>.
Maintained by Frans van Dunné. Last updated 1 years ago.
crmhacktoberfestless-annoying-crm
10.5 match 2 stars 4.48 score 4 scriptstengmcing
bandicoot:Light-Weight 'python'-Like Object-Oriented System
A light-weight object-oriented system with 'python'-like syntax which supports multiple inheritances and incorporates a 'python'-like method resolution order.
Maintained by Weihao Li. Last updated 1 years ago.
7.7 match 4 stars 4.78 score 10 scripts 1 dependentsgdemin
expss:Tables, Labels and Some Useful Functions from Spreadsheets and 'SPSS' Statistics
Package computes and displays tables with support for 'SPSS'-style labels, multiple and nested banners, weights, multiple-response variables and significance testing. There are facilities for nice output of tables in 'knitr', 'Shiny', '*.xlsx' files, R and 'Jupyter' notebooks. Methods for labelled variables add value labels support to base R functions and to some functions from other packages. Additionally, the package brings popular data transformation functions from 'SPSS' Statistics and 'Excel': 'RECODE', 'COUNT', 'COUNTIF', 'VLOOKUP' and etc. These functions are very useful for data processing in marketing research surveys. Package intended to help people to move data processing from 'Excel' and 'SPSS' to R.
Maintained by Gregory Demin. Last updated 12 months ago.
excellabelslabels-supportmsexcelpivot-tablesrecodespssspss-statisticstablesvariable-labelsvlookup
3.3 match 84 stars 11.00 score 1.8k scripts 4 dependentsr-lib
testthat:Unit Testing for R
Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.
Maintained by Hadley Wickham. Last updated 1 months ago.
1.7 match 900 stars 20.99 score 74k scripts 471 dependentsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 6 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
3.4 match 942 stars 10.73 score 284 scriptshrbrmstr
epidata:Tools to Retrieve Economic Policy Institute Data Library Extracts
The Economic Policy Institute (<http://www.epi.org/>) provides researchers, media, and the public with easily accessible, up-to-date, and comprehensive historical data on the American labor force. It is compiled from Economic Policy Institute analysis of government data sources. Use it to research wages, inequality, and other economic indicators over time and among demographic groups. Data is usually updated monthly.
Maintained by Bob Rudis. Last updated 5 years ago.
6.5 match 19 stars 5.42 score 28 scriptsthothorn
ipred:Improved Predictors
Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error.
Maintained by Torsten Hothorn. Last updated 9 months ago.
3.1 match 10.76 score 3.3k scripts 411 dependentsrstudio
shinyvalidate:Input Validation for Shiny Apps
Improves the user experience of Shiny apps by helping to provide feedback when required inputs are missing, or input values are not valid.
Maintained by Carson Sievert. Last updated 1 years ago.
3.5 match 112 stars 9.10 score 316 scripts 13 dependentsr-quantities
units:Measurement Units for R Vectors
Support for measurement units in R vectors, matrices and arrays: automatic propagation, conversion, derivation and simplification of units; raising errors in case of unit incompatibility. Compatible with the POSIXct, Date and difftime classes. Uses the UNIDATA udunits library and unit database for unit compatibility checking and conversion. Documentation about 'units' is provided in the paper by Pebesma, Mailund & Hiebert (2016, <doi:10.32614/RJ-2016-061>), included in this package as a vignette; see 'citation("units")' for details.
Maintained by Edzer Pebesma. Last updated 20 days ago.
1.8 match 181 stars 17.28 score 3.3k scripts 1.2k dependentsr-spatial
spdep:Spatial Dependence: Weighting Schemes, Statistics
A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.
Maintained by Roger Bivand. Last updated 1 months ago.
spatial-autocorrelationspatial-dependencespatial-weights
1.8 match 131 stars 16.59 score 6.0k scripts 106 dependentsstan-dev
loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models
Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
Maintained by Jonah Gabry. Last updated 19 days ago.
bayesbayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticscross-validationinformation-criterionmodel-comparisonstan
1.6 match 152 stars 17.30 score 2.6k scripts 297 dependentsstatnet
ergm:Fit, Simulate and Diagnose Exponential-Family Models for Networks
An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.
Maintained by Pavel N. Krivitsky. Last updated 23 days ago.
1.8 match 100 stars 15.36 score 1.4k scripts 36 dependentsr-lib
clock:Date-Time Types and Tools
Provides a comprehensive library for date-time manipulations using a new family of orthogonal date-time classes (durations, time points, zoned-times, and calendars) that partition responsibilities so that the complexities of time zones are only considered when they are really needed. Capabilities include: date-time parsing, formatting, arithmetic, extraction and updating of components, and rounding.
Maintained by Davis Vaughan. Last updated 15 days ago.
1.8 match 106 stars 14.53 score 296 scripts 407 dependentscwickham
munsell:Utilities for Using Munsell Colours
Provides easy access to, and manipulation of, the Munsell colours. Provides a mapping between Munsell's original notation (e.g. "5R 5/10") and hexadecimal strings suitable for use directly in R graphics. Also provides utilities to explore slices through the Munsell colour tree, to transform Munsell colours and display colour palettes.
Maintained by Charlotte Wickham. Last updated 12 months ago.
1.8 match 111 stars 13.99 score 179 scripts 8.0k dependentsdgerbing
lessR:Less Code, More Results
Each function replaces multiple standard R functions. For example, two function calls, Read() and CountAll(), generate summary statistics for all variables in the data frame, plus histograms and bar charts as appropriate. Other functions provide for summary statistics via pivot tables, a comprehensive regression analysis, ANOVA and t-test, visualizations including the Violin/Box/Scatter plot for a numerical variable, bar chart, histogram, box plot, density curves, calibrated power curve, reading multiple data formats with the same function call, variable labels, time series with aggregation and forecasting, color themes, and Trellis (facet) graphics. Also includes a confirmatory factor analysis of multiple indicator measurement models, pedagogical routines for data simulation such as for the Central Limit Theorem, generation and rendering of regression instructions for interpretative output, and interactive visualizations.
Maintained by David W. Gerbing. Last updated 16 days ago.
3.3 match 6 stars 7.42 score 394 scripts 3 dependentsthothorn
maxstat:Maximally Selected Rank Statistics
Maximally selected rank statistics with several p-value approximations.
Maintained by Torsten Hothorn. Last updated 8 years ago.
3.1 match 1 stars 7.69 score 107 scripts 59 dependentsbrodieg
diffobj:Diffs for R Objects
Generate a colorized diff of two R objects for an intuitive visualization of their differences.
Maintained by Brodie Gaslam. Last updated 3 years ago.
1.8 match 231 stars 13.17 score 107 scripts 494 dependentsf0nzie
rTorch:R Bindings to 'PyTorch'
'R' implementation and interface of the Machine Learning platform 'PyTorch' <https://pytorch.org/> developed in 'Python'. It requires a 'conda' environment with 'torch' and 'torchvision' Python packages to provide 'PyTorch' functions, methods and classes. The key object in 'PyTorch' is the tensor which is in essence a multidimensional array. These tensors are fairly flexible in performing calculations in CPUs as well as 'GPUs' to accelerate tensor operations.
Maintained by Alfonso R. Reyes. Last updated 3 years ago.
3.7 match 6 stars 5.97 score 157 scriptspecanproject
PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.
Maintained by David LeBauer. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
1.9 match 216 stars 11.62 score 64 scripts 14 dependentsnhs-r-community
NHSRplotthedots:Draw XmR Charts for NHSE/I 'Making Data Count' Programme
Provides tools for drawing Statistical Process Control (SPC) charts. This package supports the NHSE/I programme 'Making Data Count', and allows users to draw XmR charts, use change points and apply rules with summary indicators for when rules are breached.
Maintained by Christopher Reading. Last updated 2 months ago.
2.5 match 49 stars 8.35 score 58 scriptsglsnow
TeachingDemos:Demonstrations for Teaching and Learning
Demonstration functions that can be used in a classroom to demonstrate statistical concepts, or on your own to better understand the concepts or the programming.
Maintained by Greg Snow. Last updated 1 years ago.
2.9 match 7.18 score 760 scripts 13 dependentssgibb
MALDIquant:Quantitative Analysis of Mass Spectrometry Data
A complete analysis pipeline for matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) and other two-dimensional mass spectrometry data. In addition to commonly used plotting and processing methods it includes distinctive features, namely baseline subtraction methods such as morphological filters (TopHat) or the statistics-sensitive non-linear iterative peak-clipping algorithm (SNIP), peak alignment using warping functions, handling of replicated measurements as well as allowing spectra with different resolutions.
Maintained by Sebastian Gibb. Last updated 7 months ago.
maldimaldi-imsmaldi-tof-msmass-spectrometry
1.9 match 62 stars 11.06 score 180 scripts 44 dependentsmmaechler
sfsmisc:Utilities from 'Seminar fuer Statistik' ETH Zurich
Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, some of which were ported from S-plus in the 1990s. For graphics, have pretty (Log-scale) axes eaxis(), an enhanced Tukey-Anscombe plot, combining histogram and boxplot, 2d-residual plots, a 'tachoPlot()', pretty arrows, etc. For robustness, have a robust F test and robust range(). For system support, notably on Linux, provides 'Sys.*()' functions with more access to system and CPU information. Finally, miscellaneous utilities such as simple efficient prime numbers, integer codes, Duplicated(), toLatex.numeric() and is.whole().
Maintained by Martin Maechler. Last updated 5 months ago.
1.9 match 11 stars 10.85 score 566 scripts 119 dependentsjthomasmock
gtExtras:Extending 'gt' for Beautiful HTML Tables
Provides additional functions for creating beautiful tables with 'gt'. The functions are generally wrappers around boilerplate or adding opinionated niche capabilities and helpers functions.
Maintained by Thomas Mock. Last updated 12 months ago.
data-sciencedata-visualizationdatascienceggplot2gtplotssparklinesparkline-graphssparklinestables
1.7 match 201 stars 11.66 score 2.4k scripts 5 dependentsfvafrcu
fritools:Utilities for the Forest Research Institute of the State Baden-Wuerttemberg
Miscellaneous utilities, tools and helper functions for finding and searching files on disk, searching for and removing R objects from the workspace. Does not import or depend on any third party package, but on core R only (i.e. it may depend on packages with priority 'base').
Maintained by Andreas Dominik Cullmann. Last updated 1 months ago.
3.4 match 5.82 score 4 scripts 6 dependentsplangfelder
WGCNA:Weighted Correlation Network Analysis
Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.
Maintained by Peter Langfelder. Last updated 7 months ago.
2.0 match 54 stars 9.65 score 5.3k scripts 32 dependentsfishr-core-team
FSA:Simple Fisheries Stock Assessment Methods
A variety of simple fish stock assessment methods.
Maintained by Derek H. Ogle. Last updated 2 months ago.
fishfisheriesfisheries-managementfisheries-stock-assessmentpopulation-dynamicsstock-assessment
1.7 match 69 stars 11.16 score 1.7k scripts 6 dependentsjverzani
UsingR:Data Sets, Etc. for the Text "Using R for Introductory Statistics", Second Edition
A collection of data sets to accompany the textbook "Using R for Introductory Statistics," second edition.
Maintained by John Verzani. Last updated 3 years ago.
3.8 match 1 stars 5.00 score 1.4k scriptsropensci
aorsf:Accelerated Oblique Random Forests
Fit, interpret, and compute predictions with oblique random forests. Includes support for partial dependence, variable importance, passing customized functions for variable importance and identification of linear combinations of features. Methods for the oblique random survival forest are described in Jaeger et al., (2023) <DOI:10.1080/10618600.2023.2231048>.
Maintained by Byron Jaeger. Last updated 11 days ago.
data-scienceobliquerandom-forestsurvivalopenblascppopenmp
2.0 match 58 stars 9.29 score 60 scripts 1 dependentsgnguy
assertable:Verbose Assertions for Tabular Data (Data.frames and Data.tables)
Simple, flexible, assertions on data.frame or data.table objects with verbose output for vetting. While other assertion packages apply towards more general use-cases, assertable is tailored towards tabular data. It includes functions to check variable names and values, whether the dataset contains all combinations of a given set of unique identifiers, and whether it is a certain length. In addition, assertable includes utility functions to check the existence of target files and to efficiently import multiple tabular data files into one data.table.
Maintained by Grant Nguyen. Last updated 4 years ago.
2.9 match 6.29 score 219 scripts 2 dependentsnatverse
nat:NeuroAnatomy Toolbox for Analysis of 3D Image Data
NeuroAnatomy Toolbox (nat) enables analysis and visualisation of 3D biological image data, especially traced neurons. Reads and writes 3D images in NRRD and 'Amira' AmiraMesh formats and reads surfaces in 'Amira' hxsurf format. Traced neurons can be imported from and written to SWC and 'Amira' LineSet and SkeletonGraph formats. These data can then be visualised in 3D via 'rgl', manipulated including applying calculated registrations, e.g. using the 'CMTK' registration suite, and analysed. There is also a simple representation for neurons that have been subjected to 3D skeletonisation but not formally traced; this allows morphological comparison between neurons including searches and clustering (via the 'nat.nblast' extension package).
Maintained by Gregory Jefferis. Last updated 6 months ago.
3dconnectomicsimage-analysisneuroanatomyneuroanatomy-toolboxneuronneuron-morphologyneurosciencevisualisation
1.7 match 67 stars 9.94 score 436 scripts 2 dependentsgadenbuie
cleanrmd:Clean Class-Less 'R Markdown' HTML Documents
A collection of clean 'R Markdown' HTML document templates using classy-looking classless CSS styles. These documents use a minimal set of dependencies but still look great, making them suitable for use a package vignettes or for sharing results via email.
Maintained by Garrick Aden-Buie. Last updated 2 years ago.
classlessclassless-themecleancsshtmlrmarkdownstyletheme
2.9 match 151 stars 5.95 score 10 scripts 1 dependentsbioc
imcRtools:Methods for imaging mass cytometry data analysis
This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.
Maintained by Daniel Schulz. Last updated 5 months ago.
immunooncologysinglecellspatialdataimportclusteringimcsingle-cell
2.2 match 24 stars 7.58 score 126 scriptsbilldenney
PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis
Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.
Maintained by Bill Denney. Last updated 1 months ago.
ncanoncompartmental-analysispharmacokinetics
1.3 match 73 stars 12.53 score 214 scripts 4 dependentsstemangiola
tidyHeatmap:A Tidy Implementation of Heatmap
This is a tidy implementation for heatmap. At the moment it is based on the (great) package 'ComplexHeatmap'. The goal of this package is to interface a tidy data frame with this powerful tool. Some of the advantages are: Row and/or columns colour annotations are easy to integrate just specifying one parameter (column names). Custom grouping of rows is easy to specify providing a grouped tbl. For example: df %>% group_by(...). Labels size adjusted by row and column total number. Default use of Brewer and Viridis palettes.
Maintained by Stefano Mangiola. Last updated 2 months ago.
assaydomaininfrastructurebrewercomplexheatmapcustom-palettedplyrgraphvizheatmapmtcarsplottingrstudioscaletibbletidytidy-data-frametidybulktidyverseviridis
1.6 match 335 stars 10.23 score 197 scripts 1 dependentssujit-sahu
bmstdr:Bayesian Modeling of Spatio-Temporal Data with R
Fits, validates and compares a number of Bayesian models for spatial and space time point referenced and areal unit data. Model fitting is done using several packages: 'rstan', 'INLA', 'spBayes', 'spTimer', 'spTDyn', 'CARBayes' and 'CARBayesST'. Model comparison is performed using the DIC and WAIC, and K-fold cross-validation where the user is free to select their own subset of data rows for validation. Sahu (2022) <doi:10.1201/9780429318443> describes the methods in detail.
Maintained by Sujit K. Sahu. Last updated 1 days ago.
bayesianmodellingspatio-temporal-datacpp
3.1 match 16 stars 5.28 score 12 scriptstraversc
stringfish:Alt String Implementation
Provides an extendable, performant and multithreaded 'alt-string' implementation backed by 'C++' vectors and strings.
Maintained by Travers Ching. Last updated 5 months ago.
1.5 match 67 stars 10.14 score 14 scripts 57 dependentsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica Anderson. Last updated 14 days ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
1.6 match 7 stars 9.06 score 54 scriptsnlmixr2
nlmixr2:Nonlinear Mixed Effects Models in Population PK/PD
Fit and compare nonlinear mixed-effects models in differential equations with flexible dosing information commonly seen in pharmacokinetics and pharmacodynamics (Almquist, Leander, and Jirstrand 2015 <doi:10.1007/s10928-015-9409-1>). Differential equation solving is by compiled C code provided in the 'rxode2' package (Wang, Hallow, and James 2015 <doi:10.1002/psp4.12052>).
Maintained by Matthew Fidler. Last updated 1 days ago.
1.7 match 52 stars 8.32 score 120 scripts 3 dependentsmrc-ide
gonovax:Deterministic Compartmental Model of Gonorrhoea with Vaccination
Model for gonorrhoea vaccination, using odin.
Maintained by Lilith Whittles. Last updated 18 days ago.
3.0 match 3 stars 4.56 scoredarwin-eu
DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model
Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.
Maintained by Martí Català. Last updated 3 months ago.
1.7 match 8.20 score 156 scripts 2 dependentscran
epiR:Tools for the Analysis of Epidemiological Data
Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.
Maintained by Mark Stevenson. Last updated 2 months ago.
1.7 match 10 stars 8.06 score 10 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 14 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
1.7 match 105 stars 7.98 scoremoderndive
moderndive:Tidyverse-Friendly Introductory Linear Regression
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.
Maintained by Albert Y. Kim. Last updated 3 months ago.
1.1 match 88 stars 11.32 score 1.8k scriptscardiomoon
ztable:Zebra-Striped Tables in LaTeX and HTML Formats
Makes zebra-striped tables (tables with alternating row colors) in LaTeX and HTML formats easily from a data.frame, matrix, lm, aov, anova, glm, coxph, nls, fitdistr, mytable and cbind.mytable objects.
Maintained by Keon-Woong Moon. Last updated 2 years ago.
1.6 match 21 stars 7.90 score 212 scripts 2 dependentskharchenkolab
pagoda2:Single Cell Analysis and Differential Expression
Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
scrna-seqsingle-cellsingle-cell-rna-seqtranscriptomicsopenblascppopenmp
1.6 match 223 stars 8.00 score 282 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
1.5 match 3 stars 8.20 score 7.8k scripts 11 dependentsdavid-hervas
clickR:Semi-Automatic Preprocessing of Messy Data with Change Tracking for Dataset Cleaning
Tools for assessing data quality, performing exploratory analysis, and semi-automatic preprocessing of messy data with change tracking for integral dataset cleaning.
Maintained by David Hervas Marin. Last updated 1 months ago.
2.3 match 2 stars 5.53 score 25 scripts 1 dependentsbioc
psichomics:Graphical Interface for Alternative Splicing Quantification, Analysis and Visualisation
Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression project (GTEx), Sequence Read Archive (SRA) and user-provided data. The tool interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included.
Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.
sequencingrnaseqalternativesplicingdifferentialsplicingtranscriptionguiprincipalcomponentsurvivalbiomedicalinformaticstranscriptomicsimmunooncologyvisualizationmultiplecomparisongeneexpressiondifferentialexpressionalternative-splicingbioconductordata-analysesdifferential-gene-expressiondifferential-splicing-analysisgene-expressiongtexrecount2rna-seq-datasplicing-quantificationsratcgavast-toolscpp
1.8 match 36 stars 6.95 score 31 scriptsbioc
ncdfFlow:ncdfFlow: A package that provides HDF5 based storage for flow cytometry data.
Provides HDF5 storage based methods and functions for manipulation of flow cytometry data.
Maintained by Mike Jiang. Last updated 3 months ago.
immunooncologyflowcytometryzlibcpp
1.6 match 7.56 score 96 scripts 11 dependentsjoycekang
symphony:Efficient and Precise Single-Cell Reference Atlas Mapping
Implements the Symphony single-cell reference building and query mapping algorithms and additional functions described in Kang et al <https://www.nature.com/articles/s41467-021-25957-x>.
Maintained by Joyce Kang. Last updated 2 years ago.
3.1 match 3.83 score 134 scriptskharchenkolab
conos:Clustering on Network of Samples
Wires together large collections of single-cell RNA-seq datasets, which allows for both the identification of recurrent cell clusters and the propagation of information between datasets in multi-sample or atlas-scale collections. 'Conos' focuses on the uniform mapping of homologous cell types across heterogeneous sample collections. For instance, users could investigate a collection of dozens of peripheral blood samples from cancer patients combined with dozens of controls, which perhaps includes samples of a related tissue such as lymph nodes. This package interacts with data available through the 'conosPanel' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/conos>. The size of the 'conosPanel' package is approximately 12 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
batch-correctionscrna-seqsingle-cell-rna-seqopenblascppopenmp
1.6 match 205 stars 7.33 score 258 scriptsbioc
categoryCompare:Meta-analysis of high-throughput experiments using feature annotations
Calculates significant annotations (categories) in each of two (or more) feature (i.e. gene) lists, determines the overlap between the annotations, and returns graphical and tabular data about the significant annotations and which combinations of feature lists the annotations were found to be significant. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).
Maintained by Robert M. Flight. Last updated 5 months ago.
annotationgomultiplecomparisonpathwaysgeneexpressionbioconductor
1.7 match 6 stars 6.68 scorehayamizu-lab
treefit:The First Software for Quantitative Trajectory Inference
Perform two types of analysis: 1) checking the goodness-of-fit of tree models to your single-cell gene expression data; and 2) deciding which tree best fits your data.
Maintained by Kouhei Sutou. Last updated 2 months ago.
2.3 match 3 stars 4.95 score 1 scriptsmarkbravington
mvbutils:General utilities, workspace organization, code and docu editing, live package maintenance, etc
Hierarchical workspace tree, code editing and backup, easy package prep, editing of packages while loaded, per-object lazy-loading, easy documentation, macro functions, and miscellaneous utilities. Needed by debug package.
Maintained by Mark V. Bravington. Last updated 8 days ago.
1.7 match 6.57 score 138 scripts 18 dependentsjhudsl
ottrpal:Companion Tools for Open-Source Tools for Training Resources (OTTR)
Tools for converting Open-Source Tools for Training Resources (OTTR) courses into Leanpub or Coursera courses. 'ottrpal' is for use with the OTTR Template repository to create courses.
Maintained by Candace Savonen. Last updated 6 hours ago.
1.7 match 3 stars 6.53 score 10 scripts 1 dependentscran
BayesGP:Efficient Implementation of Gaussian Process in Bayesian Hierarchical Models
Implements Bayesian hierarchical models with flexible Gaussian process priors, focusing on Extended Latent Gaussian Models and incorporating various Gaussian process priors for Bayesian smoothing. Computations leverage finite element approximations and adaptive quadrature for efficient inference. Methods are detailed in Zhang, Stringer, Brown, and Stafford (2023) <doi:10.1177/09622802221134172>; Zhang, Stringer, Brown, and Stafford (2024) <doi:10.1080/10618600.2023.2289532>; Zhang, Brown, and Stafford (2023) <doi:10.48550/arXiv.2305.09914>; and Stringer, Brown, and Stafford (2021) <doi:10.1111/biom.13329>.
Maintained by Ziang Zhang. Last updated 5 months ago.
3.4 match 3.18 scoremarcelschweiker
comf:Models and Equations for Human Comfort Research
Calculation of various common and less common comfort indices such as predicted mean vote or the two node model. Converts physical variables such as relative to absolute humidity and evaluates the performance of comfort indices.
Maintained by Marcel Schweiker. Last updated 3 months ago.
2.2 match 3 stars 4.78 score 40 scriptsr-lib
marquee:Markdown Parser and Renderer for R Graphics
Provides the mean to parse and render markdown text with grid along with facilities to define the styling of the text.
Maintained by Thomas Lin Pedersen. Last updated 4 days ago.
1.2 match 86 stars 8.59 score 28 scripts 1 dependentsquexiang
OpenMindat:Quickly Retrieve Datasets from the 'Mindat' API
The goal of OpenMindat R package is to provide functions for users or machines to quickly and easily retrieve datasets from the mindat.org API (<https://api.mindat.org/schema/redoc/>).
Maintained by Xiang Que. Last updated 2 months ago.
1.8 match 34 stars 5.83 score 3 scriptsfelixfan
FinCal:Time Value of Money, Time Series Analysis and Computational Finance
Package for time value of money calculation, time series analysis and computational finance.
Maintained by Felix Yanhui Fan. Last updated 8 years ago.
1.7 match 23 stars 6.14 score 203 scripts 1 dependentshanettools
Perc:Using Percolation and Conductance to Find Information Flow Certainty in a Direct Network
To find the certainty of dominance interactions with indirect interactions being considered.
Maintained by Jessica Vandeleest. Last updated 4 years ago.
1.7 match 5.88 score 38 scriptsbioc
rBiopaxParser:Parses BioPax files and represents them in R
Parses BioPAX files and represents them in R, at the moment BioPAX level 2 and level 3 are supported.
Maintained by Frank Kramer. Last updated 5 months ago.
1.7 match 10 stars 5.85 score 7 scriptsnlmixr2
nlmixr2extra:Nonlinear Mixed Effects Models in Population PK/PD, Extra Support Functions
Fit and compare nonlinear mixed-effects models in differential equations with flexible dosing information commonly seen in pharmacokinetics and pharmacodynamics (Almquist, Leander, and Jirstrand 2015 <doi:10.1007/s10928-015-9409-1>). Differential equation solving is by compiled C code provided in the 'rxode2' package (Wang, Hallow, and James 2015 <doi:10.1002/psp4.12052>). This package is for support functions like preconditioned fits <doi:10.1208/s12248-016-9866-5>, boostrap and stepwise covariate selection.
Maintained by Matthew Fidler. Last updated 1 months ago.
1.7 match 3 stars 5.83 score 11 scripts 5 dependentsaltabering
altadata:API Wrapper for Altadata.io
Functions for interacting directly with the 'ALTADATA' API. With this R package, developers can build applications around the 'ALTADATA' API without having to deal with accessing and managing requests and responses. 'ALTADATA' is a curated data marketplace for more information go to <https://www.altadata.io>.
Maintained by Emre Durukan. Last updated 4 years ago.
3.5 match 1 stars 2.70 score 1 scriptscran
PLORN:Prediction with Less Overfitting and Robust to Noise
A method for the quantitative prediction with much predictors. This package provides functions to construct the quantitative prediction model with less overfitting and robust to noise.
Maintained by Takahiko Koizumi. Last updated 3 years ago.
3.5 match 2.70 scorebioc
spiky:Spike-in calibration for cell-free MeDIP
spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.
Maintained by Tim Triche. Last updated 5 months ago.
differentialmethylationdnamethylationnormalizationpreprocessingqualitycontrolsequencing
1.9 match 2 stars 4.90 score 3 scriptsibecav
CGPfunctions:Powell Miscellaneous Functions for Teaching and Learning Statistics
Miscellaneous functions useful for teaching statistics as well as actually practicing the art. They typically are not new methods but rather wrappers around either base R or other packages.
Maintained by Chuck Powell. Last updated 4 years ago.
1.3 match 27 stars 7.28 score 122 scriptssfcheung
modelbpp:Model BIC Posterior Probability
Fits the neighboring models of a fitted structural equation model and assesses the model uncertainty of the fitted model based on BIC posterior probabilities, using the method presented in Wu, Cheung, and Leung (2020) <doi:10.1080/00273171.2019.1574546>.
Maintained by Shu Fai Cheung. Last updated 7 months ago.
lavaanmodel-comparisonmodel-comparison-and-selectionmodel-selectionstructural-equation-modeling
2.0 match 4.54 score 2 scriptsbioc
lemur:Latent Embedding Multivariate Regression
Fit a latent embedding multivariate regression (LEMUR) model to multi-condition single-cell data. The model provides a parametric description of single-cell data measured with treatment vs. control or more complex experimental designs. The parametric model is used to (1) align conditions, (2) predict log fold changes between conditions for all cells, and (3) identify cell neighborhoods with consistent log fold changes. For those neighborhoods, a pseudobulked differential expression test is conducted to assess which genes are significantly changed.
Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.
transcriptomicsdifferentialexpressionsinglecelldimensionreductionregressionopenblascpp
1.1 match 87 stars 7.69 score 81 scriptskevinhzq
healthdb:Working with Healthcare Databases
A system for identifying diseases or events from healthcare databases and preparing data for epidemiological studies. It includes capabilities not supported by 'SQL', such as matching strings by 'stringr' style regular expressions, and can compute comorbidity scores (Quan et al. (2005) <doi:10.1097/01.mlr.0000182534.19832.83>) directly on a database server. The implementation is based on 'dbplyr' with full 'tidyverse' compatibility.
Maintained by Kevin Hu. Last updated 1 months ago.
1.8 match 2 stars 4.95 scorepboutros
ISOpureR:Deconvolution of Tumour Profiles
Deconvolution of mixed tumour profiles into normal and cancer for each patient, using the ISOpure algorithm in Quon et al. Genome Medicine, 2013 5:29. Deconvolution requires mixed tumour profiles and a set of unmatched "basis" normal profiles.
Maintained by Paul C Boutros. Last updated 6 years ago.
2.3 match 3 stars 3.61 score 34 scriptswangyuncw
Trendtwosub:Two Sample Order Free Trend Nonparametric Inference
The package contains functions for non-parametric trend comparison of two independent samples with sequential subsamples.
Maintained by Yishi Wang Developer. Last updated 5 years ago.
4.0 match 2.00 scoreopisthokonta
implied:Convert Between Bookmaker Odds and Probabilities
Convert between bookmaker odds and probabilities. Eight different algorithms are available, including basic normalization, Shin's method (Hyun Song Shin, (1992) <doi:10.2307/2234526>), and others.
Maintained by Jonas Christoffer Lindstrøm. Last updated 2 years ago.
1.2 match 9 stars 6.85 score 9 scriptsdaniel-jg
ontologyPlot:Visualising Sets of Ontological Terms
Create R plots visualising ontological terms and the relationships between them with various graphical options - Greene et al. 2017 <doi:10.1093/bioinformatics/btw763>.
Maintained by Daniel Greene. Last updated 1 years ago.
1.8 match 4.48 score 50 scripts 5 dependentshubverse-org
hubValidations:Testing framework for hubverse hub validations
This package aims at providing a simple interface to run validations on data and metadata submitted to a hubverse modeling hub. Validation tests can be run at different levels (single file, single folder, whole repository) and locally as well as part of a continuous integration workflow.
Maintained by Anna Krystalli. Last updated 20 days ago.
1.7 match 1 stars 4.67 score 27 scripts 1 dependentstidymodels
hardhat:Construct Modeling Packages
Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.
Maintained by Hannah Frick. Last updated 2 months ago.
0.5 match 104 stars 14.86 score 175 scripts 437 dependentsiancero
checkthat:Intuitive Unit Testing Tools for Data Manipulation
Provides a lightweight data validation and testing toolkit for R. Its guiding philosophy is that adding code-based data checks to users' existing workflow should be both quick and intuitive. The suite of functions included therefore mirror the common data checks many users already perform by hand or by eye. Additionally, the 'checkthat' package is optimized to work within 'tidyverse' data manipulation pipelines.
Maintained by Ian Cero. Last updated 2 years ago.
1.8 match 3 stars 4.18 score 4 scriptsbioc
yarn:YARN: Robust Multi-Condition RNA-Seq Preprocessing and Normalization
Expedite large RNA-Seq analyses using a combination of previously developed tools. YARN is meant to make it easier for the user in performing basic mis-annotation quality control, filtering, and condition-aware normalization. YARN leverages many Bioconductor tools and statistical techniques to account for the large heterogeneity and sparsity found in very large RNA-seq experiments.
Maintained by Joseph N Paulson. Last updated 5 months ago.
softwarequalitycontrolgeneexpressionsequencingpreprocessingnormalizationannotationvisualizationclustering
1.7 match 4.49 score 31 scriptsr-forge
numDeriv:Accurate Numerical Derivatives
Methods for calculating (usually) accurate numerical first and second order derivatives. Accurate calculations are done using 'Richardson''s' extrapolation or, when applicable, a complex step derivative is available. A simple difference method is also provided. Simple difference is (usually) less accurate but is much quicker than 'Richardson''s' extrapolation and provides a useful cross-check. Methods are provided for real scalar and vector valued functions.
Maintained by Paul Gilbert. Last updated 3 months ago.
0.5 match 1 stars 14.10 score 1.2k scripts 3.1k dependentshugogogo
natural:Estimating the Error Variance in a High-Dimensional Linear Model
Implementation of the two error variance estimation methods in high-dimensional linear models of Yu, Bien (2017) <arXiv:1712.02412>.
Maintained by Guo Yu. Last updated 7 years ago.
1.6 match 1 stars 4.48 score 9 scriptsbioc
qckitfastq:FASTQ Quality Control
Assessment of FASTQ file format with multiple metrics including quality score, sequence content, overrepresented sequence and Kmers.
Maintained by August Guang. Last updated 5 months ago.
softwarequalitycontrolsequencingzlibcpp
1.6 match 4.38 score 24 scriptsrstudio
pool:Object Pooling
Enables the creation of object pools, which make it less computationally expensive to fetch a new object. Currently the only supported pooled objects are 'DBI' connections.
Maintained by Hadley Wickham. Last updated 6 months ago.
0.5 match 255 stars 12.85 score 684 scripts 27 dependentsbioc
dada2:Accurate, high-resolution sample inference from amplicon sequencing data
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
Maintained by Benjamin Callahan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp
0.5 match 487 stars 13.17 score 3.0k scripts 4 dependentsngthomas
microhaplot:Microhaplotype Constructor and Visualizer
A downstream bioinformatics tool to construct and assist curation of microhaplotypes from short read sequences.
Maintained by Thomas Ng. Last updated 4 years ago.
amplicon-sequencingmicrohaplot-shinyshinyvcf
1.2 match 18 stars 5.73 score 10 scriptspoissonconsulting
subfoldr2:Save and Load R Objects
Facilitates saving and loading R objects, data frames, tables, plots, text blocks and numbers to subfolders.
Maintained by Joe Thorley. Last updated 30 days ago.
1.8 match 2 stars 3.70 score 5 scriptsbioc
plasmut:Stratifying mutations observed in cell-free DNA and white blood cells as germline, hematopoietic, or somatic
A Bayesian method for quantifying the liklihood that a given plasma mutation arises from clonal hematopoesis or the underlying tumor. It requires sequencing data of the mutation in plasma and white blood cells with the number of distinct and mutant reads in both tissues. We implement a Monte Carlo importance sampling method to assess the likelihood that a mutation arises from the tumor relative to non-tumor origin.
Maintained by Adith Arun. Last updated 5 months ago.
bayesiansomaticmutationgermlinemutationsequencing
1.6 match 4.00 score 2 scriptsderek-corcoran-barrios
NetworkExtinction:Extinction Simulation in Ecological Networks
Simulates the extinction of species in ecological networks and it analyzes its cascading effects, described in Dunne et al. (2002) <doi:10.1073/pnas.192407699>.
Maintained by Derek Corcoran. Last updated 5 months ago.
1.2 match 5 stars 5.15 score 19 scriptsstrevisani
SurfRough:Calculate Surface/Image Texture Indexes
Methods for the computation of surface/image texture indices using a geostatistical based approach (Trevisani et al. (2023) <doi:10.1016/j.geomorph.2023.108838>). It provides various functions for the computation of surface texture indices (e.g., omnidirectional roughness and roughness anisotropy), including the ones based on the robust MAD estimator. The kernels included in the software permit also to calculate the surface/image texture indices directly from the input surface (i.e., without de-trending) using increments of order 2. It also provides the new radial roughness index (RRI), representing the improvement of the popular topographic roughness index (TRI). The framework can be easily extended with ad-hoc surface/image texture indices.
Maintained by Sebastiano Trevisani. Last updated 18 days ago.
1.7 match 1 stars 3.65 scorebioc
GeneGA:Design gene based on both mRNA secondary structure and codon usage bias using Genetic algorithm
R based Genetic algorithm for gene expression optimization by considering both mRNA secondary structure and codon usage bias, GeneGA includes the information of highly expressed genes of almost 200 genomes. Meanwhile, Vienna RNA Package is needed to ensure GeneGA to function properly.
Maintained by Zhenpeng Li. Last updated 5 months ago.
2.5 match 2.30 score 6 scriptsbioc
gdsfmt:R Interface to CoreArray Genomic Data Structure (GDS) Files
Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.
Maintained by Xiuwen Zheng. Last updated 18 days ago.
infrastructuredataimportbioinformaticsgds-formatgenomicscpp
0.5 match 18 stars 11.34 score 920 scripts 29 dependentsandymckenzie
bayesbio:Miscellaneous Functions for Bioinformatics and Bayesian Statistics
A hodgepodge of hopefully helpful functions. Two of these perform shrinkage estimation: one using a simple weighted method where the user can specify the degree of shrinkage required, and one using James-Stein shrinkage estimation for the case of unequal variances.
Maintained by Andrew McKenzie. Last updated 6 years ago.
1.8 match 1 stars 3.18 score 30 scriptswrathematics
float:32-Bit Floats
R comes with a suite of utilities for linear algebra with "numeric" (double precision) vectors/matrices. However, sometimes single precision (or less!) is more than enough for a particular task. This package extends R's linear algebra facilities to include 32-bit float (single precision) data. Float vectors/matrices have half the precision of their "numeric"-type counterparts but are generally faster to numerically operate on, for a performance vs accuracy trade-off. The internal representation is an S4 class, which allows us to keep the syntax identical to that of base R's. Interaction between floats and base types for binary operators is generally possible; in these cases, type promotion always defaults to the higher precision. The package ships with copies of the single precision 'BLAS' and 'LAPACK', which are automatically built in the event they are not available on the system.
Maintained by Drew Schmidt. Last updated 23 days ago.
float-matrixhpclinear-algebramatrixfortranopenblasopenmp
0.5 match 46 stars 10.53 score 228 scripts 42 dependentsjimbrig
jimstools:Tools for R
What the package does (one paragraph).
Maintained by Jimmy Briggs. Last updated 3 years ago.
1.8 match 2 stars 3.00 score 2 scriptscran
PracTools:Designing and Weighting Survey Samples
Functions and datasets to support Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>, "Practical Tools for Designing and Weighting Survey Samples". Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-, and three-stage sample designs, and single-stage audit sample designs. Functions are included that will group geographic units accounting for distances apart and measures of size. Other functions compute variance components for multistage designs and sample sizes in two-phase designs. A number of example data sets are included.
Maintained by Richard Valliant. Last updated 9 months ago.
1.7 match 1 stars 3.18 score 1 dependentsbioc
UCell:Rank-based signature enrichment analysis for single-cell data
UCell is a package for evaluating gene signatures in single-cell datasets. UCell signature scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands less computing time and memory than other available methods, enabling the processing of large datasets in a few minutes even on machines with limited computing power. UCell can be applied to any single-cell data matrix, and includes functions to directly interact with SingleCellExperiment and Seurat objects.
Maintained by Massimo Andreatta. Last updated 5 months ago.
singlecellgenesetenrichmenttranscriptomicsgeneexpressioncellbasedassays
0.5 match 145 stars 10.21 score 454 scripts 2 dependentsmsberends
plot2:A Plotting Assistant for Fast 'ggplot2' Visualisations
A streamlined extension of 'ggplot2' designed to simplify and accelerate the creation of data visualisations. 'plot2' automates common tasks such as axis handling, plot type selection, and data transformation, allowing users to create complex, publication-ready plots with minimal code. It integrates seamlessly with the tidyverse and retains full compatibility with 'ggplot2', while offering additional conveniences like enhanced sorting, faceting, and custom theming.
Maintained by Matthijs S. Berends. Last updated 24 days ago.
ggplot2helperplottingtidyverse
1.2 match 1 stars 4.32 score 8 scripts 1 dependentsropensci
git2rdata:Store and Retrieve Data.frames in a Git Repository
The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.
Maintained by Thierry Onkelinx. Last updated 2 months ago.
reproducible-researchversion-control
0.5 match 99 stars 10.03 score 216 scripts 4 dependentsdangerzig
tipitaka:Data and Tools for Analyzing the Pali Canon
Provides access to the complete Pali Canon, or Tipitaka, the canonical scripture for Theravadin Buddhists worldwide. Based on the Chattha Sangayana Tipitaka version 4 (Vipassana Research Institute, 1990).
Maintained by Dan Zigmond. Last updated 3 years ago.
1.8 match 1 stars 2.70 score 1 scriptsbioc
DepInfeR:Inferring tumor-specific cancer dependencies through integrating ex-vivo drug response assays and drug-protein profiling
DepInfeR integrates two experimentally accessible input data matrices: the drug sensitivity profiles of cancer cell lines or primary tumors ex-vivo (X), and the drug affinities of a set of proteins (Y), to infer a matrix of molecular protein dependencies of the cancers (ß). DepInfeR deconvolutes the protein inhibition effect on the viability phenotype by using regularized multivariate linear regression. It assigns a “dependence coefficient” to each protein and each sample, and therefore could be used to gain a causal and accurate understanding of functional consequences of genomic aberrations in a heterogeneous disease, as well as to guide the choice of pharmacological intervention for a specific cancer type, sub-type, or an individual patient. For more information, please read out preprint on bioRxiv: https://doi.org/10.1101/2022.01.11.475864.
Maintained by Junyan Lu. Last updated 5 months ago.
softwareregressionpharmacogeneticspharmacogenomicsfunctionalgenomics
1.1 match 1 stars 4.36 score 23 scriptscdruee
readmet:Read some less Popular Formats Used in Meteorology
Contains tools for reading and writing data from or to files in the formats: akterm, dmna, Scintec Format-1, and Campbell Scientific TOA5.
Maintained by Clemens Druee. Last updated 1 years ago.
4.7 match 1.00 scoreledell
cvAUC:Cross-Validated Area Under the ROC Curve Confidence Intervals
Tools for working with and evaluating cross-validated area under the ROC curve (AUC) estimators. The primary functions of the package are ci.cvAUC and ci.pooled.cvAUC, which report cross-validated AUC and compute confidence intervals for cross-validated AUC estimates based on influence curves for i.i.d. and pooled repeated measures data, respectively. One benefit to using influence curve based confidence intervals is that they require much less computation time than bootstrapping methods. The utility functions, AUC and cvAUC, are simple wrappers for functions from the ROCR package.
Maintained by Erin LeDell. Last updated 3 years ago.
aucconfidence-intervalscross-validationmachine-learningstatisticsvariance
0.5 match 23 stars 9.17 score 317 scripts 40 dependentsbernice0321
UnivRNG:Univariate Pseudo-Random Number Generation
Pseudo-random number generation of 17 univariate distributions proposed by Demirtas. (2005) <DOI:10.22237/jmasm/1114907220>.
Maintained by Ran Gao. Last updated 4 years ago.
3.6 match 1.26 score 18 scriptscran
drcarlate:Improving Estimation Efficiency in CAR with Imperfect Compliance
We provide a list of functions for replicating the results of the Monte Carlo simulations and empirical application of Jiang et al. (2022). In particular, we provide corresponding functions for generating the three types of random data described in this paper, as well as all the estimation strategies. Detailed information about the data generation process and estimation strategy can be found in Jiang et al. (2022) <doi:10.48550/arXiv.2201.13004>.
Maintained by Mingxin Zhang. Last updated 2 years ago.
1.6 match 2.70 scorechris31415926535
pseudohouseholds:Generate Pseudohouseholds on Road Networks in Regions
Given an arbitrary set of spatial regions and road networks, generate a set of representative points, or pseudohouseholds, that can be used for travel burden analysis. Parallel processing is supported.
Maintained by Christopher Belanger. Last updated 1 years ago.
1.2 match 3.70 score 9 scriptsbristol-vaccine-centre
avoncap:AvonCap Study Analysis
A WIP set of functions allowing data load, wrangling of the AvonCap data set.
Maintained by Rob Challen. Last updated 4 months ago.
1.8 match 2.34 score 11 scriptsbioc
mzID:An mzIdentML parser for R
A parser for mzIdentML files implemented using the XML package. The parser tries to be general and able to handle all types of mzIdentML files with the drawback of having less 'pretty' output than a vendor specific parser. Please contact the maintainer with any problems and supply an mzIdentML file so the problems can be fixed quickly.
Maintained by Laurent Gatto. Last updated 5 months ago.
immunooncologydataimportmassspectrometryproteomics
0.5 match 7.83 score 32 scripts 38 dependentsjasdumas
shinyLP:Bootstrap Landing Home Pages for Shiny Applications
Provides functions that wrap HTML Bootstrap components code to enable the design and layout of informative landing home pages for Shiny applications. This can lead to a better user experience for the users and writing less HTML for the developer.
Maintained by Jasmine Daly. Last updated 29 days ago.
bootstrapr-shinyshinyui-design
0.5 match 115 stars 7.29 score 85 scripts 2 dependentshongyuanjia
eplusr:A Toolkit for Using Whole Building Simulation Program 'EnergyPlus'
A rich toolkit of using the whole building simulation program 'EnergyPlus'(<https://energyplus.net>), which enables programmatic navigation, modification of 'EnergyPlus' models and makes it less painful to do parametric simulations and analysis.
Maintained by Hongyuan Jia. Last updated 8 months ago.
energy-simulationenergyplusenergyplus-modelseplusepwiddidfparametric-simulationr6simulation
0.5 match 71 stars 7.19 score 91 scripts 4 dependentsbioc
ATACseqQC:ATAC-seq Quality Control
ATAC-seq, an assay for Transposase-Accessible Chromatin using sequencing, is a rapid and sensitive method for chromatin accessibility analysis. It was developed as an alternative method to MNase-seq, FAIRE-seq and DNAse-seq. Comparing to the other methods, ATAC-seq requires less amount of the biological samples and time to process. In the process of analyzing several ATAC-seq dataset produced in our labs, we learned some of the unique aspects of the quality assessment for ATAC-seq data.To help users to quickly assess whether their ATAC-seq experiment is successful, we developed ATACseqQC package partially following the guideline published in Nature Method 2013 (Greenleaf et al.), including diagnostic plot of fragment size distribution, proportion of mitochondria reads, nucleosome positioning pattern, and CTCF or other Transcript Factor footprints.
Maintained by Jianhong Ou. Last updated 3 months ago.
sequencingdnaseqatacseqgeneregulationqualitycontrolcoveragenucleosomepositioningimmunooncology
0.5 match 7.12 score 146 scripts 1 dependentsthinkr-open
thinkr:Tools for Cleaning Up Messy Files
Some tools for cleaning up messy 'Excel' files to be suitable for R. People who have been working with 'Excel' for years built more or less complicated sheets with names, characters, formats that are not homogeneous. To be able to use them in R nowadays, we built a set of functions that will avoid the majority of importation problems and keep all the data at best.
Maintained by Vincent Guyader. Last updated 3 years ago.
hacktoberfestthinkr-not-maintained
0.5 match 29 stars 6.96 score 45 scriptsjwiley
extraoperators:Extra Binary Relational and Logical Operators
Speed up common tasks, particularly logical or relational comparisons and routine follow up tasks such as finding the indices and subsetting. Inspired by mathematics, where something like: 3 < x < 6 is a standard, elegant and clear way to assert that x is both greater than 3 and less than 6 (see for example <https://en.wikipedia.org/wiki/Relational_operator>), a chaining operator is implemented. The chaining operator, %c%, allows multiple relational operations to be used in quotes on the right hand side for the same object, on the left hand side. The %e% operator allows something like set-builder notation (see for example <https://en.wikipedia.org/wiki/Set-builder_notation>) to be used on the right hand side. All operators have built in prefixes defined for all, subset, and which to reduce the amount of code needed for common tasks, such as return those values that are true.
Maintained by Joshua F. Wiley. Last updated 1 years ago.
0.5 match 3 stars 7.06 score 239 scripts 7 dependentstom-wolff
ideanet:Integrating Data Exchange and Analysis for Networks ('ideanet')
A suite of convenient tools for social network analysis geared toward students, entry-level users, and non-expert practitioners. ‘ideanet’ features unique functions for the processing and measurement of sociocentric and egocentric network data. These functions automatically generate node- and system-level measures commonly used in the analysis of these types of networks. Outputs from these functions maximize the ability of novice users to employ network measurements in further analyses while making all users less prone to common data analytic errors. Additionally, ‘ideanet’ features an R Shiny graphic user interface that allows novices to explore network data with minimal need for coding.
Maintained by Tom Wolff. Last updated 19 days ago.
0.5 match 6 stars 6.80 score 10 scriptskozodoi
fairness:Algorithmic Fairness Metrics
Offers calculation, visualization and comparison of algorithmic fairness metrics. Fair machine learning is an emerging topic with the overarching aim to critically assess whether ML algorithms reinforce existing social biases. Unfair algorithms can propagate such biases and produce predictions with a disparate impact on various sensitive groups of individuals (defined by sex, gender, ethnicity, religion, income, socioeconomic status, physical or mental disabilities). Fair algorithms possess the underlying foundation that these groups should be treated similarly or have similar prediction outcomes. The fairness R package offers the calculation and comparisons of commonly and less commonly used fairness metrics in population subgroups. These methods are described by Calders and Verwer (2010) <doi:10.1007/s10618-010-0190-x>, Chouldechova (2017) <doi:10.1089/big.2016.0047>, Feldman et al. (2015) <doi:10.1145/2783258.2783311> , Friedler et al. (2018) <doi:10.1145/3287560.3287589> and Zafar et al. (2017) <doi:10.1145/3038912.3052660>. The package also offers convenient visualizations to help understand fairness metrics.
Maintained by Nikita Kozodoi. Last updated 2 years ago.
algorithmic-discriminationalgorithmic-fairnessdiscriminationdisparate-impactfairnessfairness-aifairness-mlmachine-learning
0.5 match 32 stars 6.82 score 69 scripts 1 dependentsm-jahn
WeightedTreemaps:Generate and Plot Voronoi or Sunburst Treemaps from Hierarchical Data
Treemaps are a visually appealing graphical representation of numerical data using a space-filling approach. A plane or 'map' is subdivided into smaller areas called cells. The cells in the map are scaled according to an underlying metric which allows to grasp the hierarchical organization and relative importance of many objects at once. This package contains two different implementations of treemaps, Voronoi treemaps and Sunburst treemaps. The Voronoi treemap function subdivides the plot area in polygonal cells according to the highest hierarchical level, then continues to subdivide those parental cells on the next lower hierarchical level, and so on. The Sunburst treemap is a computationally less demanding treemap that does not require iterative refinement, but simply generates circle sectors that are sized according to predefined weights. The Voronoi tesselation is based on functions from Paul Murrell (2012) <https://www.stat.auckland.ac.nz/~paul/Reports/VoronoiTreemap/voronoiTreeMap.html>.
Maintained by Michael Jahn. Last updated 4 months ago.
r-programmingrcppsunburst-treemapvoronoi-diagramvoronoi-treemapcpp
0.5 match 50 stars 6.73 score 18 scriptsropensci
landscapetools:Landscape Utility Toolbox
Provides utility functions for some of the less-glamorous tasks involved in landscape analysis. It includes functions to coerce raster data to the common tibble format and vice versa, it helps with flexible reclassification tasks of raster data and it provides a function to merge multiple raster. Furthermore, 'landscapetools' helps landscape scientists to visualize their data by providing optional themes and utility functions to plot single landscapes, rasterstacks, -bricks and lists of raster.
Maintained by Marco Sciaini. Last updated 2 years ago.
landscapelandscape-ecologyrastervisualizationworkflow
0.5 match 47 stars 6.61 score 191 scriptsabhi-1u
texor:Converting 'LaTeX' 'R Journal' Articles into 'RJ-web-articles'
Articles in the 'R Journal' were first authored in 'LaTeX', which performs admirably for 'PDF' files but is less than ideal for modern online interfaces. The 'texor' package does all the transitional chores and conversions necessary to move to the online versions.
Maintained by Abhishek Ulayil. Last updated 6 hours ago.
0.5 match 7 stars 6.36 score 8 scriptsprzechoj
gips:Gaussian Model Invariant by Permutation Symmetry
Find the permutation symmetry group such that the covariance matrix of the given data is approximately invariant under it. Discovering such a permutation decreases the number of observations needed to fit a Gaussian model, which is of great use when it is smaller than the number of variables. Even if that is not the case, the covariance matrix found with 'gips' approximates the actual covariance with less statistical error. The methods implemented in this package are described in Graczyk et al. (2022) <doi:10.1214/22-AOS2174>. Documentation about 'gips' is provided via its website at <https://przechoj.github.io/gips/> and the paper by Chojecki, Morgen, Kołodziejek (2025, <doi:10.18637/jss.v112.i07>).
Maintained by Adam Przemysław Chojecki. Last updated 15 days ago.
covariance-estimationmachine-learningnormal-distribution
0.5 match 6 stars 6.52 score 31 scriptstconwell
textTools:Functions for Text Cleansing and Text Analysis
A framework for text cleansing and analysis. Conveniently prepare and process large amounts of text for analysis. Includes various metrics for word counts/frequencies that scale efficiently. Quickly analyze large amounts of text data using a text.table (a data.table created with one word (or unit of text analysis) per row, similar to the tidytext format). Offers flexibility to efficiently work with text data stored in vectors as well as text data formatted as a text.table.
Maintained by Timothy Conwell. Last updated 4 years ago.
3.3 match 1.00 score 4 scriptsgreen-striped-gecko
dartR.popgen:Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis
Facilitates the analysis of SNP (single nucleotide polymorphism) and silicodart (presence/absence) data. 'dartR.popgen' provides a suit of functions to analyse such data in a population genetics context. It provides several functions to calculate population genetic metrics and to study population structure. Quite a few functions need additional software to be able to run (gl.run.structure(), gl.blast(), gl.LDNe()). You find detailed description in the help pages how to download and link the packages so the function can run the software. 'dartR.popgen' is part of the the 'dartRverse' suit of packages. Gruber et al. (2018) <doi:10.1111/1755-0998.12745>. Mijangos et al. (2022) <doi:10.1111/2041-210X.13918>.
Maintained by Bernd Gruber. Last updated 9 months ago.
1.6 match 2.00 score 9 scriptsrainers48
tsapp:Time Series, Analysis and Application
Accompanies the book Rainer Schlittgen and Cristina Sattarhoff (2020) <https://www.degruyter.com/view/title/575978> "Angewandte Zeitreihenanalyse mit R, 4. Auflage" . The package contains the time series and functions used therein. It was developed over many years teaching courses about time series analysis.
Maintained by Rainer Schlittgen. Last updated 3 years ago.
3.2 match 1.00 score 1 scriptsmyles-lewis
glmmSeq:General Linear Mixed Models for Gene-Level Differential Expression
Using mixed effects models to analyse longitudinal gene expression can highlight differences between sample groups over time. The most widely used differential gene expression tools are unable to fit linear mixed effect models, and are less optimal for analysing longitudinal data. This package provides negative binomial and Gaussian mixed effects models to fit gene expression and other biological data across repeated samples. This is particularly useful for investigating changes in RNA-Sequencing gene expression between groups of individuals over time, as described in: Rivellese, F., Surace, A. E., Goldmann, K., Sciacca, E., Cubuk, C., Giorli, G., ... Lewis, M. J., & Pitzalis, C. (2022) Nature medicine <doi:10.1038/s41591-022-01789-0>.
Maintained by Myles Lewis. Last updated 2 months ago.
bioinformaticsdifferential-gene-expressiongene-expressionglmmmixed-modelstranscriptomics
0.5 match 20 stars 6.13 score 45 scriptscran
fipp:Induced Priors in Bayesian Mixture Models
Computes implicitly induced quantities from prior/hyperparameter specifications of three Mixtures of Finite Mixtures models: Dirichlet Process Mixtures (DPMs; Escobar and West (1995) <doi:10.1080/01621459.1995.10476550>), Static Mixtures of Finite Mixtures (Static MFMs; Miller and Harrison (2018) <doi:10.1080/01621459.2016.1255636>), and Dynamic Mixtures of Finite Mixtures (Dynamic MFMs; Frühwirth-Schnatter, Malsiner-Walli and Grün (2020) <arXiv:2005.09918>). For methodological details, please refer to Greve, Grün, Malsiner-Walli and Frühwirth-Schnatter (2020) <arXiv:2012.12337>) as well as the package vignette.
Maintained by Jan Greve. Last updated 4 years ago.
1.1 match 2.70 scoremrc-ide
odin.dust:Compile Odin to Dust
Less painful than it sounds, this package compiles an odin model to use dust, our new stochastic model system. Supports only a subset of odin models (discrete time stochastic models with no interpolation and no delays).
Maintained by Rich FitzJohn. Last updated 6 months ago.
0.5 match 3 stars 5.71 score 122 scriptscran
psyphy:Functions for Analyzing Psychophysical Data in R
An assortment of functions that could be useful in analyzing data from psychophysical experiments. It includes functions for calculating d' from several different experimental designs, links for m-alternative forced-choice (mafc) data to be used with the binomial family in glm (and possibly other contexts) and self-Start functions for estimating gamma values for CRT screen calibrations.
Maintained by Ken Knoblauch. Last updated 2 years ago.
1.7 match 1.78 scoreetiennebacher
flir:Find and Fix Lints in R Code
Lints are code patterns that are not optimal because they are inefficient, forget corner cases, or less readable. 'flir' provides a small set of functions to detect those lints and automatically fix them. It builds on 'astgrepr', which itself uses the Rust crate 'ast-grep' to parse and navigate R code.
Maintained by Etienne Bacher. Last updated 1 months ago.
0.5 match 51 stars 5.73 score 1 scriptscran
dotsViolin:Dot Plots Mimicking Violin Plots
Modifies dot plots to have different sizes of dots mimicking violin plots and identifies modes or peaks for them based on frequency and kernel density estimates (Rosenblatt, 1956) <doi:10.1214/aoms/1177728190> (Parzen, 1962) <doi:10.1214/aoms/1177704472>.
Maintained by Fernando Roa. Last updated 1 years ago.
1.7 match 1.70 scorefredhasselman
invctr:Infix Functions For Vector Operations
Vector operations between grapes: An infix-only package! The 'invctr' functions perform common and less common operations on vectors, data frames matrices and list objects: - Extracting a value (range), or, finding the indices of a value (range). - Trimming, or padding a vector with a value of your choice. - Simple polynomial regression. - Set and membership operations. - General check & replace function for NAs, Inf and other values.
Maintained by Fred Hasselman. Last updated 1 months ago.
0.5 match 5 stars 5.30 score 40 scriptserikseulean
nonparametric.bayes:Project Code - Nonparametric Bayes
Basic implementation of a Gibbs sampler for a Chinese Restaurant Process along with some visual aids to help understand how the sampling works. This is developed as part of a postgraduate school project for an Advanced Bayesian Nonparametric course. It is inspired by Tamara Broderick's presentation on Nonparametric Bayesian statistics given at the Simons institute.
Maintained by Erik-Cristian Seulean. Last updated 3 years ago.
1.6 match 1.70 scorebaderlab
FLASHMM:Fast and Scalable Single Cell Differential Expression Analysis using Mixed-Effects Models
A fast and scalable linear mixed-effects model (LMM) estimation algorithm for analysis of single-cell differential expression. The algorithm uses summary-level statistics and requires less computer memory to fit the LMM.
Maintained by Changjiang Xu. Last updated 1 days ago.
0.5 match 2 stars 5.00 score 3 scriptsmbojan
netseg:Measures of Network Segregation and Homophily
Segregation is a network-level property such that edges between predefined groups of vertices are relatively less likely. Network homophily is a individual-level tendency to form relations with people who are similar on some attribute (e.g. gender, music taste, social status, etc.). In general homophily leads to segregation, but segregation might arise without homophily. This package implements descriptive indices measuring homophily/segregation. It is a computational companion to Bojanowski & Corten (2014) <doi:10.1016/j.socnet.2014.04.001>.
Maintained by Michal Bojanowski. Last updated 2 years ago.
homophilysegregationsocial-networks
0.5 match 17 stars 5.08 score 14 scriptsaadler
Delaporte:Statistical Functions for the Delaporte Distribution
Provides probability mass, distribution, quantile, random-variate generation, and method-of-moments parameter-estimation functions for the Delaporte distribution with parameterization based on Vose (2008) <isbn:9780470512845>. The Delaporte is a discrete probability distribution which can be considered the convolution of a negative binomial distribution with a Poisson distribution. Alternatively, it can be considered a counting distribution with both Poisson and negative binomial components. It has been studied in actuarial science as a frequency distribution which has more variability than the Poisson, but less than the negative binomial.
Maintained by Avraham Adler. Last updated 10 months ago.
0.5 match 4 stars 5.00 score 14 scripts 2 dependentspachadotdev
kendallknight:Efficient Implementation of Kendall's Correlation Coefficient Computation
The computational complexity of the implemented algorithm for Kendall's correlation is O(n log(n)), which is faster than the base R implementation with a computational complexity of O(n^2). For small vectors (i.e., less than 100 observations), the time difference is negligible. However, for larger vectors, the speed difference can be substantial and the numerical difference is minimal. The references are Knight (1966) <doi:10.2307/2282833>, Abrevaya (1999) <doi:10.1016/S0165-1765(98)00255-9>, Christensen (2005) <doi:10.1007/BF02736122> and Emara (2024) <https://learningcpp.org/>. This implementation is described in Vargas Sepulveda (2024) <doi:10.48550/arXiv.2408.09618>.
Maintained by Mauricio Vargas Sepulveda. Last updated 1 months ago.
0.5 match 3 stars 5.02 score 3 scriptsmdelacre
Routliers:Robust Outliers Detection
Detecting outliers using robust methods, i.e. the Median Absolute Deviation (MAD) for univariate outliers; Leys, Ley, Klein, Bernard, & Licata (2013) <doi:10.1016/j.jesp.2013.03.013> and the Mahalanobis-Minimum Covariance Determinant (MMCD) for multivariate outliers; Leys, C., Klein, O., Dominicy, Y. & Ley, C. (2018) <doi:10.1016/j.jesp.2017.09.011>. There is also the more known but less robust Mahalanobis distance method, only for comparison purposes.
Maintained by Marie Delacre. Last updated 4 years ago.
0.5 match 11 stars 4.86 score 66 scriptschristiangoueguel
ConfidenceEllipse:Computation of 2D and 3D Elliptical Joint Confidence Regions
Computing elliptical joint confidence regions at a specified confidence level. It provides the flexibility to estimate either classical or robust confidence regions, which can be visualized in 2D or 3D plots. The classical approach assumes normality and uses the mean and covariance matrix to define the confidence regions. Alternatively, the robustified version employs estimators like minimum covariance determinant (MCD) and M-estimator, making them less sensitive to outliers and departures from normality. Furthermore, the functions allow users to group the dataset based on categorical variables and estimate separate confidence regions for each group. This capability is particularly useful for exploring potential differences or similarities across subgroups within a dataset. Varmuza and Filzmoser (2009, ISBN:978-1-4200-5947-2). Johnson and Wichern (2007, ISBN:0-13-187715-1). Raymaekers and Rousseeuw (2019) <DOI:10.1080/00401706.2019.1677270>.
Maintained by Christian L. Goueguel. Last updated 11 months ago.
confidence-ellipseconfidence-ellipsoidconfidence-regionmultivariate-distributionoutliers-detectionrobust-statistics
0.5 match 1 stars 4.70 scorebioc
LPE:Methods for analyzing microarray data using Local Pooled Error (LPE) method
This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.
Maintained by Nitin Jain. Last updated 5 months ago.
microarraydifferentialexpression
0.5 match 4.58 score 21 scripts 1 dependentsironholds
primes:Fast Functions for Prime Numbers
Fast functions for dealing with prime numbers, such as testing whether a number is prime and generating a sequence prime numbers. Additional functions include finding prime factors and Ruth-Aaron pairs, finding next and previous prime numbers in the series, finding or estimating the nth prime, estimating the number of primes less than or equal to an arbitrary number, computing primorials, prime k-tuples (e.g., twin primes), finding the greatest common divisor and smallest (least) common multiple, testing whether two numbers are coprime, and computing Euler's totient function. Most functions are vectorized for speed and convenience.
Maintained by Os Keyes. Last updated 1 years ago.
0.5 match 11 stars 4.50 score 47 scriptshaghish
mlim:Single and Multiple Imputation with Automated Machine Learning
Machine learning algorithms have been used for performing single missing data imputation and most recently, multiple imputations. However, this is the first attempt for using automated machine learning algorithms for performing both single and multiple imputation. Automated machine learning is a procedure for fine-tuning the model automatic, performing a random search for a model that results in less error, without overfitting the data. The main idea is to allow the model to set its own parameters for imputing each variable separately instead of setting fixed predefined parameters to impute all variables of the dataset. Using automated machine learning, the package fine-tunes an Elastic Net (default) or Gradient Boosting, Random Forest, Deep Learning, Extreme Gradient Boosting, or Stacked Ensemble machine learning model (from one or a combination of other supported algorithms) for imputing the missing observations. This procedure has been implemented for the first time by this package and is expected to outperform other packages for imputing missing data that do not fine-tune their models. The multiple imputation is implemented via bootstrapping without letting the duplicated observations to harm the cross-validation procedure, which is the way imputed variables are evaluated. Most notably, the package implements automated procedure for handling imputing imbalanced data (class rarity problem), which happens when a factor variable has a level that is far more prevalent than the other(s). This is known to result in biased predictions, hence, biased imputation of missing data. However, the autobalancing procedure ensures that instead of focusing on maximizing accuracy (classification error) in imputing factor variables, a fairer procedure and imputation method is practiced.
Maintained by E. F. Haghish. Last updated 8 months ago.
automatic-machine-learningautomlclassimbalancedata-scienceelastic-netextreme-gradient-boostinggbmglmgradient-boostinggradient-boosting-machineimputationimputation-algorithmimputation-methodsmachine-learningmissing-datamultipleimputationstack-ensemble
0.5 match 31 stars 4.49 score 7 scriptsboxiang-wang
ARTtransfer:Adaptive and Robust Pipeline for Transfer Learning
Adaptive and Robust Transfer Learning (ART) is a flexible framework for transfer learning that integrates information from auxiliary data sources to improve model performance on primary tasks. It is designed to be robust against negative transfer by including the non-transfer model in the candidate pool, ensuring stable performance even when auxiliary datasets are less informative. See the paper, Wang, Wu, and Ye (2023) <doi:10.1002/sta4.582>.
Maintained by Boxiang Wang. Last updated 2 months ago.
0.5 match 4.40 score 1 scriptsbioc
zitools:Analysis of zero-inflated count data
zitools allows for zero inflated count data analysis by either using down-weighting of excess zeros or by replacing an appropriate proportion of excess zeros with NA. Through overloading frequently used statistical functions (such as mean, median, standard deviation), plotting functions (such as boxplots or heatmap) or differential abundance tests, it allows a wide range of downstream analyses for zero-inflated data in a less biased manner. This becomes applicable in the context of microbiome analyses, where the data is often overdispersed and zero-inflated, therefore making data analysis extremly challenging.
Maintained by Carlotta Meyring. Last updated 5 months ago.
softwarestatisticalmethodmicrobiome
0.5 match 4.40 score 6 scriptsjimclarkatduke
mastif:Mast Inference and Forecasting
Analyzes production and dispersal of seeds dispersed from trees and recovered in seed traps. Motivated by long-term inventory plots where seed collections are used to infer seed production by each individual plant.
Maintained by James S. Clark. Last updated 1 years ago.
1.1 match 2.00 scorefvafrcu
cleanr:Helps You to Code Cleaner
Check your R code for some of the most common layout flaws. Many tried to teach us how to write code less dreadful, be it implicitly as B. W. Kernighan and D. M. Ritchie (1988) <ISBN:0-13-110362-8> in 'The C Programming Language' did, be it explicitly as R.C. Martin (2008) <ISBN:0-13-235088-2> in 'Clean Code: A Handbook of Agile Software Craftsmanship' did. So we should check our code for files too long or wide, functions with too many lines, too wide lines, too many arguments or too many levels of nesting. Note: This is not a static code analyzer like pylint or the like. Checkout <https://cran.r-project.org/package=lintr> instead.
Maintained by Andreas Dominik Cullmann. Last updated 2 years ago.
0.5 match 4.22 score 33 scriptsbioc
odseq:Outlier detection in multiple sequence alignments
Performs outlier detection of sequences in a multiple sequence alignment using bootstrap of predefined distance metrics. Outlier sequences can make downstream analyses unreliable or make the alignments less accurate while they are being constructed. This package implements the OD-seq algorithm proposed by Jehl et al (doi 10.1186/s12859-015-0702-1) for aligned sequences and a variant using string kernels for unaligned sequences.
Maintained by José Jiménez. Last updated 5 months ago.
alignmentmultiplesequencealignment
0.5 match 4.18 score 25 scripts 1 dependentsmedewitt
staninside:Facilitating the Use of Stan Within Packages
This package provides helper functions that can be used for integrating Stan code driven by the CmdStanR package. Using CmdStanR and pre-written Stan code can make package installation easy and less prone to fail because it removes the need for Rcpp and RStan packages(and their dependencies). Using CmdStanR will also afford users the opportunity to use the latest developments within CmdStan. However, building these packages requires some work and this package provides tools to assist with that,
Maintained by Michael DeWitt. Last updated 3 years ago.
0.5 match 7 stars 4.02 score 2 scripts 1 dependentschenlaboratory
hexDensity:Fast Kernel Density Estimation with Hexagonal Grid
Kernel density estimation with hexagonal grid for bivariate data. Hexagonal grid has many beneficial properties like equidistant neighbours and less edge bias, making it better for spatial analyses than the more commonly used rectangular grid. Carr, D. B. et al. (1987) <doi:10.2307/2289444>. Diggle, P. J. (2010) <doi:10.1201/9781420072884>. Hill, B. (2017) <https://blog.bruce-hill.com/meandering-triangles>. Jones, M. C. (1993) <doi:10.1007/BF00147776>.
Maintained by Quoc Hoang Nguyen. Last updated 2 months ago.
0.5 match 3.93 scoredrjohanlk
kollaR:Filtering, Visualization and Analysis of Eye Tracking Data
Functions for analysing eye tracking data, including event detection (I-VT, I-DT and two means clustering), visualizations and area of interest (AOI) based analyses. See separate documentation for each function. The principles underlying I-VT and I-DT filters are described in Salvucci & Goldberg (2000,\doi{10.1145/355017.355028}). Two-means clustering is described in Hessels et al. (2017, \doi{10.3758/s13428-016-0822-1}).
Maintained by Johan Lundin Kleberg. Last updated 1 months ago.
1.5 match 1.30 scorebioc
RareVariantVis:A suite for analysis of rare genomic variants in whole genome sequencing data
Second version of RareVariantVis package aims to provide comprehensive information about rare variants for your genome data. It annotates, filters and presents genomic variants (especially rare ones) in a global, per chromosome way. For discovered rare variants CRISPR guide RNAs are designed, so the user can plan further functional studies. Large structural variants, including copy number variants are also supported. Package accepts variants directly from variant caller - for example GATK or Speedseq. Output of package are lists of variants together with adequate visualization. Visualization of variants is performed in two ways - standard that outputs png figures and interactive that uses JavaScript d3 package. Interactive visualization allows to analyze trio/family data, for example in search for causative variants in rare Mendelian diseases, in point-and-click interface. The package includes homozygous region caller and allows to analyse whole human genomes in less than 30 minutes on a desktop computer. RareVariantVis disclosed novel causes of several rare monogenic disorders, including one with non-coding causative variant - keratolythic winter erythema.
Maintained by Tomasz Stokowy. Last updated 5 months ago.
genomicvariationsequencingwholegenome
0.5 match 3.90 score 1 scriptsmitra-ep
rSEA:Simultaneous Enrichment Analysis
SEA performs simultaneous feature-set testing for (gen)omics data. It tests the unified null hypothesis and controls the family-wise error rate for all possible pathways. The unified null hypothesis is defined as: "The proportion of true features in the set is less than or equal to a threshold." Family-wise error rate control is provided through use of closed testing with Simes test. There are some practical functions to play around with the pathways of interest.
Maintained by Mitra Ebrahimpoor. Last updated 10 months ago.
0.5 match 3.70 score 10 scriptseleanorcaves
AcuityView:A Package for Displaying Visual Scenes as They May Appear to an Animal with Lower Acuity
This code provides a simple method for representing a visual scene as it may be seen by an animal with less acute vision. When using (or for more information), please cite the original publication.
Maintained by Eleanor Caves. Last updated 8 years ago.
0.5 match 3 stars 3.48 score 1 scriptsbioc
omada:Machine learning tools for automated transcriptome clustering analysis
Symptomatic heterogeneity in complex diseases reveals differences in molecular states that need to be investigated. However, selecting the numerous parameters of an exploratory clustering analysis in RNA profiling studies requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent and further gene association analyses need to be performed independently. We have developed a suite of tools to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with four datasets characterised by different expression signal strengths. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Even in datasets with less clear biological distinctions, stable subgroups with different expression profiles and clinical associations were found.
Maintained by Sokratis Kariotis. Last updated 5 months ago.
softwareclusteringrnaseqgeneexpression
0.5 match 3.60 score 5 scriptshan-siyu
TIGERr:Technical Variation Elimination with Ensemble Learning Architecture
The R implementation of TIGER. TIGER integrates random forest algorithm into an innovative ensemble learning architecture. Benefiting from this advanced architecture, TIGER is resilient to outliers, free from model tuning and less likely to be affected by specific hyperparameters. TIGER supports targeted and untargeted metabolomics data and is competent to perform both intra- and inter-batch technical variation removal. TIGER can also be used for cross-kit adjustment to ensure data obtained from different analytical assays can be effectively combined and compared. Reference: Han S. et al. (2022) <doi:10.1093/bib/bbab535>.
Maintained by Siyu Han. Last updated 6 months ago.
0.5 match 6 stars 3.48 score 1 scriptscran
vitality:Fitting Routines for the Vitality Family of Mortality Models
Provides fitting routines for four versions of the Vitality family of mortality models.
Maintained by David J. Sharrow. Last updated 7 years ago.
1.8 match 1.00 scorebioc
SICtools:Find SNV/Indel differences between two bam files with near relationship
This package is to find SNV/Indel differences between two bam files with near relationship in a way of pairwise comparison thourgh each base position across the genome region of interest. The difference is inferred by fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region.
Maintained by Xiaobin Xing. Last updated 5 months ago.
alignmentsequencingcoveragesequencematchingqualitycontroldataimportsoftwaresnpvariantdetection
0.5 match 3.30 score 1 scriptsjabiru
csvread:Fast Specialized CSV File Loader
Functions for loading large (10M+ lines) CSV and other delimited files, similar to read.csv, but typically faster and using less memory than the standard R loader. While not entirely general, it covers many common use cases when the types of columns in the CSV file are known in advance. In addition, the package provides a class 'int64', which represents 64-bit integers exactly when reading from a file. The latter is useful when working with 64-bit integer identifiers exported from databases. The CSV file loader supports common column types including 'integer', 'double', 'string', and 'int64', leaving further type transformations to the user.
Maintained by Sergei Izrailev. Last updated 5 months ago.
0.5 match 3.32 score 29 scriptsbioc
transcriptR:An Integrative Tool for ChIP- And RNA-Seq Based Primary Transcripts Detection and Quantification
The differences in the RNA types being sequenced have an impact on the resulting sequencing profiles. mRNA-seq data is enriched with reads derived from exons, while GRO-, nucRNA- and chrRNA-seq demonstrate a substantial broader coverage of both exonic and intronic regions. The presence of intronic reads in GRO-seq type of data makes it possible to use it to computationally identify and quantify all de novo continuous regions of transcription distributed across the genome. This type of data, however, is more challenging to interpret and less common practice compared to mRNA-seq. One of the challenges for primary transcript detection concerns the simultaneous transcription of closely spaced genes, which needs to be properly divided into individually transcribed units. The R package transcriptR combines RNA-seq data with ChIP-seq data of histone modifications that mark active Transcription Start Sites (TSSs), such as, H3K4me3 or H3K9/14Ac to overcome this challenge. The advantage of this approach over the use of, for example, gene annotations is that this approach is data driven and therefore able to deal also with novel and case specific events. Furthermore, the integration of ChIP- and RNA-seq data allows the identification all known and novel active transcription start sites within a given sample.
Maintained by Armen R. Karapetyan. Last updated 5 months ago.
immunooncologytranscriptionsoftwaresequencingrnaseqcoverage
0.5 match 3.30 score 2 scriptscran
survIDINRI:IDI and NRI for Comparing Competing Risk Prediction Models with Censored Survival Data
Performs inference for a class of measures to compare competing risk prediction models with censored survival data. The class includes the integrated discrimination improvement index (IDI) and category-less net reclassification index (NRI).
Maintained by Hajime Uno. Last updated 3 years ago.
0.5 match 1 stars 3.18 score 1 dependentsptfonseca
inspector:Validation of Arguments and Objects in User-Defined Functions
Utility functions that implement and automate common sets of validation tasks. These functions are particularly useful to validate inputs, intermediate objects and output values in user-defined functions, resulting in tidier and less verbose functions.
Maintained by Pedro Fonseca. Last updated 4 years ago.
input-validationstatisticsvalidationvalidations
0.5 match 3.00 score 2 scriptswenlongren
ScoreEB:Score Test Integrated with Empirical Bayes for Association Study
Perform association test within linear mixed model framework using score test integrated with empirical bayes for genome-wide association study. Firstly, score test was conducted for each single nucleotide polymorphism (SNP) under linear mixed model framework, taking into account the genetic relatedness and population structure. And then all the potentially associated SNPs were selected with a less stringent criterion. Finally, all the selected SNPs were performed empirical bayes in a multi-locus model to identify the true quantitative trait nucleotide (QTN).
Maintained by Wenlong Ren. Last updated 3 years ago.
0.5 match 2 stars 3.00 score 1 scriptszhengxiaouvic
rmBayes:Performing Bayesian Inference for Repeated-Measures Designs
A Bayesian credible interval is interpreted with respect to posterior probability, and this interpretation is far more intuitive than that of a frequentist confidence interval. However, standard highest-density intervals can be wide due to between-subjects variability and tends to hide within-subject effects, rendering its relationship with the Bayes factor less clear in within-subject (repeated-measures) designs. This urgent issue can be addressed by using within-subject intervals in within-subject designs, which integrate four methods including the Wei-Nathoo-Masson (2023) <doi:10.3758/s13423-023-02295-1>, the Loftus-Masson (1994) <doi:10.3758/BF03210951>, the Nathoo-Kilshaw-Masson (2018) <doi:10.1016/j.jmp.2018.07.005>, and the Heck (2019) <doi:10.31234/osf.io/whp8t> interval estimates.
Maintained by Zhengxiao Wei. Last updated 1 years ago.
bayesian-inferencecredible-intervalhdirepeated-measuresstanwithin-subjectcpp
0.5 match 2 stars 3.00 score 2 scriptsmbelitz
phenesse:Estimate Phenological Metrics using Presence-Only Data
Generates Weibull-parameterized estimates of phenology for any percentile of a distribution using the framework established in Cooke (1979) <doi:10.1093/biomet/66.2.367>. Extensive testing against other estimators suggest the weib_percentile() function is especially useful in generating more accurate and less biased estimates of onset and offset (Belitz et al. 2020 <doi.org:10.1111/2041-210X.13448>. Non-parametric bootstrapping can be used to generate confidence intervals around those estimates, although this is computationally expensive. Additionally, this package offers an easy way to perform non-parametric bootstrapping to generate confidence intervals for quantile estimates, mean estimates, or any statistical function of interest.
Maintained by Michael Belitz. Last updated 5 years ago.
0.5 match 2.95 score 18 scriptsncss-tech
rosettaPTF:R Frontend for Rosetta Pedotransfer Functions
Access Python rosetta-soil pedotransfer functions in an R environment. Rosetta is a neural network-based model for predicting unsaturated soil hydraulic parameters from basic soil characterization data. The model predicts parameters for the van Genuchten unsaturated soil hydraulic properties model, using sand, silt, and clay, bulk density and water content. The codebase is now maintained by Dr. Todd Skaggs and other U.S. Department of Agriculture employees. This R package is intended to provide for use cases that involve many thousands of calls to the pedotransfer function. Less demanding use cases are encouraged to use the web interface or API endpoint. There are additional wrappers of the API endpoints provided by the soilDB R package `ROSETTA()` method.
Maintained by Andrew G. Brown. Last updated 3 months ago.
hydraulichydrologyksatpedotransferpythonreticulaterosettasoil
0.5 match 8 stars 2.90 score 8 scriptstim-salabim
remote:Empirical Orthogonal Teleconnections in R
Empirical orthogonal teleconnections in R. 'remote' is short for 'R(-based) EMpirical Orthogonal TEleconnections'. It implements a collection of functions to facilitate empirical orthogonal teleconnection analysis. Empirical Orthogonal Teleconnections (EOTs) denote a regression based approach to decompose spatio-temporal fields into a set of independent orthogonal patterns. They are quite similar to Empirical Orthogonal Functions (EOFs) with EOTs producing less abstract results. In contrast to EOFs, which are orthogonal in both space and time, EOT analysis produces patterns that are orthogonal in either space or time.
Maintained by Tim Appelhans. Last updated 9 years ago.
0.5 match 2.79 score 100 scriptsyuande
tatest:Two-Group Ta-Test
The ta-test is a modified two-sample or two-group t-test of Gosset (1908). In small samples with less than 15 replicates,the ta-test significantly reduces type I error rate but has almost the same power with the t-test and hence can greatly enhance reliability or reproducibility of discoveries in biology and medicine. The ta-test can test single null hypothesis or multiple null hypotheses without needing to correct p-values.
Maintained by Yuan-De Tan. Last updated 3 years ago.
0.5 match 2.70 score 7 scriptstomasmrkvicka
binspp:Bayesian Inference for Neyman-Scott Point Processes
The Bayesian MCMC estimation of parameters for Thomas-type cluster point process with various inhomogeneities. It allows for inhomogeneity in (i) distribution of parent points, (ii) mean number of points in a cluster, (iii) cluster spread. The package also allows for the Bayesian MCMC algorithm for the homogeneous generalized Thomas process. The cluster size is allowed to have a variance that is greater or less than the expected value (cluster sizes are over or under dispersed). Details are described in Dvořák, Remeš, Beránek & Mrkvička (2022) <arXiv: 10.48550/arXiv.2205.07946>.
Maintained by Remes Radim. Last updated 2 days ago.
0.5 match 1 stars 2.70 scoreandrija-djurovic
monobinShiny:Shiny User Interface for 'monobin' Package
This is an add-on package to the 'monobin' package that simplifies its use. It provides shiny-based user interface (UI) that is especially handy for less experienced 'R' users as well as for those who intend to perform quick scanning of numeric risk factors when building credit rating models. The additional functions implemented in 'monobinShiny' that do no exist in 'monobin' package are: descriptive statistics, special case and outliers imputation. The function descriptive statistics is exported and can be used in 'R' sessions independently from the user interface, while special case and outlier imputation functions are written to be used with shiny UI.
Maintained by Andrija Djurovic. Last updated 3 years ago.
0.5 match 1 stars 2.70 score 6 scriptsrjauslin
WaveSampling:Weakly Associated Vectors (WAVE) Sampling
Spatial data are generally auto-correlated, meaning that if two units selected are close to each other, then it is likely that they share the same properties. For this reason, when sampling in the population it is often needed that the sample is well spread over space. A new method to draw a sample from a population with spatial coordinates is proposed. This method is called wave (Weakly Associated Vectors) sampling. It uses the less correlated vector to a spatial weights matrix to update the inclusion probabilities vector into a sample. For more details see Raphaël Jauslin and Yves Tillé (2019) <doi:10.1007/s13253-020-00407-1>.
Maintained by Raphaël Jauslin. Last updated 2 months ago.
0.5 match 1 stars 2.70 score 8 scriptssamhaycock
stressor:Algorithms for Testing Models under Stress
Traditional model evaluation metrics fail to capture model performance under less than ideal conditions. This package employs techniques to evaluate models "under-stress". This includes testing models' extrapolation ability, or testing accuracy on specific sub-samples of the overall model space. Details describing stress-testing methods in this package are provided in Haycock (2023) <doi:10.26076/2am5-9f67>. The other primary contribution of this package is provided to R users access to the 'Python' library 'PyCaret' <https://pycaret.org/> for quick and easy access to auto-tuned machine learning models.
Maintained by Sam Haycock. Last updated 11 months ago.
0.5 match 2.70 score 6 scriptsaalfons
robmedExtra:Extra Functionality for (Robust) Mediation Analysis
This companion package extends the package 'robmed' (Alfons, Ates & Groenen, 2022b; <doi:10.18637/jss.v103.i13>) in various ways. Most notably, it provides a graphical user interface for the robust bootstrap test ROBMED (Alfons, Ates & Groenen, 2022a; <doi:10.1177/1094428121999096>) to make the method more accessible to less proficient 'R' users, as well as functions to export the results as a table in a 'Microsoft Word' or 'Microsoft Powerpoint' document, or as a 'LaTeX' table. Furthermore, the package contains a 'shiny' app to compare various bootstrap procedures for mediation analysis on simulated data.
Maintained by Andreas Alfons. Last updated 5 months ago.
0.5 match 1 stars 2.70 scoreumich-biostatistics
MetaIntegration:Ensemble Meta-Inference Framework
An ensemble meta-inference framework to integrate multiple regression models into a current study. Gu, T., Taylor, J.M.G. and Mukherjee, B. (2021) <arXiv:2010.09971>. A meta-analysis framework along with two weighted estimators as the ensemble of empirical Bayes estimators, which combines the estimates from the different external models. The proposed framework is flexible and robust in the ways that (i) it is capable of incorporating external models that use a slightly different set of covariates; (ii) it is able to identify the most relevant external information and diminish the influence of information that is less compatible with the internal data; and (iii) it nicely balances the bias-variance trade-off while preserving the most efficiency gain. The proposed estimators are more efficient than the naive analysis of the internal data and other naive combinations of external estimators.
Maintained by Michael Kleinsasser. Last updated 4 years ago.
0.5 match 1 stars 2.70 score 1 scripts