R-universe search: span

jimclarkatduke

gjam:Generalized Joint Attribute Modeling

Analyzes joint attribute data (e.g., species abundance) that are combinations of continuous and discrete data with Gibbs sampling. Full model and computation details are described in Clark et al. (2018) <doi:10.1002/ecm.1241>.

Maintained by James S. Clark. Last updated 3 years ago.

openblas cpp openmp

87.5 match 3.18 score 150 scripts

grunwaldlab

poppr:Genetic Analysis of Populations with Mixed Reproduction

Population genetic analyses for hierarchical analysis of partially clonal populations built upon the architecture of the 'adegenet' package. Originally described in Kamvar, Tabima, and Grünwald (2014) <doi:10.7717/peerj.281> with version 2.0 described in Kamvar, Brooks, and Grünwald (2015) <doi:10.3389/fgene.2015.00208>.

Maintained by Zhian N. Kamvar. Last updated 11 months ago.

clonality genetic-analysis genetic-distances minimum-spanning-networks multilocus-genotypes multilocus-lineages population-genetics populations openmp

18.6 match 69 stars 10.84 score 672 scripts

annechao

iNEXT.3D:Interpolation and Extrapolation for Three Dimensions of Biodiversity

Biodiversity is a multifaceted concept covering different levels of organization from genes to ecosystems. 'iNEXT.3D' extends 'iNEXT' to include three dimensions (3D) of biodiversity, i.e., taxonomic diversity (TD), phylogenetic diversity (PD) and functional diversity (FD). This package provides functions to compute standardized 3D diversity estimates with a common sample size or sample coverage. A unified framework based on Hill numbers and their generalizations (Hill-Chao numbers) are used to quantify 3D. All 3D estimates are in the same units of species/lineage equivalents and can be meaningfully compared. The package features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of 3D diversity across individual assemblages. Asymptotic 3D diversity estimates are also provided. See Chao et al. (2021) <doi:10.1111/2041-210X.13682> for more details.

Maintained by Anne Chao. Last updated 1 months ago.

cpp

28.7 match 6.59 score 26 scripts 2 dependents

thomasp85

ggraph:An Implementation of Grammar of Graphics for Graphs and Networks

The grammar of graphics as implemented in ggplot2 is a poor fit for graph and network visualizations due to its reliance on tabular data input. ggraph is an extension of the ggplot2 API tailored to graph visualizations and provides the same flexible approach to building up plots layer by layer.

Maintained by Thomas Lin Pedersen. Last updated 1 years ago.

ggplot-extension ggplot2 graph-visualization network-visualization visualization cpp

7.6 match 1.1k stars 16.96 score 9.2k scripts 111 dependents

kurthornik

NLP:Natural Language Processing Infrastructure

Basic classes and methods for Natural Language Processing.

Maintained by Kurt Hornik. Last updated 4 months ago.

13.4 match 6 stars 9.42 score 1.0k scripts 127 dependents

jakobbossek

mcMST:A Toolbox for the Multi-Criteria Minimum Spanning Tree Problem

Algorithms to approximate the Pareto-front of multi-criteria minimum spanning tree problems.

Maintained by Jakob Bossek. Last updated 2 years ago.

evolutionary-algorithms mcmst minimum-spanning-trees multi-objective-optimization spanningtrees

26.0 match 4 stars 4.73 score 27 scripts

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 4 hours ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

5.8 match 584 stars 21.14 score 31k scripts 1.9k dependents

dbosak01

reporter:Creates Statistical Reports

Contains functions to create regulatory-style statistical reports. Originally designed to create tables, listings, and figures for the pharmaceutical, biotechnology, and medical device industries, these reports are generalized enough that they could be used in any industry. Generates text, rich-text, PDF, HTML, and Microsoft Word file formats. The package specializes in printing wide and long tables with automatic page wrapping and splitting. Reports can be produced with a minimum of function calls, and without relying on other table packages. The package supports titles, footnotes, page header, page footers, spanning headers, page by variables, and automatic page numbering.

Maintained by David Bosak. Last updated 1 years ago.

report reporting reports rptr

12.1 match 16 stars 9.35 score 173 scripts 4 dependents

r-lib

clock:Date-Time Types and Tools

Provides a comprehensive library for date-time manipulations using a new family of orthogonal date-time classes (durations, time points, zoned-times, and calendars) that partition responsibilities so that the complexities of time zones are only considered when they are really needed. Capabilities include: date-time parsing, formatting, arithmetic, extraction and updating of components, and rounding.

Maintained by Davis Vaughan. Last updated 15 days ago.

cpp

7.7 match 106 stars 14.53 score 296 scripts 407 dependents

jimclarkatduke

mastif:Mast Inference and Forecasting

Analyzes production and dispersal of seeds dispersed from trees and recovered in seed traps. Motivated by long-term inventory plots where seed collections are used to infer seed production by each individual plant.

Maintained by James S. Clark. Last updated 1 years ago.

openblas cpp

55.3 match 2.00 score

edwinth

padr:Quickly Get Datetime Data Ready for Analysis

Transforms datetime data into a format ready for analysis. It offers two core functionalities; aggregating data to a higher level interval (thicken) and imputing records where observations were absent (pad).

Maintained by Edwin Thoen. Last updated 4 months ago.

cpp

9.1 match 132 stars 11.55 score 428 scripts 20 dependents

trinker

qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis

Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.

Maintained by Tyler Rinker. Last updated 5 years ago.

qdap quantitative-discourse-analysis text-analysis text-mining text-plotting openjdk

10.8 match 176 stars 9.61 score 1.3k scripts 3 dependents

r-spatial

spdep:Spatial Dependence: Weighting Schemes, Statistics

A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.

Maintained by Roger Bivand. Last updated 1 months ago.

spatial-autocorrelation spatial-dependence spatial-weights

6.1 match 131 stars 16.59 score 6.0k scripts 106 dependents

njtierney

naniar:Data Structures, Summaries, and Visualisations for Missing Data

Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.

Maintained by Nicholas Tierney. Last updated 19 days ago.

data-visualisation ggplot2 missing-data missingness tidy-data

6.5 match 657 stars 15.63 score 5.1k scripts 9 dependents

italo-granato

snpReady:Preparing Genotypic Datasets in Order to Run Genomic Analysis

Three functions to clean, summarize and prepare genomic datasets to Genome Selection and Genome Association analysis and to estimate population genetic parameters.

Maintained by Italo Granato. Last updated 5 years ago.

16.2 match 4 stars 5.90 score 33 scripts

tidyverse

lubridate:Make Dealing with Dates a Little Easier

Functions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects. The 'lubridate' package has a consistent and memorable syntax that makes working with dates easy and fun.

Maintained by Vitalie Spinu. Last updated 4 months ago.

date date-time

4.4 match 757 stars 20.95 score 135k scripts 1.9k dependents

bioc

xcms:LC-MS and GC-MS Data Analysis

Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.

Maintained by Steffen Neumann. Last updated 18 days ago.

immunooncology massspectrometry metabolomics bioconductor feature-detection mass-spectrometry peak-detection cpp

6.3 match 196 stars 14.31 score 984 scripts 11 dependents

rspatial

geosphere:Spherical Trigonometry

Spherical trigonometry for geographic applications. That is, compute distances and related measures for angular (longitude/latitude) locations.

Maintained by Robert J. Hijmans. Last updated 6 months ago.

cpp

5.6 match 36 stars 13.79 score 5.7k scripts 116 dependents

bioc

monocle:Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq

Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.

Maintained by Cole Trapnell. Last updated 5 months ago.

immunooncology sequencing rnaseq geneexpression differentialexpression infrastructure dataimport datarepresentation visualization clustering multiplecomparison qualitycontrol cpp

8.7 match 8.71 score 1.6k scripts 2 dependents

span-18

spStack:Bayesian Geostatistics Using Predictive Stacking

Fits Bayesian hierarchical spatial process models for point-referenced Gaussian, Poisson, binomial, and binary data using stacking of predictive densities. It involves sampling from analytically available posterior distributions conditional upon some candidate values of the spatial process parameters and, subsequently assimilate inference from these individual posterior distributions using Bayesian predictive stacking. Our algorithm is highly parallelizable and hence, much faster than traditional Markov chain Monte Carlo algorithms while delivering competitive predictive performance. See Zhang, Tang, and Banerjee (2024) <doi:10.48550/arXiv.2304.12414>, and, Pan, Zhang, Bradley, and Banerjee (2024) <doi:10.48550/arXiv.2406.04655> for details.

Maintained by Soumyakanti Pan. Last updated 4 days ago.

openblas cpp

15.0 match 4.98 score 6 scripts

r-forge

CHNOSZ:Thermodynamic Calculations and Diagrams for Geochemistry

An integrated set of tools for thermodynamic calculations in aqueous geochemistry and geobiochemistry. Functions are provided for writing balanced reactions to form species from user-selected basis species and for calculating the standard molal properties of species and reactions, including the standard Gibbs energy and equilibrium constant. Calculations of the non-equilibrium chemical affinity and equilibrium chemical activity of species can be portrayed on diagrams as a function of temperature, pressure, or activity of basis species; in two dimensions, this gives a maximum affinity or predominance diagram. The diagrams have formatted chemical formulas and axis labels, and water stability limits can be added to Eh-pH, oxygen fugacity- temperature, and other diagrams with a redox variable. The package has been developed to handle common calculations in aqueous geochemistry, such as solubility due to complexation of metal ions, mineral buffers of redox or pH, and changing the basis species across a diagram ("mosaic diagrams"). CHNOSZ also implements a group additivity algorithm for the standard thermodynamic properties of proteins.

Maintained by Jeffrey Dick. Last updated 24 days ago.

fortran

7.5 match 9.46 score 238 scripts 4 dependents

rstudio

shiny:Web Application Framework for R

Makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.

Maintained by Winston Chang. Last updated 6 days ago.

reactive rstudio shiny web-app web-development

3.3 match 5.5k stars 21.31 score 108k scripts 1.8k dependents

bxc147

Epi:Statistical Analysis in Epidemiology

Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data. In particular representation, manipulation, rate estimation and simulation for multistate data - the Lexis suite of functions, which includes interfaces to 'mstate', 'etm' and 'cmprsk' packages. Contains functions for Age-Period-Cohort and Lee-Carter modeling and a function for interval censored data and some useful functions for tabulation and plotting, as well as a number of epidemiological data sets.

Maintained by Bendix Carstensen. Last updated 3 months ago.

6.9 match 4 stars 9.64 score 708 scripts 11 dependents

cmlmagneville

mFD:Compute and Illustrate the Multiple Facets of Functional Diversity

Computing functional traits-based distances between pairs of species for species gathered in assemblages allowing to build several functional spaces. The package allows to compute functional diversity indices assessing the distribution of species (and of their dominance) in a given functional space for each assemblage and the overlap between assemblages in a given functional space, see: Chao et al. (2018) <doi:10.1002/ecm.1343>, Maire et al. (2015) <doi:10.1111/geb.12299>, Mouillot et al. (2013) <doi:10.1016/j.tree.2012.10.004>, Mouillot et al. (2014) <doi:10.1073/pnas.1317625111>, Ricotta and Szeidl (2009) <doi:10.1016/j.tpb.2009.10.001>. Graphical outputs are included. Visit the 'mFD' website for more information, documentation and examples.

Maintained by Camille Magneville. Last updated 3 months ago.

9.3 match 26 stars 7.22 score 61 scripts

annechao

iNEXT.beta3D:Interpolation and Extrapolation with Beta Diversity for Three Dimensions of Biodiversity

As a sequel to 'iNEXT', the 'iNEXT.beta3D' package provides functions to compute standardized taxonomic, phylogenetic, and functional diversity (3D) estimates with a common sample size (for alpha and gamma diversity) or sample coverage (for alpha, beta, gamma diversity as well as dissimilarity or turnover indices). Hill numbers and their generalizations are used to quantify 3D and to make multiplicative decomposition (gamma = alpha x beta). The package also features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of beta diversity across datasets. See Chao et al. (2023) <doi:10.1002/ecm.1588> for more details.

Maintained by Anne Chao. Last updated 5 months ago.

12.7 match 5.18 score 6 scripts

jfrench

smerc:Statistical Methods for Regional Counts

Implements statistical methods for analyzing the counts of areal data, with a focus on the detection of spatial clusters and clustering. The package has a heavy emphasis on spatial scan methods, which were first introduced by Kulldorff and Nagarwalla (1995) <doi:10.1002/sim.4780140809> and Kulldorff (1997) <doi:10.1080/03610929708831995>.

Maintained by Joshua French. Last updated 5 months ago.

cpp

10.5 match 3 stars 6.08 score 45 scripts 3 dependents

klmr

box:Write Reusable, Composable and Modular R Code

A modern module system for R. Organise code into hierarchical, composable, reusable modules, and use it effortlessly across projects via a flexible, declarative dependency loading syntax.

Maintained by Konrad Rudolph. Last updated 28 days ago.

modules packages

5.0 match 888 stars 12.39 score 47 scripts 4 dependents

dpmcsuss

iGraphMatch:Tools for Graph Matching

Versatile tools and data for graph matching analysis with various forms of prior information that supports working with 'igraph' objects, matrix objects, or lists of either.

Maintained by Daniel Sussman. Last updated 11 months ago.

graph-algorithms graph-matching cpp

11.0 match 9 stars 5.65 score 9 scripts

microsoft

wpa:Tools for Analysing and Visualising Viva Insights Data

Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.

Maintained by Martin Chan. Last updated 4 months ago.

workplace-analytics

8.8 match 30 stars 6.68 score 39 scripts 1 dependents

leef-uzh

LEEF:Data Package Containing Only Data and Data Information

Setup package for the LEEF pipeline which loads / installs all necessary packages and functions to run the pipeline.

Maintained by Rainer M. Krug. Last updated 3 years ago.

data-analysis data-processing leef

19.8 match 2.95 score

rstudio

htmltools:Tools for HTML

Tools for HTML generation and output.

Maintained by Carson Sievert. Last updated 11 months ago.

3.3 match 218 stars 17.61 score 10k scripts 4.5k dependents

kaihsianghu

iNEXT.4steps:Four-Step Biodiversity Analysis Based on 'iNEXT'

Expands 'iNEXT' to include the estimation of sample completeness and evenness. The package provides simple functions to perform the following four-step biodiversity analysis: STEP 1: Assessment of sample completeness profiles. STEP 2a: Analysis of size-based rarefaction and extrapolation sampling curves to determine whether the asymptotic diversity can be accurately estimated. STEP 2b: Comparison of the observed and the estimated asymptotic diversity profiles. STEP 3: Analysis of non-asymptotic coverage-based rarefaction and extrapolation sampling curves. STEP 4: Assessment of evenness profiles. The analyses in STEPs 2a, 2b and STEP 3 are mainly based on the previous 'iNEXT' package. Refer to the 'iNEXT' package for details. This package is mainly focusing on the computation for STEPs 1 and 4. See Chao et al. (2020) <doi:10.1111/1440-1703.12102> for statistical background.

Maintained by Anne Chao. Last updated 10 months ago.

9.1 match 4 stars 6.00 score 8 scripts

pmartr

pmartR:Panomics Marketplace - Quality Control and Statistical Analysis for Panomics Data

Provides functionality for quality control processing and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data, as well as RNA-seq based count data and nuclear magnetic resonance (NMR) data. This includes data transformation, specification of groups that are to be compared against each other, filtering of features and/or samples, data normalization, data summarization (correlation, PCA), and statistical comparisons between defined groups. Implements methods described in: Webb-Robertson et al. (2014) <doi:10.1074/mcp.M113.030932>. Webb-Robertson et al. (2011) <doi:10.1002/pmic.201100078>. Matzke et al. (2011) <doi:10.1093/bioinformatics/btr479>. Matzke et al. (2013) <doi:10.1002/pmic.201200269>. Polpitiya et al. (2008) <doi:10.1093/bioinformatics/btn217>. Webb-Robertson et al. (2010) <doi:10.1021/pr1005247>.

Maintained by Lisa Bramer. Last updated 18 days ago.

data-summarization lipids mass-spectrometry metabolites metabolomics-data peptides proteins rna-seq-analysis openblas cpp

7.0 match 40 stars 7.69 score 144 scripts

jonclayden

RcppArray:'Rcpp' Meets 'C++' Arrays

Interoperability between 'Rcpp' and the 'C++11' array and tuple types. Linking to this package allows fixed-length 'std::array' objects to be converted to and from equivalent R vectors, and 'std::tuple' objects converted to lists, via the as() and wrap() functions. There is also experimental support for 'std::span' from 'C++20'.

Maintained by Jon Clayden. Last updated 3 months ago.

array cpp11 cpp20 rcpp span cpp

11.5 match 5 stars 3.88 score 5 scripts

graemetlloyd

Claddis:Measuring Morphological Diversity and Evolutionary Tempo

Measures morphological diversity from discrete character data and estimates evolutionary tempo on phylogenetic trees. Imports morphological data from #NEXUS (Maddison et al. (1997) <doi:10.1093/sysbio/46.4.590>) format with read_nexus_matrix(), and writes to both #NEXUS and TNT format (Goloboff et al. (2008) <doi:10.1111/j.1096-0031.2008.00217.x>). Main functions are test_rates(), which implements AIC and likelihood ratio tests for discrete character rates introduced across Lloyd et al. (2012) <doi:10.1111/j.1558-5646.2011.01460.x>, Brusatte et al. (2014) <doi:10.1016/j.cub.2014.08.034>, Close et al. (2015) <doi:10.1016/j.cub.2015.06.047>, and Lloyd (2016) <doi:10.1111/bij.12746>, and calculate_morphological_distances(), which implements multiple discrete character distance metrics from Gower (1971) <doi:10.2307/2528823>, Wills (1998) <doi:10.1006/bijl.1998.0255>, Lloyd (2016) <doi:10.1111/bij.12746>, and Hopkins and St John (2018) <doi:10.1098/rspb.2018.1784>. This also includes the GED correction from Lehmann et al. (2019) <doi:10.1111/pala.12430>. Multiple functions implement morphospace plots: plot_chronophylomorphospace() implements Sakamoto and Ruta (2012) <doi:10.1371/journal.pone.0039752>, plot_morphospace() implements Wills et al. (1994) <doi:10.1017/S009483730001263X>, plot_changes_on_tree() implements Wang and Lloyd (2016) <doi:10.1098/rspb.2016.0214>, and plot_morphospace_stack() implements Foote (1993) <doi:10.1017/S0094837300015864>. Other functions include safe_taxonomic_reduction(), which implements Wilkinson (1995) <doi:10.1093/sysbio/44.4.501>, map_dollo_changes() implements the Dollo stochastic character mapping of Tarver et al. (2018) <doi:10.1093/gbe/evy096>, and estimate_ancestral_states() implements the ancestral state options of Lloyd (2018) <doi:10.1111/pala.12380>. calculate_tree_length() and reconstruct_ancestral_states() implements the generalised algorithms from Swofford and Maddison (1992; no doi).

Maintained by Graeme T. Lloyd. Last updated 7 days ago.

5.4 match 13 stars 7.86 score 77 scripts 2 dependents

harish11999

transformerForecasting:Transformer Deep Learning Model for Time Series Forecasting

Time series forecasting faces challenges due to the non-stationarity, nonlinearity, and chaotic nature of the data. Traditional deep learning models like Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) process data sequentially but are inefficient for long sequences. To overcome the limitations of these models, we proposed a transformer-based deep learning architecture utilizing an attention mechanism for parallel processing, enhancing prediction accuracy and efficiency. This paper presents user-friendly code for the implementation of the proposed transformer-based deep learning architecture utilizing an attention mechanism for parallel processing. References: Nayak et al. (2024) <doi:10.1007/s40808-023-01944-7> and Nayak et al. (2024) <doi:10.1016/j.simpa.2024.100716>.

Maintained by G H Harish Nayak. Last updated 25 days ago.

21.2 match 1 stars 2.00 score

bioc

plyranges:A fluent interface for manipulating GenomicRanges

A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.

Maintained by Michael Love. Last updated 13 days ago.

infrastructure datarepresentation workflowstep coverage bioconductor data-analysis dplyr genomic-ranges genomics tidy-data

3.3 match 144 stars 12.66 score 1.9k scripts 20 dependents

jakobbossek

grapherator:A Modular Multi-Step Graph Generator

Set of functions for step-wise generation of (weighted) graphs. Aimed for research in the field of single- and multi-objective combinatorial optimization. Graphs are generated adding nodes, edges and weights. Each step may be repeated multiple times with different predefined and custom generators resulting in high flexibility regarding the graph topology and structure of edge weights.

Maintained by Jakob Bossek. Last updated 4 years ago.

combinatorial-optimization graph-generator minimum-spanning-tree multi-objective-optimization optimization

6.7 match 9 stars 6.04 score 27 scripts 1 dependents

davisvaughan

ivs:Interval Vectors

Provides a library for generic interval manipulations using a new interval vector class. Capabilities include: locating various kinds of relationships between two interval vectors, merging overlaps within a single interval vector, splitting an interval vector on its overlapping endpoints, and applying set theoretical operations on interval vectors. Many of the operations in this package were inspired by James Allen's interval algebra, Allen (1983) <doi:10.1145/182.358434>.

Maintained by Davis Vaughan. Last updated 2 years ago.

5.6 match 48 stars 7.05 score 39 scripts 2 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 1 months ago.

ecological-modelling ecology ordination fortran openblas

2.0 match 476 stars 19.40 score 15k scripts 445 dependents

bioc

TRONCO:TRONCO, an R package for TRanslational ONCOlogy

The TRONCO (TRanslational ONCOlogy) R package collects algorithms to infer progression models via the approach of Suppes-Bayes Causal Network, both from an ensemble of tumors (cross-sectional samples) and within an individual patient (multi-region or single-cell samples). The package provides parallel implementation of algorithms that process binary matrices where each row represents a tumor sample and each column a single-nucleotide or a structural variant driving the progression; a 0/1 value models the absence/presence of that alteration in the sample. The tool can import data from plain, MAF or GISTIC format files, and can fetch it from the cBioPortal for cancer genomics. Functions for data manipulation and visualization are provided, as well as functions to import/export such data to other bioinformatics tools for, e.g, clustering or detection of mutually exclusive alterations. Inferred models can be visualized and tested for their confidence via bootstrap and cross-validation. TRONCO is used for the implementation of the Pipeline for Cancer Inference (PICNIC).

Maintained by Luca De Sano. Last updated 5 days ago.

biomedicalinformatics bayesian graphandnetwork somaticmutation networkinference network clustering dataimport singlecell immunooncology algorithms cancer-inference tumors

4.6 match 30 stars 8.35 score 38 scripts

allanvc

emstreeR:Tools for Fast Computing and Visualizing Euclidean Minimum Spanning Trees

Fast and easily computes an Euclidean Minimum Spanning Tree (EMST) from data, relying on the R API for 'mlpack' - the C++ Machine Learning Library (Curtin et. al., 2013). 'emstreeR' uses the Dual-Tree Boruvka (March, Ram, Gray, 2010, <doi:10.1145/1835804.1835882>), which is theoretically and empirically the fastest algorithm for computing an EMST. This package also provides functions and an S3 method for readily visualizing Minimum Spanning Trees (MST) using either the style of the 'base', 'scatterplot3d', or 'ggplot2' libraries; and functions to export the MST output to shapefiles.

Maintained by Allan Quadros. Last updated 1 years ago.

9.0 match 7 stars 4.23 score 16 scripts 1 dependents

adeverse

adespatial:Multivariate Multiscale Spatial Analysis

Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.

Maintained by Aurélie Siberchicot. Last updated 11 hours ago.

fortran openblas

3.4 match 36 stars 11.25 score 398 scripts 2 dependents

pbs-software

PBSmapping:Mapping Fisheries Data and Spatial Analysis Tools

This software has evolved from fisheries research conducted at the Pacific Biological Station (PBS) in 'Nanaimo', British Columbia, Canada. It extends the R language to include two-dimensional plotting features similar to those commonly available in a Geographic Information System (GIS). Embedded C code speeds algorithms from computational geometry, such as finding polygons that contain specified point events or converting between longitude-latitude and Universal Transverse Mercator (UTM) coordinates. Additionally, we include 'C++' code developed by Angus Johnson for the 'Clipper' library, data for a global shoreline, and other data sets in the public domain. Under the user's R library directory '.libPaths()', specifically in './PBSmapping/doc', a complete user's guide is offered and should be consulted to use package functions effectively.

Maintained by Rowan Haigh. Last updated 6 months ago.

cpp

3.6 match 11 stars 10.16 score 652 scripts 9 dependents

atusy

ftExtra:Extensions for 'Flextable'

Build display tables easily by extending the functionality of the 'flextable' package. Features include spanning header, grouping rows, parsing markdown and so on.

Maintained by Atsushi Yasumoto. Last updated 2 months ago.

3.9 match 65 stars 9.19 score 194 scripts

eglenn

acs:Download, Manipulate, and Present American Community Survey and Decennial Data from the US Census

Provides a general toolkit for downloading, managing, analyzing, and presenting data from the U.S. Census (<https://www.census.gov/data/developers/data-sets.html>), including SF1 (Decennial short-form), SF3 (Decennial long-form), and the American Community Survey (ACS). Confidence intervals provided with ACS data are converted to standard errors to be bundled with estimates in complex acs objects. Package provides new methods to conduct standard operations on acs objects and present/plot data in statistically appropriate ways.

Maintained by Ezra Haber Glenn. Last updated 6 years ago.

6.3 match 11 stars 5.33 score 430 scripts 3 dependents

lbbe-software

MareyMap:Estimation of Meiotic Recombination Rates Using Marey Maps

Local recombination rates are graphically estimated across a genome using Marey maps.

Maintained by Aurélie Siberchicot. Last updated 29 days ago.

6.3 match 1 stars 5.30 score 20 scripts

emmanuelparadis

ape:Analyses of Phylogenetics and Evolution

Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.

Maintained by Emmanuel Paradis. Last updated 20 hours ago.

openblas cpp

2.0 match 64 stars 15.83 score 13k scripts 599 dependents

cran

fossil:Palaeoecological and Palaeogeographical Analysis Tools

A set of analytical tools useful in analysing ecological and geographical data sets, both ancient and modern. The package includes functions for estimating species richness (Chao 1 and 2, ACE, ICE, Jacknife), shared species/beta diversity, species area curves and geographic distances and areas.

Maintained by Matthew J. Vavrek. Last updated 5 years ago.

8.9 match 1 stars 3.44 score 7 dependents

alanarnholt

BSDA:Basic Statistics and Data Analysis

Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.

Maintained by Alan T. Arnholt. Last updated 2 years ago.

3.4 match 7 stars 9.11 score 1.3k scripts 6 dependents

r-lib

gtable:Arrange 'Grobs' in Tables

Tools to make it easier to work with "tables" of 'grobs'. The 'gtable' package defines a 'gtable' grob class that specifies a grid along with a list of grobs and their placement in the grid. Further the package makes it easy to manipulate and combine 'gtable' objects so that complex compositions can be built up sequentially.

Maintained by Thomas Lin Pedersen. Last updated 5 months ago.

1.7 match 92 stars 18.12 score 4.1k scripts 7.6k dependents

bioc

RBGL:An interface to the BOOST graph library

A fairly extensive and comprehensive interface to the graph algorithms contained in the BOOST library.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

graphandnetwork network cpp

3.5 match 8.59 score 320 scripts 132 dependents

adeverse

ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.

Maintained by Aurélie Siberchicot. Last updated 12 days ago.

openblas cpp

2.0 match 40 stars 15.10 score 2.2k scripts 257 dependents

ddsjoberg

gtsummary:Presentation-Ready Data Summary and Analytic Result Tables

Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.

Maintained by Daniel D. Sjoberg. Last updated 6 days ago.

easy-to-use gt html5 regression-models reproducibility reproducible-research statistics summary-statistics summary-tables table1 tableone

1.8 match 1.1k stars 17.02 score 8.2k scripts 15 dependents

squidlobster

castor:Efficient Phylogenetics on Large Trees

Efficient phylogenetic analyses on massive phylogenies comprising up to millions of tips. Functions include pruning, rerooting, calculation of most-recent common ancestors, calculating distances from the tree root and calculating pairwise distances. Calculation of phylogenetic signal and mean trait depth (trait conservatism), ancestral state reconstruction and hidden character prediction of discrete characters, simulating and fitting models of trait evolution, fitting and simulating diversification models, dating trees, comparing trees, and reading/writing trees in Newick format. Citation: Louca, Stilianos and Doebeli, Michael (2017) <doi:10.1093/bioinformatics/btx701>.

Maintained by Stilianos Louca. Last updated 4 months ago.

cpp

5.2 match 2 stars 5.75 score 450 scripts 9 dependents

lydialucchesi

smallsets:Visual Documentation for Data Preprocessing

Data practitioners regularly use the 'R' and 'Python' programming languages to prepare data for analyses. Thus, they encode important data preprocessing decisions in 'R' and 'Python' code. The 'smallsets' package subsequently decodes these decisions into a Smallset Timeline, a static, compact visualisation of data preprocessing decisions (Lucchesi et al. (2022) <doi:10.1145/3531146.3533175>). The visualisation consists of small data snapshots of different preprocessing steps. The 'smallsets' package builds this visualisation from a user's dataset and preprocessing code located in an 'R', 'R Markdown', 'Python', or 'Jupyter Notebook' file. Users simply add structured comments with snapshot instructions to the preprocessing code. One optional feature in 'smallsets' requires installation of the 'Gurobi' optimisation software and 'gurobi' 'R' package, available from <https://www.gurobi.com>. More information regarding the optional feature and 'gurobi' installation can be found in the 'smallsets' vignette.

Maintained by Lydia R. Lucchesi. Last updated 2 months ago.

data-science data-visualization documentation-tool machine-learning preprocessing python visualization-tools

5.7 match 14 stars 5.19 score 11 scripts

fmestre1

MetaLandSim:Landscape and Range Expansion Simulation

Tools to generate random landscape graphs, evaluate species occurrence in dynamic landscapes, simulate future landscape occupation and evaluate range expansion when new empty patches are available (e.g. as a result of climate change). References: Mestre, F., Canovas, F., Pita, R., Mira, A., Beja, P. (2016) <doi:10.1016/j.envsoft.2016.03.007>; Mestre, F., Risk, B., Mira, A., Beja, P., Pita, R. (2017) <doi:10.1016/j.ecolmodel.2017.06.013>; Mestre, F., Pita, R., Mira, A., Beja, P. (2020) <doi:10.1186/s12898-019-0273-5>.

Maintained by Frederico Mestre. Last updated 2 years ago.

biogeography ecology metapopulation species

5.7 match 3 stars 5.10 score 28 scripts

elvanceyhan

pcds:Proximity Catch Digraphs and Their Applications

Contains the functions for construction and visualization of various families of the proximity catch digraphs (PCDs) (see (Ceyhan (2005) ISBN:978-3-639-19063-2), for computing the graph invariants for testing the patterns of segregation and association against complete spatial randomness (CSR) or uniformity in one, two and three dimensional cases. The package also has tools for generating points from these spatial patterns. The graph invariants used in testing spatial point data are the domination number (Ceyhan (2011) <doi:10.1080/03610921003597211>) and arc density (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>; Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). The PCD families considered are Arc-Slice PCDs, Proportional-Edge PCDs, and Central Similarity PCDs.

Maintained by Elvan Ceyhan. Last updated 2 years ago.

5.0 match 5.80 score 21 scripts 2 dependents

martin3141

spant:MR Spectroscopy Analysis Tools

Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.

Maintained by Martin Wilson. Last updated 4 days ago.

brain mri mrs mrshub spectroscopy fortran

3.3 match 25 stars 8.52 score 81 scripts

kasperwelbers

corpustools:Managing, Querying and Analyzing Tokenized Text

Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.

Maintained by Kasper Welbers. Last updated 6 months ago.

cpp

3.7 match 31 stars 7.50 score 174 scripts 1 dependents

rich-iannone

DiagrammeR:Graph/Network Visualization

Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.

Maintained by Richard Iannone. Last updated 2 months ago.

graph graph-functions network-graph property-graph visualization

1.8 match 1.7k stars 15.29 score 3.8k scripts 86 dependents

lcbc-uio

Conversions:Collection of functions that convert certain RAW data in the LCBC database

Collection of functions that convert certain RAW data in the LCBC database.

Maintained by Athanasia Monika Mowinckel. Last updated 4 years ago.

7.5 match 3.65 score 10 scripts

gagolews

genieclust:Fast and Robust Hierarchical Clustering with Noise Points Detection

A retake on the Genie algorithm (Gagolewski, 2021 <DOI:10.1016/j.softx.2021.100722>), which is a robust hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 <DOI:10.1016/j.ins.2016.05.003>). It is now faster and more memory efficient; determining the whole cluster hierarchy for datasets of 10M points in low dimensional Euclidean spaces or 100K points in high-dimensional ones takes only a minute or so. Allows clustering with respect to mutual reachability distances so that it can act as a noise point detector or a robustified version of 'HDBSCAN*' (that is able to detect a predefined number of clusters and hence it does not dependent on the somewhat fragile 'eps' parameter). The package also features an implementation of inequality indices (e.g., Gini and Bonferroni), external cluster validity measures (e.g., the normalised clustering accuracy, the adjusted Rand index, the Fowlkes-Mallows index, and normalised mutual information), and internal cluster validity indices (e.g., the Calinski-Harabasz, Davies-Bouldin, Ball-Hall, Silhouette, and generalised Dunn indices). See also the 'Python' version of 'genieclust' available on 'PyPI', which supports sparse data, more metrics, and even larger datasets.

Maintained by Marek Gagolewski. Last updated 11 days ago.

cluster-analysis clustering clustering-algorithm data-analysis data-mining data-science genie hdbscan hierarchical-clustering hierarchical-clustering-algorithm machine-learning machine-learning-algorithms mlpack nmslib python python3 sparse cpp openmp

3.5 match 61 stars 7.33 score 13 scripts 5 dependents

r-lib

marquee:Markdown Parser and Renderer for R Graphics

Provides the mean to parse and render markdown text with grid along with facilities to define the styling of the text.

Maintained by Thomas Lin Pedersen. Last updated 3 days ago.

cpp

3.0 match 86 stars 8.59 score 28 scripts 1 dependents

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 3 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

1.8 match 13.84 score 16k scripts 587 dependents

bioc

tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Maintained by Timothy Keyes. Last updated 5 months ago.

singlecell flowcytometry bioinformatics cytometry data-science single-cell tidyverse cpp

3.3 match 18 stars 7.24 score 35 scripts

alarm-redist

redistmetrics:Redistricting Metrics

Reliable and flexible tools for scoring redistricting plans using common measures and metrics. These functions provide key direct access to tools useful for non-simulation analyses of redistricting plans, such as for measuring compactness or partisan fairness. Tools are designed to work with the 'redist' package seamlessly.

Maintained by Christopher T. Kenny. Last updated 10 months ago.

openblas cpp

3.0 match 10 stars 7.57 score 23 scripts 2 dependents

ropensci

pathviewr:Wrangle, Analyze, and Visualize Animal Movement Data

Tools to import, clean, and visualize movement data, particularly from motion capture systems such as Optitrack's 'Motive', the Straw Lab's 'Flydra', or from other sources. We provide functions to remove artifacts, standardize tunnel position and tunnel axes, select a region of interest, isolate specific trajectories, fill gaps in trajectory data, and calculate 3D and per-axis velocity. For experiments of visual guidance, we also provide functions that use subject position to estimate perception of visual stimuli.

Maintained by Vikram B. Baliga. Last updated 2 years ago.

animal-movement flydra motion movement-data optitrack trajectories trajectory-analysis visual-guidance visual-perception

3.5 match 8 stars 6.56 score 102 scripts

epiverse-trace

cleanepi:Clean and Standardize Epidemiological Data

Cleaning and standardizing tabular data package, tailored specifically for curating epidemiological data. It streamlines various data cleaning tasks that are typically expected when working with datasets in epidemiology. It returns the processed data in the same format, and generates a comprehensive report detailing the outcomes of each cleaning task.

Maintained by Karim Mané. Last updated 11 days ago.

data-cleaning epidemiology epiverse

3.0 match 8 stars 7.39 score 19 scripts

bioc

slingshot:Tools for ordering single-cell sequencing

Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.

Maintained by Kelly Street. Last updated 5 months ago.

clustering differentialexpression geneexpression rnaseq sequencing software singlecell transcriptomics visualization

1.8 match 286 stars 12.01 score 1.0k scripts 4 dependents

vincentarelbundock

tinytable:Simple and Configurable Tables in 'HTML', 'LaTeX', 'Markdown', 'Word', 'PNG', 'PDF', and 'Typst' Formats

Create highly customized tables with this simple and dependency-free package. Data frames can be converted to 'HTML', 'LaTeX', 'Markdown', 'Word', 'PNG', 'PDF', or 'Typst' tables. The user interface is minimalist and easy to learn. The syntax is concise. 'HTML' tables can be customized using the flexible 'Bootstrap' framework, and 'LaTeX' code with the 'tabularray' package.

Maintained by Vincent Arel-Bundock. Last updated 10 days ago.

1.8 match 264 stars 12.26 score 562 scripts 10 dependents

lyzander

tableHTML:A Tool to Create HTML Tables

A tool to create and style HTML tables with CSS. These can be exported and used in any application that accepts HTML (e.g. 'shiny', 'rmarkdown', 'PowerPoint'). It also provides functions to create CSS files (which also work with shiny).

Maintained by Theo Boutaris. Last updated 2 years ago.

2.3 match 22 stars 9.10 score 280 scripts 2 dependents

ropensci

tsbox:Class-Agnostic Time Series

Time series toolkit with identical behavior for all time series classes: 'ts','xts', 'data.frame', 'data.table', 'tibble', 'zoo', 'timeSeries', 'tsibble', 'tis' or 'irts'. Also converts reliably between these classes.

Maintained by Christoph Sax. Last updated 5 months ago.

graphics time-series

2.0 match 150 stars 10.61 score 496 scripts 4 dependents

mmaechler

cluster:"Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al.

Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) "Finding Groups in Data".

Maintained by Martin Maechler. Last updated 19 days ago.

1.7 match 3 stars 11.98 score 14k scripts 2.2k dependents

wa-department-of-agriculture

soils:Visualize and Report Soil Health Data

Collection of soil health data visualization and reporting tools, including a RStudio project template with everything you need to generate custom HTML and Microsoft Word reports for each participant in your soil health sampling project.

Maintained by Jadey N Ryan. Last updated 6 days ago.

3.5 match 10 stars 5.73 score 9 scripts

rjdverse

rjd3toolkit:Utility Functions around 'JDemetra+ 3.0'

R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It provides functions allowing to model time series (create outlier regressors, user-defined calendar regressors, UCARIMA models...), to test the presence of trading days or seasonal effects and also to set specifications in pre-adjustment and benchmarking when using rjd3x13 or rjd3tramoseats.

Maintained by Tanguy Barthelemy. Last updated 5 months ago.

java jdemetra seasonal-adjustment time-series timeseries openjdk

3.5 match 6 stars 5.74 score 48 scripts 16 dependents

ovvo-financial

NNS:Nonlinear Nonparametric Statistics

Nonlinear nonparametric statistics using partial moments. Partial moments are the elements of variance and asymptotically approximate the area of f(x). These robust statistics provide the basis for nonlinear analysis while retaining linear equivalences. NNS offers: Numerical integration, Numerical differentiation, Clustering, Correlation, Dependence, Causal analysis, ANOVA, Regression, Classification, Seasonality, Autoregressive modeling, Normalization, Stochastic dominance and Advanced Monte Carlo sampling. All routines based on: Viole, F. and Nawrocki, D. (2013), Nonlinear Nonparametric Statistics: Using Partial Moments (ISBN: 1490523995).

Maintained by Fred Viole. Last updated 4 days ago.

clustering econometrics machine-learning nonlinear nonparametric partial-moments statistics time-series cpp

1.8 match 72 stars 10.77 score 66 scripts 3 dependents

ms609

TreeTools:Create, Modify and Analyse Phylogenetic Trees

Efficient implementations of functions for the creation, modification and analysis of phylogenetic trees. Applications include: generation of trees with specified shapes; tree rearrangement; analysis of tree shape; rooting of trees and extraction of subtrees; calculation and depiction of split support; plotting the position of rogue taxa (Klopfstein & Spasojevic 2019) <doi:10.1371/journal.pone.0212942>; calculation of ancestor-descendant relationships, of 'stemwardness' (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>, and of tree balance (Mir et al. 2013, Lemant et al. 2022) <doi:10.1016/j.mbs.2012.10.005>, <doi:10.1093/sysbio/syac027>; artificial extinction (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>; import and export of trees from Newick, Nexus (Maddison et al. 1997) <doi:10.1093/sysbio/46.4.590>, and TNT <https://www.lillo.org.ar/phylogeny/tnt/> formats; and analysis of splits and cladistic information.

Maintained by Martin R. Smith. Last updated 7 days ago.

evolutionary-biology phylogenetic-trees phylogenetics cpp

2.0 match 23 stars 9.83 score 124 scripts 10 dependents

gforge

Gmisc:Descriptive Statistics, Transition Plots, and More

Tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bézier lines with arrows complementing the ones in the 'grid' package, and more.

Maintained by Max Gordon. Last updated 2 years ago.

cpp

1.9 match 51 stars 10.34 score 233 scripts 2 dependents

tconwell

html5:Creates Valid HTML5 Strings

Generates valid HTML tag strings for HTML5 elements documented by Mozilla. Attributes are passed as named lists, with names being the attribute name and values being the attribute value. Attribute values are automatically double-quoted. To declare a DOCTYPE, wrap html() with function doctype(). Mozilla's documentation for HTML5 is available here: <https://developer.mozilla.org/en-US/docs/Web/HTML/Element>. Elements marked as obsolete are not included.

Maintained by Timothy Conwell. Last updated 2 years ago.

5.2 match 1 stars 3.65 score 1 scripts 3 dependents

bioc

IntEREst:Intron-Exon Retention Estimator

This package performs Intron-Exon Retention analysis on RNA-seq data (.bam files).

Maintained by Ali Oghabian. Last updated 4 days ago.

software alternativesplicing coverage differentialsplicing sequencing rnaseq alignment normalization differentialexpression immunooncology

4.5 match 4.16 score 12 scripts

emmanuelparadis

pegas:Population and Evolutionary Genetics Analysis System

Functions for reading, writing, plotting, analysing, and manipulating allelic and haplotypic data, including from VCF files, and for the analysis of population nucleotide sequences and micro-satellites including coalescent analyses, linkage disequilibrium, population structure (Fst, Amova) and equilibrium (HWE), haplotype networks, minimum spanning tree and network, and median-joining networks.

Maintained by Emmanuel Paradis. Last updated 1 years ago.

2.4 match 7.56 score 576 scripts 17 dependents

insightsengineering

formatters:ASCII Formatting for Values and Tables

We provide a framework for rendering complex tables to ASCII, and a set of formatters for transforming values or sets of values into ASCII-ready display strings.

Maintained by Joe Zhu. Last updated 3 months ago.

format matrix table

1.8 match 17 stars 10.19 score 22 scripts 20 dependents

ms609

TreeDist:Calculate and Map Distances Between Phylogenetic Trees

Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.

Maintained by Martin R. Smith. Last updated 2 months ago.

phylogenetics tree-distance phylogenetic-trees tree-distances trees cpp

1.7 match 32 stars 10.32 score 97 scripts 5 dependents

lukejharmon

geiger:Analysis of Evolutionary Diversification

Methods for fitting macroevolutionary models to phylogenetic trees Pennell (2014) <doi:10.1093/bioinformatics/btu181>.

Maintained by Luke Harmon. Last updated 2 years ago.

openblas cpp

2.3 match 1 stars 7.84 score 2.3k scripts 28 dependents

animaltags

tagtools:Work with Data from High-Resolution Biologging Tags

High-resolution movement-sensor tags typically include accelerometers to measure body posture and sudden movements or changes in speed, magnetometers to measure direction of travel, and pressure sensors to measure dive depth in aquatic or marine animals. The sensors in these tags usually sample many times per second. Some tags include sensors for speed, turning rate (gyroscopes), and sound. This package provides software tools to facilitate calibration, processing, and analysis of such data. Tools are provided for: data import/export; calibration (from raw data to calibrated data in scientific units); visualization (for example, multi-panel time-series plots); data processing (such as event detection, calculation of derived metrics like jerk and dynamic acceleration, dive detection, and dive parameter calculation); and statistical analysis (for example, track reconstruction, a rotation test, and Mahalanobis distance analysis).

Maintained by Stacy DeRuiter. Last updated 9 months ago.

3.8 match 8 stars 4.62 score 26 scripts

tguillerme

dispRity:Measuring Disparity

A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.

Maintained by Thomas Guillerme. Last updated 14 days ago.

disparity ecology multidimensionality palaeobiology

2.0 match 26 stars 8.65 score 220 scripts 1 dependents

bioc

FlowSOM:Using self-organizing maps for visualization and interpretation of cytometry data

FlowSOM offers visualization options for cytometry data, by using Self-Organizing Map clustering and Minimal Spanning Trees.

Maintained by Sofie Van Gassen. Last updated 5 months ago.

cellbiology flowcytometry clustering visualization software cellbasedassays

2.2 match 7.71 score 468 scripts 10 dependents

natverse

nat:NeuroAnatomy Toolbox for Analysis of 3D Image Data

NeuroAnatomy Toolbox (nat) enables analysis and visualisation of 3D biological image data, especially traced neurons. Reads and writes 3D images in NRRD and 'Amira' AmiraMesh formats and reads surfaces in 'Amira' hxsurf format. Traced neurons can be imported from and written to SWC and 'Amira' LineSet and SkeletonGraph formats. These data can then be visualised in 3D via 'rgl', manipulated including applying calculated registrations, e.g. using the 'CMTK' registration suite, and analysed. There is also a simple representation for neurons that have been subjected to 3D skeletonisation but not formally traced; this allows morphological comparison between neurons including searches and clustering (via the 'nat.nblast' extension package).

Maintained by Gregory Jefferis. Last updated 6 months ago.

3d connectomics image-analysis neuroanatomy neuroanatomy-toolbox neuron neuron-morphology neuroscience visualisation

1.7 match 67 stars 9.94 score 436 scripts 2 dependents

bioc

HGC:A fast hierarchical graph-based clustering method

HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.

Maintained by XGlab. Last updated 5 months ago.

singlecell software clustering rnaseq graphandnetwork dnaseq cpp

3.5 match 4.70 score 25 scripts

bayesball

ProbBayes:Probability and Bayesian Modeling

Functions and datasets to accompany J. Albert and J. Hu, "Probability and Bayesian Modeling", CRC Press, (2019, ISBN: 1138492566).

Maintained by Jim Albert. Last updated 4 years ago.

3.8 match 5 stars 4.30 score 80 scripts

pharmaverse

tidytlg:Create TLGs using the 'tidyverse'

Generate tables, listings, and graphs (TLG) using 'tidyverse.' Tables can be created functionally, using a standard TLG process, or by specifying table and column metadata to create generic analysis summaries. The 'envsetup' package can also be leveraged to create environments for table creation.

Maintained by Konrad Pagacz. Last updated 9 months ago.

2.0 match 33 stars 7.96 score 22 scripts

bioc

PREDA:Position Related Data Analysis

Package for the position related analysis of quantitative functional genomics data.

Maintained by Francesco Ferrari. Last updated 5 months ago.

software copynumbervariation geneexpression genetics

3.7 match 4.30 score 9 scripts

sevvandi

stxplore:Exploration of Spatio-Temporal Data

A set of statistical tools for spatio-temporal data exploration. Includes simple plotting functions, covariance calculations and computations similar to principal component analysis for spatio-temporal data. Can use both dataframes and stars objects for all plots and computations. For more details refer 'Spatio-Temporal Statistics with R' (Christopher K. Wikle, Andrew Zammit-Mangion, Noel Cressie, 2019, ISBN:9781138711136).

Maintained by Sevvandi Kandanaarachchi. Last updated 2 years ago.

3.3 match 5 stars 4.70 score 7 scripts

tesselle

kairos:Analysis of Chronological Patterns from Archaeological Count Data

A toolkit for absolute and relative dating and analysis of chronological patterns. This package includes functions for chronological modeling and dating of archaeological assemblages from count data. It provides methods for matrix seriation. It also allows to compute time point estimates and density estimates of the occupation and duration of an archaeological site.

Maintained by Nicolas Frerebeau. Last updated 6 hours ago.

chronology matrix-seriation archaeology archaeological-science

3.3 match 4.69 score 11 scripts 1 dependents

john-harrold

onbrand:Templated Reporting Workflows in Word and PowerPoint

Automated reporting in Word and PowerPoint can require customization for each organizational template. This package works around this by adding standard reporting functions and an abstraction layer to facilitate automated reporting workflows that can be replicated across different organizational templates.

Maintained by John Harrold. Last updated 2 months ago.

1.9 match 24 stars 8.11 score 45 scripts 4 dependents

k5cents

gluedown:Wrap Vectors in Markdown Formatting

Ease the transition between R vectors and markdown text. With 'gluedown' and 'rmarkdown', users can create traditional vectors in R, glue those strings together with the markdown syntax, and print those formatted vectors directly to the document. This package primarily uses GitHub Flavored Markdown (GFM), an offshoot of the unambiguous CommonMark specification by John MacFarlane (2019) <https://spec.commonmark.org/>.

Maintained by Kiernan Nicholls. Last updated 1 years ago.

markdown

2.0 match 115 stars 7.59 score 57 scripts

matloff

qeML:Quick and Easy Machine Learning Tools

The letters 'qe' in the package title stand for "quick and easy," alluding to the convenience goal of the package. We bring together a variety of machine learning (ML) tools from standard R packages, providing wrappers with a simple, convenient, and uniform interface.

Maintained by Norm Matloff. Last updated 11 days ago.

1.8 match 41 stars 8.37 score 48 scripts 1 dependents

bioc

GSAR:Gene Set Analysis in R

Gene set analysis using specific alternative hypotheses. Tests for differential expression, scale and net correlation structure.

Maintained by Yasir Rahmatallah. Last updated 5 months ago.

software statisticalmethod differentialexpression

3.4 match 4.38 score 7 scripts

gsk-biostatistics

tfrmt:Applies Display Metadata to Analysis Results Datasets

Creates a framework to store and apply display metadata to Analysis Results Datasets (ARDs). The use of 'tfrmt' allows users to define table format and styling without the data, and later apply the format to the data.

Maintained by Becca Krouse. Last updated 4 months ago.

1.8 match 73 stars 8.03 score 84 scripts 1 dependents

alexchristensen

NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis

Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.

Maintained by Alexander Christensen. Last updated 2 years ago.

network-analysis

2.0 match 23 stars 7.04 score 101 scripts 4 dependents

bioc

TrajectoryUtils:Single-Cell Trajectory Analysis Utilities

Implements low-level utilities for single-cell trajectory analysis, primarily intended for re-use inside higher-level packages. Include a function to create a cluster-level minimum spanning tree and data structures to hold pseudotime inference results.

Maintained by Aaron Lun. Last updated 5 months ago.

geneexpression singlecell

2.3 match 5.91 score 16 scripts 9 dependents

sebastian-engelke

graphicalExtremes:Statistical Methodology for Graphical Extreme Value Models

Statistical methodology for sparse multivariate extreme value models. Methods are provided for exact simulation and statistical inference for multivariate Pareto distributions on graphical structures as described in the paper 'Graphical Models for Extremes' by Engelke and Hitz (2020) <doi:10.1111/rssb.12355>.

Maintained by Sebastian Engelke. Last updated 3 months ago.

1.8 match 16 stars 7.38 score 28 scripts 1 dependents

rubensmoura87

MultiATSM:Multicountry Term Structure of Interest Rates Models

Estimation routines for several classes of affine term structure of interest rates models. All the models are based on the single-country unspanned macroeconomic risk framework from Joslin, Priebsch, and Singleton (2014, JF) <doi:10.1111/jofi.12131>. Multicountry extensions such as the ones of Jotikasthira, Le, and Lundblad (2015, JFE) <doi:10.1016/j.jfineco.2014.09.004>, Candelon and Moura (2023, EM) <doi:10.1016/j.econmod.2023.106453>, and Candelon and Moura (Forthcoming, JFEC) <doi:10.1093/jjfinec/nbae008> are also available.

Maintained by Rubens Moura. Last updated 8 days ago.

3.3 match 4.00 score 8 scripts

josherrickson

rlemon:R Access to LEMON Graph Algorithms

Allows easy access to the LEMON Graph Library set of algorithms, written in C++. See the LEMON project page at <https://lemon.cs.elte.hu/trac/lemon>. Current LEMON version is 1.3.1.

Maintained by Josh Errickson. Last updated 2 months ago.

cpp

1.9 match 8 stars 7.04 score 1 scripts 15 dependents

corybrunson

ordr:A Tidyverse Extension for Ordinations and Biplots

Ordination comprises several multivariate exploratory and explanatory techniques with theoretical foundations in geometric data analysis; see Podani (2000, ISBN:90-5782-067-6) for techniques and applications and Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0> for foundations. Greenacre (2010, ISBN:978-84-923846) shows how the most established of these, including principal components analysis, correspondence analysis, multidimensional scaling, factor analysis, and discriminant analysis, rely on eigen-decompositions or singular value decompositions of pre-processed numeric matrix data. These decompositions give rise to a set of shared coordinates along which the row and column elements can be measured. The overlay of their scatterplots on these axes, introduced by Gabriel (1971) <doi:10.1093/biomet/58.3.453>, is called a biplot. 'ordr' provides inspection, extraction, manipulation, and visualization tools for several popular ordination classes supported by a set of recovery methods. It is inspired by and designed to integrate into 'tidyverse' workflows provided by Wickham et al (2019) <doi:10.21105/joss.01686>.

Maintained by Jason Cory Brunson. Last updated 28 days ago.

biplot data-visualization dimension-reduction geometric-data-analysis grammar-of-graphics log-ratio-analysis multivariate-analysis multivariate-statistics ordination tidymodels tidyverse

1.7 match 24 stars 7.26 score 28 scripts

beanumber

etl:Extract-Transform-Load Framework for Medium Data

A predictable and pipeable framework for performing ETL (extract-transform-load) operations on publicly-accessible medium-sized data set. This package sets up the method structure and implements generic functions. Packages that depend on this package download specific data sets from the Internet, clean them up, and import them into a local or remote relational database management system.

Maintained by Benjamin S. Baumer. Last updated 1 years ago.

1.7 match 129 stars 7.17 score 38 scripts 1 dependents

bluegreen-labs

phenocamr:Facilitates 'PhenoCam' Data Access and Time Series Post-Processing

Programmatic interface to the 'PhenoCam' web services (<https://phenocam.nau.edu/webcam>). Allows for easy downloading of 'PhenoCam' data directly to your R workspace or your computer and provides post-processing routines for consistent and easy timeseries outlier detection, smoothing and estimation of phenological transition dates. Methods for this package are described in detail in Hufkens et. al (2018) <doi:10.1111/2041-210X.12970>.

Maintained by Koen Hufkens. Last updated 1 years ago.

phenocam phenocam-data phenology-modelling remote-sensing

1.8 match 23 stars 6.71 score 75 scripts 1 dependents

maxcal88

locationgamer:Identification of Location Game Equilibria in Networks

Identification of equilibrium locations in location games (Hotelling (1929) <doi:10.2307/2224214>). In these games, two competing actors place customer-serving units in two locations simultaneously. Customers make the decision to visit the location that is closest to them. The functions in this package include Prim algorithm (Prim (1957) <doi:10.1002/j.1538-7305.1957.tb01515.x>) to find the minimum spanning tree connecting all network vertices, an implementation of Dijkstra algorithm (Dijkstra (1959) <doi:10.1007/BF01386390>) to find the shortest distance and path between any two vertices, a self-developed algorithm using elimination of purely dominated strategies to find the equilibrium, and several plotting functions.

Maintained by Maximilian Zellner. Last updated 4 years ago.

4.0 match 3.00 score

zajichek

cheese:Tools for Working with Data During Statistical Analysis

Contains tools for working with data during statistical analysis, promoting flexible, intuitive, and reproducible workflows. There are functions designated for specific statistical tasks such building a custom univariate descriptive table, computing pairwise association statistics, etc. These are built on a collection of data manipulation tools designed for general use that are motivated by the functional programming concept.

Maintained by Alex Zajichek. Last updated 2 years ago.

data-manipulation statistical-analysis

3.0 match 4.00 score 2 scripts

elianhugh

quartools:Programmatic Element Creation For Quarto Documents

Programatically generate quarto-compliant markdown elements.

Maintained by Elian Thiele-Evans. Last updated 1 years ago.

markdown quarto

3.8 match 27 stars 3.13 score 3 scripts

bioc

LOBSTAHS:Lipid and Oxylipin Biomarker Screening through Adduct Hierarchy Sequences

LOBSTAHS is a multifunction package for screening, annotation, and putative identification of mass spectral features in large, HPLC-MS lipid datasets. In silico data for a wide range of lipids, oxidized lipids, and oxylipins can be generated from user-supplied structural criteria with a database generation function. LOBSTAHS then applies these databases to assign putative compound identities to features in any high-mass accuracy dataset that has been processed using xcms and CAMERA. Users can then apply a series of orthogonal screening criteria based on adduct ion formation patterns, chromatographic retention time, and other properties, to evaluate and assign confidence scores to this list of preliminary assignments. During the screening routine, LOBSTAHS rejects assignments that do not meet the specified criteria, identifies potential isomers and isobars, and assigns a variety of annotation codes to assist the user in evaluating the accuracy of each assignment.

Maintained by Henry Holm. Last updated 5 months ago.

immunooncology massspectrometry metabolomics lipidomics dataimport adduct algae bioconductor hplc-esi-ms lipid mass-spectrometry oxidative-stress-biomarkers oxidized-lipids oxylipins plankton

1.8 match 8 stars 6.56 score 9 scripts

bioc

chimeraviz:Visualization tools for gene fusions

chimeraviz manages data from fusion gene finders and provides useful visualization tools.

Maintained by Stian Lågstad. Last updated 5 months ago.

infrastructure alignment

1.8 match 37 stars 6.71 score 14 scripts

psychelzh

preproc.iquizoo:Utility Functions for Data Processing of Iquizoo Games

Several couples of games are developed by IQUIZOO.COM. Here are the functions used to do data processing for all of those games.

Maintained by Liang Zhang. Last updated 6 months ago.

cognitive-science data-processing

5.3 match 1 stars 2.18 score 1 scripts

rmheiberger

HH:Statistical Analysis and Data Display: Heiberger and Holland

Support software for Statistical Analysis and Data Display (Second Edition, Springer, ISBN 978-1-4939-2121-8, 2015) and (First Edition, Springer, ISBN 0-387-40270-5, 2004) by Richard M. Heiberger and Burt Holland. This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The second edition includes redesigned graphics and additional chapters. The authors emphasize how to construct and interpret graphs, discuss principles of graphical design, and show how accompanying traditional tabular results are used to confirm the visual impressions derived directly from the graphs. Many of the graphical formats are novel and appear here for the first time in print. All chapters have exercises. All functions introduced in the book are in the package. R code for all examples, both graphs and tables, in the book is included in the scripts directory of the package.

Maintained by Richard M. Heiberger. Last updated 2 months ago.

1.8 match 3 stars 6.42 score 752 scripts 5 dependents

landscitech

roads:Road Network Projection

Iterative least cost path and minimum spanning tree methods for projecting forest road networks. The methods connect a set of target points to an existing road network using 'igraph' <https://igraph.org> to identify least cost routes. The cost of constructing a road segment between adjacent pixels is determined by a user supplied weight raster and a weight function; options include the average of adjacent weight raster values, and a function of the elevation differences between adjacent cells that penalizes steep grades. These road network projection methods are intended for integration into R workflows and modelling frameworks used for forecasting forest change, and can be applied over multiple time-steps without rebuilding a graph at each time-step.

Maintained by Sarah Endicott. Last updated 7 months ago.

1.6 match 4 stars 6.54 score 29 scripts

r-forge

tframe:Time Frame Coding Kernel

A kernel of functions for programming time series methods in a way that is relatively independently of the representation of time. Also provides plotting, time windowing, and some other utility functions which are specifically intended for time series. See the Guide distributed as a vignette, or '?tframe.Intro' for more details. (User utilities are in package 'tfplot'.)

Maintained by Paul Gilbert. Last updated 1 years ago.

2.3 match 4.49 score 34 scripts 3 dependents

kasperwelbers

tokenbrowser:Create Full Text Browsers from Annotated Token Lists

Create browsers for reading full texts from a token list format. Information obtained from text analyses (e.g., topic modeling, word scaling) can be used to annotate the texts.

Maintained by Kasper Welbers. Last updated 4 years ago.

cpp

1.9 match 7 stars 5.38 score 13 scripts 5 dependents

jmw86069

jamba:Just Analysis Methods Base

Just analysis methods ('jam') base functions focused on bioinformatics. Version- and gene-centric alphanumeric sort, unique name and version assignment, colorized console and 'HTML' output, color ramp and palette manipulation, 'Rmarkdown' cache import, styled 'Excel' worksheet import and export, interpolated raster output from smooth scatter and image plots, list to delimited vector, efficient list tools.

Maintained by James M. Ward. Last updated 4 days ago.

bioinformatics

1.8 match 6 stars 5.52 score

spkaluzny

splusTimeDate:Times and Dates from 'S-PLUS'

A collection of classes and methods for working with times and dates. The code was originally available in 'S-PLUS'.

Maintained by Stephen Kaluzny. Last updated 2 months ago.

datetime

2.0 match 4.94 score 58 scripts 2 dependents

drag05

replacer:A Value Replacement Utility

Updates values within csv format data files using a custom, User-built csv format lookup file. Based on 'data.table' package.

Maintained by Bandur Dragos. Last updated 3 years ago.

3.6 match 2.70 score

pweidemueller

fullRankMatrix:Generation of Full Rank Design Matrix

Creates a full rank matrix out of a given matrix. The intended use is for one-hot encoded design matrices that should be used in linear models to ensure that significant associations can be correctly interpreted. However, 'fullRankMatrix' can be applied to any matrix to make it full rank. It removes columns with only 0's, merges duplicated columns and discovers linearly dependent columns and replaces them with linearly independent columns that span the space of the original columns. Columns are renamed to reflect those modifications. This results in a full rank matrix that can be used as a design matrix in linear models. The algorithm and some functions are inspired by Kuhn, M. (2008) <doi:10.18637/jss.v028.i05>.

Maintained by Paula Weidemueller. Last updated 9 months ago.

1.6 match 14 stars 5.62 score 6 scripts

schweflo

pandocfilters:Pandoc Filters for R

The document converter 'pandoc' <https://pandoc.org/> is widely used in the R community. One feature of 'pandoc' is that it can produce and consume JSON-formatted abstract syntax trees (AST). This allows to transform a given source document into JSON-formatted AST, alter it by so called filters and pass the altered JSON-formatted AST back to 'pandoc'. This package provides functions which allow to write such filters in native R code. Although this package is inspired by the Python package 'pandocfilters' <https://github.com/jgm/pandocfilters/>, it provides additional convenience functions which make it simple to use the 'pandocfilters' package as a report generator. Since 'pandocfilters' inherits most of it's functionality from 'pandoc' it can create documents in many formats (for more information see <https://pandoc.org/>) but is also bound to the same limitations as 'pandoc'.

Maintained by Florian Schwendinger. Last updated 3 years ago.

3.3 match 2.80 score 63 scripts

zheng206

ComBatFamQC:Comprehensive Batch Effect Diagnostics and Harmonization

Provides a comprehensive framework for batch effect diagnostics, harmonization, and post-harmonization downstream analysis. Features include interactive visualization tools, robust statistical tests, and a range of harmonization techniques. Additionally, 'ComBatFamQC' enables the creation of life-span age trend plots with estimated age-adjusted centiles and facilitates the generation of covariate-corrected residuals for analytical purposes. Methods for harmonization are based on approaches described in Johnson et al., (2007) <doi:10.1093/biostatistics/kxj037>, Beer et al., (2020) <doi:10.1016/j.neuroimage.2020.117129>, Pomponio et al., (2020) <doi:10.1016/j.neuroimage.2019.116450>, and Chen et al., (2021) <doi:10.1002/hbm.25688>.

Maintained by Zheng Ren. Last updated 8 days ago.

diagnostic-tool harmonization rshinyapp

1.7 match 2 stars 5.41 score 16 scripts

mrc-ide

eppasm:Age-structured EPP Model for HIV Epidemic Estimates

What the package does (one paragraph).

Maintained by Jeff Eaton. Last updated 4 months ago.

1.8 match 6 stars 5.04 score 34 scripts 3 dependents

adrientaudiere

cati:Community Assembly by Traits: Individuals and Beyond

Detect and quantify community assembly processes using trait values of individuals or populations, the T-statistics and other metrics, and dedicated null models.

Maintained by Adrien Taudiere. Last updated 5 months ago.

1.7 match 12 stars 5.33 score 15 scripts

dgrun

RaceID:Identification of Cell Types, Inference of Lineage Trees, and Prediction of Noise Dynamics from Single-Cell RNA-Seq Data

Application of 'RaceID' allows inference of cell types and prediction of lineage trees by the 'StemID2' algorithm (Herman, J.S., Sagar, Grun D. (2018) <DOI:10.1038/nmeth.4662>). 'VarID2' is part of this package and allows quantification of biological gene expression noise at single-cell resolution (Rosales-Alvarez, R.E., Rettkowski, J., Herman, J.S., Dumbovic, G., Cabezas-Wallscheid, N., Grun, D. (2023) <DOI:10.1186/s13059-023-02974-1>).

Maintained by Dominic Grün. Last updated 4 months ago.

cpp

1.8 match 4.74 score 110 scripts

scholaempirica

reschola:The Schola Empirica Package

A collection of utilies, themes and templates for data analysis at Schola Empirica.

Maintained by Jan Netík. Last updated 6 months ago.

1.7 match 4 stars 4.83 score 14 scripts

hadley

highlight:Syntax Highlighter

Syntax highlighter for R code based on the results of the R parser. Rendering in HTML and latex markup. Custom Sweave driver performing syntax highlighting of R code chunks.

Maintained by Hadley Wickham. Last updated 2 years ago.

cpp

1.8 match 4.72 score 67 scripts 4 dependents

kevinhzq

healthdb:Working with Healthcare Databases

A system for identifying diseases or events from healthcare databases and preparing data for epidemiological studies. It includes capabilities not supported by 'SQL', such as matching strings by 'stringr' style regular expressions, and can compute comorbidity scores (Quan et al. (2005) <doi:10.1097/01.mlr.0000182534.19832.83>) directly on a database server. The implementation is based on 'dbplyr' with full 'tidyverse' compatibility.

Maintained by Kevin Hu. Last updated 1 months ago.

1.6 match 2 stars 4.95 score

alexwhitworth

synthACS:Synthetic Microdata and Spatial MicroSimulation Modeling for ACS Data

Provides access to curated American Community Survey (ACS) base tables via a wrapper to library(acs). Builds synthetic micro-datasets at any user-specified geographic level with ten default attributes; and, conducts spatial microsimulation modeling (SMSM) via simulated annealing. SMSM is conducted in parallel by default. Lastly, we provide functionality for data-extensibility of micro-datasets <doi:10.18637/jss.v104.i07>.

Maintained by Alex Whitworth. Last updated 2 years ago.

acs acs-data microsimulation spatial-data-analysis cpp

1.9 match 5 stars 4.29 score 78 scripts

groditi

blsR:Make Requests from the Bureau of Labor Statistics API

Implements v2 of the B.L.S. API for requests of survey information and time series data through 3-tiered API that allows users to interact with the raw API directly, create queries through a functional interface, and re-shape the data structures returned to fit common uses. The API definition is located at: <https://www.bls.gov/developers/api_signature_v2.htm>.

Maintained by Guillermo Roditi Dominguez. Last updated 1 years ago.

1.8 match 14 stars 4.45 score 40 scripts

traversc

seqtrie:Radix Tree and Trie-Based String Distances

A collection of Radix Tree and Trie algorithms for finding similar sequences and calculating sequence distances (Levenshtein and other distance metrics). This work was inspired by a trie implementation in Python: "Fast and Easy Levenshtein distance using a Trie." Hanov (2011) <https://stevehanov.ca/blog/index.php?id=114>.

Maintained by Travers Ching. Last updated 30 days ago.

cpp

1.6 match 8 stars 4.90 score 9 scripts

permaverse

nevada:Network-Valued Data Analysis

A flexible statistical framework for network-valued data analysis. It leverages the complexity of the space of distributions on graphs by using the permutation framework for inference as implemented in the 'flipr' package. Currently, only the two-sample testing problem is covered and generalization to k samples and regression will be added in the future as well. It is a 4-step procedure where the user chooses a suitable representation of the networks, a suitable metric to embed the representation into a metric space, one or more test statistics to target specific aspects of the distributions to be compared and a formula to compute the permutation p-value. Two types of inference are provided: a global test answering whether there is a difference between the distributions that generated the two samples and a local test for localizing differences on the network structure. The latter is assumed to be shared by all networks of both samples. References: Lovato, I., Pini, A., Stamm, A., Vantini, S. (2020) "Model-free two-sample test for network-valued data" <doi:10.1016/j.csda.2019.106896>; Lovato, I., Pini, A., Stamm, A., Taquet, M., Vantini, S. (2021) "Multiscale null hypothesis testing for network-valued data: Analysis of brain networks of patients with autism" <doi:10.1111/rssc.12463>.

Maintained by Aymeric Stamm. Last updated 4 months ago.

openblas cpp

1.7 match 6 stars 4.48 score 7 scripts

bioc

RCSL:Rank Constrained Similarity Learning for single cell RNA sequencing data

A novel clustering algorithm and toolkit RCSL (Rank Constrained Similarity Learning) to accurately identify various cell types using scRNA-seq data from a complex tissue. RCSL considers both lo-cal similarity and global similarity among the cells to discern the subtle differences among cells of the same type as well as larger differences among cells of different types. RCSL uses Spearman’s rank correlations of a cell’s expression vector with those of other cells to measure its global similar-ity, and adaptively learns neighbour representation of a cell as its local similarity. The overall similar-ity of a cell to other cells is a linear combination of its global similarity and local similarity.

Maintained by Qinglin Mei. Last updated 5 months ago.

singlecell software clustering dimensionreduction rnaseq visualization sequencing

1.7 match 2 stars 4.48 score 10 scripts

bioc

hiAnnotator:Functions for annotating GRanges objects

hiAnnotator contains set of functions which allow users to annotate a GRanges object with custom set of annotations. The basic philosophy of this package is to take two GRanges objects (query & subject) with common set of seqnames (i.e. chromosomes) and return associated annotation per seqnames and rows from the query matching seqnames and rows from the subject (i.e. genes or cpg islands). The package comes with three types of annotation functions which calculates if a position from query is: within a feature, near a feature, or count features in defined window sizes. Moreover, each function is equipped with parallel backend to utilize the foreach package. In addition, the package is equipped with wrapper functions, which finds appropriate columns needed to make a GRanges object from a common data frame.

Maintained by Nirav V Malani. Last updated 5 months ago.

software annotation

1.6 match 4.65 score 15 scripts 1 dependents

hughjonesd

huxtable:Easily Create and Style Tables for LaTeX, HTML and Other Formats

Creates styled tables for data presentation. Export to HTML, LaTeX, RTF, 'Word', 'Excel', and 'PowerPoint'. Simple, modern interface to manipulate borders, size, position, captions, colours, text styles and number formatting. Table cells can span multiple rows and/or columns. Includes a 'huxreg' function for creation of regression tables, and 'quick_*' one-liners to print data to a new document.

Maintained by David Hugh-Jones. Last updated 27 days ago.

html huxtable latex microsoft-word powerpoint reproducible-research tables

0.5 match 323 stars 13.93 score 1.9k scripts 16 dependents

buttrey

AcrossTic:A Cost-Minimal Regular Spanning Subgraph with TreeClust

Construct minimum-cost regular spanning subgraph as part of a non-parametric two-sample test for equality of distribution.

Maintained by Sam Buttrey. Last updated 9 years ago.

7.0 match 1.00 score 4 scripts

bioc

RUVnormalize:RUV for normalization of expression array data

RUVnormalize is meant to remove unwanted variation from gene expression data when the factor of interest is not defined, e.g., to clean up a dataset for general use or to do any kind of unsupervised analysis.

Maintained by Laurent Jacob. Last updated 5 months ago.

statisticalmethod normalization

1.7 match 4.11 score 32 scripts

rundel

md4r:Markdown Parser Implemented using the 'MD4C' Library

Provides an R wrapper for the 'MD4C' (Markdown for 'C') library. Functions exist for parsing markdown ('CommonMark' compliant) along with support for other common markdown extensions (e.g. 'GitHub' flavored markdown, 'LaTeX' equation support, etc.). The package also provides a number of higher level functions for exploring and manipulating markdown abstract syntax trees as well as translating and displaying the documents.

Maintained by Colin Rundel. Last updated 1 years ago.

cpp

1.9 match 4 stars 3.60 score 3 scripts

rcurtin

mlpack:'Rcpp' Integration for the 'mlpack' Library

A fast, flexible machine learning library, written in C++, that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms. See also Curtin et al. (2023) <doi:10.21105/joss.05026>.

Maintained by Ryan Curtin. Last updated 4 months ago.

openblas cpp openmp

1.8 match 3.71 score 20 scripts 8 dependents

davidbuch

gretel:Generalized Path Analysis for Social Networks

The social network literature features numerous methods for assigning value to paths as a function of their ties. 'gretel' systemizes these approaches, casting them as instances of a generalized path value function indexed by a penalty parameter. The package also calculates probabilistic path value and identifies optimal paths in either value framework. Finally, proximity matrices can be generated in these frameworks that capture high-order connections overlooked in primitive adjacency sociomatrices. Novel methods are described in Buch (2019) <https://davidbuch.github.io/analyzing-networks-with-gretel.html>. More traditional methods are also implemented, as described in Yang, Knoke (2001) <doi:10.1016/S0378-8733(01)00043-0>.

Maintained by David Buch. Last updated 5 years ago.

cpp

1.8 match 1 stars 3.70 score 6 scripts

loicschwaller

saturnin:Spanning Trees Used for Network Inference

Bayesian inference of graphical model structures using spanning trees.

Maintained by Loïc Schwaller. Last updated 10 years ago.

cpp

5.4 match 1.18 score 15 scripts

repboxr

repboxMap:Mapping information from article and run supplement. Mainly (regression) tables.

This package should use as input only information stored in regdb tables.

Maintained by Sebastian Kranz. Last updated 2 months ago.

1.9 match 3.26 score 1 scripts 2 dependents

drelliesmall

smallstuff:Dr. Small's Functions

Functions used in courses taught by Dr. Small at Drew University.

Maintained by Ellie Small. Last updated 1 years ago.

4.1 match 1.48 score 2 scripts 1 dependents

re2simlab

ViSiElse:A Visual Tool for Behavior Analysis over Time

A graphical R package designed to visualize behavioral observations over time. Based on raw time data extracted from video recorded sessions of experimental observations, ViSiElse grants a global overview of a process by combining the visualization of multiple actions timestamps for all participants in a single graph. Individuals and/or group behavior can easily be assessed. Supplementary features allow users to further inspect their data by adding summary statistics (mean, standard deviation, quantile or statistical test) and/or time constraints to assess the accuracy of the realized actions.

Maintained by Elodie Garnier. Last updated 5 years ago.

1.3 match 2 stars 4.34 score 11 scripts

pariya

netShiny:Tool for Comparison and Visualization of Multiple Networks

We developed a comprehensive tool that helps with visualization and analysis of networks with the same variables across multiple factor levels. The 'netShiny' contains most of the popular network features such as centrality measures, modularity, and other summary statistics (e.g. clustering coefficient). It also contains known tools to look at the (dis)similarities between two networks, such as pairwise distance measures between networks, set operations on the nodes of the networks, distribution of the weights of the edges and a network representing the difference between two correlation matrices. The package 'netShiny' also contains tools to perform bootstrapping and find clusters in networks. See the 'netShiny' manual for more information, documentation and examples.

Maintained by Pariya Behrouzi. Last updated 3 years ago.

1.9 match 3.00 score 5 scripts

chrisumphlett

revulyticsR:Connect to Your 'Revulytics' Data

Facilitates making a connection to the 'Revulytics' API and executing various queries. You can use it to get event data and metadata. The Revulytics documentation is available at <https://docs.revenera.com/ui560/report/>. This package is not supported by 'Flexera' (owner of the software).

Maintained by Chris Umphlett. Last updated 4 years ago.

3.4 match 1.70 score

cran

ips:Interfaces to Phylogenetic Software in R

Functions that wrap popular phylogenetic software for sequence alignment, masking of sequence alignments, and estimation of phylogenies and ancestral character states.

Maintained by Christoph Heibl. Last updated 11 months ago.

1.8 match 3.18 score 1 dependents

kasperwelbers

rsyntax:Extract Semantic Relations from Text by Querying and Reshaping Syntax

Various functions for querying and reshaping dependency trees, as for instance created with the 'spacyr' or 'udpipe' packages. This enables the automatic extraction of useful semantic relations from texts, such as quotes (who said what) and clauses (who did what). Method proposed in Van Atteveldt et al. (2017) <doi:10.1017/pan.2016.12>.

Maintained by Kasper Welbers. Last updated 3 years ago.

1.8 match 3.19 score 19 scripts 4 dependents

chrisumphlett

reveneraR:Connect to Your 'Revenera' (Formerly 'Revulytics') Data

Facilitates making a connection to the 'Revenera' API and executing various queries. You can use it to get event data and metadata. The 'Revenera' documentation is available at <https://rui-api.redoc.ly/>. This package is not supported by 'Flexera' (owner of the software).

Maintained by Chris Umphlett. Last updated 1 months ago.

api-wrapper flexera revenera usage-data

1.7 match 3.18 score

joemsong

TreeDimensionTest:Trajectory Presence and Heterogeneity in Multivariate Data

Testing for trajectory presence and heterogeneity on multivariate data. Two statistical methods (Tenha & Song 2022) <doi:10.1371/journal.pcbi.1009829> are implemented. The tree dimension test quantifies the statistical evidence for trajectory presence. The subset specificity measure summarizes pattern heterogeneity using the minimum subtree cover. There is no user tunable parameters for either method. Examples are included to illustrate how to use the methods on single-cell data for studying gene and pathway expression dynamics and pathway expression specificity.

Maintained by Joe Song. Last updated 3 years ago.

cpp

1.8 match 2.70 score 1 scripts

gchapron

MDPtoolbox:Markov Decision Processes Toolbox

The Markov Decision Processes (MDP) toolbox proposes functions related to the resolution of discrete-time Markov Decision Processes: finite horizon, value iteration, policy iteration, linear programming algorithms with some variants and also proposes some functions related to Reinforcement Learning.

Maintained by Guillaume Chapron. Last updated 8 years ago.

2.0 match 3 stars 2.40 score 84 scripts

cran

sim1000G:Genotype Simulations for Rare or Common Variants Using Haplotypes from 1000 Genomes

Generates realistic simulated genetic data in families or unrelated individuals.

Maintained by Apostolos Dimitromanolakis. Last updated 6 years ago.

1.7 match 1 stars 2.78 score

matthewabirk

birk:MA Birk's Functions

Collection of tools to make R more convenient. Includes tools to summarize data using statistics not available with base R and manipulate objects for analyses.

Maintained by Matthew A. Birk. Last updated 9 years ago.

1.8 match 2.38 score 35 scripts 2 dependents

smbartell

MapGAM:Mapping Smoothed Effect Estimates from Individual-Level Data

Contains functions for mapping odds ratios, hazard ratios, or other effect estimates using individual-level data such as case-control study data, using generalized additive models (GAMs) or Cox models for smoothing with a two-dimensional predictor (e.g., geolocation or exposure to chemical mixtures) while adjusting linearly for confounding variables, using methods described by Kelsall and Diggle (1998), Webster at al. (2006), and Bai et al. (2020). Includes convenient functions for mapping point estimates and confidence intervals, efficient control sampling, and permutation tests for the null hypothesis that the two-dimensional predictor is not associated with the outcome variable (adjusting for confounders).

Maintained by Scott Bartell. Last updated 2 years ago.

1.8 match 1 stars 2.37 score 26 scripts 1 dependents

uscbiostats

slurmR:A Lightweight Wrapper for 'Slurm'

'Slurm', Simple Linux Utility for Resource Management <https://slurm.schedmd.com/>, is a popular 'Linux' based software used to schedule jobs in 'HPC' (High Performance Computing) clusters. This R package provides a specialized lightweight wrapper of 'Slurm' with a syntax similar to that found in the 'parallel' R package. The package also includes a method for creating socket cluster objects spanning multiple nodes that can be used with the 'parallel' package.

Maintained by George Vega Yon. Last updated 1 years ago.

bioinformatics hpc slurm

0.5 match 60 stars 8.07 score 216 scripts 1 dependents

wjschne

WJSmisc:Miscellaneous functions from W. Joel Schneider

Several functions I find useful.

Maintained by W. Joel Schneider. Last updated 2 years ago.

1.7 match 5 stars 2.40 score 10 scripts

bioc

TSCAN:Tools for Single-Cell Analysis

Provides methods to perform trajectory analysis based on a minimum spanning tree constructed from cluster centroids. Computes pseudotemporal cell orderings by mapping cells in each cluster (or new cells) to the closest edge in the tree. Uses linear modelling to identify differentially expressed genes along each path through the tree. Several plotting and interactive visualization functions are also implemented.

Maintained by Zhicheng Ji. Last updated 5 months ago.

geneexpression visualization gui

0.5 match 7.58 score 207 scripts 3 dependents

openintrostat

usdata:Data on the States and Counties of the United States

Demographic data on the United States at the county and state levels spanning multiple years.

Maintained by Mine Çetinkaya-Rundel. Last updated 10 months ago.

data openintro

0.6 match 9 stars 6.89 score 294 scripts 1 dependents

cran

DataSimilarity:Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

A collection of methods for quantifying the similarity of two or more datasets, many of which can be used for two- or k-sample testing. It provides newly implemented methods as well as wrapper functions for existing methods that enable calling many different methods in a unified framework. The methods were selected from the review and comparison of Stolte et al. (2024) <doi:10.1214/24-SS149>.

Maintained by Marieke Stolte. Last updated 13 days ago.

1.9 match 2.00 score

stla

html2R:Convert 'HTML' to 'R' with a 'Shiny' App

Provides a 'Shiny' app allowing to convert 'HTML' code to 'R' code (e.g. '<span>Hello</span>' to 'tags$span("Hello")'), for usage in a 'Shiny' UI.

Maintained by Stéphane Laurent. Last updated 5 years ago.

html shiny

1.0 match 9 stars 3.65 score

cran

MPDiR:Data Sets and Scripts for Modeling Psychophysical Data in R

Data sets and scripts for Modeling Psychophysical Data in R (Springer).

Maintained by Ken Knoblauch. Last updated 2 years ago.

3.6 match 1.00 score

mpascariu

MortalityLaws:Parametric Mortality Models, Life Tables and HMD

Fit the most popular human mortality 'laws', and construct full and abridge life tables given various input indices. A mortality law is a parametric function that describes the dying-out process of individuals in a population during a significant portion of their life spans. For a comprehensive review of the most important mortality laws see Tabeau (2001) <doi:10.1007/0-306-47562-6_1>. Practical functions for downloading data from various human mortality databases are provided as well.

Maintained by Marius D. Pascariu. Last updated 1 years ago.

actuarial-science demography download-hmd human-mortality-laws life-table mortality

0.5 match 32 stars 7.00 score 103 scripts 1 dependents

tharindupdealwis

sdrt:Estimating the Sufficient Dimension Reduction Subspaces in Time Series

The sdrt() function is designed for estimating subspaces for Sufficient Dimension Reduction (SDR) in time series, with a specific focus on the Time Series Central Mean subspace (TS-CMS). The package employs the Fourier transformation method proposed by Samadi and De Alwis (2023) <doi:10.48550/arXiv.2312.02110> and the Nadaraya-Watson kernel smoother method proposed by Park et al. (2009) <doi:10.1198/jcgs.2009.08076> for estimating the TS-CMS. The package provides tools for estimating distances between subspaces and includes functions for selecting model parameters using the Fourier transformation method.

Maintained by Tharindu P. De Alwis. Last updated 1 years ago.

1.7 match 2.00 score

oliverehmer

act:Aligned Corpus Toolkit

The Aligned Corpus Toolkit (act) is designed for linguists that work with time aligned transcription data. It offers functions to import and export various annotation file formats ('ELAN' .eaf, 'EXMARaLDA .exb and 'Praat' .TextGrid files), create print transcripts in the style of conversation analysis, search transcripts (span searches across multiple annotations, search in normalized annotations, make concordances etc.), export and re-import search results (.csv and 'Excel' .xlsx format), create cuts for the search results (print transcripts, audio/video cuts using 'FFmpeg' and video sub titles in 'Subrib title' .srt format), modify the data in a corpus (search/replace, delete, filter etc.), interact with 'Praat' using 'Praat'-scripts, and exchange data with the 'rPraat' package. The package is itself written in R and may be expanded by other users.

Maintained by Oliver Ehmer. Last updated 2 years ago.

0.5 match 4 stars 6.65 score 184 scripts

delosh653

mwshiny:'Shiny' for Multiple Windows

A simple function, mwsApp(), that runs a 'shiny' app spanning multiple, connected windows. This uses all standard 'shiny' conventions, and depends only on the 'shiny' package.

Maintained by Hannah De los Santos. Last updated 5 years ago.

0.5 match 25 stars 5.83 score 27 scripts

fbartos

BayesTools:Tools for Bayesian Analyses

Provides tools for conducting Bayesian analyses and Bayesian model averaging (Kass and Raftery, 1995, <doi:10.1080/01621459.1995.10476572>, Hoeting et al., 1999, <doi:10.1214/ss/1009212519>). The package contains functions for creating a wide range of prior distribution objects, mixing posterior samples from 'JAGS' and 'Stan' models, plotting posterior distributions, and etc... The tools for working with prior distribution span from visualization, generating 'JAGS' and 'bridgesampling' syntax to basic functions such as rng, quantile, and distribution functions.

Maintained by František Bartoš. Last updated 2 months ago.

bayesian model-averaging

0.5 match 7 stars 6.06 score 17 scripts 3 dependents

mechantrouquin

SMITIDvisu:Visualize Data for Host and Viral Population from 'SMITIDstruct' using 'HTMLwidgets'

Visualisation tools for 'SMITIDstruct' package. Allow to visualize host timeline, transmission tree, index diversities and variant graph using 'HTMLwidgets'. It mainly using 'D3JS' javascript framework.

Maintained by Jean-Francois Rey. Last updated 4 years ago.

cpp

1.9 match 1.63 score 43 scripts

big-life-lab

cchsflow:Transforming and Harmonizing CCHS Variables

Supporting the use of the Canadian Community Health Survey (CCHS) by transforming variables from each cycle into harmonized, consistent versions that span survey cycles (currently, 2001 to 2018). CCHS data used in this library is accessed and adapted in accordance to the Statistics Canada Open Licence Agreement. This package uses rec_with_table(), which was developed from 'sjmisc' rec(). Lüdecke D (2018). "sjmisc: Data and Variable Transformation Functions". Journal of Open Source Software, 3(26), 754. <doi:10.21105/joss.00754>.

Maintained by Kitty Chen. Last updated 1 years ago.

cchs opensci openscience

0.5 match 12 stars 6.02 score 192 scripts

maarten14c

coffee:Chronological Ordering for Fossils and Environmental Events

While individual calibrated radiocarbon dates can span several centuries, combining multiple dates together with any chronological constraints can make a chronology much more robust and precise. This package uses Bayesian methods to enforce the chronological ordering of radiocarbon and other dates, for example for trees with multiple radiocarbon dates spaced at exactly known intervals (e.g., 10 annual rings). For methods see Christen 2003 <doi:10.11141/ia.13.2>. Another example is sites where the relative chronological position of the dates is taken into account - the ages of dates further down a site must be older than those of dates further up (Buck, Kenworthy, Litton and Smith 1991 <doi:10.1017/S0003598X00080534>; Nicholls and Jones 2001 <doi:10.1111/1467-9876.00250>). The paper accompanying this R package is Blaauw et al. 2024 <doi:10.1017/RDC.2024.56>.

Maintained by Maarten Blaauw. Last updated 3 months ago.

0.5 match 7 stars 6.02 score 6 scripts

kwb-r

kwbGompitz:Interface to GompitZ Tool for Modelling the Degradation of Sewer Pipelines

Functions enabling the writing of GompitZ input files, running of GompitZ Tools (gompcal.exe, gompred.exe) and reading of GompitZ output files.

Maintained by Hauke Sonnenberg. Last updated 3 years ago.

modelling project-sema sewer-classification cpp

1.8 match 1.70 score 1 scripts

lightbluetitan

timeSeriesDataSets:Time Series Data Sets

Provides a diverse collection of time series datasets spanning various fields such as economics, finance, energy, healthcare, and more. Designed to support time series analysis in R by offering datasets from multiple disciplines, making it a valuable resource for researchers and analysts.

Maintained by Renzo Caceres Rossi. Last updated 7 months ago.

0.5 match 10 stars 5.71 score 103 scripts

lightbluetitan

MedDataSets:Comprehensive Medical, Disease, Treatment, and Drug Datasets

Provides an extensive collection of datasets related to medicine, diseases, treatments, drugs, and public health. This package covers topics such as drug effectiveness, vaccine trials, survival rates, infectious disease outbreaks, and medical treatments. The included datasets span various health conditions, including AIDS, cancer, bacterial infections, and COVID-19, along with information on pharmaceuticals and vaccines. These datasets are sourced from the R ecosystem and other R packages, remaining unaltered to ensure data integrity. This package serves as a valuable resource for researchers, analysts, and healthcare professionals interested in conducting medical and public health data analysis in R.

Maintained by Renzo Caceres Rossi. Last updated 5 months ago.

0.5 match 8 stars 5.68 score 60 scripts

eddelbuettel

td:Access to the 'twelvedata' Financial Data API

The 'twelvedata' REST service offers access to current and historical data on stocks, standard as well as digital 'crypto' currencies, and other financial assets covering a wide variety of course and time spans. See <https://twelvedata.com/> for details, to create an account, and to request an API key for free-but-capped access to the data.

Maintained by Dirk Eddelbuettel. Last updated 4 months ago.

0.5 match 16 stars 5.37 score 73 scripts

liyabera

EngrEcon:Engineering Economics Analysis for Engineering Projects Cost Analysis

Computing economic analysis in civil infrastructure and ecosystem restoration projects is a typical activity. This package contains Standard cost engineering and engineering economics methods that are applied to convert between present, future, and annualized costs. Newnan D. (2020) <ISBN 9780190931919> “Engineering Economic Analysis”.

Maintained by Liya Abera. Last updated 12 months ago.

1.6 match 1.70 score 1 scripts

lightbluetitan

crimedatasets:A Comprehensive Collection of Crime-Related Datasets

A comprehensive collection of datasets exclusively focused on crimes, criminal activities, and related topics. This package serves as a valuable resource for researchers, analysts, and students interested in crime analysis, criminology, social and economic studies related to criminal behavior. Datasets span global and local contexts, with a mix of tabular and spatial data.

Maintained by Renzo Caceres Rossi. Last updated 4 months ago.

0.5 match 8 stars 4.90 score 3 scripts

chiefmurph

mondate:Keep track of dates in terms of months

Keep track of dates in terms of months. Model dates as at close of business. Perform date arithmetic in units of "months" and "years". Allow "infinite" dates to model "ultimate" time spans.

Maintained by Dan Murphy. Last updated 3 years ago.

0.5 match 2 stars 4.53 score 91 scripts

populationstatistics

refugees:UNHCR Refugee Population Statistics Database

The Refugee Population Statistics Database published by The Office of The United Nations High Commissioner for Refugees (UNHCR) contains information about forcibly displaced populations spanning more than 70 years of statistical activities. It covers displaced populations such as refugees, asylum-seekers and internally displaced people, including their demographics. Stateless people are also included, most of who have never been displaced. The database also reflects the different types of solutions for displaced populations such as repatriation or resettlement. More information on the data and methodology can be found on the UNHCR Refugee Data Finder <https://www.unhcr.org/refugee-statistics/>.

Maintained by Hisham Galal. Last updated 5 months ago.

0.5 match 8 stars 4.57 score 46 scripts

bioc

FeatSeekR:FeatSeekR an R package for unsupervised feature selection

FeatSeekR performs unsupervised feature selection using replicated measurements. It iteratively selects features with the highest reproducibility across replicates, after projecting out those dimensions from the data that are spanned by the previously selected features. The selected a set of features has a high replicate reproducibility and a high degree of uniqueness.

Maintained by Tuemay Capraz. Last updated 2 months ago.

software statisticalmethod featureextraction massspectrometry

0.5 match 2 stars 4.48 score 3 scripts

shangkun-wang

OSFD:Output Space-Filling Design

Methods to generate a design in the input space that sequentially fills the output space of a black-box function. The output space-filling designs are helpful in inverse design or feature-based modeling problems. See Wang, Shangkun, Adam P. Generale, Surya R. Kalidindi, and V. Roshan Joseph. (2024), Sequential designs for filling output spaces, Technometrics, 66, 65–76. for details. This work is supported by U.S. National Foundation grant CMMI-1921646.

Maintained by Shangkun Wang. Last updated 28 days ago.

openblas cpp

1.7 match 1.30 score

labbdm

RAMpath:Structural Equation Modeling Using the Reticular Action Model (RAM) Notation

We rewrite of RAMpath software developed by John McArdle and Steven Boker as an R package. In addition to performing regular SEM analysis through the R package lavaan, RAMpath has unique features. First, it can generate path diagrams according to a given model. Second, it can display path tracing rules through path diagrams and decompose total effects into their respective direct and indirect effects as well as decompose variance and covariance into individual bridges. Furthermore, RAMpath can fit dynamic system models automatically based on latent change scores and generate vector field plots based upon results obtained from a bivariate dynamic system. Starting version 0.4, RAMpath can conduct power analysis for both univariate and bivariate latent change score models.

Maintained by Zhiyong Zhang. Last updated 2 years ago.

2.0 match 1 stars 1.11 score 13 scripts

bioc

flowGate:Interactive Cytometry Gating in R

flowGate adds an interactive Shiny app to allow manual GUI-based gating of flow cytometry data in R. Using flowGate, you can draw 1D and 2D span/rectangle gates, quadrant gates, and polygon gates on flow cytometry data by interactively drawing the gates on a plot of your data, rather than by specifying gate coordinates. This package is especially geared toward wet-lab cytometerists looking to take advantage of R for cytometry analysis, without necessarily having a lot of R experience.

Maintained by Andrew Wight. Last updated 5 months ago.

software workflowstep flowcytometry preprocessing immunooncology dataimport

0.5 match 4.00 score 3 scripts

tjmahr

WrapRmd:RStudio Addin for Wrapping RMarkdown Paragraphs

Provides an RStudio addin for wrapping paragraphs in an RMarkdown document without inserting linebreaks into spans of inline R code.

Maintained by Tristan Mahr. Last updated 1 years ago.

knitr rmarkdown rstudio

0.5 match 104 stars 3.72 score 3 scripts

taylor-arnold

sotu:United States Presidential State of the Union Addresses

The President of the United States is constitutionally obligated to provide a report known as the 'State of the Union'. The report summarizes the current challenges facing the country and the president's upcoming legislative agenda. While historically the State of the Union was often a written document, in recent decades it has always taken the form of an oral address to a joint session of the United States Congress. This package provides the raw text from every such address with the intention of being used for meaningful examples of text analysis in R. The corpus is well suited to the task as it is historically important, includes material intended to be read and material intended to be spoken, and it falls in the public domain. As the corpus spans over two centuries it is also a good test of how well various methods hold up to the idiosyncrasies of historical texts. Associated data about each address, such as the year, president, party, and format, are also included.

Maintained by Taylor B. Arnold. Last updated 3 years ago.

0.5 match 2 stars 3.87 score 74 scripts

singator

autoharp:Semi-Automatic Grading of R and Rmd Scripts

A customisable set of tools for assessing and grading R or R-markdown scripts from students. It allows for checking correctness of code output, runtime statistics and static code analysis. The latter feature is made possible by representing R expressions using a tree structure.

Maintained by Vik Gopal. Last updated 3 years ago.

1.9 match 1 stars 1.00 score 8 scripts

mrbcuda

ustyc:Fetch US Treasury yield curve data.

Forms a query to submit for US Treasury yield curve data, posting this query to the US Treasury web site's data feed service. By default the download includes data yield data for 12 products from January 1, 1990, some of which are NA during this span. The caller can pass parameters to limit the query to a certain year or year and month, but the full download is not especially large. The download data from the service is in XML format. The package's main function transforms that XML data into a numeric data frame with treasury product items (constant maturity yields for 12 kinds of bills, notes, and bonds) as columns and dates as row names. The function returns a list which includes an item for this data frame as well as query-related values for reference and the update date from the service.

Maintained by Matt Barry. Last updated 5 years ago.

0.5 match 9 stars 3.65 score 9 scripts

jerryratcliffe

aoristic:Generates Aoristic Probability Distributions

It can sometimes be difficult to ascertain when some events (such as property crime) occur because the victim is not present when the crime happens. As a result, police databases often record a 'start' (or 'from') date and time, and an 'end' (or 'to') date and time. The time span between these date/times can be minutes, hours, or sometimes days, hence the term 'Aoristic'. Aoristic is one of the past tenses in Greek and represents an uncertain occurrence in time. For events with a location describes with either a latitude/longitude, or X,Y coordinate pair, and a start and end date/time, this package generates an aoristic data frame with aoristic weighted probability values for each hour of the week, for each observation. The coordinates are not necessary for the program to calculate aoristic weights; however, they are part of this package because a spatial component has been integral to aoristic analysis from the start. Dummy coordinates can be introduced if the user only has temporal data. Outputs include an aoristic data frame, as well as summary graphs and displays. For more information see: Ratcliffe, JH (2002) Aoristic signatures and the temporal analysis of high volume crime patterns, Journal of Quantitative Criminology. 18 (1): 23-43. Note: This package replaces an original 'aoristic' package (version 0.6) by George Kikuchi that has been discontinued with his permission.

Maintained by Jerry Ratcliffe. Last updated 2 years ago.

0.5 match 7 stars 3.54 score 9 scripts

bsandel

painter:Creation and Manipulation of Color Palettes

Functions for creating color palettes, visualizing palettes, modifying colors, and assigning colors for plotting.

Maintained by Brody Sandel. Last updated 7 years ago.

1.8 match 1.00 score 9 scripts

cran

IALS:Iterative Alternating Least Square Estimation for Large-Dimensional Matrix Factor Model

The matrix factor model has drawn growing attention for its advantage in achieving two-directional dimension reduction simultaneously for matrix-structured observations. In contrast to the Principal Component Analysis (PCA)-based methods, we propose a simple Iterative Alternating Least Squares (IALS) algorithm for matrix factor model, see the details in He et al. (2023) <arXiv:2301.00360>.

Maintained by Ran Zhao. Last updated 1 years ago.

1.8 match 1.00 score

cran

BalanceCheck:Balance Check for Multiple Covariates in Matched Observational Studies

Two practical tests are provided for assessing whether multiple covariates in a treatment group and a matched control group are balanced in observational studies.

Maintained by Hao Chen. Last updated 6 years ago.

1.8 match 1 stars 1.00 score

gabrielslpires

ggdaynight:Add Day/Night Patterns to 'ggplot2' Plots

It provides a custom 'ggplot2' geom to add day/night patterns to plots. It visually distinguishes daytime and nighttime periods. It is useful for visualizing data that spans multiple days and for highlighting diurnal patterns.

Maintained by Gabriel S. Pires. Last updated 10 months ago.

0.5 match 2 stars 3.30 score 2 scripts

bioc

SICtools:Find SNV/Indel differences between two bam files with near relationship

This package is to find SNV/Indel differences between two bam files with near relationship in a way of pairwise comparison thourgh each base position across the genome region of interest. The difference is inferred by fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region.

Maintained by Xiaobin Xing. Last updated 5 months ago.

alignment sequencing coverage sequencematching qualitycontrol dataimport software snp variantdetection

0.5 match 3.30 score 1 scripts

zh2395

KPC:Kernel Partial Correlation Coefficient

Implementations of two empirical versions the kernel partial correlation (KPC) coefficient and the associated variable selection algorithms. KPC is a measure of the strength of conditional association between Y and Z given X, with X, Y, Z being random variables taking values in general topological spaces. As the name suggests, KPC is defined in terms of kernels on reproducing kernel Hilbert spaces (RKHSs). The population KPC is a deterministic number between 0 and 1; it is 0 if and only if Y is conditionally independent of Z given X, and it is 1 if and only if Y is a measurable function of Z and X. One empirical KPC estimator is based on geometric graphs, such as K-nearest neighbor graphs and minimum spanning trees, and is consistent under very weak conditions. The other empirical estimator, defined using conditional mean embeddings (CMEs) as used in the RKHS literature, is also consistent under suitable conditions. Using KPC, a stepwise forward variable selection algorithm KFOCI (using the graph based estimator of KPC) is provided, as well as a similar stepwise forward selection algorithm based on the RKHS based estimator. For more details on KPC, its empirical estimators and its application on variable selection, see Huang, Z., N. Deb, and B. Sen (2022). “Kernel partial correlation coefficient – a measure of conditional dependence” (URL listed below). When X is empty, KPC measures the unconditional dependence between Y and Z, which has been described in Deb, N., P. Ghosal, and B. Sen (2020), “Measuring association on topological spaces using kernels and geometric graphs” <arXiv:2010.01768>, and it is implemented in the functions KMAc() and Klin() in this package. The latter can be computed in near linear time.

Maintained by Zhen Huang. Last updated 1 years ago.

0.5 match 4 stars 3.30 score 6 scripts

gatesdupont

StateLevelForest:Historical State-Level Forest Cover Data in the United States

Provides a unique dataset of historical forest cover across all states in the United States, spanning from 1907 to 2017, along with 1630 as a reference year. This dataset is important for understanding environmental changes and land use trends over time. It includes functionality for easy access of the data.

Maintained by Gates Dupont. Last updated 1 years ago.

0.5 match 3.00 score 3 scripts

zh2395

KMD:Kernel Measure of Multi-Sample Dissimilarity

Implementations of the kernel measure of multi-sample dissimilarity (KMD) between several samples using K-nearest neighbor graphs and minimum spanning trees. The KMD measures the dissimilarity between multiple samples, based on the observations from them. It converges to the population quantity (depending on the kernel) which is between 0 and 1. A small value indicates the multiple samples are from the same distribution, and a large value indicates the corresponding distributions are different. The population quantity is 0 if and only if all distributions are the same, and 1 if and only if all distributions are mutually singular. The package also implements the tests based on KMD for H0: the M distributions are equal against H1: not all the distributions are equal. Both permutation test and asymptotic test are available. These tests are consistent against all alternatives where at least two samples have different distributions. For more details on KMD and the associated tests, see Huang, Z. and B. Sen (2022) <arXiv:2210.00634>.

Maintained by Zhen Huang. Last updated 2 years ago.

0.5 match 1 stars 2.70 score 2 scripts

alighanbari26

GPRMortality:Gaussian Process Regression for Mortality Rates

A Bayesian statistical model for estimating child (under-five age group) and adult (15-60 age group) mortality. The main challenge is how to combine and integrate these different time series and how to produce unified estimates of mortality rates during a specified time span. GPR is a Bayesian statistical model for estimating child and adult mortality rates which its data likelihood is mortality rates from different data sources such as: Death Registration System, Censuses or surveys. There are also various hyper-parameters for completeness of DRS, mean, covariance functions and variances as priors. This function produces estimations and uncertainty (95% or any desirable percentiles) based on sampling and non-sampling errors due to variation in data sources. The GP model utilizes Bayesian inference to update predicted mortality rates as a posterior in Bayes rule by combining data and a prior probability distribution over parameters in mean, covariance function, and the regression model. This package uses Markov Chain Monte Carlo (MCMC) to sample from posterior probability distribution by 'rstan' package in R. Details are given in Wang H, Dwyer-Lindgren L, Lofgren KT, et al. (2012) <doi:10.1016/S0140-6736(12)61719-X>, Wang H, Liddell CA, Coates MM, et al. (2014) <doi:10.1016/S0140-6736(14)60497-9> and Mohammadi, Parsaeian, Mehdipour et al. (2017) <doi:10.1016/S2214-109X(17)30105-5>.

Maintained by Ali Ghanbari. Last updated 4 years ago.

0.5 match 2.70 score 7 scripts

amlinz

OTUtable:North Temperate Lakes - Microbial Observatory 16S Time Series Data and Functions

Analyses of OTU tables produced by 16S rRNA gene amplicon sequencing, as well as example data. It contains the data and scripts used in the paper Linz, et al. (2017) "Bacterial community composition and dynamics spanning five years in freshwater bog lakes," <doi: 10.1128/mSphere.00169-17>.

Maintained by Alexandra Linz. Last updated 7 years ago.

0.5 match 2.20 score 53 scripts

surajitstat

Modalclust:Hierarchical Modal Clustering

Performs Modal Clustering (MAC) including Hierarchical Modal Clustering (HMAC) along with their parallel implementation (PHMAC) over several processors. These model-based non-parametric clustering techniques can extract clusters in very high dimensions with arbitrary density shapes. By default clustering is performed over several resolutions and the results are summarised as a hierarchical tree. Associated plot functions are also provided. There is a package vignette that provides many examples. This version adheres to CRAN policy of not spanning more than two child processes by default.

Maintained by Surajit Ray. Last updated 6 years ago.

0.5 match 2.08 score 12 scripts

cran

CMHSU:Mental Health Status, Substance Use Status and their Concurrent Status in North American Healthcare Administrative Databases

Patients' Mental Health (MH) status, Substance Use (SU) status, and concurrent MH/SU status in the American/Canadian Healthcare Administrative Databases can be identified. The detection is based on given parameters of interest by clinicians including the list of plausible ICD MH/SU codes (3/4/5 characters), the required number of visits of hospital for MH/SU , the required number of visits of service physicians for MH/SU, and the maximum time span within MH visits, within SU visits, and, between MH and SU visits. Methods are described in: Khan S <https://pubmed.ncbi.nlm.nih.gov/29044442/>, Keen C, et al. (2021) <doi:10.1111/add.15580>, Lavergne MR, et al. (2022) <doi:10.1186/s12913-022-07759-z>, Casillas, S M, et al. (2022) <doi:10.1016/j.abrep.2022.100464>, CIHI (2022) <https://www.cihi.ca/en>, CDC (2024) <https://www.cdc.gov>, WHO (2019) <https://icd.who.int/en>.

Maintained by Chel Hee Lee. Last updated 3 months ago.

0.5 match 2.00 score