R-universe search: preprint

ropensci

medrxivr:Access and Search MedRxiv and BioRxiv Preprint Data

An increasingly important source of health-related bibliographic content are preprints - preliminary versions of research articles that have yet to undergo peer review. The two preprint repositories most relevant to health-related sciences are medRxiv <https://www.medrxiv.org/> and bioRxiv <https://www.biorxiv.org/>, both of which are operated by the Cold Spring Harbor Laboratory. 'medrxivr' provides programmatic access to the 'Cold Spring Harbour Laboratory (CSHL)' API <https://api.biorxiv.org/>, allowing users to easily download medRxiv and bioRxiv preprint metadata (e.g. title, abstract, publication date, author list, etc) into R. 'medrxivr' also provides functions to search the downloaded preprint records using regular expressions and Boolean logic, as well as helper functions that allow users to export their search results to a .BIB file for easy import to a reference manager and to download the full-text PDFs of preprints matching their search criteria.

Maintained by Yaoxiang Li. Last updated 1 months ago.

bibliographic-database biorxiv evidence-synthesis medrxiv-data peer-reviewed preprint-records systematic-reviews

16.9 match 56 stars 7.17 score 44 scripts

arnaudgallou

plume:A Simple Author Handler for Scientific Writing

Handles and formats author information in scientific writing in 'R Markdown' and 'Quarto'. 'plume' provides easy-to-use and flexible tools for injecting author metadata in 'YAML' headers as well as generating author and contribution lists (among others) as strings from tabular data.

Maintained by Arnaud Gallou. Last updated 30 days ago.

authors contribution contributions list lists markdown paper preprint quarto role roles

11.0 match 21 stars 6.84 score 15 scripts

lazappi

doilinker:Link Preprints And Publications By DOI

Links preprints to publications using the method described in Cabanac G, Oikonomidi T, Boutron I. "Day-to-day discovery of preprint-publication links". Scientometrics. 2021;1–20. DOI: 10.1007/s11192-021-03900-7.

Maintained by Luke Zappia. Last updated 1 years ago.

doi preprint publication

17.2 match 5 stars 3.40 score 3 scripts

stephenturner

biorecap:Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama

Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama.

Maintained by Stephen Turner. Last updated 6 months ago.

12.8 match 64 stars 4.20 score 5 scripts

paolomaranzano

SCDA:Spatially-Clustered Data Analysis

Contains functions for statistical data analysis based on spatially-clustered techniques. The package allows estimating the spatially-clustered spatial regression models presented in Cerqueti, Maranzano \& Mattera (2024), "Spatially-clustered spatial autoregressive models with application to agricultural market concentration in Europe", arXiv preprint 2407.15874 <doi:10.48550/arXiv.2407.15874>. Specifically, the current release allows the estimation of the spatially-clustered linear regression model (SCLM), the spatially-clustered spatial autoregressive model (SCSAR), the spatially-clustered spatial Durbin model (SCSEM), and the spatially-clustered linear regression model with spatially-lagged exogenous covariates (SCSLX). From release 0.0.2, the library contains functions to estimate spatial clustering based on Adiajacent Matrix K-Means (AMKM) as described in Zhou, Liu \& Zhu (2019), "Weighted adjacent matrix for K-means clustering", Multimedia Tools and Applications, 78 (23) <doi:10.1007/s11042-019-08009-x>.

Maintained by Paolo Maranzano. Last updated 5 months ago.

9.8 match 1.79 score 31 scripts

alexanderhenzi

isodistrreg:Isotonic Distributional Regression (IDR)

Distributional regression under stochastic order restrictions for numeric and binary response variables and partially ordered covariates. See Henzi, Ziegel, Gneiting (2021) <doi:10.1111/rssb.12450>.

Maintained by Alexander Henzi. Last updated 1 years ago.

cpp

3.3 match 15 stars 5.20 score 21 scripts

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 17 days ago.

openblas cpp openmp

0.8 match 147 stars 12.54 score 1.2k scripts 166 dependents

rfastofficial

Rfast2:A Collection of Efficient and Extremely Fast R Functions II

A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.

Maintained by Manos Papadakis. Last updated 1 years ago.

openblas cpp openmp

0.8 match 38 stars 8.09 score 75 scripts 26 dependents

cran

psyverse:Decentralized Unequivocality in Psychological Science

The constructs used to study the human psychology have many definitions and corresponding instructions for eliciting and coding qualitative data pertaining to constructs' content and for measuring the constructs. This plethora of definitions and instructions necessitates unequivocal reference to specific definitions and instructions in empirical and secondary research. This package implements a human- and machine-readable standard for specifying construct definitions and instructions for measurement and qualitative research based on 'YAML'. This standard facilitates systematic unequivocal reference to specific construct definitions and corresponding instructions in a decentralized manner (i.e. without requiring central curation; Peters (2020) <doi:10.31234/osf.io/xebhn>).

Maintained by Gjalt-Jorn Peters. Last updated 2 years ago.

2.2 match 2.70 score

pascalkieslich

mousetrap:Process and Analyze Mouse-Tracking Data

Mouse-tracking, the analysis of mouse movements in computerized experiments, is a method that is becoming increasingly popular in the cognitive sciences. The mousetrap package offers functions for importing, preprocessing, analyzing, aggregating, and visualizing mouse-tracking data. An introduction into mouse-tracking analyses using mousetrap can be found in Wulff, Kieslich, Henninger, Haslbeck, & Schulte-Mecklenbeck (2023) <doi:10.31234/osf.io/v685r> (preprint: <https://osf.io/preprints/psyarxiv/v685r>).

Maintained by Pascal J. Kieslich. Last updated 1 years ago.

analysis clustering mouse-tracking visualization cpp

0.8 match 46 stars 6.68 score 124 scripts

ropensci

aRxiv:Interface to the arXiv API

An interface to the API for 'arXiv', a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics.

Maintained by Karl Broman. Last updated 1 years ago.

arxiv arxiv-analytics arxiv-api arxiv-org

0.5 match 63 stars 6.97 score 74 scripts

frbcesab

rcompendium:Create a Package or Research Compendium Structure

Makes easier the creation of R package or research compendium (i.e. a predefined files/folders structure) so that users can focus on the code/analysis instead of wasting time organizing files. A full ready-to-work structure is set up with some additional features: version control, remote repository creation, CI/CD configuration (check package integrity under several OS, test code with 'testthat', and build and deploy website using 'pkgdown'). This package heavily relies on the R packages 'devtools' and 'usethis' and follows recommendations made by Wickham H. (2015) <ISBN:9781491910597> and Marwick B. et al. (2018) <doi:10.7287/peerj.preprints.3192v2>.

Maintained by Nicolas Casajus. Last updated 1 months ago.

reproducible-research research-compendium

0.5 match 40 stars 6.72 score 22 scripts

comp-cogneuro-lang

LexFindR:Find Related Items and Lexical Dimensions in a Lexicon

Implements code to identify lexical competitors in a given list of words. We include many of the standard competitor types used in spoken word recognition research, such as functions to find cohorts, neighbors, and rhymes, amongst many others. The package includes documentation for using a variety of lexicon files, including those with form codes made up of multiple letters (i.e., phoneme codes) and also basic orthographies. Importantly, the code makes use of multiple CPU cores and vectorization when possible, making it extremely fast and able to handle large lexicons. Additionally, the package contains documentation for users to easily write new functions, allowing researchers to examine other relationships within a lexicon. Preprint: <https://osf.io/preprints/psyarxiv/8dyru/>. Open access: <doi:10.3758/s13428-021-01667-6>. Citation: Li, Z., Crinnion, A.M. & Magnuson, J.S. (2021). <doi:10.3758/s13428-021-01667-6>.

Maintained by ZhaoBin Li. Last updated 9 months ago.

0.8 match 4 stars 4.30 score 5 scripts

nsaph-software

CRE:Interpretable Discovery and Inference of Heterogeneous Treatment Effects

Provides a new method for interpretable heterogeneous treatment effects characterization in terms of decision rules via an extensive exploration of heterogeneity patterns by an ensemble-of-trees approach, enforcing high stability in the discovery. It relies on a two-stage pseudo-outcome regression, and it is supported by theoretical convergence guarantees. Bargagli-Stoffi, F. J., Cadei, R., Lee, K., & Dominici, F. (2023) Causal rule ensemble: Interpretable Discovery and Inference of Heterogeneous Treatment Effects. arXiv preprint <doi:10.48550/arXiv.2009.09036>.

Maintained by Falco Joannes Bargagli Stoffi. Last updated 5 months ago.

0.5 match 13 stars 6.41 score 11 scripts

nsaph-software

GPCERF:Gaussian Processes for Estimating Causal Exposure Response Curves

Provides a non-parametric Bayesian framework based on Gaussian process priors for estimating causal effects of a continuous exposure and detecting change points in the causal exposure response curves using observational data. Ren, B., Wu, X., Braun, D., Pillai, N., & Dominici, F.(2021). "Bayesian modeling for exposure response curve via gaussian processes: Causal effects of exposure to air pollution on health outcomes." arXiv preprint <doi:10.48550/arXiv.2105.03454>.

Maintained by Boyu Ren. Last updated 11 months ago.

cpp

0.5 match 9 stars 6.33 score 16 scripts

shaunpwilkinson

insect:Informatic Sequence Classification Trees

Provides tools for probabilistic taxon assignment with informatic sequence classification trees. See Wilkinson et al (2018) <doi:10.7287/peerj.preprints.26812v1>.

Maintained by Shaun Wilkinson. Last updated 4 years ago.

0.5 match 14 stars 5.80 score 91 scripts

brandmaier

reproducibleRchunks:Automated Reproducibility Checks for R Markdown Documents

Provide reproducible R chunks in R Markdown document that automatically check computational results for reproducibility. This is achieved by creating json files storing metadata about computational results. A comprehensive tutorial to the package is available as preprint by Brandmaier & Peikert (2024, <doi:10.31234/osf.io/3zjvf>).

Maintained by Andreas M. Brandmaier. Last updated 17 days ago.

reproducibility

0.5 match 25 stars 5.55 score 11 scripts

sehellmann

dynConfiR:Dynamic Models for Confidence and Response Time Distributions

Provides density functions for the joint distribution of choice, response time and confidence for discrete confidence judgments as well as functions for parameter fitting, prediction and simulation for various dynamical models of decision confidence. All models are explained in detail by Hellmann et al. (2023; Preprint available at <https://osf.io/9jfqr/>, published version: <doi:10.1037/rev0000411>). Implemented models are the dynaViTE model, dynWEV model, the 2DSD model (Pleskac & Busemeyer, 2010, <doi:10.1037/a0019737>), and various race models. C++ code for dynWEV and 2DSD is based on the 'rtdists' package by Henrik Singmann.

Maintained by Sebastian Hellmann. Last updated 17 hours ago.

cpp

0.5 match 3 stars 5.47 score 18 scripts

tslumley

DHBins:Hexmaps for NZ District Health Boards

Draws stylized choropleth maps -- hexagonal maps and triangular multiclass hex maps -- for New Zealand District Health Boards and Regional Council areas. These allow faceted, coloured displays of quantitative information for comparison across District Health Boards or Regional Councils. The preprint Lumley (2019) <arXiv:1912.04435> is based on the methods in this package.

Maintained by Thomas Lumley. Last updated 3 years ago.

0.5 match 3 stars 4.59 score 13 scripts

greifflab

immuneSIM:Tunable Simulation of B- And T-Cell Receptor Repertoires

Simulate full B-cell and T-cell receptor repertoires using an in silico recombination process that includes a wide variety of tunable parameters to introduce noise and biases. Additional post-simulation modification functions allow the user to implant motifs or codon biases as well as remodeling sequence similarity architecture. The output repertoires contain records of all relevant repertoire dimensions and can be analyzed using provided repertoire analysis functions. Preprint is available at bioRxiv (Weber et al., 2019 <doi:10.1101/759795>).

Maintained by Cédric R. Weber. Last updated 1 years ago.

0.5 match 37 stars 4.44 score 15 scripts

bioc

DepInfeR:Inferring tumor-specific cancer dependencies through integrating ex-vivo drug response assays and drug-protein profiling

DepInfeR integrates two experimentally accessible input data matrices: the drug sensitivity profiles of cancer cell lines or primary tumors ex-vivo (X), and the drug affinities of a set of proteins (Y), to infer a matrix of molecular protein dependencies of the cancers (ß). DepInfeR deconvolutes the protein inhibition effect on the viability phenotype by using regularized multivariate linear regression. It assigns a “dependence coefficient” to each protein and each sample, and therefore could be used to gain a causal and accurate understanding of functional consequences of genomic aberrations in a heterogeneous disease, as well as to guide the choice of pharmacological intervention for a specific cancer type, sub-type, or an individual patient. For more information, please read out preprint on bioRxiv: https://doi.org/10.1101/2022.01.11.475864.

Maintained by Junyan Lu. Last updated 5 months ago.

software regression pharmacogenetics pharmacogenomics functionalgenomics

0.5 match 1 stars 4.36 score 23 scripts

collinerickson

CGGP:Composite Grid Gaussian Processes

Run computer experiments using the adaptive composite grid algorithm with a Gaussian process model. The algorithm works best when running an experiment that can evaluate thousands of points from a deterministic computer simulation. This package is an implementation of a forthcoming paper by Plumlee, Erickson, Ankenman, et al. For a preprint of the paper, contact the maintainer of this package.

Maintained by Collin Erickson. Last updated 1 years ago.

cpp

0.5 match 2 stars 4.08 score 12 scripts

rtgodwin

oneinfl:Estimates OIPP and OIZTNB Regression Models

Estimates one-inflated positive Poisson (OIPP) and one-inflated zero-truncated negative binomial (OIZTNB) regression models. A suite of ancillary statistical tools are also provided, including: estimation of positive Poisson (PP) and zero-truncated negative binomial (ZTNB) models; marginal effects and their standard errors; diagnostic likelihood ratio and Wald tests; plotting; predicted counts and expected responses; and random variate generation. The models and tools, as well as four applications, are shown in Godwin, R. T. (2024). "One-inflated zero-truncated count regression models" arXiv preprint <arXiv:2402.02272>.

Maintained by Ryan T. Godwin. Last updated 2 months ago.

0.5 match 2 stars 3.78 score

ncsoft

promotionImpact:Analysis & Measurement of Promotion Effectiveness

Analysis and measurement of promotion effectiveness on a given target variable (e.g. daily sales). After converting promotion schedule into dummy or smoothed predictor variables, the package estimates the effects of these variables controlled for trend/periodicity/structural change using prophet by Taylor and Letham (2017) <doi:10.7287/peerj.preprints.3190v2> and some prespecified variables (e.g. start of a month).

Maintained by Nahyun Kim. Last updated 5 years ago.

0.5 match 47 stars 3.67 score 2 scripts

wraff

wrProteo:Proteomics Data Analysis Functions

Data analysis of proteomics experiments by mass spectrometry is supported by this collection of functions mostly dedicated to the analysis of (bottom-up) quantitative (XIC) data. Fasta-formatted proteomes (eg from UniProt Consortium <doi:10.1093/nar/gky1049>) can be read with automatic parsing and multiple annotation types (like species origin, abbreviated gene names, etc) extracted. Initial results from multiple software for protein (and peptide) quantitation can be imported (to a common format): MaxQuant (Tyanova et al 2016 <doi:10.1038/nprot.2016.136>), Dia-NN (Demichev et al 2020 <doi:10.1038/s41592-019-0638-x>), Fragpipe (da Veiga et al 2020 <doi:10.1038/s41592-020-0912-y>), ionbot (Degroeve et al 2021 <doi:10.1101/2021.07.02.450686>), MassChroq (Valot et al 2011 <doi:10.1002/pmic.201100120>), OpenMS (Strauss et al 2021 <doi:10.1038/nmeth.3959>), ProteomeDiscoverer (Orsburn 2021 <doi:10.3390/proteomes9010015>), Proline (Bouyssie et al 2020 <doi:10.1093/bioinformatics/btaa118>), AlphaPept (preprint Strauss et al <doi:10.1101/2021.07.23.453379>) and Wombat-P (Bouyssie et al 2023 <doi:10.1021/acs.jproteome.3c00636>. Meta-data provided by initial analysis software and/or in sdrf format can be integrated to the analysis. Quantitative proteomics measurements frequently contain multiple NA values, due to physical absence of given peptides in some samples, limitations in sensitivity or other reasons. Help is provided to inspect the data graphically to investigate the nature of NA-values via their respective replicate measurements and to help/confirm the choice of NA-replacement algorithms. Meta-data in sdrf-format (Perez-Riverol et al 2020 <doi:10.1021/acs.jproteome.0c00376>) or similar tabular formats can be imported and included. Missing values can be inspected and imputed based on the concept of NA-neighbours or other methods. Dedicated filtering and statistical testing using the framework of package 'limma' <doi:10.18129/B9.bioc.limma> can be run, enhanced by multiple rounds of NA-replacements to provide robustness towards rare stochastic events. Multi-species samples, as frequently used in benchmark-tests (eg Navarro et al 2016 <doi:10.1038/nbt.3685>, Ramus et al 2016 <doi:10.1016/j.jprot.2015.11.011>), can be run with special options considering such sub-groups during normalization and testing. Subsequently, ROC curves (Hand and Till 2001 <doi:10.1023/A:1010920819831>) can be constructed to compare multiple analysis approaches. As detailed example the data-set from Ramus et al 2016 <doi:10.1016/j.jprot.2015.11.011>) quantified by MaxQuant, ProteomeDiscoverer, and Proline is provided with a detailed analysis of heterologous spike-in proteins.

Maintained by Wolfgang Raffelsberger. Last updated 4 months ago.

0.5 match 3.67 score 17 scripts 1 dependents

tobiaskley

forecastSNSTS:Forecasting for Stationary and Non-Stationary Time Series

Methods to compute linear h-step ahead prediction coefficients based on localised and iterated Yule-Walker estimates and empirical mean squared and absolute prediction errors for the resulting predictors. Also, functions to compute autocovariances for AR(p) processes, to simulate tvARMA(p,q) time series, and to verify an assumption from Kley et al. (2017), Preprint <http://personal.lse.ac.uk/kley/forecastSNSTS.pdf>.

Maintained by Tobias Kley. Last updated 7 years ago.

cpp

0.5 match 5 stars 3.40 score 9 scripts

lukaswallrich

rsprite2:Identify Distributions that Match Reported Sample Parameters (SPRITE)

The SPRITE algorithm creates possible distributions of discrete responses based on reported sample parameters, such as mean, standard deviation and range (Heathers et al., 2018, <doi:10.7287/peerj.preprints.26968v1>). This package implements it, drawing heavily on the code for Nick Brown's 'rSPRITE' Shiny app <https://shiny.ieis.tue.nl/sprite/>. In addition, it supports the modeling of distributions based on multi-item (Likert-type) scales and the use of restrictions on the frequency of particular responses.

Maintained by Lukas Wallrich. Last updated 1 years ago.

0.5 match 2 stars 3.30 score 10 scripts

jmbh

fspe:Estimating the Number of Factors in EFA with Out-of-Sample Prediction Errors

Estimating the number of factors in Exploratory Factor Analysis (EFA) with out-of-sample prediction errors using a cross-validation scheme. Haslbeck & van Bork (Preprint) <https://psyarxiv.com/qktsd>.

Maintained by Jonas Haslbeck. Last updated 2 years ago.

0.5 match 1 stars 3.18 score 2 scripts 1 dependents

cran

prevtoinc:Prevalence to Incidence Calculations for Point-Prevalence Studies in a Nosocomial Setting

Functions to simulate point prevalence studies (PPSs) of healthcare-associated infections (HAIs) and to convert prevalence to incidence in steady state setups. Companion package to the preprint Willrich et al., From prevalence to incidence - a new approach in the hospital setting; <doi:10.1101/554725> , where methods are explained in detail.

Maintained by Niklas Willrich. Last updated 6 years ago.

0.5 match 3.18 score 1 dependents

plambertuliege

ordgam:Additive Model for Ordinal Data using Laplace P-Splines

Additive proportional odds model for ordinal data using Laplace P-splines. The combination of Laplace approximations and P-splines enable fast and flexible inference in a Bayesian framework. Specific approximations are proposed to account for the asymmetry in the marginal posterior distributions of non-penalized parameters. For more details, see Lambert and Gressani (2023) <doi:10.1177/1471082X231181173> ; Preprint: <arXiv:2210.01668>).

Maintained by Philippe Lambert. Last updated 2 years ago.

0.5 match 3.02 score 21 scripts

migurke

GenoPop:Genotype Imputation and Population Genomics Efficiently from Variant Call Formatted (VCF) Files

Tools for efficient processing of large, whole genome genotype data sets in variant call format (VCF). It includes several functions to calculate commonly used population genomic metrics and a method for reference panel free genotype imputation, which is described in the preprint Gurke & Mayer (2024) <doi:10.22541/au.172515591.10119928/v1>.

Maintained by Marie Gurke. Last updated 4 months ago.

0.5 match 3.00 score 6 scripts

mondrus96

fabisearch:Change Point Detection in High-Dimensional Time Series Networks

Implementation of the Factorized Binary Search (FaBiSearch) methodology for the estimation of the number and the location of multiple change points in the network (or clustering) structure of multivariate high-dimensional time series. The method is motivated by the detection of change points in functional connectivity networks for functional magnetic resonance imaging (fMRI) data. FaBiSearch uses non-negative matrix factorization (NMF), an unsupervised dimension reduction technique, and a new binary search algorithm to identify multiple change points. It requires minimal assumptions. Lastly, we provide interactive, 3-dimensional, brain-specific network visualization capability in a flexible, stand-alone function. This function can be conveniently used with any node coordinate atlas, and nodes can be color coded according to community membership, if applicable. The output is an elegantly displayed network laid over a cortical surface, which can be rotated in the 3-dimensional space. The main routines of the package are detect.cps(), for multiple change point detection, est.net(), for estimating a network between stationary multivariate time series, net.3dplot(), for plotting the estimated functional connectivity networks, and opt.rank(), for finding the optimal rank in NMF for a given data set. The functions have been extensively tested on simulated multivariate high-dimensional time series data and fMRI data. For details on the FaBiSearch methodology, please see Ondrus et al. (2021) <arXiv:2103.06347>. For a more detailed explanation and applied examples of the fabisearch package, please see Ondrus and Cribben (2022), preprint.

Maintained by Martin Ondrus. Last updated 7 months ago.

0.5 match 1 stars 3.00 score 2 scripts

thomasferte

PheVis:Automatic Phenotyping of Electronic Health Record at Visit Resolution

Using Electronic Health Record (EHR) is difficult because most of the time the true characteristic of the patient is not available. Instead we can retrieve the International Classification of Disease code related to the disease of interest or we can count the occurrence of the Unified Medical Language System. None of them is the true phenotype which needs chart review to identify. However chart review is time consuming and costly. 'PheVis' is an algorithm which is phenotyping (i.e identify a characteristic) at the visit level in an unsupervised fashion. It can be used for chronic or acute diseases. An example of how to use 'PheVis' is available in the vignette. Basically there are two functions that are to be used: `train_phevis()` which trains the algorithm and `test_phevis()` which get the predicted probabilities. The detailed method is described in preprint by Ferté et al. (2020) <doi:10.1101/2020.06.15.20131458>.

Maintained by Thomas Ferte. Last updated 1 years ago.

cpp

0.5 match 1 stars 2.70 score 3 scripts

fernandalschumacher

ARCensReg:Fitting Univariate Censored Linear Regression Model with Autoregressive Errors

It fits a univariate left, right, or interval censored linear regression model with autoregressive errors, considering the normal or the Student-t distribution for the innovations. It provides estimates and standard errors of the parameters, predicts future observations, and supports missing values on the dependent variable. References used for this package: Schumacher, F. L., Lachos, V. H., & Dey, D. K. (2017). Censored regression models with autoregressive errors: A likelihood-based perspective. Canadian Journal of Statistics, 45(4), 375-392 <doi:10.1002/cjs.11338>. Schumacher, F. L., Lachos, V. H., Vilca-Labra, F. E., & Castro, L. M. (2018). Influence diagnostics for censored regression models with autoregressive errors. Australian & New Zealand Journal of Statistics, 60(2), 209-229 <doi:10.1111/anzs.12229>. Valeriano, K. A., Schumacher, F. L., Galarza, C. E., & Matos, L. A. (2021). Censored autoregressive regression models with Student-t innovations. arXiv preprint <arXiv:2110.00224>.

Maintained by Fernanda L. Schumacher. Last updated 2 years ago.

openblas cpp openmp

0.5 match 1 stars 2.70 score 9 scripts

chriscpritchard

PRISMA2020:Make Interactive 'PRISMA' Flow Diagrams

Systematic reviews should be described in a high degree of methodological detail. The 'PRISMA' Statement calls for a high level of reporting detail in systematic reviews and meta-analyses. An integral part of the methodological description of a review is a flow diagram. This package produces an interactive flow diagram that conforms to the 'PRISMA2020' preprint. When made interactive, the reader/user can click on each box and be directed to another website or file online (e.g. a detailed description of the screening methods, or a list of excluded full texts), with a mouse-over tool tip that describes the information linked to in more detail. Interactive versions can be saved as HTML files, whilst static versions for inclusion in manuscripts can be saved as HTML, PDF, PNG, SVG, PS or WEBP files.

Maintained by Chris Pritchard. Last updated 2 years ago.

0.5 match 1 stars 2.46 score 29 scripts

michaelklein916

crso:Cancer Rule Set Optimization ('crso')

An algorithm for identifying candidate driver combinations in cancer. CRSO is based on a theoretical model of cancer in which a cancer rule is defined to be a collection of two or more events (i.e., alterations) that are minimally sufficient to cause cancer. A cancer rule set is a set of cancer rules that collectively are assumed to account for all of ways to cause cancer in the population. In CRSO every event is designated explicitly as a passenger or driver within each patient. Each event is associated with a patient-specific, event-specific passenger penalty, reflecting how unlikely the event would have happened by chance, i.e., as a passenger. CRSO evaluates each rule set by assigning all samples to a rule in the rule set, or to the null rule, and then calculating the total statistical penalty from all unassigned event. CRSO uses a three phase procedure find the best rule set of fixed size K for a range of Ks. A core rule set is then identified from among the best rule sets of size K as the rule set that best balances rule set size and statistical penalty. Users should consult the 'crso' vignette for an example walk through of a full CRSO run. The full description, of the CRSO algorithm is presented in: Klein MI, Cannataro V, Townsend J, Stern DF and Zhao H. "Identifying combinations of cancer driver in individual patients." BioRxiv 674234 [Preprint]. June 19, 2019. <doi:10.1101/674234>. Please cite this article if you use 'crso'.

Maintained by Michael Klein. Last updated 6 years ago.

0.5 match 2.32 score 21 scripts

lmrodriguezr

enveomics.R:Various Utilities for Microbial Genomics and Metagenomics

A collection of functions for microbial ecology and other applications of genomics and metagenomics. Companion package for the Enveomics Collection (Rodriguez-R, L.M. and Konstantinidis, K.T., 2016 <DOI:10.7287/peerj.preprints.1900v1>).

Maintained by Luis M. Rodriguez-R. Last updated 1 months ago.

0.5 match 2.17 score 26 scripts

melinar

ZIprop:Permutations Tests and Performance Indicator for Zero-Inflated Proportions Response

Permutations tests to identify factor correlated to zero-inflated proportions response. Provide a performance indicator based on Spearman correlation to quantify the part of correlation explained by the selected set of factors. See details for the method at the following preprint e.g.: <https://hal.archives-ouvertes.fr/hal-02936779v3>.

Maintained by Melina Ribaud. Last updated 4 years ago.

0.5 match 2.08 score 12 scripts

cran

kko:Kernel Knockoffs Selection for Nonparametric Additive Models

A variable selection procedure, dubbed KKO, for nonparametric additive model with finite-sample false discovery rate control guarantee. The method integrates three key components: knockoffs, subsampling for stability, and random feature mapping for nonparametric function approximation. For more information, see the accompanying paper: Dai, X., Lyu, X., & Li, L. (2021). “Kernel Knockoffs Selection for Nonparametric Additive Models”. arXiv preprint <arXiv:2105.11659>.

Maintained by Xiang Lyu. Last updated 3 years ago.

0.5 match 1 stars 2.00 score

jdgonzalezwork

tetrascatt:Acoustic Scattering for Complex Shapes by Using the DWBA

Uses the Distorted Wave Born Approximation (DWBA) to compute the acoustic backward scattering, the geometry of the object is formed by a volumetric mesh, composed of tetrahedrons. This computation is done efficiently through an analytical 3D integration that allows for a solution which is expressed in terms of elementary functions for each tetrahedron. It is important to note that this method is only valid for objects whose acoustic properties, such as density and sound speed, do not vary significantly compared to the surrounding medium. (See Lavia, Cascallares and Gonzalez, J. D. (2023). TetraScatt model: Born approximation for the estimation of acoustic dispersion of fluid-like objects of arbitrary geometries. arXiv preprint <arXiv:2312.16721>).

Maintained by Juan Domingo Gonzalez. Last updated 1 years ago.

openblas cpp

0.5 match 2.00 score 5 scripts

chrislloyd58

exact.n:Exact Samples Sizes and Inference for Clinical Trials with Binary Endpoint

Allows the user to determine minimum sample sizes that achieve target size and power at a specified alternative. For more information, see “Exact samples sizes for clinical trials subject to size and power constraints” by Lloyd, C.J. (2022) Preprint <doi:10.13140/RG.2.2.11828.94085>.

Maintained by Chris J. Lloyd. Last updated 1 years ago.

0.5 match 1.70 score

cran

GeoAdjust:Accounting for Random Displacements of True GPS Coordinates of Data

The purpose is to account for the random displacements (jittering) of true survey household cluster center coordinates in geostatistical analyses of Demographic and Health Surveys program (DHS) data. Adjustment for jittering can be implemented either in the spatial random effect, or in the raster/distance based covariates, or in both. Detailed information about the methods behind the package functionality can be found in two preprints. Umut Altay, John Paige, Andrea Riebler, Geir-Arne Fuglstad (2022) <arXiv:2202.11035v2>. Umut Altay, John Paige, Andrea Riebler, Geir-Arne Fuglstad (2022) <arXiv:2211.07442v1>.

Maintained by Umut Altay. Last updated 1 years ago.

cpp

0.5 match 1.70 score 1 scripts

vcerqueira

autoBagging:Learning to Rank Bagging Workflows with Metalearning

A framework for automated machine learning. Concretely, the focus is on the optimisation of bagging workflows. A bagging workflows is composed by three phases: (i) generation: which and how many predictive models to learn; (ii) pruning: after learning a set of models, the worst ones are cut off from the ensemble; and (iii) integration: how the models are combined for predicting a new observation. autoBagging optimises these processes by combining metalearning and a learning to rank approach to learn from metadata. It automatically ranks 63 bagging workflows by exploiting past performance and dataset characterization. A complete description of the method can be found in: Pinto, F., Cerqueira, V., Soares, C., Mendes-Moreira, J. (2017): "autoBagging: Learning to Rank Bagging Workflows with Metalearning" arXiv preprint arXiv:1706.09367.

Maintained by Vitor Cerqueira. Last updated 8 years ago.

0.5 match 1.70 score 9 scripts

junyuchen-econ

ablasso:Arellano-Bond LASSO Estimator for Dynamic Linear Panel Models

Implements the Arellano-Bond estimation method combined with LASSO for dynamic linear panel models. See Chernozhukov et al. (2024) "Arellano-Bond LASSO Estimator for Dynamic Linear Panel Models". arXiv preprint <doi:10.48550/arXiv.2402.00584>.

Maintained by Junyu Chen. Last updated 1 months ago.

0.5 match 1 stars 1.30 score 1 scripts

shchurch

countland:Analysis of Biological Count Data, Especially from Single-Cell RNA-Seq

A set of functions for applying a restricted linear algebra to the analysis of count-based data. See the accompanying preprint manuscript: "Normalizing need not be the norm: count-based math for analyzing single-cell data" Church et al (2022) <doi:10.1101/2022.06.01.494334> This tool is specifically designed to analyze count matrices from single cell RNA sequencing assays. The tools implement several count-based approaches for standard steps in single-cell RNA-seq analysis, including scoring genes and cells, comparing cells and clustering, calculating differential gene expression, and several methods for rank reduction. There are many opportunities for further optimization that may prove useful in the analysis of other data. We provide the source code freely available at <https://github.com/shchurch/countland> and encourage users and developers to fork the code for their own purposes.

Maintained by Church Samuel H.. Last updated 1 years ago.

0.5 match 1.32 score 21 scripts

e-caron

slm:Stationary Linear Models

Provides statistical procedures for linear regression in the general context where the errors are assumed to be correlated. Different ways to estimate the asymptotic covariance matrix of the least squares estimators are available. Starting from this estimation of the covariance matrix, the confidence intervals and the usual tests on the parameters are modified. The functions of this package are very similar to those of 'lm': it contains methods such as summary(), plot(), confint() and predict(). The 'slm' package is described in the paper by E. Caron, J. Dedecker and B. Michel (2019), "Linear regression with stationary errors: the R package slm", arXiv preprint <arXiv:1906.06583>.

Maintained by Emmanuel Caron. Last updated 5 years ago.

0.5 match 1.28 score 19 scripts

gloewing

sMTL:Sparse Multi-Task Learning

Implements L0-constrained Multi-Task Learning and domain generalization algorithms. The algorithms are coded in Julia allowing for fast implementations of the coordinate descent and local combinatorial search algorithms. For more details, see a preprint of the paper: Loewinger et al., (2022) <arXiv:2212.08697>.

Maintained by Gabriel Loewinger. Last updated 2 years ago.

0.5 match 1.00 score 8 scripts

cran

cspec:Complete Discrete Fourier Transform (DFT) and Periodogram

Calculate the predictive discrete Fourier transform, complete discrete Fourier transform, complete periodogram, and tapered complete periodogram. This algorithm is based on the preprint "Spectral methods for small sample time series: A complete periodogram approach" (2020) by Sourav Das, Suhasini Subba Rao, and Junho Yang.

Maintained by Junho Yang. Last updated 5 years ago.

0.5 match 1.00 score 1 scripts

cran

multiRDPG:Multiple Random Dot Product Graphs

Fits the Multiple Random Dot Product Graph Model and performs a test for whether two networks come from the same distribution. Both methods are proposed in Nielsen, A.M., Witten, D., (2018) "The Multiple Random Dot Product Graph Model", arXiv preprint <arXiv:1811.12172> (Submitted to Journal of Computational and Graphical Statistics).

Maintained by Agnes Martine Nielsen. Last updated 6 years ago.

0.5 match 1.00 score

cran

hdthreshold:Inference on Many Jumps in Nonparametric Panel Regression Models

Provides uniform testing procedures for existence and heterogeneity of threshold effects in high-dimensional nonparametric panel regression models. The package accompanies the paper Chen, Keilbar, Su and Wang (2023) "Inference on many jumps in nonparametric panel regression models". arXiv preprint <doi:10.48550/arXiv.2312.01162>.

Maintained by Georg Keilbar. Last updated 3 months ago.

0.5 match 1.00 score

rl1081

L2hdchange:L2 Inference for Change Points in High-Dimensional Time Series

Provides a method for detecting multiple change points in high-dimensional time series, targeting dense or spatially clustered signals. See Li et al. (2023) "L2 Inference for Change Points in High-Dimensional Time Series via a Two-Way MOSUM". arXiv preprint <arXiv:2208.13074>.

Maintained by Rui Lin. Last updated 2 years ago.

0.5 match 1 stars 1.00 score 1 scripts

cran

l1spectral:An L1-Version of the Spectral Clustering

Provides an l1-version of the spectral clustering algorithm devoted to robustly clustering highly perturbed graphs using l1-penalty. This algorithm is described with more details in the preprint C. Champion, M. Champion, M. Blazère, R. Burcelin and J.M. Loubes, "l1-spectral clustering algorithm: a spectral clustering method using l1-regularization" (2022).

Maintained by Magali Champion. Last updated 3 years ago.

openblas cpp

0.5 match 1.00 score

cran

latentgraph:Graphical Models with Latent Variables

Three methods are provided to estimate graphical models with latent variables: (1) Jin, Y., Ning, Y., and Tan, K. M. (2020) (preprint available); (2) Chandrasekaran, V., Parrilo, P. A. & Willsky, A. S. (2012) <doi:10.1214/11-AOS949>; (3) Tan, K. M., Ning, Y., Witten, D. M. & Liu, H. (2016) <doi:10.1093/biomet/asw050>.

Maintained by Yanxin Jin. Last updated 4 years ago.

openblas cpp

0.5 match 1.00 score

cran

neuromplex:Neural Multiplexing Analysis

Statistical methods for whole-trial and time-domain analysis of single cell neural response to multiple stimuli presented simultaneously. The package is based on the paper by C Glynn, ST Tokdar, A Zaman, VC Caruso, JT Mohl, SM Willett, and JM Groh (2021) "Analyzing second order stochasticity of neural spiking under stimuli-bundle exposure", is in press for publication by the Annals of Applied Statistics. A preprint may be found at <arXiv:1911.04387>.

Maintained by Surya Tokdar. Last updated 4 years ago.

0.5 match 1.00 score

cran

cotrend:Consistent Co-Trending Rank Selection

Implements cointegration/co-trending rank selection algorithm in Guo and Shintani (2013) "Consistent co-trending rank selection when both stochastic and nonlinear deterministic trends are present". The Econometrics Journal 16: 473-483 <doi:10.1111/j.1368-423X.2012.00392.x>. Numbered examples correspond to Feb 2011 preprint <http://www.fas.nus.edu.sg/ecs/events/seminar/seminar-papers/05Apr11.pdf>.

Maintained by A. Christian Silva. Last updated 5 years ago.

0.5 match 1.00 score 2 scripts

nkyangtzeqian

ACSSpack:ACSS, Corresponding ACSS, and GLP Algorithm

Allow user to run the Adaptive Correlated Spike and Slab (ACSS) algorithm, corresponding INdependent Spike and Slab (INSS) algorithm, and Giannone, Lenza and Primiceri (GLP) algorithm with adaptive burn-in. All of the three algorithms are used to fit high dimensional data set with either sparse structure, or dense structure with smaller contributions from all predictors. The state-of-the-art GLP algorithm is in Giannone, D., Lenza, M., & Primiceri, G. E. (2021, ISBN:978-92-899-4542-4) "Economic predictions with big data: The illusion of sparsity". The two new algorithms, ACSS algorithm and INSS algorithm, and the discussion on their performance can be seen in Yang, Z., Khare, K., & Michailidis, G. (2024, preprint) "Bayesian methodology for adaptive sparsity and shrinkage in regression".

Maintained by Ziqian Yang. Last updated 8 months ago.

openblas cpp openmp

0.5 match 1.00 score