R-universe search: string

tidyverse

stringr:Simple, Consistent Wrappers for Common String Operations

A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.

Maintained by Hadley Wickham. Last updated 7 months ago.

regular-expression strings

68.3 match 622 stars 21.97 score 164k scripts 8.2k dependents

gagolews

stringi:Fast and Portable Character String Processing Facilities

A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).

Maintained by Marek Gagolewski. Last updated 1 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi stringr text text-processing tidy-data unicode cpp

67.3 match 309 stars 18.31 score 10k scripts 8.6k dependents

rpolars

polars:Lightning-Fast 'DataFrame' Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Soren Welling. Last updated 3 days ago.

arrow polars rust

75.3 match 499 stars 12.01 score 1.0k scripts 2 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 days ago.

fortran cpp

44.6 match 87 stars 16.68 score 7.7k scripts 99 dependents

tidyverse

glue:Interpreted String Literals

An implementation of interpreted string literals, inspired by Python's Literal String Interpolation <https://www.python.org/dev/peps/pep-0498/> and Docstrings <https://www.python.org/dev/peps/pep-0257/> and Julia's Triple-Quoted String Literals <https://docs.julialang.org/en/v1.3/manual/strings/#Triple-Quoted-String-Literals-1>.

Maintained by Jennifer Bryan. Last updated 5 months ago.

string-interpolation strings

29.8 match 729 stars 21.76 score 57k scripts 14k dependents

r-lib

cli:Helpers for Developing Command Line Interfaces

A suite of tools to build attractive command line interfaces ('CLIs'), from semantic elements: headings, lists, alerts, paragraphs, etc. Supports custom themes via a 'CSS'-like language. It also contains a number of lower level 'CLI' elements: rules, boxes, trees, and 'Unicode' symbols with 'ASCII' alternatives. It support ANSI colors and text styles as well.

Maintained by Gábor Csárdi. Last updated 5 days ago.

cli

29.2 match 664 stars 19.30 score 1.4k scripts 14k dependents

lrberge

stringmagic:Character String Operations and Interpolation, Magic Edition

Performs complex string operations compactly and efficiently. Supports string interpolation jointly with over 50 string operations. Also enhances regular string functions (like grep() and co). See an introduction at <https://lrberge.github.io/stringmagic/>.

Maintained by Laurent R Berge. Last updated 7 months ago.

interpolation string cpp

51.3 match 15 stars 10.56 score 37 scripts 33 dependents

rorynolan

strex:Extra String Manipulation Functions

There are some things that I wish were easier with the 'stringr' or 'stringi' packages. The foremost of these is the extraction of numbers from strings. 'stringr' and 'stringi' make you figure out the regular expression for yourself; 'strex' takes care of this for you. There are many other handy functionalities in 'strex'. Contributions to this package are encouraged; it is intended as a miscellany of string manipulation functions that cannot be found in 'stringi' or 'stringr'.

Maintained by Rory Nolan. Last updated 6 months ago.

46.8 match 41 stars 10.59 score 1.2k scripts 18 dependents

harrelfe

Hmisc:Harrell Miscellaneous

Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.

Maintained by Frank E Harrell Jr. Last updated 2 days ago.

fortran

27.5 match 210 stars 17.61 score 17k scripts 750 dependents

ropensci

redland:RDF Library Bindings in R

Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.

Maintained by Matthew B. Jones. Last updated 1 years ago.

redland

58.8 match 17 stars 7.85 score 98 scripts 13 dependents

kwb-r

kwb.utils:General Utility Functions Developed at KWB

This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).

Maintained by Hauke Sonnenberg. Last updated 12 months ago.

62.4 match 8 stars 7.33 score 12 scripts 78 dependents

eitsupi

neopolars:R Bindings for the 'polars' Rust Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Tatsuya Shima. Last updated 4 days ago.

rust cargo

74.0 match 38 stars 4.84 score 1 scripts

tiledb-inc

tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays

The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.

Maintained by Isaiah Norton. Last updated 3 days ago.

array hdfs s3 storage-manager tiledb cpp

26.4 match 107 stars 11.96 score 306 scripts 4 dependents

rorynolan

filesstrings:Handy File and String Manipulation

This started out as a package for file and string manipulation. Since then, the 'fs' and 'strex' packages emerged, offering functionality previously given by this package (but it's done better in these new ones). Those packages have hence almost pushed 'filesstrings' into extinction. However, it still has a small number of unique, handy file manipulation functions which can be seen in the vignette. One example is a function to remove spaces from all file names in a directory.

Maintained by Rory Nolan. Last updated 1 years ago.

36.6 match 22 stars 8.59 score 632 scripts 4 dependents

sjmack

HLAtools:Toolkit for HLA Immunogenomics

A toolkit for the analysis and management of data for genes in the so-called "Human Leukocyte Antigen" (HLA) region. Functions extract reference data from the Anthony Nolan HLA Informatics Group/ImmunoGeneTics HLA 'GitHub' repository (ANHIG/IMGTHLA) <https://github.com/ANHIG/IMGTHLA>, validate Genotype List (GL) Strings, convert between UNIFORMAT and GL String Code (GLSC) formats, translate HLA alleles and GLSCs across ImmunoPolymorphism Database (IPD) IMGT/HLA Database release versions, identify differences between pairs of alleles at a locus, generate customized, multi-position sequence alignments, trim and convert allele-names across nomenclature epochs, and extend existing data-analysis methods.

Maintained by Steven Mack. Last updated 12 days ago.

49.2 match 4 stars 6.21 score 7 scripts 1 dependents

pythonicr

strs:'Python' Style String Functions

A comprehensive set of string manipulation functions based on those found in 'Python' without relying on 'reticulate'. It provides functions that intend to (1) make it easier for users familiar with 'Python' to work with strings, (2) reduce the complexity often associated with string operations, (3) and enable users to write more readable and maintainable code that manipulates strings.

Maintained by Garrett Shipley. Last updated 2 months ago.

77.7 match 2 stars 3.90 score 5 scripts

dankelley

oce:Analysis of Oceanographic Data

Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.

Maintained by Dan Kelley. Last updated 8 days ago.

oceanography fortran cpp

18.5 match 146 stars 15.45 score 4.2k scripts 18 dependents

bioc

Biostrings:Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Maintained by Hervé Pagès. Last updated 23 days ago.

sequencematching alignment sequencing genetics dataimport datarepresentation infrastructure bioconductor-package core-package

14.8 match 61 stars 17.83 score 8.6k scripts 1.2k dependents

berndbischl

BBmisc:Miscellaneous Helper Functions for B. Bischl

Miscellaneous helper functions for and from B. Bischl and some other guys, mainly for package development.

Maintained by Bernd Bischl. Last updated 2 years ago.

24.4 match 20 stars 10.59 score 980 scripts 69 dependents

r-lib

crayon:Colored Terminal Output

The crayon package is now superseded. Please use the 'cli' package for new projects. Colored terminal output on terminals that support 'ANSI' color and highlight codes. It also works in 'Emacs' 'ESS'. 'ANSI' color support is automatically detected. Colors and highlighting can be combined and nested. New styles can also be created easily. This package was inspired by the 'chalk' 'JavaScript' project.

Maintained by Gábor Csárdi. Last updated 5 months ago.

14.8 match 324 stars 16.61 score 1.5k scripts 6.0k dependents

winvector

wrapr:Wrap R Tools for Debugging and Parametric Programming

Tools for writing and debugging R code. Provides: '%.>%' dot-pipe (an 'S3' configurable pipe), unpack/to (R style multiple assignment/return), 'build_frame()'/'draw_frame()' ('data.frame' example tools), 'qc()' (quoting concatenate), ':=' (named map builder), 'let()' (converts non-standard evaluation interfaces to parametric standard evaluation interfaces, inspired by 'gtools::strmacro()' and 'base::bquote()'), and more.

Maintained by John Mount. Last updated 2 years ago.

22.1 match 137 stars 11.11 score 390 scripts 12 dependents

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 1 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

30.5 match 20 stars 7.60 score 55 scripts

henrikbengtsson

R.utils:Various Programming Utilities

Utility functions useful when programming and developing R packages.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

16.2 match 63 stars 13.74 score 5.7k scripts 814 dependents

r-lib

systemfonts:System Native Font Finding

Provides system native access to the font catalogue. As font handling varies between systems it is difficult to correctly locate installed fonts across different operating systems. The 'systemfonts' package provides bindings to the native libraries on Windows, macOS and Linux for finding font files that can then be used further by e.g. graphic devices. The main use is intended to be from compiled code but 'systemfonts' also provides access from R.

Maintained by Thomas Lin Pedersen. Last updated 2 months ago.

fonts fontconfig freetype cpp

13.6 match 95 stars 15.62 score 384 scripts 990 dependents

atorus-research

Tplyr:A Traceability Focused Grammar of Clinical Data Summary

A traceability focused tool created to simplify the data manipulation necessary to create clinical summaries.

Maintained by Mike Stackhouse. Last updated 1 years ago.

pharma tables

21.8 match 95 stars 9.49 score 138 scripts 2 dependents

brodieg

fansi:ANSI Control Sequence Aware String Functions

Counterparts to R string manipulation functions that account for the effects of ANSI text formatting control sequences.

Maintained by Brodie Gaslam. Last updated 10 months ago.

string-manipulation

14.5 match 54 stars 14.18 score 136 scripts 11k dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

24.5 match 3 stars 8.20 score 7.8k scripts 11 dependents

bioc

alabaster.string:Save and Load Biostrings to/from File

Save Biostrings objects to file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

38.8 match 4.95 score 5 scripts 2 dependents

tbates

umx:Structural Equation Modeling and Twin Modeling in R

Quickly create, run, and report structural equation models, and twin models. See '?umx' for help, and umx_open_CRAN_page("umx") for NEWS. Timothy C. Bates, Michael C. Neale, Hermine H. Maes, (2019). umx: A library for Structural Equation and Twin Modelling in R. Twin Research and Human Genetics, 22, 27-41. <doi:10.1017/thg.2019.2>.

Maintained by Timothy C. Bates. Last updated 23 hours ago.

behavior-genetics genetics openmx psychology sem statistics structural-equation-modeling tutorials twin-models umx

19.9 match 44 stars 9.45 score 472 scripts

yihui

xfun:Supporting Functions for Packages Maintained by 'Yihui Xie'

Miscellaneous functions commonly used in other packages maintained by 'Yihui Xie'.

Maintained by Yihui Xie. Last updated 2 days ago.

10.0 match 145 stars 18.18 score 916 scripts 4.4k dependents

stochastictree

stochtree:Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference

Flexible stochastic tree ensemble software. Robust implementations of Bayesian Additive Regression Trees (BART) Chipman, George, McCulloch (2010) <doi:10.1214/09-AOAS285> for supervised learning and Bayesian Causal Forests (BCF) Hahn, Murray, Carvalho (2020) <doi:10.1214/19-BA1195> for causal inference. Enables model serialization and parallel sampling and provides a low-level interface for custom stochastic forest samplers.

Maintained by Drew Herren. Last updated 17 days ago.

bart bayesian-machine-learning bayesian-methods decision-trees gradient-boosted-trees machine-learning probabilistic-models tree-ensembles cpp

21.3 match 20 stars 8.52 score 40 scripts

hneth

ds4psy:Data Science for Psychologists

All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2023), freely available at <https://bookdown.org/hneth/ds4psy/>. The book and course introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant.

Maintained by Hansjoerg Neth. Last updated 1 months ago.

data-literacy data-science education exploratory-data-analysis psychology social-sciences visualisation

26.1 match 22 stars 6.79 score 70 scripts

dstgithub

GrpString:Patterns and Statistical Differences Between Two Groups of Strings

Methods include converting series of event names to strings, finding common patterns in a group of strings, discovering featured patterns when comparing two groups of strings as well as the number and starting position of each pattern in each string, obtaining transition matrix, computing transition entropy, statistically comparing the difference between two groups of strings, and clustering string groups. Event names can be any action names or labels such as events in log files or areas of interest (AOIs) in eye tracking research.

Maintained by Hui (Tom) Tang. Last updated 7 years ago.

50.0 match 2 stars 3.48 score 30 scripts

tidyverse

tidyr:Tidy Messy Data

Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).

Maintained by Hadley Wickham. Last updated 12 days ago.

tidy-data cpp

7.5 match 1.4k stars 22.88 score 168k scripts 5.5k dependents

leonawicz

tabr:Music Notation Syntax, Manipulation, Analysis and Transcription in R

Provides a music notation syntax and a collection of music programming functions for generating, manipulating, organizing, and analyzing musical information in R. Music syntax can be entered directly in character strings, for example to quickly transcribe short pieces of music. The package contains functions for directly performing various mathematical, logical and organizational operations and musical transformations on special object classes that facilitate working with music data and notation. The same music data can be organized in tidy data frames for a familiar and powerful approach to the analysis of large amounts of structured music data. Functions are available for mapping seamlessly between these formats and their representations of musical information. The package also provides an API to 'LilyPond' (<https://lilypond.org/>) for transcribing musical representations in R into tablature ("tabs") and sheet music. 'LilyPond' is open source music engraving software for generating high quality sheet music based on markup syntax. The package generates 'LilyPond' files from R code and can pass them to the 'LilyPond' command line interface to be rendered into sheet music PDF files or inserted into R markdown documents. The package offers nominal MIDI file output support in conjunction with rendering sheet music. The package can read MIDI files and attempts to structure the MIDI data to integrate as best as possible with the data structures and functionality found throughout the package.

Maintained by Matthew Leonawicz. Last updated 6 months ago.

guitar-tablature lilypond lilypond-api music-analysis music-data music-notation music-programming music-syntax music-transcription sheet-music

21.4 match 132 stars 7.87 score 94 scripts

gbganalyst

forstringr:String Manipulation Package for Those Familiar with 'Microsoft Excel'

The goal of 'forstringr' is to enable complex string manipulation in R especially to those more familiar with LEFT(), RIGHT(), and MID() functions in Microsoft Excel. The package combines the power of 'stringr' with other manipulation packages such as 'dplyr' and 'tidyr'.

Maintained by Ezekiel Ogundepo. Last updated 6 months ago.

glue regex string

28.4 match 10 stars 5.89 score 26 scripts 1 dependents

rapporter

rapportools:Miscellaneous (Stats) Helper Functions with Sane Defaults for Reporting

Helper functions that act as wrappers to more advanced statistical methods with the advantage of having sane defaults for quick reporting.

Maintained by Gergely Daróczi. Last updated 15 days ago.

21.9 match 8 stars 7.50 score 186 scripts 11 dependents

rossellhayes

stringstatic:Dependency-Free String Operations

Provides drop-in replacements for functions from the 'stringr' package, with the same user interface. These functions have no external dependencies and can be copied directly into your package code using the 'staticimports' package.

Maintained by Alexander Rossell Hayes. Last updated 2 years ago.

43.6 match 6 stars 3.48 score 1 scripts

gagolews

stringx:Replacements for Base String Functions Powered by 'stringi'

English is the native language for only 5% of the World population. Also, only 17% of us can understand this text. Moreover, the Latin alphabet is the main one for merely 36% of the total. The early computer era, now a very long time ago, was dominated by the US. Due to the proliferation of the internet, smartphones, social media, and other technologies and communication platforms, this is no longer the case. This package replaces base R string functions (such as grep(), tolower(), sprintf(), and strptime()) with ones that fully support the Unicode standards related to natural language and date-time processing. It also fixes some long-standing inconsistencies, and introduces some new, useful features. Thanks to 'ICU' (International Components for Unicode) and 'stringi', they are fast, reliable, and portable across different platforms.

Maintained by Marek Gagolewski. Last updated 2 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi text text-processing unicode

31.4 match 28 stars 4.75 score 1 scripts

insightsengineering

formatters:ASCII Formatting for Values and Tables

We provide a framework for rendering complex tables to ASCII, and a set of formatters for transforming values or sets of values into ASCII-ready display strings.

Maintained by Joe Zhu. Last updated 2 months ago.

format matrix table

14.4 match 17 stars 10.19 score 22 scripts 20 dependents

r-lib

rlang:Functions for Base Types and Core R and 'Tidyverse' Features

A toolbox for working with base types, core R features like the condition system, and core 'Tidyverse' features like tidy evaluation.

Maintained by Lionel Henry. Last updated 19 days ago.

7.1 match 517 stars 20.53 score 9.8k scripts 15k dependents

cbielow

PTXQC:Quality Report Generation for MaxQuant and mzTab Results

Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.

Maintained by Chris Bielow. Last updated 1 years ago.

drag-and-drop hacktoberfest heatmap match-between-runs maxquant metric mztab openms proteomics quality-control quality-metrics report

15.5 match 42 stars 9.35 score 105 scripts 1 dependents

kurthornik

NLP:Natural Language Processing Infrastructure

Basic classes and methods for Natural Language Processing.

Maintained by Kurt Hornik. Last updated 4 months ago.

15.3 match 6 stars 9.37 score 1.0k scripts 127 dependents

easystats

insight:Easy Access to Model Information for Various Model Objects

A tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. 'insight' mainly revolves around two types of functions: Functions that find (the names of) information, starting with 'find_', and functions that get the underlying data, starting with 'get_'. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing.

Maintained by Daniel Lüdecke. Last updated 4 days ago.

easystats hacktoberfest insight models names predictors random

7.8 match 412 stars 17.24 score 568 scripts 210 dependents

mmaechler

sfsmisc:Utilities from 'Seminar fuer Statistik' ETH Zurich

Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, some of which were ported from S-plus in the 1990s. For graphics, have pretty (Log-scale) axes eaxis(), an enhanced Tukey-Anscombe plot, combining histogram and boxplot, 2d-residual plots, a 'tachoPlot()', pretty arrows, etc. For robustness, have a robust F test and robust range(). For system support, notably on Linux, provides 'Sys.*()' functions with more access to system and CPU information. Finally, miscellaneous utilities such as simple efficient prime numbers, integer codes, Duplicated(), toLatex.numeric() and is.whole().

Maintained by Martin Maechler. Last updated 5 months ago.

12.3 match 11 stars 10.87 score 566 scripts 119 dependents

ngmarchant

comparator:Comparison Functions for Clustering and Record Linkage

Implements functions for comparing strings, sequences and numeric vectors for clustering and record linkage applications. Supported comparison functions include: generalized edit distances for comparing sequences/strings, Monge-Elkan similarity for fuzzy comparison of token sets, and L-p distances for comparing numeric vectors. Where possible, comparison functions are implemented in C/C++ to ensure good performance.

Maintained by Neil Marchant. Last updated 3 years ago.

clustering distance-measures distance-metrics entity-resolution record-linkage similarity-measures string-similarity cpp

28.0 match 18 stars 4.63 score 47 scripts

ggrothendieck

gsubfn:Utilities for Strings and Function Arguments

The gsubfn function is like gsub but can take a replacement function or certain other objects instead of the replacement string. Matches and back references are input to the replacement function and replaced by the function output. gsubfn can be used to split strings based on content rather than delimiters and for quasi-perl-style string interpolation. The package also has facilities for translating formulas to functions and allowing such formulas in function calls instead of functions. This can be used with R functions such as apply, sapply, lapply, optim, integrate, xyplot, Filter and any other function that expects another function as an input argument or functions like cat or sql calls that may involve strings where substitution is desirable. There is also a facility for returning multiple objects from functions and a version of transform that allows the RHS to refer to LHS used in the same transform.

Maintained by G. Grothendieck. Last updated 7 years ago.

12.1 match 11 stars 10.58 score 872 scripts 76 dependents

skranz

stringtools:Tools for working with strings in R

Tools for working with strings in R

Maintained by Sebastian Kranz. Last updated 3 years ago.

34.5 match 2 stars 3.66 score 29 scripts 26 dependents

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

10.1 match 29 stars 12.34 score 6.6k scripts 931 dependents

enricoschumann

textutils:Utilities for Handling Strings and Text

Utilities for handling character vectors that store human-readable text (either plain or with markup, such as HTML or LaTeX). The package provides, in particular, functions that help with the preparation of plain-text reports, e.g. for expanding and aligning strings that form the lines of such reports. The package also provides generic functions for transforming R objects to HTML and to plain text.

Maintained by Enrico Schumann. Last updated 2 months ago.

html5 string-manipulation

16.9 match 11 stars 7.37 score 47 scripts 12 dependents

emilhvitfeldt

emoji:Data and Function to Work with Emojis

Contains data about emojis with relevant metadata, and functions to work with emojis when they are in strings.

Maintained by Emil Hvitfeldt. Last updated 5 months ago.

15.6 match 28 stars 7.97 score 304 scripts 3 dependents

yihui

knitr:A General-Purpose Package for Dynamic Report Generation in R

Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.

Maintained by Yihui Xie. Last updated 1 days ago.

dynamic-documents knitr literate-programming rmarkdown sweave

5.3 match 2.4k stars 23.62 score 116k scripts 4.2k dependents

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 23 days ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

12.5 match 233 stars 9.84 score 185 scripts 1 dependents

ohdsi

SqlRender:Rendering Parameterized SQL and Translation to Dialects

A rendering tool for parameterized SQL that also translates into different SQL dialects. These dialects include 'Microsoft SQL Server', 'Oracle', 'PostgreSql', 'Amazon RedShift', 'Apache Impala', 'IBM Netezza', 'Google BigQuery', 'Microsoft PDW', 'Snowflake', 'Azure Synapse Analytics Dedicated', 'Apache Spark', 'SQLite', and 'InterSystems IRIS'.

Maintained by Martijn Schuemie. Last updated 3 days ago.

hades openjdk

9.7 match 82 stars 12.52 score 488 scripts 13 dependents

mlampros

fuzzywuzzyR:Fuzzy String Matching

Fuzzy string matching implementation of the 'fuzzywuzzy' <https://github.com/seatgeek/fuzzywuzzy> 'python' package. It uses the Levenshtein Distance <https://en.wikipedia.org/wiki/Levenshtein_distance> to calculate the differences between sequences.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

fuzzywuzzy matching python reticulate string

20.5 match 37 stars 5.87 score 40 scripts

davidgohel

gdtools:Utilities for Graphical Rendering and Fonts Management

Tools are provided to compute metrics of formatted strings and to check the availability of a font. Another set of functions is provided to support the collection of fonts from 'Google Fonts' in a cache. Their use is simple within 'R Markdown' documents and 'shiny' applications but also with graphic productions generated with the 'ggiraph', 'ragg' and 'svglite' packages or with tabular productions from the 'flextable' package.

Maintained by David Gohel. Last updated 2 months ago.

cairo freetype cpp

9.4 match 27 stars 12.80 score 234 scripts 144 dependents

janmarvin

openxlsx2:Read, Write and Edit 'xlsx' Files

Simplifies the creation of 'xlsx' files by providing a high level interface to writing, styling and editing worksheets.

Maintained by Jan Marvin Garbuszus. Last updated 2 days ago.

xlsx cpp

8.1 match 137 stars 13.66 score 194 scripts 11 dependents

bioc

rhdf5:R Interface to HDF5

This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.

Maintained by Mike Smith. Last updated 2 months ago.

infrastructure dataimport hdf5 rhdf5 openssl curl zlib cpp

6.9 match 62 stars 15.93 score 4.2k scripts 232 dependents

coolbutuseless

yyjsonr:Fast 'JSON', 'NDJSON' and 'GeoJSON' Parser and Generator

A fast 'JSON' parser, generator and validator which converts 'JSON', 'NDJSON' (Newline Delimited 'JSON') and 'GeoJSON' (Geographic 'JSON') data to/from R objects. The standard R data types are supported (e.g. logical, numeric, integer) with configurable handling of NULL and NA values. Data frames, atomic vectors and lists are all supported as data containers translated to/from 'JSON'. 'GeoJSON' data is read in as 'simple features' objects. This implementation wraps the 'yyjson' 'C' library which is available from <https://github.com/ibireme/yyjson>.

Maintained by Mike Cheng. Last updated 4 months ago.

zlib

11.4 match 147 stars 9.56 score 22 scripts 9 dependents

bmewing

mgsub:Safe, Multiple, Simultaneous String Substitution

Designed to enable simultaneous substitution in strings in a safe fashion. Safe means it does not rely on placeholders (which can cause errors in same length matches).

Maintained by Mark Ewing. Last updated 17 days ago.

hacktoberfest

11.7 match 14 stars 9.29 score 270 scripts 26 dependents

mlr-org

mlr3misc:Helper Functions for 'mlr3'

Frequently used helper functions and assertions used in 'mlr3' and its companion packages. Comes with helper functions for functional programming, for printing, to work with 'data.table', as well as some generally useful 'R6' classes. This package also supersedes the package 'BBmisc'.

Maintained by Marc Becker. Last updated 4 months ago.

machine-learning miscellaneous mlr3

10.4 match 12 stars 10.28 score 302 scripts 42 dependents

qile0317

FastUtils:Fast, Readable Utility Functions

A wide variety of tools for general data analysis, wrangling, spelling, statistics, visualizations, package development, and more. All functions have vectorized implementations whenever possible. Exported names are designed to be readable, with longer names possessing short aliases.

Maintained by Qile Yang. Last updated 4 months ago.

scientific-computing utilities utility cpp

21.1 match 2 stars 4.95 score 2 scripts

pecanproject

PEcAn.utils:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Rob Kooper. Last updated 1 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

9.3 match 216 stars 10.92 score 218 scripts 35 dependents

lewinfox

levitate:Fuzzy String Comparison

Provides string similarity calculations inspired by the Python 'thefuzz' package. Compare strings by edit distance, similarity ratio, best matching substring, ordered token matching and set-based token matching. A range of edit distance measures are available thanks to the 'stringdist' package.

Maintained by Lewin Appleton-Fox. Last updated 10 months ago.

data-matching fuzzy-matching similarity-measures string-similarity thefuzz

19.1 match 35 stars 5.24 score 4 scripts

tazinho

snakecase:Convert Strings into any Case

A consistent, flexible and easy to use tool to parse and convert strings into cases like snake or camel among others.

Maintained by Malte Grosser. Last updated 2 years ago.

camelcase case conversion pascalcase snake-case

7.1 match 150 stars 13.99 score 744 scripts 290 dependents

rsheets

cellranger:Translate Spreadsheet Cell Ranges to Rows and Columns

Helper functions to work with spreadsheets and the "A1:D10" style of cell range specification.

Maintained by Jennifer Bryan. Last updated 7 years ago.

7.2 match 51 stars 13.84 score 80 scripts 843 dependents

oobianom

quickcode:Quick and Essential 'R' Tricks for Better Scripts

The NOT functions, 'R' tricks and a compilation of some simple quick plus often used 'R' codes to improve your scripts. Improve the quality and reproducibility of 'R' scripts.

Maintained by Obinna Obianom. Last updated 13 days ago.

colors data distributions images

12.8 match 5 stars 7.76 score 7 scripts 6 dependents

skranz

RTutor:Interactive R problem sets with automatic testing of solutions and automatic hints

Interactive R problem sets with automatic testing of solutions and automatic hints

Maintained by Sebastian Kranz. Last updated 1 years ago.

economics learn-to-code problem-set rstudio rtutor shiny teaching

17.0 match 205 stars 5.83 score 111 scripts 1 dependents

brockk

escalation:A Modular Approach to Dose-Finding Clinical Trials

Methods for working with dose-finding clinical trials. We provide implementations of many dose-finding clinical trial designs, including the continual reassessment method (CRM) by O'Quigley et al. (1990) <doi:10.2307/2531628>, the toxicity probability interval (TPI) design by Ji et al. (2007) <doi:10.1177/1740774507079442>, the modified TPI (mTPI) design by Ji et al. (2010) <doi:10.1177/1740774510382799>, the Bayesian optimal interval design (BOIN) by Liu & Yuan (2015) <doi:10.1111/rssc.12089>, EffTox by Thall & Cook (2004) <doi:10.1111/j.0006-341X.2004.00218.x>; the design of Wages & Tait (2015) <doi:10.1080/10543406.2014.920873>, and the 3+3 described by Korn et al. (1994) <doi:10.1002/sim.4780131802>. All designs are implemented with a common interface. We also offer optional additional classes to tailor the behaviour of all designs, including avoiding skipping doses, stopping after n patients have been treated at the recommended dose, stopping when a toxicity condition is met, or demanding that n patients are treated before stopping is allowed. By daisy-chaining together these classes using the pipe operator from 'magrittr', it is simple to tailor the behaviour of a dose-finding design so it behaves how the trialist wants. Having provided a flexible interface for specifying designs, we then provide functions to run simulations and calculate dose-paths for future cohorts of patients.

Maintained by Kristian Brock. Last updated 2 months ago.

12.3 match 15 stars 7.91 score 67 scripts

mplex

multiplex:Algebraic Tools for the Analysis of Multiple Social Networks

Algebraic procedures for analyses of multiple social networks are delivered with this package as described in Ostoic (2020) <DOI:10.18637/jss.v092.i11>. 'multiplex' makes possible, among other things, to create and manipulate multiplex, multimode, and multilevel network data with different formats. Effective ways are available to treat multiple networks with routines that combine algebraic systems like the partially ordered semigroup with decomposition procedures or semiring structures with the relational bundles occurring in different types of multivariate networks. 'multiplex' provides also an algebraic approach for affiliation networks through Galois derivations between families of the pairs of subsets in the two domains of the network with visualization options.

Maintained by Antonio Rivero Ostoic. Last updated 2 months ago.

algebra network-analysis semigroup semiring

11.9 match 23 stars 8.12 score 69 scripts 2 dependents

william-swl

baizer:Useful Functions for Data Processing

In ancient Chinese mythology, Bai Ze is a divine creature that knows the needs of everything. 'baizer' provides data processing functions frequently used by the author. Hope this package also knows what you want!

Maintained by William Song. Last updated 1 years ago.

dataframe numbers strings tidyverse

24.4 match 6 stars 3.95 score 5 scripts 1 dependents

spgarbet

tangram:The Grammar of Tables

Provides an extensible formula system to quickly and easily create production quality tables. The processing steps are a formula parser, statistical content generation from data as defined by formula, followed by rendering into a table. Each step of the processing is separate and user definable thus creating a set of composable building blocks for highly customizable table generation. A user is not limited by any of the choices of the package creator other than the formula grammar. For example, one could chose to add a different S3 rendering function and output a format not provided in the default package, or possibly one would rather have Gini coefficients for their statistical content in a resulting table. Routines to achieve New England Journal of Medicine style, Lancet style and Hmisc::summaryM() statistics are provided. The package contains rendering for HTML5, Rmarkdown and an indexing format for use in tracing and tracking are provided.

Maintained by Shawn Garbett. Last updated 2 years ago.

16.2 match 68 stars 5.93 score 62 scripts

jamovi

jmvcore:Dependencies for the 'jamovi' Framework

A framework for creating rich interactive analyses for the jamovi platform (see <https://www.jamovi.org> for more information).

Maintained by Jonathon Love. Last updated 6 months ago.

14.7 match 4 stars 6.51 score 20 scripts 8 dependents

r-gregmisc

gdata:Various R Programming Tools for Data Manipulation

Various R programming tools for data manipulation, including medical unit conversions, combining objects, character vector operations, factor manipulation, obtaining information about R objects, generating fixed-width format files, extracting components of date & time objects, operations on columns of data frames, matrix operations, operations on vectors, operations on data frames, value of last evaluated expression, and a resample() wrapper for sample() that ensures consistent behavior for both scalar and vector arguments.

Maintained by Arni Magnusson. Last updated 2 months ago.

6.9 match 9 stars 13.62 score 4.5k scripts 124 dependents

r-lib

textshaping:Bindings to the 'HarfBuzz' and 'Fribidi' Libraries for Text Shaping

Provides access to the text shaping functionality in the 'HarfBuzz' library and the bidirectional algorithm in the 'Fribidi' library. 'textshaping' is a low-level utility package mainly for graphic devices that expands upon the font tool-set provided by the 'systemfonts' package.

Maintained by Thomas Lin Pedersen. Last updated 2 months ago.

harfbuzz freetype fribidi cpp

6.6 match 19 stars 13.58 score 66 scripts 484 dependents

kevinstadler

cultevo:Tools, Measures and Statistical Tests for Cultural Evolution

Provides tools and statistics useful for analysing data from artificial language experiments. It implements the information-theoretic measure of the compositionality of signalling systems due to Spike (2016) <http://hdl.handle.net/1842/25930>, the Mantel test for distance matrix correlation (after Dietz 1983) <doi:10.1093/sysbio/32.1.21>), functions for computing string and meaning distance matrices as well as an implementation of the Page test for monotonicity of ranks (Page 1963) <doi:10.1080/01621459.1963.10500843> with exact p-values up to k = 22.

Maintained by Kevin Stadler. Last updated 1 years ago.

13.7 match 8 stars 6.50 score 131 scripts 1 dependents

r-lib

cpp11:A C++11 Interface for R's C Interface

Provides a header only, C++11 interface to R's C interface. Compared to other approaches 'cpp11' strives to be safe against long jumps from the C API as well as C++ exceptions, conform to normal R function semantics and supports interaction with 'ALTREP' vectors.

Maintained by Davis Vaughan. Last updated 11 days ago.

cpp cpp11

5.0 match 212 stars 17.69 score 104 scripts 8.6k dependents

svilsen

STRMPS:Analysis of Short Tandem Repeat (STR) Massively Parallel Sequencing (MPS) Data

Loading, identifying, aggregating, manipulating, and analysing short tandem repeat regions of massively parallel sequencing data in forensic genetics. The analyses and framework implemented in this package relies on the papers of Vilsen et al. (2017) <doi:10.1016/j.fsigen.2017.01.017> and Vilsen et al. (2018) <doi:10.1016/j.fsigen.2018.04.003>. Note: that the parallelisation in the package relies on mclapply() and, thus, speed-ups will only be seen on UNIX based systems.

Maintained by Søren B. Vilsen. Last updated 2 days ago.

biostrings pwalign shortread iranges

20.2 match 4.30 score

dipterix

dipsaus:A Dipping Sauce for Data Analysis and Visualizations

Works as an "add-on" to packages like 'shiny', 'future', as well as 'rlang', and provides utility functions. Just like dipping sauce adding flavors to potato chips or pita bread, 'dipsaus' for data analysis and visualizations adds handy functions and enhancements to popular packages. The goal is to provide simple solutions that are frequently asked for online, such as how to synchronize 'shiny' inputs without freezing the app, or how to get memory size on 'Linux' or 'MacOS' system. The enhancements roughly fall into these four categories: 1. 'shiny' input widgets; 2. high-performance computing using the 'future' package; 3. modify R calls and convert among numbers, strings, and other objects. 4. utility functions to get system information such like CPU chip-set, memory limit, etc.

Maintained by Zhengjia Wang. Last updated 4 days ago.

cpp

11.0 match 13 stars 7.90 score 85 scripts 3 dependents

markvanderloo

stringdist:Approximate String Matching, Fuzzy Text Search, and String Distance Functions

Implements an approximate string matching version of R's native 'match' function. Also offers fuzzy text search based on various string distance measures. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well. Reference: MPJ van der Loo (2014) <doi:10.32614/RJ-2014-011>.

Maintained by Mark van der Loo. Last updated 3 months ago.

openmp

5.5 match 327 stars 15.54 score 2.0k scripts 179 dependents

bioc

epiNEM:epiNEM

epiNEM is an extension of the original Nested Effects Models (NEM). EpiNEM is able to take into account double knockouts and infer more complex network signalling pathways. It is tailored towards large scale double knock-out screens.

Maintained by Martin Pirkl. Last updated 5 months ago.

pathways systemsbiology networkinference network

14.4 match 1 stars 5.83 score 1 scripts 3 dependents

trinker

qdapRegex:Regular Expression Removal, Extraction, and Replacement Tools

A collection of regular expression tools associated with the 'qdap' package that may be useful outside of the context of discourse analysis. Tools include removal/extraction/replacement of abbreviations, dates, dollar amounts, email addresses, hash tags, numbers, percentages, citations, person tags, phone numbers, times, and zip codes.

Maintained by Tyler Rinker. Last updated 1 years ago.

qdapregex regular-expression

8.8 match 50 stars 9.48 score 502 scripts 41 dependents

rolkra

explore:Simplifies Exploratory Data Analysis

Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.

Maintained by Roland Krasser. Last updated 3 months ago.

data-exploration data-visualisation decision-trees eda rmarkdown shiny tidy

7.3 match 228 stars 11.43 score 221 scripts 1 dependents

alexanderrobitzsch

miceadds:Some Additional Multiple Imputation Functions, Especially for 'mice'

Contains functions for multiple imputation which complements existing functionality in R. In particular, several imputation methods for the mice package (van Buuren & Groothuis-Oudshoorn, 2011, <doi:10.18637/jss.v045.i03>) are implemented. Main features of the miceadds package include plausible value imputation (Mislevy, 1991, <doi:10.1007/BF02294457>), multilevel imputation for variables at any level or with any number of hierarchical and non-hierarchical levels (Grund, Luedtke & Robitzsch, 2018, <doi:10.1177/1094428117703686>; van Buuren, 2018, Ch.7, <doi:10.1201/9780429492259>), imputation using partial least squares (PLS) for high dimensional predictors (Robitzsch, Pham & Yanagida, 2016), nested multiple imputation (Rubin, 2003, <doi:10.1111/1467-9574.00217>), substantive model compatible imputation (Bartlett et al., 2015, <doi:10.1177/0962280214521348>), and features for the generation of synthetic datasets (Reiter, 2005, <doi:10.1111/j.1467-985X.2004.00343.x>; Nowok, Raab, & Dibben, 2016, <doi:10.18637/jss.v074.i11>).

Maintained by Alexander Robitzsch. Last updated 15 days ago.

missing-data multiple-imputation openblas cpp

8.9 match 16 stars 9.16 score 542 scripts 9 dependents

martinzaefferer

CEGO:Combinatorial Efficient Global Optimization

Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces.

Maintained by Martin Zaefferer. Last updated 2 months ago.

26.6 match 1 stars 3.04 score 73 scripts

scharlton2

phreeqc:R Interface to Geochemical Modeling Software

A geochemical modeling program developed by the US Geological Survey that is designed to perform a wide variety of aqueous geochemical calculations, including speciation, batch-reaction, one-dimensional reactive-transport, and inverse geochemical calculations.

Maintained by S.R. Charlton. Last updated 17 days ago.

cpp

23.6 match 9 stars 3.37 score 60 scripts

bioc

annotate:Annotation for microarrays

Using R enviroments for annotation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation pathways go

7.0 match 11.41 score 812 scripts 243 dependents

r-dbi

DBI:R Database Interface

A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.

Maintained by Kirill Müller. Last updated 3 months ago.

database interface

3.8 match 302 stars 20.88 score 19k scripts 2.9k dependents

renkun-ken

formattable:Create 'Formattable' Data Structures

Provides functions to create formattable vectors and data frames. 'Formattable' vectors are printed with text formatting, and formattable data frames are printed with multiple types of formatting in HTML to improve the readability of data presented in tabular form rendered in web pages.

Maintained by Kun Ren. Last updated 3 months ago.

5.3 match 700 stars 14.69 score 3.6k scripts 26 dependents

bioc

DeepPINCS:Protein Interactions and Networks with Compounds based on Sequences using Deep Learning

The identification of novel compound-protein interaction (CPI) is important in drug discovery. Revealing unknown compound-protein interactions is useful to design a new drug for a target protein by screening candidate compounds. The accurate CPI prediction assists in effective drug discovery process. To identify potential CPI effectively, prediction methods based on machine learning and deep learning have been developed. Data for sequences are provided as discrete symbolic data. In the data, compounds are represented as SMILES (simplified molecular-input line-entry system) strings and proteins are sequences in which the characters are amino acids. The outcome is defined as a variable that indicates how strong two molecules interact with each other or whether there is an interaction between them. In this package, a deep-learning based model that takes only sequence information of both compounds and proteins as input and the outcome as output is used to predict CPI. The model is implemented by using compound and protein encoders with useful features. The CPI model also supports other modeling tasks, including protein-protein interaction (PPI), chemical-chemical interaction (CCI), or single compounds and proteins. Although the model is designed for proteins, DNA and RNA can be used if they are represented as sequences.

Maintained by Dongmin Jung. Last updated 5 months ago.

software network graphandnetwork neuralnetwork openjdk

16.2 match 4.78 score 4 scripts 2 dependents

rstudio

reticulate:Interface to 'Python'

Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.

Maintained by Tomasz Kalinowski. Last updated 1 days ago.

cpp

3.6 match 1.7k stars 21.07 score 18k scripts 427 dependents

functionaldata

fdapace:Functional Data Analysis and Empirical Dynamics

A versatile package that provides implementation of various methods of Functional Data Analysis (FDA) and Empirical Dynamics. The core of this package is Functional Principal Component Analysis (FPCA), a key technique for functional data analysis, for sparsely or densely sampled random trajectories and time courses, via the Principal Analysis by Conditional Estimation (PACE) algorithm. This core algorithm yields covariance and mean functions, eigenfunctions and principal component (scores), for both functional data and derivatives, for both dense (functional) and sparse (longitudinal) sampling designs. For sparse designs, it provides fitted continuous trajectories with confidence bands, even for subjects with very few longitudinal observations. PACE is a viable and flexible alternative to random effects modeling of longitudinal data. There is also a Matlab version (PACE) that contains some methods not available on fdapace and vice versa. Updates to fdapace were supported by grants from NIH Echo and NSF DMS-1712864 and DMS-2014626. Please cite our package if you use it (You may run the command citation("fdapace") to get the citation format and bibtex entry). References: Wang, J.L., Chiou, J., Müller, H.G. (2016) <doi:10.1146/annurev-statistics-041715-033624>; Chen, K., Zhang, X., Petersen, A., Müller, H.G. (2017) <doi:10.1007/s12561-015-9137-5>.

Maintained by Yidong Zhou. Last updated 9 months ago.

cpp

6.6 match 31 stars 11.46 score 474 scripts 25 dependents

rstudio

shiny:Web Application Framework for R

Makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.

Maintained by Winston Chang. Last updated 12 days ago.

reactive rstudio shiny web-app web-development

3.5 match 5.4k stars 21.28 score 108k scripts 1.8k dependents

jbkunst

highcharter:A Wrapper for the 'Highcharts' Library

A wrapper for the 'Highcharts' library including shortcut functions to plot R objects. 'Highcharts' <https://www.highcharts.com/> is a charting library offering numerous chart types with a simple configuration syntax.

Maintained by Joshua Kunst. Last updated 1 years ago.

highcharts htmlwidgets shiny shiny-r visualization wrapper

5.4 match 725 stars 13.93 score 4.9k scripts 18 dependents

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 5 days ago.

4.0 match 584 stars 18.71 score 7.2k scripts 380 dependents

thomasp85

farver:High Performance Colour Space Manipulation

The encoding of colour can be handled in many different ways, using different colour spaces. As different colour spaces have different uses, efficient conversion between these representations are important. The 'farver' package provides a set of functions that gives access to very fast colour space conversion and comparisons implemented in C++, and offers speed improvements over the 'convertColor' function in the 'grDevices' package.

Maintained by Thomas Lin Pedersen. Last updated 10 months ago.

color-conversion cpp

5.2 match 136 stars 14.17 score 164 scripts 7.9k dependents

mlampros

textTinyR:Text Processing for Small or Big Data Files

It offers functions for splitting, parsing, tokenizing and creating a vocabulary for big text data files. Moreover, it includes functions for building a document-term matrix and extracting information from those (term-associations, most frequent terms). It also embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. Lastly, it includes functions for Word Vector Representations (i.e. 'GloVe', 'fasttext') and incorporates functions for the calculation of (pairwise) text document dissimilarities. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.

Maintained by Lampros Mouselimis. Last updated 1 years ago.

bh boost cpp11 processing rcpp rcpparmadillo text openblas cpp openmp

9.7 match 38 stars 7.64 score 244 scripts 1 dependents

r-lib

prettyunits:Pretty, Human Readable Formatting of Quantities

Pretty, human readable formatting of quantities. Time intervals: '1337000' -> '15d 11h 23m 20s'. Vague time intervals: '2674000' -> 'about a month ago'. Bytes: '1337' -> '1.34 kB'. Rounding: '99' with 3 significant digits -> '99.0' p-values: '0.00001' -> '<0.0001'. Colors: '#FF0000' -> 'red'. Quantities: '1239437' -> '1.24 M'.

Maintained by Gabor Csardi. Last updated 7 months ago.

5.5 match 133 stars 13.46 score 86 scripts 3.3k dependents

heardacat

Ramble:Parser Combinator for R

Parser generator for R using combinatory parsers. It is inspired by combinatory parsers developed in Haskell.

Maintained by Chapman Siu. Last updated 8 years ago.

combinatory-parsers parser-combinators parsing

12.4 match 22 stars 5.93 score 39 scripts

psychbruce

bruceR:Broadly Useful Convenient and Efficient R Functions

Broadly useful convenient and efficient R functions that bring users concise and elegant R data analyses. This package includes easy-to-use functions for (1) basic R programming (e.g., set working directory to the path of currently opened file; import/export data from/to files in any format; print tables to Microsoft Word); (2) multivariate computation (e.g., compute scale sums/means/... with reverse scoring); (3) reliability analyses and factor analyses; (4) descriptive statistics and correlation analyses; (5) t-test, multi-factor analysis of variance (ANOVA), simple-effect analysis, and post-hoc multiple comparison; (6) tidy report of statistical models (to R Console and Microsoft Word); (7) mediation and moderation analyses (PROCESS); and (8) additional toolbox for statistics and graphics.

Maintained by Han-Wu-Shuang Bao. Last updated 9 months ago.

anova data-analysis data-science linear-models linear-regression multilevel-models statistics toolbox

9.3 match 176 stars 7.87 score 316 scripts 3 dependents

ropensci

osmdata:Import 'OpenStreetMap' Data as Simple Features or Spatial Objects

Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (<https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.

Maintained by Mark Padgham. Last updated 1 months ago.

open0street0map openstreetmap overpass0api osm cpp osm-data overpass-api peer-reviewed cpp

5.1 match 322 stars 14.53 score 2.8k scripts 14 dependents

ropensci

jqr:Client for 'jq', a 'JSON' Processor

Client for 'jq', a 'JSON' processor (<https://jqlang.github.io/jq/>), written in C. 'jq' allows the following with 'JSON' data: index into, parse, do calculations, cut up and filter, change key names and values, perform conditionals and comparisons, and more.

Maintained by Jeroen Ooms. Last updated 3 months ago.

jq json

7.2 match 144 stars 10.04 score 95 scripts 28 dependents

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

5.7 match 79 stars 12.62 score 186 scripts 9 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 3 days ago.

5.3 match 845 stars 13.57 score 264 scripts 2 dependents

azure

AzureKusto:Interface to 'Kusto'/'Azure Data Explorer'

An interface to 'Azure Data Explorer', also known as 'Kusto', a fast, distributed data exploration service from Microsoft: <https://azure.microsoft.com/en-us/products/data-explorer/>. Includes 'DBI' and 'dplyr' interfaces, with the latter modelled after the 'dbplyr' package, whereby queries are translated from R into the native 'KQL' query language and executed lazily. On the admin side, the package extends the object framework provided by 'AzureRMR' to support creation and deletion of databases, and management of database principals. Part of the 'AzureR' family of packages.

Maintained by Alex Kyllo. Last updated 1 years ago.

azure azure-data-explorer azure-sdk-r big-data-analytics kusto

13.7 match 18 stars 5.19 score 9 scripts

hrbrmstr

qrencoder:Quick Response Code (QR Code) / Matrix Barcode Creator

Quick Response codes (QR codes) are a type of matrix bar code and can be used to authenticate transactions, provide access to multi-factor authentication services and enable general data transfer in an image. QR codes use four standardized encoding modes (numeric, alphanumeric, byte/binary, and kanji) to efficiently store data. Matrix barcode generation is performed efficiently in C via the included 'libqrencoder' library created by Kentaro Fukuchi.

Maintained by Bob Rudis. Last updated 6 years ago.

qrcode qrcode-generator cpp

11.8 match 61 stars 6.03 score 59 scripts 1 dependents

trinker

qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis

Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.

Maintained by Tyler Rinker. Last updated 4 years ago.

qdap quantitative-discourse-analysis text-analysis text-mining text-plotting openjdk

7.3 match 176 stars 9.61 score 1.3k scripts 3 dependents

poissonconsulting

subfoldr2:Save and Load R Objects

Facilitates saving and loading R objects, data frames, tables, plots, text blocks and numbers to subfolders.

Maintained by Joe Thorley. Last updated 13 days ago.

18.9 match 2 stars 3.70 score 5 scripts

skranz

dplyrExtras:extra functionality for dplyr like mutate_rows for mutation of a subset of rows

Some extra functionality that is not (yet) in dplyr, e.g. mutate_rows (mutation of subset of rows) xsummarise_each (summarise_each with more flexible alignment of results), or s_filter, s_arrange ,... that allow string arguments.

Maintained by Sebastian Kranz. Last updated 5 years ago.

dplyr

14.3 match 20 stars 4.85 score 59 scripts 4 dependents

tidyverse

dbplyr:A 'dplyr' Back End for Databases

A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.

Maintained by Hadley Wickham. Last updated 3 months ago.

database

3.5 match 481 stars 19.72 score 5.2k scripts 736 dependents

cardiomoon

rrtable:Reproducible Research with a Table of R Codes

Makes documents containing plots and tables from a table of R codes. Can make "HTML", "pdf('LaTex')", "docx('MS Word')" and "pptx('MS Powerpoint')" documents with or without R code. In the package, modularized 'shiny' app codes are provided. These modules are intended for reuse across applications.

Maintained by Keon-Woong Moon. Last updated 2 years ago.

10.6 match 3 stars 6.45 score 76 scripts 2 dependents

dan-reznik

clustringr:Cluster Strings by Edit-Distance

Returns an edit-distance based clusterization of an input vector of strings. Each cluster will contain a set of strings w/ small mutual edit-distance (e.g., levenshtein, optimum-sequence-alignment, damerau-lev), as computed by stringdist::stringdist(). The set of all mutual edit-distances is then used by g graph algorithms (from package igraph) to single out subsets of high connectivity.

Maintained by Dan S. Reznik. Last updated 6 years ago.

clustering graphs strings

17.7 match 15 stars 3.88 score 4 scripts

rstudio

htmltools:Tools for HTML

Tools for HTML generation and output.

Maintained by Carson Sievert. Last updated 10 months ago.

3.9 match 218 stars 17.61 score 10k scripts 4.5k dependents

elipousson

isstatic:Dependency-Free Object Tests

Convenience functions for checking class inheritance, extracting attributes, basic type conversion, and miscellaneous string manipulation. working with sf, ggplot2, and other packages.

Maintained by Eli Pousson. Last updated 2 years ago.

31.2 match 3 stars 2.18 score 1 scripts

coolbutuseless

ctypesio:Read and Write Standard 'C' Types from Files, Connections and Raw Vectors

Interacting with binary files can be difficult because R's types are a subset of what is generally supported by 'C'. This package provides a suite of functions for reading and writing binary data (with files, connections, and raw vectors) using 'C' type descriptions. These functions convert data between 'C' types and R types while checking for values outside the type limits, 'NA' values, etc.

Maintained by Mike Cheng. Last updated 2 months ago.

11.2 match 5 stars 6.02 score 6 scripts 1 dependents

acharaakshit

rminizinc:R Interface to 'MiniZinc'

Constraint optimization, or constraint programming, is the name given to identifying feasible solutions out of a very large set of candidates, where the problem can be modeled in terms of arbitrary constraints. 'MiniZinc' is a free and open-source constraint modeling language. Constraint satisfaction and discrete optimization problems can be formulated in a high-level modeling language. Models are compiled into an intermediate representation that is understood by a wide range of solvers. 'MiniZinc' itself provides several solvers, for instance 'GeCode'. R users can use the package to solve constraint programming problems without using 'MiniZinc' directly, modify existing 'MiniZinc' models and also create their own models.

Maintained by Akshit Achara. Last updated 3 years ago.

cpp

13.9 match 13 stars 4.81 score 5 scripts

dfsp-spirit

fsbrain:Managing and Visualizing Brain Surface Data

Provides high-level access to neuroimaging data from standard software packages like 'FreeSurfer' <http://freesurfer.net/> on the level of subjects and groups. Load morphometry data, surfaces and brain parcellations based on atlases. Mask data using labels, load data for specific atlas regions only, and visualize data and statistical results directly in 'R'.

Maintained by Tim Schäfer. Last updated 4 months ago.

3d brain dti freesurfer mesh mri neuroimaging research surface visualization voxel

10.3 match 66 stars 6.47 score 15 scripts

r-lib

httr2:Perform HTTP Requests and Process the Responses

Tools for creating and modifying HTTP requests, then performing them and processing the results. 'httr2' is a modern re-imagining of 'httr' that uses a pipe-based interface and solves more of the problems that API wrapping packages face.

Maintained by Hadley Wickham. Last updated 7 days ago.

http

3.7 match 246 stars 17.66 score 1.9k scripts 1.1k dependents

selkamand

assertions:Simple Assertions for Beautiful and Customisable Error Messages

Provides simple assertions with sensible defaults and customisable error messages. It offers convenient assertion call wrappers and a general assert function that can handle any condition. Default error messages are user friendly and easily customized with inline code evaluation and styling powered by the 'cli' package.

Maintained by Sam El-Kamand. Last updated 4 months ago.

9.4 match 3 stars 6.84 score 172 scripts 3 dependents

ropensci

beautier:'BEAUti' from R

'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAUti 2' (which is part of 'BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file 'BEAST2' needs to run. This package provides a way to create 'BEAST2' input files without active user input, but using R function calls instead.

Maintained by Richèl J.C. Bilderbeek. Last updated 22 days ago.

bayesian beast beast2 beauti phylogenetic-inference phylogenetics

7.3 match 13 stars 8.76 score 198 scripts 5 dependents

vubiostat

yaml:Methods to Convert R Data to YAML and Back

Implements the 'libyaml' 'YAML' 1.1 parser and emitter (<https://pyyaml.org/wiki/LibYAML>) for R.

Maintained by Shawn Garbett. Last updated 3 months ago.

yaml

3.6 match 166 stars 17.74 score 5.2k scripts 5.1k dependents

apache

arrow:Integration to 'Apache' 'Arrow'

'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.

Maintained by Jonathan Keane. Last updated 1 months ago.

arrow curl openssl cpp

3.3 match 15k stars 19.22 score 10k scripts 81 dependents

strategicprojects

RapidFuzz:String Similarity Computation Using 'RapidFuzz'

Provides a high-performance interface for calculating string similarities and distances, leveraging the efficient library 'RapidFuzz' <https://github.com/rapidfuzz/rapidfuzz-cpp>. This package integrates the 'C++' implementation, allowing 'R' users to access cutting-edge algorithms for fuzzy matching and text analysis.

Maintained by Andre Leite. Last updated 3 months ago.

cpp

16.7 match 4 stars 3.78 score 3 scripts

bioc

rBiopaxParser:Parses BioPax files and represents them in R

Parses BioPAX files and represents them in R, at the moment BioPAX level 2 and level 3 are supported.

Maintained by Frank Kramer. Last updated 5 months ago.

datarepresentation

10.7 match 10 stars 5.85 score 7 scripts

gadenbuie

epoxy:String Interpolation for Documents, Reports and Apps

Extra strength 'glue' for data-driven templates. String interpolation for 'Shiny' apps or 'R Markdown' and 'knitr'-powered 'Quarto' documents, built on the 'glue' and 'whisker' packages.

Maintained by Garrick Aden-Buie. Last updated 11 months ago.

glue knitr knitr-engine quarto rmarkdown rmd shiny template

7.4 match 218 stars 8.43 score 312 scripts

pacificclimate

ncdf4.helpers:Helper Functions for Use with the 'ncdf4' Package

Contains a collection of helper functions for dealing with 'NetCDF' files <https://www.unidata.ucar.edu/software/netcdf/> opened using 'ncdf4', particularly 'NetCDF' files that conform to the Climate and Forecast (CF) Metadata Conventions <http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html>.

Maintained by Lee Zeman. Last updated 3 years ago.

10.6 match 5 stars 5.85 score 236 scripts 1 dependents

guillaumepressiat

stringfix:Strings manipulations in a piping way of coding

Some infix operators and other utility functions to make syntax more and more left to right. In a manner of coding, I just want to say %>% nothing.

Maintained by Guillaume Pressiat. Last updated 3 years ago.

21.3 match 13 stars 2.89 score 12 scripts

rstudio

gt:Easily Create Presentation-Ready Display Tables

Build display tables from tabular data with an easy-to-use set of functions. With its progressive approach, we can construct display tables with a cohesive set of table parts. Table values can be formatted using any of the included formatting functions. Footnotes and cell styles can be precisely added through a location targeting system. The way in which 'gt' handles things for you means that you don't often have to worry about the fine details.

Maintained by Richard Iannone. Last updated 10 days ago.

docx easy-to-use html latex rtf summary-tables

3.4 match 2.1k stars 18.36 score 20k scripts 112 dependents

r-lib

tidyselect:Select from a Set of Strings

A backend for the selecting functions of the 'tidyverse'. It makes it easy to implement select-like functions in your own packages in a way that is consistent with other 'tidyverse' interfaces for selection.

Maintained by Lionel Henry. Last updated 3 months ago.

3.3 match 130 stars 18.31 score 1.9k scripts 8.2k dependents

r-lib

lintr:A 'Linter' for R Code

Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.

Maintained by Michael Chirico. Last updated 8 days ago.

linter

3.6 match 1.2k stars 17.00 score 916 scripts 33 dependents

stamats

MKmisc:Miscellaneous Functions from M. Kohl

Contains several functions for statistical data analysis; e.g. for sample size and power calculations, computation of confidence intervals and tests, and generation of similarity matrices.

Maintained by Matthias Kohl. Last updated 2 years ago.

8.2 match 11 stars 7.40 score 129 scripts 1 dependents

miracum

DIZtools:Lightweight Utilities for 'DIZ' R Package Development

Lightweight utility functions used for the R package development infrastructure inside the data integration centers ('DIZ') to standardize and facilitate repetitive tasks such as setting up a database connection or issuing notification messages and to avoid redundancy.

Maintained by Jonathan M. Mang. Last updated 1 years ago.

snippets tools

14.6 match 3 stars 4.13 score 2 scripts 3 dependents

genentech

psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation

Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.

Maintained by Matt Secrest. Last updated 1 months ago.

bayesian-dynamic-borrowing psborrow2 simulation-study

7.6 match 18 stars 7.87 score 16 scripts

jl5000

tidyged:Handle GEDCOM Files Using Tidyverse Principles

Create and summarise family tree GEDCOM files using tidy dataframes.

Maintained by Jamie Lendrum. Last updated 3 years ago.

10.0 match 8 stars 5.96 score 23 scripts 3 dependents

daroczig

logger:A Lightweight, Modern and Flexible Logging Utility

Inspired by the the 'futile.logger' R package and 'logging' Python module, this utility provides a flexible and extensible way of formatting and delivering log messages with low overhead.

Maintained by Gergely Daróczi. Last updated 2 months ago.

3.5 match 298 stars 16.88 score 1.5k scripts 98 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

4.1 match 12 stars 14.22 score 612 scripts 2.2k dependents

njtierney

naniar:Data Structures, Summaries, and Visualisations for Missing Data

Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.

Maintained by Nicholas Tierney. Last updated 3 days ago.

data-visualisation ggplot2 missing-data missingness tidy-data

3.8 match 657 stars 15.63 score 5.1k scripts 9 dependents

pzhaonet

pinyin:Convert Chinese Characters into Pinyin, Sijiao, Wubi or Other Codes

Convert Chinese characters into Pinyin (the official romanization system for Standard Chinese in mainland China, Malaysia, Singapore, and Taiwan. See <https://en.wikipedia.org/wiki/Pinyin> for details), Sijiao (four or five numerical digits per character. See <https://en.wikipedia.org/wiki/Four-Corner_Method>.), Wubi (an input method with five strokes. See <https://en.wikipedia.org/wiki/Wubi_method>) or user-defined codes.

Maintained by Peng Zhao. Last updated 5 years ago.

bookdown chinese-characters pinyin

10.3 match 49 stars 5.71 score 35 scripts 1 dependents

s-fleck

lgr:A Fully Featured Logging Framework

A flexible, feature-rich yet light-weight logging framework based on 'R6' classes. It supports hierarchical loggers, custom log levels, arbitrary data fields in log events, logging to plaintext, 'JSON', (rotating) files, memory buffers. For extra appenders that support logging to databases, email and push notifications see the the package lgr.app.

Maintained by Stefan Fleck. Last updated 4 months ago.

log4j logging r6

5.2 match 81 stars 11.29 score 120 scripts 93 dependents

bioc

Biobase:Biobase: Base functions for Bioconductor

Functions that are needed by many other packages or which replace R functions.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

3.5 match 9 stars 16.45 score 6.6k scripts 1.8k dependents

brockk

trialr:Clinical Trial Designs in 'rstan'

A collection of clinical trial designs and methods, implemented in 'rstan' and R, including: the Continual Reassessment Method by O'Quigley et al. (1990) <doi:10.2307/2531628>; EffTox by Thall & Cook (2004) <doi:10.1111/j.0006-341X.2004.00218.x>; the two-parameter logistic method of Neuenschwander, Branson & Sponer (2008) <doi:10.1002/sim.3230>; and the Augmented Binary method by Wason & Seaman (2013) <doi:10.1002/sim.5867>; and more. We provide functions to aid model-fitting and analysis. The 'rstan' implementations may also serve as a cookbook to anyone looking to extend or embellish these models. We hope that this package encourages the use of Bayesian methods in clinical trials. There is a preponderance of early phase trial designs because this is where Bayesian methods are used most. If there is a method you would like implemented, please get in touch.

Maintained by Kristian Brock. Last updated 1 years ago.

cpp

6.8 match 41 stars 8.55 score 106 scripts 3 dependents

thierryo

qrcode:Generate QRcodes with R

Create static QR codes in R. The content of the QR code is exactly what the user defines. We don't add a redirect URL, making it impossible for us to track the usage of the QR code. This allows to generate fast, free to use and privacy friendly QR codes.

Maintained by Thierry Onkelinx. Last updated 6 months ago.

qrcode qrcode-generator r-project

7.7 match 44 stars 7.56 score 456 scripts 7 dependents

parmsam

lzstring:Wrapper for 'lz-string' 'C++' Library

Provide access to the 'lz-string' <http://pieroxy.net/blog/pages/lz-string/index.html> 'C++' library for Lempel-Ziv (LZ) based compression and decompression of strings.

Maintained by Sam Parmar. Last updated 2 months ago.

lzstring cpp

13.1 match 1 stars 4.38 score 4 scripts 1 dependents

ms609

TreeTools:Create, Modify and Analyse Phylogenetic Trees

Efficient implementations of functions for the creation, modification and analysis of phylogenetic trees. Applications include: generation of trees with specified shapes; tree rearrangement; analysis of tree shape; rooting of trees and extraction of subtrees; calculation and depiction of split support; plotting the position of rogue taxa (Klopfstein & Spasojevic 2019) <doi:10.1371/journal.pone.0212942>; calculation of ancestor-descendant relationships, of 'stemwardness' (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>, and of tree balance (Mir et al. 2013, Lemant et al. 2022) <doi:10.1016/j.mbs.2012.10.005>, <doi:10.1093/sysbio/syac027>; artificial extinction (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>; import and export of trees from Newick, Nexus (Maddison et al. 1997) <doi:10.1093/sysbio/46.4.590>, and TNT <https://www.lillo.org.ar/phylogeny/tnt/> formats; and analysis of splits and cladistic information.

Maintained by Martin R. Smith. Last updated 1 months ago.

evolutionary-biology phylogenetic-trees phylogenetics cpp

5.8 match 21 stars 9.92 score 124 scripts 10 dependents

insightsengineering

rtables:Reporting Tables

Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.

Maintained by Joe Zhu. Last updated 2 months ago.

pharmaceuticals tables

4.1 match 232 stars 13.65 score 238 scripts 17 dependents

jkcshea

ivmte:Instrumental Variables: Extrapolation by Marginal Treatment Effects

The marginal treatment effect was introduced by Heckman and Vytlacil (2005) <doi:10.1111/j.1468-0262.2005.00594.x> to provide a choice-theoretic interpretation to instrumental variables models that maintain the monotonicity condition of Imbens and Angrist (1994) <doi:10.2307/2951620>. This interpretation can be used to extrapolate from the compliers to estimate treatment effects for other subpopulations. This package provides a flexible set of methods for conducting this extrapolation. It allows for parametric or nonparametric sieve estimation, and allows the user to maintain shape restrictions such as monotonicity. The package operates in the general framework developed by Mogstad, Santos and Torgovitsky (2018) <doi:10.3982/ECTA15463>, and accommodates either point identification or partial identification (bounds). In the partially identified case, bounds are computed using either linear programming or quadratically constrained quadratic programming. Support for four solvers is provided. Gurobi and the Gurobi R API can be obtained from <http://www.gurobi.com/index>. CPLEX can be obtained from <https://www.ibm.com/analytics/cplex-optimizer>. CPLEX R APIs 'Rcplex' and 'cplexAPI' are available from CRAN. MOSEK and the MOSEK R API can be obtained from <https://www.mosek.com/>. The lp_solve library is freely available from <http://lpsolve.sourceforge.net/5.5/>, and is included when installing its API 'lpSolveAPI', which is available from CRAN.

Maintained by Joshua Shea. Last updated 7 months ago.

10.5 match 18 stars 5.33 score 30 scripts

crlsierra

SoilR:Models of Soil Organic Matter Decomposition

Functions for modeling Soil Organic Matter decomposition in terrestrial ecosystems with linear and nonlinear systems of differential equations. The package implements models according to the compartmental system representation described in Sierra and others (2012) <doi:10.5194/gmd-5-1045-2012> and Sierra and others (2014) <doi:10.5194/gmd-7-1919-2014>.

Maintained by Carlos A. Sierra. Last updated 1 years ago.

19.3 match 5 stars 2.88 score 153 scripts

rstudio

pointblank:Data Validation and Organization of Metadata for Local and Remote Tables

Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.

Maintained by Richard Iannone. Last updated 9 days ago.

data-assertions data-checker data-dictionaries data-frames data-inference data-management data-profiler data-quality data-validation data-verification database-tables easy-to-understand reporting-tool schema-validation testing-tools yaml-configuration

5.3 match 932 stars 10.59 score 284 scripts

pauljohn32

kutils:Project Management Tools

Tools for data importation, recoding, and inspection. There are functions to create new project folders, R code templates, create uniquely named output directories, and to quickly obtain a visual summary for each variable in a data frame. The main feature here is the systematic implementation of the "variable key" framework for data importation and recoding. We are eager to have community feedback about the variable key and the vignette about it. In version 1.7, the function 'semTable' is removed. It was deprecated since 1.67. That is provided in a separate package, 'semTable'.

Maintained by Paul Johnson. Last updated 1 years ago.

9.5 match 5.85 score 110 scripts 20 dependents

melff

RKernel:Yet another R kernel for Jupyter

Provides a kernel for Jupyter.

Maintained by Martin Elff. Last updated 14 days ago.

jupyter jupyter-kernel jupyter-kernels jupyter-notebook

12.0 match 38 stars 4.60 score

tconwell

toolbox:List, String, and Meta Programming Utility Functions

Includes functions for mapping named lists to function arguments, random strings, pasting and combining rows together across columns, etc.

Maintained by Timothy Conwell. Last updated 2 years ago.

17.5 match 3.16 score 16 scripts 3 dependents

stuart-lab

Signac:Analysis of Single-Cell Chromatin Data

A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.

Maintained by Tim Stuart. Last updated 7 months ago.

atac bioinformatics single-cell zlib cpp

4.5 match 349 stars 12.19 score 3.7k scripts 1 dependents

brry

berryFunctions:Function Collection Related to Plotting and Hydrology

Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.

Maintained by Berry Boessenkool. Last updated 1 months ago.

5.8 match 13 stars 9.43 score 350 scripts 16 dependents

qsbase

qs:Quick Serialization of R Objects

Provides functions for quickly writing and reading any R object to and from disk.

Maintained by Travers Ching. Last updated 9 days ago.

compression data-storage encoding serialization libzstd lz4 cpp

3.9 match 414 stars 13.91 score 2.5k scripts 51 dependents

poissonconsulting

chk:Check User-Supplied Function Arguments

For developers to check user-supplied function arguments. It is designed to be simple, fast and customizable. Error messages follow the tidyverse style guide.

Maintained by Joe Thorley. Last updated 2 months ago.

chk

4.5 match 48 stars 11.89 score 22 scripts 95 dependents

mlopez-ibanez

irace:Iterated Racing for Automatic Algorithm Configuration

Iterated race is an extension of the Iterated F-race method for the automatic configuration of optimization algorithms, that is, (offline) tuning their parameters by finding the most appropriate settings given a set of instances of an optimization problem. M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez Cáceres, T. Stützle, and M. Birattari (2016) <doi:10.1016/j.orp.2016.09.002>.

Maintained by Manuel López-Ibáñez. Last updated 30 days ago.

algorithm-configuration hyperparameter-tuning irace optimization-algorithms

5.2 match 63 stars 10.28 score 103 scripts 1 dependents

cysouw

qlcMatrix:Utility Sparse Matrix Functions for Quantitative Language Comparison

Extension of the functionality of the 'Matrix' package for using sparse matrices. Some of the functions are very general, while other are highly specific for special data format as used for quantitative language comparison.

Maintained by Michael Cysouw. Last updated 9 months ago.

7.6 match 6 stars 6.98 score 256 scripts 1 dependents

kasperwelbers

corpustools:Managing, Querying and Analyzing Tokenized Text

Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.

Maintained by Kasper Welbers. Last updated 6 months ago.

cpp

7.0 match 31 stars 7.50 score 174 scripts 1 dependents

teunbrand

ggh4x:Hacks for 'ggplot2'

A 'ggplot2' extension that does a variety of little helpful things. The package extends 'ggplot2' facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.

Maintained by Teun van den Brand. Last updated 3 months ago.

ggplot-extension ggplot2

3.8 match 616 stars 13.98 score 4.4k scripts 20 dependents

lrberge

fixest:Fast Fixed-Effects Estimations

Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.

Maintained by Laurent Berge. Last updated 7 months ago.

cpp openmp

3.6 match 387 stars 14.69 score 3.8k scripts 25 dependents

henrikbengtsson

R.rsp:Dynamic Generation of Scientific Reports

The RSP markup language makes any text-based document come alive. RSP provides a powerful markup for controlling the content and output of LaTeX, HTML, Markdown, AsciiDoc, Sweave and knitr documents (and more), e.g. 'Today's date is <%=Sys.Date()%>'. Contrary to many other literate programming languages, with RSP it is straightforward to loop over mixtures of code and text sections, e.g. in month-by-month summaries. RSP has also several preprocessing directives for incorporating static and dynamic contents of external files (local or online) among other things. Functions rstring() and rcat() make it easy to process RSP strings, rsource() sources an RSP file as it was an R script, while rfile() compiles it (even online) into its final output format, e.g. rfile('report.tex.rsp') generates 'report.pdf' and rfile('report.md.rsp') generates 'report.html'. RSP is ideal for self-contained scientific reports and R package vignettes. It's easy to use - if you know how to write an R script, you'll be up and running within minutes.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

document markup report reproducibility science

6.5 match 31 stars 8.04 score 36 scripts 9 dependents

pharmaverse

admiral:ADaM in R Asset Library

A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).

Maintained by Ben Straub. Last updated 4 days ago.

cdisc clinical-trials open-source

3.8 match 236 stars 13.89 score 486 scripts 4 dependents

cran

notebookutils:Dummy R APIs Used in 'Azure Synapse Analytics' for Local Developments

This is a pure dummy interfaces package which mirrors 'MsSparkUtils' APIs <https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/microsoft-spark-utilities?pivots=programming-language-r> of 'Azure Synapse Analytics' <https://learn.microsoft.com/en-us/azure/synapse-analytics/> for R users, customer of Azure Synapse can download this package from CRAN for local development.

Maintained by runtimeexp. Last updated 11 months ago.

22.0 match 2.36 score 23 scripts

allanvc

mRpostman:An IMAP Client for R

An easy-to-use IMAP client that provides tools for message searching, selective fetching of message attributes, mailbox management, attachment extraction, and several other IMAP features, paving the way for e-mail data analysis in R.

Maintained by Allan Quadros. Last updated 6 months ago.

8.7 match 31 stars 5.92 score 18 scripts

trinker

numform:Tools to Format Numbers for Publication

Format numbers and plots for publication; includes the removal of leading zeros, standardization of number of digits, addition of affixes, and a p-value formatter. These tools combine the functionality of several 'base' functions such as 'paste()', 'format()', and 'sprintf()' into specific use case functions that are named in a way that is consistent with usage, making their names easy to remember and easy to deploy.

Maintained by Tyler Rinker. Last updated 3 years ago.

number-formating

8.5 match 51 stars 6.06 score 151 scripts 1 dependents

mhenderson

llinyn:A Few Esoteric String Operations

A few esoteric string operations in R.

Maintained by Matthew Henderson. Last updated 6 months ago.

string-manipulation

20.6 match 2.48 score 1 dependents

plangfelder

WGCNA:Weighted Correlation Network Analysis

Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.

Maintained by Peter Langfelder. Last updated 6 months ago.

cpp

5.3 match 54 stars 9.65 score 5.3k scripts 32 dependents

alshum

hashids:Generate Short Unique YouTube-Like IDs (Hashes) from Integers

An R port of the hashids library. hashids generates YouTube-like hashes from integers or vector of integers. Hashes generated from integers are relatively short, unique and non-seqential. hashids can be used to generate unique ids for URLs and hide database row numbers from the user. By default hashids will avoid generating common English cursewords by preventing certain letters being next to each other. hashids are not one-way: it is easy to encode an integer to a hashid and decode a hashid back into an integer.

Maintained by Alex Shum. Last updated 6 years ago.

12.4 match 18 stars 4.10 score 14 scripts

mllg

checkmate:Fast and Versatile Argument Checks

Tests and assertions to perform frequent argument checks. A substantial part of the package was written in C to minimize any worries about execution time overhead.

Maintained by Michel Lang. Last updated 8 months ago.

assertions testthat

3.1 match 276 stars 16.28 score 1.5k scripts 1.9k dependents

aphalo

ggpmisc:Miscellaneous Extensions to 'ggplot2'

Extensions to 'ggplot2' respecting the grammar of graphics paradigm. Statistics: locate and tag peaks and valleys; label plot with the equation of a fitted polynomial or other types of models; labels with P-value, R^2 or adjusted R^2 or information criteria for fitted models; label with ANOVA table for fitted models; label with summary for fitted models. Model fit classes for which suitable methods are provided by package 'broom' and 'broom.mixed' are supported. Scales and stats to build volcano and quadrant plots based on outcomes, fold changes, p-values and false discovery rates.

Maintained by Pedro J. Aphalo. Last updated 4 months ago.

data-analysis dataviz ggplot2-annotations ggplot2-stats statistics

3.8 match 105 stars 13.32 score 4.4k scripts 14 dependents

insightsengineering

rbmi:Reference Based Multiple Imputation

Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.

Maintained by Isaac Gravestock. Last updated 23 days ago.

5.7 match 18 stars 8.78 score 33 scripts 1 dependents

trinker

wakefield:Generate Random Data Sets

Generates random data sets including: data.frames, lists, and vectors.

Maintained by Tyler Rinker. Last updated 5 years ago.

data-generation wakefield

7.0 match 256 stars 7.13 score 209 scripts

r-dbi

RPostgres:C++ Interface to PostgreSQL

Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.

Maintained by Kirill Müller. Last updated 19 days ago.

database postgres postgresql cpp

3.3 match 338 stars 14.78 score 1.6k scripts 31 dependents

cran

XML:Tools for Parsing and Generating XML Within R and S-Plus

Many approaches for both reading and creating XML (and HTML) documents (including DTDs), both local and accessible via HTTP or FTP. Also offers access to an 'XPath' "interpreter".

Maintained by CRAN Team. Last updated 2 months ago.

libxml2

5.5 match 3 stars 8.87 score 1.3k dependents

systemsbioinformatics

parcr:Construct Parsers for Structured Text Files

Construct parser combinator functions, higher order functions that parse input. Construction of such parsers is transparent and easy. Their main application is the parsing of structured text files like those generated by laboratory instruments. Based on a paper by Hutton (1992) <doi:10.1017/S0956796800000411>.

Maintained by Douwe Molenaar. Last updated 9 months ago.

combinators higher-order-functions parser parsing

9.7 match 4 stars 5.08 score 8 scripts

weirichs

eatTools:Miscellaneous Functions for the Analysis of Educational Assessments

Miscellaneous functions for data cleaning and data analysis of educational assessments. Includes functions for descriptive analyses, character vector manipulations and weighted statistics. Mainly a lightweight dependency for the packages 'eatRep', 'eatGADS', 'eatPrep' and 'eatModel' (which will be subsequently submitted to 'CRAN'). The function for defining (weighted) contrasts in weighted effect coding refers to te Grotenhuis et al. (2017) <doi:10.1007/s00038-016-0901-1>. Functions for weighted statistics refer to Wolter (2007) <doi:10.1007/978-0-387-35099-8>.

Maintained by Sebastian Weirich. Last updated 3 months ago.

9.1 match 2 stars 5.38 score 11 scripts 2 dependents

kkholst

lava:Latent Variable Models

A general implementation of Structural Equation Models with latent variables (MLE, 2SLS, and composite likelihood estimators) with both continuous, censored, and ordinal outcomes (Holst and Budtz-Joergensen (2013) <doi:10.1007/s00180-012-0344-y>). Mixture latent variable models and non-linear latent variable models (Holst and Budtz-Joergensen (2020) <doi:10.1093/biostatistics/kxy082>). The package also provides methods for graph exploration (d-separation, back-door criterion), simulation of general non-linear latent variable models, and estimation of influence functions for a broad range of statistical models.

Maintained by Klaus K. Holst. Last updated 2 months ago.

latent-variable-models simulation statistics structural-equation-models

3.8 match 33 stars 12.85 score 610 scripts 476 dependents

hhoeflin

hdf5r:Interface to the 'HDF5' Binary Data Format

'HDF5' is a data model, library and file format for storing and managing large amounts of data. This package provides a nearly feature complete, object oriented wrapper for the 'HDF5' API <https://support.hdfgroup.org/documentation/hdf5/latest/_r_m.html> using R6 classes. Additionally, functionality is added so that 'HDF5' objects behave very similar to their corresponding R counterparts.

Maintained by Holger Hoefling. Last updated 2 months ago.

hdf5

3.9 match 82 stars 12.31 score 988 scripts 33 dependents

calvagone

campsismod:Generic Implementation of a PK/PD Model

A generic, easy-to-use and expandable implementation of a pharmacokinetic (PK) / pharmacodynamic (PD) model based on the S4 class system. This package allows the user to read/write a pharmacometric model from/to files and adapt it further on the fly in the R environment. For this purpose, this package provides an intuitive API to add, modify or delete equations, ordinary differential equations (ODE's), model parameters or compartment properties (like infusion duration or rate, bioavailability and initial values). Finally, this package also provides a useful export of the model for use with simulation packages 'rxode2' and 'mrgsolve'. This package is designed and intended to be used with package 'campsis', a PK/PD simulation platform built on top of 'rxode2' and 'mrgsolve'.

Maintained by Nicolas Luyckx. Last updated 1 months ago.

7.1 match 5 stars 6.64 score 42 scripts 1 dependents

jmotif

jmotif:Time Series Analysis Toolkit Based on Symbolic Aggregate Discretization, i.e. SAX

Implements time series z-normalization, SAX, HOT-SAX, VSM, SAX-VSM, RePair, and RRA algorithms facilitating time series motif (i.e., recurrent pattern), discord (i.e., anomaly), and characteristic pattern discovery along with interpretable time series classification.

Maintained by Pavel Senin. Last updated 2 years ago.

anomalydiscovery discord discretization kdd sax sax-vsm timeseries cpp

9.2 match 55 stars 5.12 score 48 scripts

wrathematics

ngram:Fast n-Gram 'Tokenization'

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.

Maintained by Drew Schmidt. Last updated 1 years ago.

ngram text text-mining

4.5 match 71 stars 10.45 score 844 scripts 7 dependents

michaelhallquist

MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus

Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.

Maintained by Michael Hallquist. Last updated 2 months ago.

3.6 match 86 stars 12.96 score 664 scripts 13 dependents

bioc

DAPAR:Tools for the Differential Analysis of Proteins Abundance with R

The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).

Maintained by Samuel Wieczorek. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol go dataimport prostar1

8.6 match 2 stars 5.42 score 22 scripts 1 dependents

canmod

macpan2:Fast and Flexible Compartmental Modelling

Fast and flexible compartmental modelling with Template Model Builder.

Maintained by Steve Walker. Last updated 1 days ago.

compartmental-models epidemiology forecasting mixed-effects model-fitting optimization simulation simulation-modeling cpp

5.3 match 4 stars 8.89 score 246 scripts 1 dependents

truecluster

ff:Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Maintained by Jens Oehlschlägel. Last updated 2 months ago.

cpp

3.9 match 27 stars 12.01 score 764 scripts 71 dependents

poissonconsulting

tidyplus:Additional 'tidyverse' Functions

Provides functions such as str_crush(), add_missing_column(), coalesce_data() and drop_na_all() that complement 'tidyverse' functionality or functions that provide alternative behaviors such as if_else2() and str_detect2().

Maintained by Ayla Pearson. Last updated 2 months ago.

7.3 match 9 stars 6.33 score 1 scripts 4 dependents

adamlilith

omnibus:Helper Tools for Managing Data, Dates, Missing Values, and Text

An assortment of helper functions for managing data (e.g., rotating values in matrices by a user-defined angle, switching from row- to column-indexing), dates (e.g., intuiting year from messy date strings), handling missing values (e.g., removing elements/rows across multiple vectors or matrices if any have an NA), text (e.g., flushing reports to the console in real-time); and combining data frames with different schema (copying, filling, or concatenating columns or applying functions before combining).

Maintained by Adam B. Smith. Last updated 6 months ago.

count-decimals leap-year merge-lists missing-values rotate-matrix sampling

7.9 match 4 stars 5.83 score 54 scripts 3 dependents

hadley

assertthat:Easy Pre and Post Assertions

An extension to stopifnot() that makes it easy to declare the pre and post conditions that you code should satisfy, while also producing friendly error messages so that your users know what's gone wrong.

Maintained by Hadley Wickham. Last updated 6 years ago.

3.0 match 207 stars 15.21 score 2.5k scripts 984 dependents

chrismuir

refinr:Cluster and Merge Similar Values Within a Character Vector

These functions take a character vector as input, identify and cluster similar values, and then merge clusters together so their values become identical. The functions are an implementation of the key collision and ngram fingerprint algorithms from the open source tool Open Refine <https://openrefine.org/>. More info on key collision and ngram fingerprint can be found here <https://openrefine.org/docs/technical-reference/clustering-in-depth>.

Maintained by Chris Muir. Last updated 1 years ago.

approximate-string-matching clustering data-cleaning data-clustering fuzzy-matching ngram openrefine cpp

6.7 match 104 stars 6.80 score 121 scripts

aryoda

tryCatchLog:Advanced 'tryCatch()' and 'try()' Functions

Advanced tryCatch() and try() functions for better error handling (logging, stack trace with source code references and support for post-mortem analysis via dump files).

Maintained by Juergen Altfeld. Last updated 2 years ago.

5.2 match 73 stars 8.63 score 116 scripts 9 dependents

dbosak01

common:Solutions for Common Problems in Base R

Contains functions for solving commonly encountered problems while programming in R. This package is intended to provide a lightweight supplement to Base R, and will be useful for almost any R user.

Maintained by David Bosak. Last updated 12 months ago.

5.5 match 6 stars 8.21 score 193 scripts 11 dependents

kwb-r

kwb.datetime:Functions for date/time objects

Functions for date/time objects, e.g. functions to convert timestamps between different time zones. Correctness for some functions still to be verified!

Maintained by Hauke Sonnenberg. Last updated 4 years ago.

datetime

9.0 match 4.98 score 5 scripts 32 dependents

fvafrcu

fritools:Utilities for the Forest Research Institute of the State Baden-Wuerttemberg

Miscellaneous utilities, tools and helper functions for finding and searching files on disk, searching for and removing R objects from the workspace. Does not import or depend on any third party package, but on core R only (i.e. it may depend on packages with priority 'base').

Maintained by Andreas Dominik Cullmann. Last updated 26 days ago.

7.5 match 5.98 score 4 scripts 6 dependents

jinghuazhao

gap:Genetic Analysis Package

As first reported [Zhao, J. H. 2007. "gap: Genetic Analysis Package". J Stat Soft 23(8):1-18. <doi:10.18637/jss.v023.i08>], it is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).

Maintained by Jing Hua Zhao. Last updated 15 days ago.

genetics imputation lmm fortran

3.8 match 12 stars 11.88 score 448 scripts 16 dependents

tconwell

html5:Creates Valid HTML5 Strings

Generates valid HTML tag strings for HTML5 elements documented by Mozilla. Attributes are passed as named lists, with names being the attribute name and values being the attribute value. Attribute values are automatically double-quoted. To declare a DOCTYPE, wrap html() with function doctype(). Mozilla's documentation for HTML5 is available here: <https://developer.mozilla.org/en-US/docs/Web/HTML/Element>. Elements marked as obsolete are not included.

Maintained by Timothy Conwell. Last updated 2 years ago.

12.2 match 1 stars 3.65 score 1 scripts 3 dependents

atfutures

calendar:Create, Read, Write, and Work with 'iCalendar' Files, Calendars and Scheduling Data

Provides function to create, read, write, and work with 'iCalendar' files (which typically have '.ics' or '.ical' extensions), and the scheduling data, calendars and timelines of people, organisations and other entities that they represent. 'iCalendar' is an open standard for exchanging calendar and scheduling information between users and computers, described at <https://icalendar.org/>.

Maintained by Robin Lovelace. Last updated 7 months ago.

calendar ical

5.3 match 42 stars 8.33 score 113 scripts 1 dependents

bioc

glmSparseNet:Network Centrality Metrics for Elastic-Net Regularized Models

glmSparseNet is an R-package that generalizes sparse regression models when the features (e.g. genes) have a graph structure (e.g. protein-protein interactions), by including network-based regularizers. glmSparseNet uses the glmnet R-package, by including centrality measures of the network as penalty weights in the regularization. The current version implements regularization based on node degree, i.e. the strength and/or number of its associated edges, either by promoting hubs in the solution or orphan genes in the solution. All the glmnet distribution families are supported, namely "gaussian", "poisson", "binomial", "multinomial", "cox", and "mgaussian".

Maintained by André Veríssimo. Last updated 5 months ago.

software statisticalmethod dimensionreduction regression classification survival network graphandnetwork

5.9 match 6 stars 7.42 score 41 scripts 1 dependents

wraff

wrMisc:Analyze Experimental High-Throughput (Omics) Data

The efficient treatment and convenient analysis of experimental high-throughput (omics) data gets facilitated through this collection of diverse functions. Several functions address advanced object-conversions, like manipulating lists of lists or lists of arrays, reorganizing lists to arrays or into separate vectors, merging of multiple entries, etc. Another set of functions provides speed-optimized calculation of standard deviation (sd), coefficient of variance (CV) or standard error of the mean (SEM) for data in matrixes or means per line with respect to additional grouping (eg n groups of replicates). A group of functions facilitate dealing with non-redundant information, by indexing unique, adding counters to redundant or eliminating lines with respect redundancy in a given reference-column, etc. Help is provided to identify very closely matching numeric values to generate (partial) distance matrixes for very big data in a memory efficient manner or to reduce the complexity of large data-sets by combining very close values. Other functions help aligning a matrix or data.frame to a reference using partial matching or to mine an experimental setup to extract patterns of replicate samples. Many times large experimental datasets need some additional filtering, adequate functions are provided. Convenient data normalization is supported in various different modes, parameter estimation via permutations or boot-strap as well as flexible testing of multiple pair-wise combinations using the framework of 'limma' is provided, too. Batch reading (or writing) of sets of files and combining data to arrays is supported, too.

Maintained by Wolfgang Raffelsberger. Last updated 7 months ago.

9.9 match 4.44 score 33 scripts 4 dependents

epiverse-trace

epiparameter:Classes and Helper Functions for Working with Epidemiological Parameters

Classes and helper functions for loading, extracting, converting, manipulating, plotting and aggregating epidemiological parameters for infectious diseases. Epidemiological parameters extracted from the literature are loaded from the 'epiparameterDB' R package.

Maintained by Joshua W. Lambert. Last updated 2 months ago.

data-access data-package epidemiology epiverse probability-distribution

4.4 match 33 stars 9.84 score 102 scripts 1 dependents

cran

tmcn:A Text Mining Toolkit for Chinese

A Text mining toolkit for Chinese, which includes facilities for Chinese string processing, Chinese NLP supporting, encoding detecting and converting. Moreover, it provides some functions to support 'tm' package in Chinese.

Maintained by Jian Li. Last updated 6 years ago.

18.3 match 1 stars 2.38 score 5 dependents