Showing 142 of total 142 results (show query)
tidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 5 days ago.
11.4 match 584 stars 18.71 score 7.2k scripts 380 dependentsarnaudgallou
plume:A Simple Author Handler for Scientific Writing
Handles and formats author information in scientific writing in 'R Markdown' and 'Quarto'. 'plume' provides easy-to-use and flexible tools for injecting author metadata in 'YAML' headers as well as generating author and contribution lists (among others) as strings from tabular data.
Maintained by Arnaud Gallou. Last updated 30 days ago.
authorscontributioncontributionslistlistsmarkdownpaperpreprintquartoroleroles
24.6 match 21 stars 6.84 score 15 scriptsjosesamos
rolap:Obtaining Star Databases from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.
Maintained by Jose Samos. Last updated 1 years ago.
14.8 match 5 stars 6.12 score 25 scripts 1 dependentstychobra
polished:Authentication and Hosting for 'shiny' Apps
Authentication, user administration, hosting, and additional infrastructure for 'shiny' apps. See <https://polished.tech> for additional documentation and examples.
Maintained by Andy Merlino. Last updated 1 years ago.
10.9 match 233 stars 8.09 score 75 scriptsazure
AzureRMR:Interface to 'Azure Resource Manager'
A lightweight but powerful R interface to the 'Azure Resource Manager' REST API. The package exposes a comprehensive class framework and related tools for creating, updating and deleting 'Azure' resource groups, resources and templates. While 'AzureRMR' can be used to manage any 'Azure' service, it can also be extended by other packages to provide extra functionality for specific services. Part of the 'AzureR' family of packages.
Maintained by Hong Ooi. Last updated 1 years ago.
azureazure-resource-managerazure-sdk-rcloud
8.1 match 20 stars 9.94 score 51 scripts 12 dependentsr-lib
desc:Manipulate DESCRIPTION Files
Tools to read, write, create, and manipulate DESCRIPTION files. It is intended for packages that create or manipulate other packages.
Maintained by Gábor Csárdi. Last updated 1 months ago.
5.3 match 123 stars 14.68 score 409 scripts 1.1k dependentskjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 11 months ago.
24.8 match 2.28 score 38 scriptsrevelle
psychTools:Tools to Accompany the 'psych' Package for Psychological Research
Support functions, data sets, and vignettes for the 'psych' package. Contains several of the biggest data sets for the 'psych' package as well as four vignettes. A few helper functions for file manipulation are included as well. For more information, see the <https://personality-project.org/r/> web page.
Maintained by William Revelle. Last updated 12 months ago.
6.9 match 5.89 score 178 scripts 5 dependentsvubiostat
redcapAPI:Interface to 'REDCap'
Access data stored in 'REDCap' databases using the Application Programming Interface (API). 'REDCap' (Research Electronic Data CAPture; <https://projectredcap.org>, Harris, et al. (2009) <doi:10.1016/j.jbi.2008.08.010>, Harris, et al. (2019) <doi:10.1016/j.jbi.2019.103208>) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The API allows users to access data and project meta data (such as the data dictionary) from the web programmatically. The 'redcapAPI' package facilitates the process of accessing data with options to prepare an analysis-ready data set consistent with the definitions in a database's data dictionary.
Maintained by Shawn Garbett. Last updated 8 days ago.
3.5 match 22 stars 10.47 score 134 scripts 2 dependentstom-wolff
ideanet:Integrating Data Exchange and Analysis for Networks ('ideanet')
A suite of convenient tools for social network analysis geared toward students, entry-level users, and non-expert practitioners. ‘ideanet’ features unique functions for the processing and measurement of sociocentric and egocentric network data. These functions automatically generate node- and system-level measures commonly used in the analysis of these types of networks. Outputs from these functions maximize the ability of novice users to employ network measurements in further analyses while making all users less prone to common data analytic errors. Additionally, ‘ideanet’ features an R Shiny graphic user interface that allows novices to explore network data with minimal need for coding.
Maintained by Tom Wolff. Last updated 2 days ago.
5.1 match 6 stars 6.80 score 10 scriptsbioc
rsbml:R support for SBML, using libsbml
Links R to libsbml for SBML parsing, validating output, provides an S4 SBML DOM, converts SBML to R graph objects. Optionally links to the SBML ODE Solver Library (SOSLib) for simulating models.
Maintained by Michael Lawrence. Last updated 17 days ago.
graphandnetworkpathwaysnetworklibsbmlcpp
6.3 match 4.71 score 19 scripts 1 dependentsbioc
Biobase:Biobase: Base functions for Bioconductor
Functions that are needed by many other packages or which replace R functions.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructurebioconductor-packagecore-package
1.8 match 9 stars 16.45 score 6.6k scripts 1.8k dependentsavdrark
cmm:Categorical Marginal Models
Quite extensive package for maximum likelihood estimation and weighted least squares estimation of categorical marginal models (CMMs; e.g., Bergsma and Rudas, 2002, <http://www.jstor.org/stable/2700006?; Bergsma, Croon and Hagenaars, 2009, <DOI:10.1007/b12532>.
Maintained by L. A. van der Ark. Last updated 2 years ago.
10.2 match 2.73 score 25 scripts 4 dependentsthothorn
HSAUR3:A Handbook of Statistical Analyses Using R (3rd Edition)
Functions, data sets, analyses and examples from the third edition of the book ''A Handbook of Statistical Analyses Using R'' (Torsten Hothorn and Brian S. Everitt, Chapman & Hall/CRC, 2014). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, Sweave source code for slides of selected chapters is included in this package (see HSAUR3/inst/slides). The publishers web page is '<https://www.routledge.com/A-Handbook-of-Statistical-Analyses-using-R/Hothorn-Everitt/p/book/9781482204582>'.
Maintained by Torsten Hothorn. Last updated 7 months ago.
4.0 match 6 stars 6.72 score 120 scripts 2 dependentsakgold
onelogin:Interact with the 'OneLogin' API
The identity provider ['OneLogin']<http://onelogin.com> is used for authentication via Single Sign On (SSO). This package provides an R interface to their API.
Maintained by Alex Gold. Last updated 6 years ago.
9.9 match 2.70 score 1 scriptsmarkedmondson1234
googleAuthR:Authenticate and Create Google APIs
Create R functions that interact with OAuth2 Google APIs <https://developers.google.com/apis-explorer/> easily, with auto-refresh and Shiny compatibility.
Maintained by Erik Grönroos. Last updated 10 months ago.
apiauthenticationgooglegoogleauthroauth2-flowshiny
2.0 match 178 stars 12.84 score 804 scripts 13 dependentssmartdata-analysis-and-statistics
metamisc:Meta-Analysis of Diagnosis and Prognosis Research Studies
Facilitate frequentist and Bayesian meta-analysis of diagnosis and prognosis research studies. It includes functions to summarize multiple estimates of prediction model discrimination and calibration performance (Debray et al., 2019) <doi:10.1177/0962280218785504>. It also includes functions to evaluate funnel plot asymmetry (Debray et al., 2018) <doi:10.1002/jrsm.1266>. Finally, the package provides functions for developing multivariable prediction models from datasets with clustering (de Jong et al., 2021) <doi:10.1002/sim.8981>.
Maintained by Thomas Debray. Last updated 30 days ago.
meta-analysisprognosisprognostic-models
3.4 match 7 stars 7.48 score 102 scriptsazure
AzureStor:Storage Management in 'Azure'
Manage storage in Microsoft's 'Azure' cloud: <https://azure.microsoft.com/en-us/product-categories/storage/>. On the admin side, 'AzureStor' includes features to create, modify and delete storage accounts. On the client side, it includes an interface to blob storage, file storage, and 'Azure Data Lake Storage Gen2': upload and download files and blobs; list containers and files/blobs; create containers; and so on. Authenticated access to storage is supported, via either a shared access key or a shared access signature (SAS). Part of the 'AzureR' family of packages.
Maintained by Hong Ooi. Last updated 2 years ago.
azure-data-lakeazure-sdk-razure-storageazure-storage-blobazure-storage-file
2.3 match 64 stars 10.72 score 298 scripts 4 dependentsbioc
MoonlightR:Identify oncogenes and tumor suppressor genes from omics data
Motivation: The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). Results: We present an R/bioconductor package called MoonlightR which returns a list of candidate driver genes for specific cancer types on the basis of TCGA expression data. The method first infers gene regulatory networks and then carries out a functional enrichment analysis (FEA) (implementing an upstream regulator analysis, URA) to score the importance of well-known biological processes with respect to the studied cancer type. Eventually, by means of random forests, MoonlightR predicts two specific roles for the candidate driver genes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, MoonlightR can be used to discover OCGs and TSGs in the same cancer type. This may help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV) in breast cancer. In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments.
Maintained by Matteo Tiberti. Last updated 5 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment
3.7 match 17 stars 6.57 scoretidymodels
textrecipes:Extra 'Recipes' for Text Processing
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Maintained by Emil Hvitfeldt. Last updated 8 days ago.
2.3 match 160 stars 10.87 score 964 scripts 1 dependentsthothorn
HSAUR:A Handbook of Statistical Analyses Using R (1st Edition)
Functions, data sets, analyses and examples from the book ''A Handbook of Statistical Analyses Using R'' (Brian S. Everitt and Torsten Hothorn, Chapman & Hall/CRC, 2006). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available.
Maintained by Torsten Hothorn. Last updated 3 years ago.
4.0 match 6.07 score 253 scripts 5 dependentsazure
AzureGraph:Simple Interface to 'Microsoft Graph'
A simple interface to the 'Microsoft Graph' API <https://learn.microsoft.com/en-us/graph/overview>. 'Graph' is a comprehensive framework for accessing data in various online Microsoft services. This package was originally intended to provide an R interface only to the 'Azure Active Directory' part, with a view to supporting interoperability of R and 'Azure': users, groups, registered apps and service principals. However it has since been expanded into a more general tool for interacting with Graph. Part of the 'AzureR' family of packages.
Maintained by Hong Ooi. Last updated 2 years ago.
azure-active-directory-graph-apiazure-sdk-rmicrosoft-graph-api
2.3 match 32 stars 10.30 score 36 scripts 21 dependentsmlr-org
mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'
Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.
Maintained by Martin Binder. Last updated 8 days ago.
baggingdata-sciencedataflow-programmingensemble-learningmachine-learningmlr3pipelinespreprocessingstacking
1.9 match 141 stars 12.36 score 448 scripts 7 dependentsstouffer
rnetcarto:Fast Network Modularity and Roles Computation by Simulated Annealing (Rgraph C Library Wrapper for R)
Provides functions to compute the modularity and modularity-related roles in networks. It is a wrapper around the rgraph library (Guimera & Amaral, 2005, <doi:10.1038/nature03288>).
Maintained by Daniel B. Stouffer. Last updated 2 years ago.
5.0 match 1 stars 4.58 score 38 scriptsr-forge
Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.
Maintained by Berwin A Turlach. Last updated 1 years ago.
3.6 match 6.38 score 522 scriptspaws-r
paws:Amazon Web Services Software Development Kit
Interface to Amazon Web Services <https://aws.amazon.com>, including storage, database, and compute services, such as 'Simple Storage Service' ('S3'), 'DynamoDB' 'NoSQL' database, and 'Lambda' functions-as-a-service.
Maintained by Dyfan Jones. Last updated 3 days ago.
2.0 match 332 stars 11.25 score 177 scripts 12 dependentsjiefei-wang
aws.ecx:Communicating with AWS EC2 and ECS using AWS REST APIs
Providing the functions for communicating with Amazon Web Services(AWS) Elastic Compute Cloud(EC2) and Elastic Container Service(ECS). The functions will have the prefix 'ecs_' or 'ec2_' depending on the class of the API. The request will be sent via the REST API and the parameters are given by the function argument. The credentials can be set via 'aws_set_credentials'. The EC2 documentation can be found at <https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Welcome.html> and ECS can be found at <https://docs.aws.amazon.com/AmazonECS/latest/APIReference/Welcome.html>.
Maintained by Jiefei Wang. Last updated 3 years ago.
5.3 match 1 stars 4.18 score 2 scriptsthothorn
HSAUR2:A Handbook of Statistical Analyses Using R (2nd Edition)
Functions, data sets, analyses and examples from the second edition of the book ''A Handbook of Statistical Analyses Using R'' (Brian S. Everitt and Torsten Hothorn, Chapman & Hall/CRC, 2008). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, the package contains Sweave code for producing slides for selected chapters (see HSAUR2/inst/slides).
Maintained by Torsten Hothorn. Last updated 2 years ago.
4.0 match 5.51 score 181 scripts 1 dependentsbioc
DiffLogo:DiffLogo: A comparative visualisation of biooligomer motifs
DiffLogo is an easy-to-use tool to visualize motif differences.
Maintained by Hendrik Treutler. Last updated 5 months ago.
softwaresequencematchingmultiplecomparisonmotifannotationvisualizationalignment
3.2 match 8 stars 6.66 score 27 scriptsjkropko
coxed:Duration-Based Quantities of Interest for the Cox Proportional Hazards Model
Functions for generating, simulating, and visualizing expected durations and marginal changes in duration from the Cox proportional hazards model as described in Kropko and Harden (2017) <doi:10.1017/S000712341700045X> and Harden and Kropko (2018) <doi:10.1017/psrm.2018.19>.
Maintained by "Kropko, Jonathan". Last updated 4 years ago.
3.3 match 25 stars 6.00 score 132 scripts 1 dependentsbioc
ReactomeGraph4R:Interface for the Reactome Graph Database
Pathways, reactions, and biological entities in Reactome knowledge are systematically represented as an ordered network. Instances are represented as nodes and relationships between instances as edges; they are all stored in the Reactome Graph Database. This package serves as an interface to query the interconnected data from a local Neo4j database, with the aim of minimizing the usage of Neo4j Cypher queries.
Maintained by Chi-Lam Poon. Last updated 5 months ago.
dataimportpathwaysreactomenetworkgraphandnetwork
3.5 match 6 stars 5.26 score 6 scriptspaws-r
paws.security.identity:'Amazon Web Services' Security, Identity, & Compliance Services
Interface to 'Amazon Web Services' security, identity, and compliance services, including the 'Identity & Access Management' ('IAM') service for managing access to services and resources, and more <https://aws.amazon.com/>.
Maintained by Dyfan Jones. Last updated 3 days ago.
2.0 match 332 stars 9.17 score 15 dependentsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 18 days ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
1.8 match 126 stars 9.90 score 226 scripts 2 dependentscivisanalytics
civis:R Client for the 'Civis Platform API'
A convenient interface for making requests directly to the 'Civis Platform API' <https://www.civisanalytics.com/platform/>. Full documentation available 'here' <https://civisanalytics.github.io/civis-r/>.
Maintained by Peter Cooman. Last updated 2 months ago.
2.3 match 16 stars 7.84 score 144 scriptsbioc
DAPAR:Tools for the Differential Analysis of Proteins Abundance with R
The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).
Maintained by Samuel Wieczorek. Last updated 5 months ago.
proteomicsnormalizationpreprocessingmassspectrometryqualitycontrolgodataimportprostar1
3.2 match 2 stars 5.42 score 22 scripts 1 dependentsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 11 days ago.
1.3 match 46 stars 12.34 score 1.2k scripts 13 dependentsmages
googleVis:R Interface to Google Charts
R interface to Google's chart tools, allowing users to create interactive charts based on data frames. Charts are displayed locally via the R HTTP help server. A modern browser with an Internet connection is required. The data remains local and is not uploaded to Google.
Maintained by Markus Gesmann. Last updated 10 months ago.
1.3 match 361 stars 12.98 score 2.4k scripts 11 dependentsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
2.0 match 61 stars 7.95 score 121 scriptsdatawookie
clockify:A Wrapper for the 'Clockify' API
A wrapper for the Clockify API <https://docs.clockify.me/>, making it possible to query, insert and update time keeping data.
Maintained by Andrew B. Collier. Last updated 10 months ago.
4.0 match 2 stars 3.95 score 6 scriptsbioc
PIUMA:Phenotypes Identification Using Mapper from topological data Analysis
The PIUMA package offers a tidy pipeline of Topological Data Analysis frameworks to identify and characterize communities in high and heterogeneous dimensional data.
Maintained by Mattia Chiesa. Last updated 5 months ago.
clusteringgraphandnetworkdimensionreductionnetworkclassification
3.0 match 4 stars 5.08 score 2 scriptshuanglabumn
oncoPredict:Drug Response Modeling and Biomarker Discovery
Allows for building drug response models using screening data between bulk RNA-Seq and a drug response metric and two additional tools for biomarker discovery that have been developed by the Huang Laboratory at University of Minnesota. There are 3 main functions within this package. (1) calcPhenotype is used to build drug response models on RNA-Seq data and impute them on any other RNA-Seq dataset given to the model. (2) GLDS is used to calculate the general level of drug sensitivity, which can improve biomarker discovery. (3) IDWAS can take the results from calcPhenotype and link the imputed response back to available genomic (mutation and CNV alterations) to identify biomarkers. Each of these functions comes from a paper from the Huang research laboratory. Below gives the relevant paper for each function. calcPhenotype - Geeleher et al, Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. GLDS - Geeleher et al, Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. IDWAS - Geeleher et al, Discovering novel pharmacogenomic biomarkers by imputing drug response in cancer patients from large genomics studies.
Maintained by Robert Gruener. Last updated 12 months ago.
svapreprocesscorestringrbiomartgenefilterorg.hs.eg.dbgenomicfeaturestxdb.hsapiens.ucsc.hg19.knowngenetcgabiolinksbiocgenericsgenomicrangesirangess4vectors
2.4 match 18 stars 6.47 score 41 scriptsnicebread
fSRM:Social Relations Analyses with Roles ("Family SRM")
Social Relations Analysis with roles ("Family SRM") are computed, using a structural equation modeling approach. Groups ranging from three members up to an unlimited number of members are supported and the mean structure can be computed. Means and variances can be compared between different groups of families and between roles.
Maintained by Felix Schönbrodt. Last updated 4 years ago.
14.7 match 1.04 score 11 scriptsdyfanjones
noctua:Connect to 'AWS Athena' using R 'AWS SDK' 'paws' ('DBI' Interface)
Designed to be compatible with the 'R' package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this the 'R' 'AWS' Software Development Kit ('SDK') 'paws' <https://github.com/paws-r/paws> is used as a driver.
Maintained by Dyfan Jones. Last updated 11 months ago.
1.9 match 46 stars 7.48 score 58 scriptsmspeekenbrink
sdamr:Statistics: Data Analysis and Modelling
Data sets and functions to support the books "Statistics: Data analysis and modelling" by Speekenbrink, M. (2021) <https://mspeekenbrink.github.io/sdam-book/> and "An R companion to Statistics: data analysis and modelling" by Speekenbrink, M. (2021) <https://mspeekenbrink.github.io/sdam-r-companion/>. All datasets analysed in these books are provided in this package. In addition, the package provides functions to compute sample statistics (variance, standard deviation, mode), create raincloud and enhanced Q-Q plots, and expand Anova results into omnibus tests and tests of individual contrasts.
Maintained by Maarten Speekenbrink. Last updated 1 months ago.
3.1 match 5 stars 4.39 score 99 scriptsbig-life-lab
recodeflow:Contains functions to interface with variable details sheets, including recoding variables and converting them to PMML
Recode and harmonize data using variable and details sheets.
Maintained by Yulric Sequeria. Last updated 5 days ago.
2.0 match 6 stars 6.75 score 7 scriptsdyfanjones
RAthena:Connect to 'AWS Athena' using 'Boto3' ('DBI' Interface)
Designed to be compatible with the R package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this 'Python' 'Boto3' Software Development Kit ('SDK') <https://boto3.amazonaws.com/v1/documentation/api/latest/index.html> is used as a driver.
Maintained by Dyfan Jones. Last updated 1 years ago.
1.9 match 37 stars 7.10 score 38 scriptspablobarbera
Rfacebook:Access to Facebook API via R
Provides an interface to the Facebook API.
Maintained by Pablo Barbera. Last updated 5 years ago.
1.7 match 351 stars 7.75 score 268 scriptslvclark
polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids
Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.
Maintained by Lindsay V. Clark. Last updated 7 days ago.
bioinformaticsdna-sequencinggenotype-likelihoodsgenotyping-by-sequencinghacktoberfestrad-seqrad-sequencingsnp-genotypingcpp
1.8 match 28 stars 6.98 score 85 scriptscran
ILSM:Analyze Interconnection Structure of Multilayer Interaction Networks
In view of the analysis of the structural characteristics of the multilayer network has been complete, however, there is still a lack of a unified operation that can quickly obtain the corresponding characteristics of the multilayer network. To solve this insufficiency, 'ILSM' was designed for supporting calculating such metrics of multilayer networks by functions of this R package.
Maintained by WeiCheng Sun. Last updated 6 months ago.
3.7 match 3.30 scorevusaverse
vvcanvas:'Canvas' LMS API Integration
Allow R users to interact with the 'Canvas' Learning Management System (LMS) API (see <https://canvas.instructure.com/doc/api/all_resources.html> for details). It provides a set of functions to access and manipulate course data, assignments, grades, users, and other resources available through the 'Canvas' API.
Maintained by Tomer Iwan. Last updated 3 days ago.
canvascanvas-lmscanvas-lms-apicanvasapieducationalinstructure-canvas
1.9 match 7 stars 6.23 score 10 scriptscran
randomLCA:Random Effects Latent Class Analysis
Fits standard and random effects latent class models. The single level random effects model is described in Qu et al <doi:10.2307/2533043> and the two level random effects model in Beath and Heller <doi:10.1177/1471082X0800900302>. Examples are given for their use in diagnostic testing.
Maintained by Ken Beath. Last updated 6 months ago.
3.8 match 3.10 score 42 scriptsmatloff
qeML:Quick and Easy Machine Learning Tools
The letters 'qe' in the package title stand for "quick and easy," alluding to the convenience goal of the package. We bring together a variety of machine learning (ML) tools from standard R packages, providing wrappers with a simple, convenient, and uniform interface.
Maintained by Norm Matloff. Last updated 25 days ago.
1.3 match 41 stars 8.41 score 48 scripts 1 dependentsdaroczig
botor:'AWS Python SDK' ('boto3') for R
Fork-safe, raw access to the 'Amazon Web Services' ('AWS') 'SDK' via the 'boto3' 'Python' module, and convenient helper functions to query the 'Simple Storage Service' ('S3') and 'Key Management Service' ('KMS'), partial support for 'IAM', the 'Systems Manager Parameter Store' and 'Secrets Manager'.
Maintained by Gergely Daróczi. Last updated 2 months ago.
amazon-web-servicesawsboto3python
1.7 match 31 stars 6.61 score 32 scriptsmlr-org
mlr3spatiotempcv:Spatiotemporal Resampling Methods for 'mlr3'
Extends the mlr3 machine learning framework with spatio-temporal resampling methods to account for the presence of spatiotemporal autocorrelation (STAC) in predictor variables. STAC may cause highly biased performance estimates in cross-validation if ignored. A JSS article is available at <doi:10.18637/jss.v111.i07>.
Maintained by Patrick Schratz. Last updated 4 months ago.
cross-validationmlr3resamplingresampling-methodsspatialtemporal
1.3 match 50 stars 8.09 score 123 scriptsfloschuberth
cSEM:Composite-Based Structural Equation Modeling
Estimate, assess, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croon’s approach), as well as several tests and typical postestimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit etc.).
Maintained by Florian Schuberth. Last updated 16 days ago.
1.1 match 28 stars 9.11 score 56 scripts 2 dependentsmbannert
timeseriesdb:A Time Series Database for Official Statistics with R and PostgreSQL
Archive and manage times series data from official statistics. The 'timeseriesdb' package was designed to manage a large catalog of time series from official statistics which are typically published on a monthly, quarterly or yearly basis. Thus timeseriesdb is optimized to handle updates caused by data revision as well as elaborate, multi-lingual meta information.
Maintained by Matthias Bannert. Last updated 6 months ago.
1.5 match 24 stars 6.89 score 26 scriptsjosesamos
starschemar:Obtaining Stars from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a star schema. Transformations can be carried out using professional extract, transform and load tools or tools intended for data transformation for end users. With the tools mentioned, this transformation can be carried out, but it requires a lot of work. The main objective of this package is to define transformations that allow obtaining stars from flat tables easily. In addition, it includes basic data cleaning, dimension enrichment, incremental data refresh and query operations, adapted to this context.
Maintained by Jose Samos. Last updated 11 months ago.
1.8 match 7 stars 5.66 score 11 scripts 2 dependentsemitanaka
edibble:Encapsulating Elements of Experimental Design
A system to facilitate designing comparative (and non-comparative) experiments using the grammar of experimental designs <https://emitanaka.org/edibble-book/>. An experimental design is treated as an intermediate, mutable object that is built progressively by fundamental experimental components like units, treatments, and their relation. The system aids in experimental planning, management and workflow.
Maintained by Emi Tanaka. Last updated 4 months ago.
1.3 match 217 stars 7.43 score 62 scriptsropensci
tic:Tasks Integrating Continuously: CI-Agnostic Workflow Definitions
Provides a way to describe common build and deployment workflows for R-based projects: packages, websites (e.g. blogdown, pkgdown), or data processing (e.g. research compendia). The recipe is described independent of the continuous integration tool used for processing the workflow (e.g. 'GitHub Actions' or 'Circle CI'). This package has been peer-reviewed by rOpenSci (v0.3.0.9004).
Maintained by Eli Miller. Last updated 1 months ago.
appveyorcontinuous-integrationdeploymentgithubactionstravis-ci
1.3 match 155 stars 7.57 score 16 scriptsmbojan
isnar:Introduction to Social Network Analysis with R
Functions and datasets accompanying the workshop "Introduction to Social Network Analysis with R" on annual INSNA Sunbelt conferences.
Maintained by Michal Bojanowski. Last updated 4 years ago.
3.3 match 8 stars 2.86 score 18 scriptscloudyr
aws.iam:AWS IAM Client Package
A simple client for the Amazon Web Services ('AWS') Identity and Access Management ('IAM') 'API' <https://aws.amazon.com/iam/>.
Maintained by Simon Urbanek. Last updated 5 years ago.
2.0 match 15 stars 4.65 score 10 scriptsbioc
GenomicRanges:Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Maintained by Hervé Pagès. Last updated 4 months ago.
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
0.5 match 44 stars 17.75 score 13k scripts 1.3k dependentsropensci
CRediTas:Generate CRediT Author Statements
A tiny package to generate CRediT author statements (<https://credit.niso.org/>). It provides three functions: create a template, read it back and generate the CRediT author statement in a text file.
Maintained by Josep Pueyo-Ros. Last updated 2 years ago.
1.9 match 8 stars 4.75 score 14 scriptsjiscah
sequoia:Pedigree Inference from SNPs
Multi-generational pedigree inference from incomplete data on hundreds of SNPs, including parentage assignment and sibship clustering. See Huisman (2017) (<DOI:10.1111/1755-0998.12665>) for more information.
Maintained by Jisca Huisman. Last updated 9 months ago.
pedigreepedigree-reconstructionpedigreessequoiasnpsnp-datafortran
1.2 match 26 stars 7.40 score 79 scriptsjimbrig
lossrx:Actuarial Loss Development and Reserving with R
Actuarial Loss Development and Reserving Helper Functions and ShinyApp.
Maintained by Jimmy Briggs. Last updated 3 months ago.
actuarial-scienceclaims-dataclaims-reservingdata-scienceinsurancemodellingproperty-casualtyreservingrshinyworkflow
1.5 match 14 stars 5.89 score 7 scriptsjglev
veccompare:Perform Set Operations on Vectors, Automatically Generating All n-Wise Comparisons, and Create Markdown Output
Automates set operations (i.e., comparisons of overlap) between multiple vectors. It also contains a function for automating reporting in 'RMarkdown', by generating markdown output for easy analysis, as well as an 'RMarkdown' template for use with 'RStudio'.
Maintained by Jacob Gerard Levernier. Last updated 8 years ago.
2.4 match 8 stars 3.60 score 10 scriptskzavez
LearnVizLMM:Learning and Communicating Linear Mixed Models Without Data
Summarizes characteristics of linear mixed effects models without data or a fitted model by converting code for fitting lmer() from 'lme4' and lme() from 'nlme' into tables, equations, and visuals. Outputs can be used to learn how to fit linear mixed effects models in 'R' and to communicate about these models in presentations, manuscripts, and analysis plans.
Maintained by Katherine Zavez. Last updated 5 months ago.
2.3 match 3.70 score 2 scriptsdrg-123
IIS:Datasets to Accompany Wolfe and Schneider - Intuitive Introductory Statistics
These datasets and functions accompany Wolfe and Schneider (2017) - Intuitive Introductory Statistics (ISBN: 978-3-319-56070-0) <doi:10.1007/978-3-319-56072-4>. They are used in the examples throughout the text and in the end-of-chapter exercises. The datasets are meant to cover a broad range of topics in order to appeal to the diverse set of interests and backgrounds typically present in an introductory Statistics class.
Maintained by Grant Schneider. Last updated 1 months ago.
4.5 match 1.74 score 55 scriptscran
datarobot:'DataRobot' Predictive Modeling API
For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.
Maintained by AJ Alon. Last updated 1 years ago.
2.3 match 2 stars 3.48 scorebioc
awst:Asymmetric Within-Sample Transformation
We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.
Maintained by Davide Risso. Last updated 5 months ago.
normalizationgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
1.5 match 3 stars 4.95 score 15 scriptsgi0na
ghypernet:Fit and Simulate Generalised Hypergeometric Ensembles of Graphs
Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG). To learn how to use it, check the vignettes for a quick tutorial. Please reference its use as Casiraghi, G., Nanumyan, V. (2019) <doi:10.5281/zenodo.2555300> together with those relevant references from the one listed below. The package is based on the research developed at the Chair of Systems Design, ETH Zurich. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>. Casiraghi, G., (2017) <arXiv:1702.02048> Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926> Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>. Casiraghi, G., Nanumyan, V. (2021) <doi:10.1038/s41598-021-92519-y>. Casiraghi, G. (2021) <doi:10.1088/2632-072X/ac0493>.
Maintained by Giona Casiraghi. Last updated 11 months ago.
data-miningdata-sciencegraphsnetworknetwork-analysisrandom-graph-generationrandom-graphs
1.3 match 8 stars 5.68 score 20 scriptskarlines
NetIndices:Estimating Network Indices, Including Trophic Structure of Foodwebs in R
Given a network (e.g. a food web), estimates several network indices. These include: Ascendency network indices, Direct and indirect dependencies, Effective measures, Environ network indices, General network indices, Pathway analysis, Network uncertainty indices and constraint efficiencies and the trophic level and omnivory indices of food webs.
Maintained by Karline Soetaert. Last updated 3 years ago.
1.7 match 3.91 score 134 scripts 2 dependentsbioc
Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data
The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.
Maintained by Matteo Tiberti. Last updated 2 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment
1.0 match 5 stars 6.59 score 43 scriptsopenvolley
ovlytics:Functions and Algorithms for Volleyball Analytics
Analytical functions for volleyball analytics, to be used in conjunction with the datavolley and peranavolley packages.
Maintained by Ben Raymond. Last updated 3 months ago.
2.0 match 3.13 score 9 scripts 3 dependentscran
airGRteaching:Teaching Hydrological Modelling with the GR Rainfall-Runoff Models ('Shiny' Interface Included)
Add-on package to the 'airGR' package that simplifies its use and is aimed at being used for teaching hydrology. The package provides 1) three functions that allow to complete very simply a hydrological modelling exercise 2) plotting functions to help students to explore observed data and to interpret the results of calibration and simulation of the GR ('Génie rural') models 3) a 'Shiny' graphical interface that allows for displaying the impact of model parameters on hydrographs and models internal variables.
Maintained by Olivier Delaigue. Last updated 1 months ago.
1.3 match 6 stars 4.82 scoreninohardt
echoice2:Choice Models with Economic Foundation
Implements choice models based on economic theory, including estimation using Markov chain Monte Carlo (MCMC), prediction, and more. Its usability is inspired by ideas from 'tidyverse'. Models include versions of the Hierarchical Multinomial Logit and Multiple Discrete-Continous (Volumetric) models with and without screening. The foundations of these models are described in Allenby, Hardt and Rossi (2019) <doi:10.1016/bs.hem.2019.04.002>. Models with conjunctive screening are described in Kim, Hardt, Kim and Allenby (2022) <doi:10.1016/j.ijresmar.2022.04.001>. Models with set-size variation are described in Hardt and Kurz (2020) <doi:10.2139/ssrn.3418383>.
Maintained by Nino Hardt. Last updated 1 years ago.
choice-modelsopenblascppopenmp
1.5 match 1 stars 4.00 score 7 scriptsifpri
ARIA:App for IMPACT (App foR ImpAct)
App for IMPACT (App foR ImpAct).
Maintained by Abhijeet Mishra. Last updated 10 months ago.
2.2 match 2.70 score 6 scriptsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 28 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
0.5 match 55 stars 11.77 score 1.2k scripts 2 dependentsdyfanjones
smdocker:Build 'Docker Images' in 'Amazon SageMaker Studio' using 'Amazon Web Service CodeBuild'
Allows users to easily build custom 'docker images' <https://docs.docker.com/> from 'Amazon Web Service Sagemaker' <https://aws.amazon.com/sagemaker/> using 'Amazon Web Service CodeBuild' <https://aws.amazon.com/codebuild/>.
Maintained by Dyfan Jones. Last updated 2 years ago.
1.7 match 5 stars 3.40 score 4 scriptspsolymos
clickrup:Interacting with the ClickUp v2 API from R
Work with the ClickUp productivity app from R to manage tasks, goals, time tracking, and more.
Maintained by Peter Solymos. Last updated 1 years ago.
apiclickupclickup-apiproject-management
1.8 match 18 stars 3.26 score 7 scriptsjhmaindonald
hddplot:Use Known Groups in High-Dimensional Data to Derive Scores for Plots
Cross-validated linear discriminant calculations determine the optimum number of features. Test and training scores from successive cross-validation steps determine, via a principal components calculation, a low-dimensional global space onto which test scores are projected, in order to plot them. Further functions are included that are intended for didactic use. The package implements, and extends, methods described in J.H. Maindonald and C.J. Burden (2005) <https://journal.austms.org.au/V46/CTAC2004/Main/home.html>.
Maintained by John Maindonald. Last updated 2 years ago.
1.7 match 3.00 score 10 scriptsdyfanjones
sagemaker.core:Sagemaker core classes, methods and functions
Contains core classes, methods and functions that support `AWS Sagemaker R Software Development Kit (SDK)`.
Maintained by Dyfan Jones. Last updated 3 years ago.
amazon-sagemakerawsmachine-learningsagemakersdk
1.7 match 2.88 score 1 scripts 5 dependentstobiasschoch
wbacon:Weighted BACON Algorithms
The BACON algorithms are methods for multivariate outlier nomination (detection) and robust linear regression by Billor, Hadi, and Velleman (2000) <doi:10.1016/S0167-9473(99)00101-2>. The extension to weighted problems is due to Beguin and Hulliger (2008) <https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X200800110616>; see also <doi:10.21105/joss.03238>.
Maintained by Tobias Schoch. Last updated 6 months ago.
outlieroutlier-detectionrobust-regressionstatisticsopenblasopenmp
1.2 match 2 stars 4.00 score 8 scriptsbioc
midasHLA:R package for immunogenomics data handling and association analysis
MiDAS is a R package for immunogenetics data transformation and statistical analysis. MiDAS accepts input data in the form of HLA alleles and KIR types, and can transform it into biologically meaningful variables, enabling HLA amino acid fine mapping, analyses of HLA evolutionary divergence, KIR gene presence, as well as validated HLA-KIR interactions. Further, it allows comprehensive statistical association analysis workflows with phenotypes of diverse measurement scales. MiDAS closes a gap between the inference of immunogenetic variation and its efficient utilization to make relevant discoveries related to T cell, Natural Killer cell, and disease biology.
Maintained by Maciej Migdał. Last updated 5 months ago.
cellbiologygeneticsstatisticalmethod
1.1 match 4.30 score 3 scriptsinqs909
csucistats:CSU Channel Islands R Tools
An R package containing functions for statistics courses at CSUCI.
Maintained by Isaac Quintanilla Salinas. Last updated 2 months ago.
1.6 match 2.99 score 14 scriptsjohannes-titz
passt:Probability Associator Time (PASS-T)
Simulates judgments of frequency and duration based on the Probability Associator Time (PASS-T) model. PASS-T is a memory model based on a simple competitive artificial neural network. It can imitate human judgments of frequency and duration, which have been extensively studied in cognitive psychology (e.g. Hintzman (1970) <doi:10.1037/h0028865>, Betsch et al. (2010) <https://psycnet.apa.org/record/2010-18204-003>). The PASS-T model is an extension of the PASS model (Sedlmeier, 2002, ISBN:0198508638). The package provides an easy way to run simulations, which can then be compared with empirical data in human judgments of frequency and duration.
Maintained by Johannes Titz. Last updated 4 years ago.
1.3 match 3.70 score 3 scriptsabdisalammuse
AHSurv:Flexible Parametric Accelerated Hazards Models
Flexible parametric Accelerated Hazards (AH) regression models in overall and relative survival frameworks with 13 distinct Baseline Distributions. The AH Model can also be applied to lifetime data with crossed survival curves. Any user-defined parametric distribution can be fitted, given at least an R function defining the cumulative hazard and hazard rate functions. See Chen and Wang (2000) <doi:10.1080/01621459.2000.10474236>, and Lee (2015) <doi:10.1007/s10985-015-9349-5> for more details.
Maintained by Abdisalam Hassan Muse. Last updated 3 years ago.
3.1 match 1.48 score 1 dependentsnhs-r-community
NHSRtools:NHS-R Tools
Provides tools commonly used by Analysts within the NHS.
Maintained by Tom Jemmett. Last updated 4 years ago.
2.3 match 2 stars 2.00 score 1 scriptsropengov
mpg:FuelEconomy.gov Data
Extract fuel economy data from FuelEconomy.gov.
Maintained by Thomas J. Leeper. Last updated 3 years ago.
1.6 match 12 stars 2.78 scoreerictleung
pyblack:Style Python code blocks with black
RStudio addin to help format and style Python code in RMarkdown and Quarto documents with the Python code formatter, black.
Maintained by Eric Leung. Last updated 9 months ago.
blackformattingpythonrstudiorstudio-addin
1.6 match 8 stars 2.60 scoreshah-in-boots
rmdl:A Causality-Informed Modeling Approach
A system for describing and manipulating the many models that are generated in causal inference and data analysis projects, as based on the causal theory and criteria of Austin Bradford Hill (1965) <doi:10.1177/003591576505800503>. This system includes the addition of formal attributes that modify base `R` objects, including terms and formulas, with a focus on variable roles in the "do-calculus" of modeling, as described in Pearl (2010) <doi:10.2202/1557-4679.1203>. For example, the definition of exposure, outcome, and interaction are implicit in the roles variables take in a formula. These premises allow for a more fluent modeling approach focusing on variable relationships, and assessing effect modification, as described by VanderWeele and Robins (2007) <doi:10.1097/EDE.0b013e318127181b>. The essential goal is to help contextualize formulas and models in causality-oriented workflows.
Maintained by Anish S. Shah. Last updated 10 months ago.
epidemiologymodelingstatistics
0.8 match 4.60 score 7 scriptslcrawlab
mvMAPIT:Multivariate Genome Wide Marginal Epistasis Test
Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this package, we present the 'multivariate MArginal ePIstasis Test' ('mvMAPIT') – a multi-outcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact – thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search based methods. Our proposed 'mvMAPIT' builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate 'mvMAPIT' as a multivariate linear mixed model and develop a multi-trait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. Crawford et al. (2017) <doi:10.1371/journal.pgen.1006869>. Stamp et al. (2023) <doi:10.1093/g3journal/jkad118>.
Maintained by Julian Stamp. Last updated 5 months ago.
cppepistasisepistasis-analysisgwasgwas-toolslinear-mixed-modelsmapitmvmapitvariance-componentsopenblascppopenmp
0.5 match 11 stars 6.90 score 17 scripts 1 dependentsbioc
gDRstyle:A package with style requirements for the gDR suite
Package fills a helper package role for whole gDR suite. It helps to support good development practices by keeping style requirements and style tests for other packages. It also contains build helpers to make all package requirements met.
Maintained by Arkadiusz Gladki. Last updated 1 months ago.
0.5 match 2 stars 6.10 score 2 scriptsbioc
scMET:Bayesian modelling of cell-to-cell DNA methylation heterogeneity
High-throughput single-cell measurements of DNA methylomes can quantify methylation heterogeneity and uncover its role in gene regulation. However, technical limitations and sparse coverage can preclude this task. scMET is a hierarchical Bayesian model which overcomes sparsity, sharing information across cells and genomic features to robustly quantify genuine biological heterogeneity. scMET can identify highly variable features that drive epigenetic heterogeneity, and perform differential methylation and variability analyses. We illustrate how scMET facilitates the characterization of epigenetically distinct cell populations and how it enables the formulation of novel hypotheses on the epigenetic regulation of gene expression.
Maintained by Andreas C. Kapourani. Last updated 5 months ago.
immunooncologydnamethylationdifferentialmethylationdifferentialexpressiongeneexpressiongeneregulationepigeneticsgeneticsclusteringfeatureextractionregressionbayesiansequencingcoveragesinglecellbayesian-inferencegeneralised-linear-modelsheterogeneityhierarchical-modelsmethylation-analysissingle-cellcpp
0.5 match 20 stars 6.23 score 42 scriptsrsetienne
secsse:Several Examined and Concealed States-Dependent Speciation and Extinction
Simultaneously infers state-dependent diversification across two or more states of a single or multiple traits while accounting for the role of a possible concealed trait. See Herrera-Alsina et al. (2019) <doi:10.1093/sysbio/syy057>.
Maintained by Rampal S. Etienne. Last updated 11 months ago.
0.5 match 1 stars 5.83 score 34 scriptscran
latcontrol:Evaluation of the Role of Control Variables in Structural Equation Models
Various opportunities to evaluate the effects of including one or more control variable(s) in structural equation models onto model-implied variances, covariances, and parameter estimates. The derivation of the methodology employed in this package can be obtained from Blötner (2023) <doi:10.31234/osf.io/dy79z>.
Maintained by Christian Blötner. Last updated 9 months ago.
2.9 match 1.00 scorebioc
PICB:piRNA Cluster Builder
piRNAs (short for PIWI-interacting RNAs) and their PIWI protein partners play a key role in fertility and maintaining genome integrity by restricting mobile genetic elements (transposons) in germ cells. piRNAs originate from genomic regions known as piRNA clusters. The piRNA Cluster Builder (PICB) is a versatile toolkit designed to identify genomic regions with a high density of piRNAs. It constructs piRNA clusters through a stepwise integration of unique and multimapping piRNAs and offers wide-ranging parameter settings, supported by an optimization function that allows users to test different parameter combinations to tailor the analysis to their specific piRNA system. The output includes extensive metadata columns, enabling researchers to rank clusters and extract cluster characteristics.
Maintained by Franziska Ahrend. Last updated 1 months ago.
geneticsgenomeannotationsequencingfunctionalpredictioncoveragetranscriptomics
0.5 match 5 stars 5.57 scoreamoeba
pkgsci:Does science on R pakcages
Various utility functions for analyzing coding practices in R packages.
Maintained by Bryce Mecum. Last updated 6 years ago.
1.6 match 1.70 score 1 scriptsbioc
GRaNIE:GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data
Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.
Maintained by Christian Arnold. Last updated 5 months ago.
softwaregeneexpressiongeneregulationnetworkinferencegenesetenrichmentbiomedicalinformaticsgeneticstranscriptomicsatacseqrnaseqgraphandnetworkregressiontranscriptionchipseq
0.5 match 5.40 score 24 scriptshezibu
alien:Estimate Invasive and Alien Species (IAS) Introduction Rates
Easily estimate the introduction rates of alien species given first records data. It specializes in addressing the role of sampling on the pattern of discoveries, thus providing better estimates than using Generalized Linear Models which assume perfect immediate detection of newly introduced species.
Maintained by Yehezkel Buba. Last updated 9 months ago.
0.5 match 1 stars 5.08 score 10 scriptsmbq
vistla:Detecting Influence Paths with Information Theory
Traces information spread through interactions between features, utilising information theory measures and a higher-order generalisation of the concept of widest paths in graphs. In particular, 'vistla' can be used to better understand the results of high-throughput biomedical experiments, by organising the effects of the investigated intervention in a tree-like hierarchy from direct to indirect ones, following the plausible information relay circuits. Due to its higher-order nature, 'vistla' can handle multi-modality and assign multiple roles to a single feature.
Maintained by Miron B. Kursa. Last updated 24 days ago.
0.5 match 4.78 score 3 scriptsthiyangt
DSjobtracker:What Skills and Qualifications are Required for Data Science Related Jobs?
Dataset containing information about job listings for data science job roles.
Maintained by Thiyanga S. Talagala. Last updated 1 years ago.
datasetqualificationsskillsstatisticstidy
0.6 match 3 stars 4.29 score 13 scriptsstscl
sshicm:Information Consistency-Based Measures for Spatial Stratified Heterogeneity
Spatial stratified heterogeneity (SSH) denotes the coexistence of within-strata homogeneity and between-strata heterogeneity. Information consistency-based methods provide a rigorous approach to quantify SSH and evaluate its role in spatial processes, grounded in principles of geographical stratification and information theory (Bai, H. et al. (2023) <doi:10.1080/24694452.2023.2223700>; Wang, J. et al. (2024) <doi:10.1080/24694452.2023.2289982>).
Maintained by Wenbo Lv. Last updated 3 months ago.
geoinformaticsgeospatial-analysisinformation-theoryspatial-statisticsspatial-stratified-heterogeneitycpp
0.5 match 3 stars 4.65 score 2 scriptsbioc
NoRCE:NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment
While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.
Maintained by Gulden Olgun. Last updated 5 months ago.
biologicalquestiondifferentialexpressiongenomeannotationgenesetenrichmentgenetargetgenomeassemblygo
0.5 match 1 stars 4.60 score 6 scriptsjosesamos
geomultistar:Multidimensional Queries Enriched with Geographic Data
Multidimensional systems allow complex queries to be carried out in an easy way. The geographical dimension, together with the temporal dimension, plays a fundamental role in multidimensional systems. Through this package, vector geographic data layers can be associated to the attributes of geographic dimensions, so that the results of multidimensional queries can be obtained directly as vector layers. The multidimensional structures on which we can define the queries can be created from a flat table or imported directly using functions from this package.
Maintained by Jose Samos. Last updated 8 months ago.
0.5 match 2 stars 4.48 score 8 scripts 1 dependentskisungyou
mclustcomp:Measures for Comparing Clusters
Given a set of data points, a clustering is defined as a disjoint partition where each pair of sets in a partition has no overlapping elements. This package provides 25 methods that play a role somewhat similar to distance or metric that measures similarity of two clusterings - or partitions. For a more detailed description, see Meila, M. (2005) <doi:10.1145/1102351.1102424>.
Maintained by Kisung You. Last updated 2 years ago.
0.5 match 1 stars 4.43 score 18 scripts 10 dependentsbioc
RLassoCox:A reweighted Lasso-Cox by integrating gene interaction information
RLassoCox is a package that implements the RLasso-Cox model proposed by Wei Liu. The RLasso-Cox model integrates gene interaction information into the Lasso-Cox model for accurate survival prediction and survival biomarker discovery. It is based on the hypothesis that topologically important genes in the gene interaction network tend to have stable expression changes. The RLasso-Cox model uses random walk to evaluate the topological weight of genes, and then highlights topologically important genes to improve the generalization ability of the Lasso-Cox model. The RLasso-Cox model has the advantage of identifying small gene sets with high prognostic performance on independent datasets, which may play an important role in identifying robust survival biomarkers for various cancer types.
Maintained by Wei Liu. Last updated 5 months ago.
survivalregressiongeneexpressiongenepredictionnetwork
0.5 match 3 stars 4.48 score 2 scriptsbioc
flowSpecs:Tools for processing of high-dimensional cytometry data
This package is intended to fill the role of conventional cytometry pre-processing software, for spectral decomposition, transformation, visualization and cleanup, and to aid further downstream analyses, such as with DepecheR, by enabling transformation of flowFrames and flowSets to dataframes. Functions for flowCore-compliant automatic 1D-gating/filtering are in the pipe line. The package name has been chosen both as it will deal with spectral cytometry and as it will hopefully give the user a nice pair of spectacles through which to view their data.
Maintained by Jakob Theorell. Last updated 5 months ago.
softwarecellbasedassaysdatarepresentationimmunooncologyflowcytometrysinglecellvisualizationnormalizationdataimport
0.5 match 6 stars 4.38 score 7 scriptsbioc
transite:RNA-binding protein motif analysis
transite is a computational method that allows comprehensive analysis of the regulatory role of RNA-binding proteins in various cellular processes by leveraging preexisting gene expression data and current knowledge of binding preferences of RNA-binding proteins.
Maintained by Konstantin Krismer. Last updated 5 months ago.
geneexpressiontranscriptiondifferentialexpressionmicroarraymrnamicroarraygeneticsgenesetenrichmentcpp
0.5 match 4.30 score 20 scriptsbbcrown
solrad:Calculating Solar Radiation and Related Variables Based on Location, Time and Topographical Conditions
For surface energy models and estimation of solar positions and components with varying topography, time and locations. The functions calculate solar top-of-atmosphere, open, diffuse and direct components, atmospheric transmittance and diffuse factors, day length, sunrise and sunset, solar azimuth, zenith, altitude, incidence, and hour angles, earth declination angle, equation of time, and solar constant. Details about the methods and equations are explained in Seyednasrollah, Bijan, Mukesh Kumar, and Timothy E. Link. 'On the role of vegetation density on net snow cover radiation at the forest floor.' Journal of Geophysical Research: Atmospheres 118.15 (2013): 8359-8374, <doi:10.1002/jgrd.50575>.
Maintained by Bijan Seyednasrollah. Last updated 6 years ago.
0.5 match 10 stars 4.32 score 42 scriptsjosesamos
geodimension:Definition of Geographic Dimensions
The geographic dimension plays a fundamental role in multidimensional systems. To define a geographic dimension in a star schema, we need a table with attributes corresponding to the levels of the dimension. Additionally, we will also need one or more geographic layers to represent the data using this dimension. The goal of this package is to support the definition of geographic dimensions from layers of geographic information related to each other. It makes it easy to define relationships between layers and obtain the necessary data from them.
Maintained by Jose Samos. Last updated 1 years ago.
0.5 match 2 stars 4.00 score 8 scriptsbioc
DMCHMM:Differentially Methylated CpG using Hidden Markov Model
A pipeline for identifying differentially methylated CpG sites using Hidden Markov Model in bisulfite sequencing data. DNA methylation studies have enabled researchers to understand methylation patterns and their regulatory roles in biological processes and disease. However, only a limited number of statistical approaches have been developed to provide formal quantitative analysis. Specifically, a few available methods do identify differentially methylated CpG (DMC) sites or regions (DMR), but they suffer from limitations that arise mostly due to challenges inherent in bisulfite sequencing data. These challenges include: (1) that read-depths vary considerably among genomic positions and are often low; (2) both methylation and autocorrelation patterns change as regions change; and (3) CpG sites are distributed unevenly. Furthermore, there are several methodological limitations: almost none of these tools is capable of comparing multiple groups and/or working with missing values, and only a few allow continuous or multiple covariates. The last of these is of great interest among researchers, as the goal is often to find which regions of the genome are associated with several exposures and traits. To tackle these issues, we have developed an efficient DMC identification method based on Hidden Markov Models (HMMs) called “DMCHMM” which is a three-step approach (model selection, prediction, testing) aiming to address the aforementioned drawbacks.
Maintained by Farhad Shokoohi. Last updated 5 months ago.
differentialmethylationsequencinghiddenmarkovmodelcoverage
0.5 match 3.78 score 3 scriptsflorian-laroumagne
previsionio:'Prevision.io' R SDK
For working with the 'Prevision.io' AI model management platform's API <https://prevision.io/>.
Maintained by Florian Laroumagne. Last updated 3 years ago.
1.8 match 1.00 score 1 scriptskolassa-dev
PHInfiniteEstimates:Tools for Inference in the Presence of a Monotone Likelihood
Proportional hazards estimation in the presence of a partially monotone likelihood has difficulties, in that finite estimators do not exist. These difficulties are related to those arising from logistic and multinomial regression. References for methods are given in the separate function documents. Supported by grant NSF DMS 1712839.
Maintained by John E. Kolassa. Last updated 1 years ago.
1.6 match 1.00 scorer-forge
multiColl:Collinearity Detection in a Multiple Linear Regression Model
The detection of worrying approximate collinearity in a multiple linear regression model is a problem addressed in all existing statistical packages. However, we have detected deficits regarding to the incorrect treatment of qualitative independent variables and the role of the intercept of the model. The objective of this package is to correct these deficits. In this package will be available detection and treatment techniques traditionally used as the recently developed. D.A. Belsley (1982) <doi:10.1016/0304-4076(82)90020-3>. D. A. Belsley (1991, ISBN: 978-0471528890). C. Garcia, R. Salmeron and C.B. Garcia (2019) <doi:10.1080/00949655.2018.1543423>. R. Salmeron, C.B. Garcia and J. Garcia (2018) <doi:10.1080/00949655.2018.1463376>. G.W. Stewart (1987) <doi:10.1214/ss/1177013444>.
Maintained by R. Salmeron. Last updated 2 years ago.
0.5 match 2.59 score 13 scripts 1 dependentsmatt-dray
potato:Play a Game of 'Potato'
Play in your console an interactive version of 'Potato', a one-page role-playing game by Oliver Darkshire.
Maintained by Matt Dray. Last updated 3 years ago.
0.5 match 1.70 score 6 scriptsslimaneregui
PerRegMod:Fitting Periodic Coefficients Linear Regression Models
Provides tools for fitting periodic coefficients regression models to data where periodicity plays a crucial role. It allows users to model and analyze relationships between variables that exhibit cyclical or seasonal patterns, offering functions for estimating parameters and testing the periodicity of coefficients in linear regression models. For simple periodic coefficient regression model see Regui et al. (2024) <doi:10.1080/03610918.2024.2314662>.
Maintained by Slimane Regui. Last updated 3 months ago.
0.5 match 1.48 score 1 scriptsrandel
GMAC:Genomic Mediation Analysis with Adaptive Confounding Adjustment
Performs genomic mediation analysis with adaptive confounding adjustment (GMAC) proposed by Yang et al. (2017) <doi:10.1101/078683>. It implements large scale mediation analysis and adaptively selects potential confounding variables to adjust for each mediation test from a pool of candidate confounders. The package is tailored for but not limited to genomic mediation analysis (e.g., cis-gene mediating trans-gene regulation pattern where an eQTL, its cis-linking gene transcript, and its trans-gene transcript play the roles as treatment, mediator and the outcome, respectively), restricting to scenarios with the presence of cis-association (i.e., treatment-mediator association) and random eQTL (i.e., treatment).
Maintained by Jiebiao Wang. Last updated 3 years ago.
0.5 match 1.48 score 3 scriptscran
DNAmotif:DNA Sequence Motifs
Motifs within biological sequences show a significant role. This package utilizes a user-defined threshold value (window size and similarity) to create consensus segments or motifs through local alignment of dynamic programming with gap and it calculates the frequency of each identified motif, offering a detailed view of their prevalence within the dataset. It allows for thorough exploration and understanding of sequence patterns and their biological importance.
Maintained by Subham Ghosh. Last updated 6 months ago.
0.5 match 1.30 scorecran
censCov:Linear Regression with a Randomly Censored Covariate
Implementations of threshold regression approaches for linear regression models with a covariate subject to random censoring, including deletion threshold regression and completion threshold regression. Reverse survival regression, which flip the role of response variable and the covariate, is also considered.
Maintained by Sy Han (Steven) Chiou. Last updated 8 years ago.
0.5 match 1.00 score 1 scriptscran
EnviroPRA2:Environmental Probabilistic Risk Assessment Tools
It contains functions for dose calculation for different routes, fitting data to probability distributions, random number generation (Monte Carlo simulation) and calculation of systemic and carcinogenic risks. For more information see the publication: Barrio-Parra et al. (2019) "Human-health probabilistic risk assessment: the role of exposure factors in an urban garden scenario" <doi:10.1016/j.landurbplan.2019.02.005>.
Maintained by Fernando Barrio-Parra. Last updated 1 years ago.
0.5 match 1.00 scorecran
CoreMicrobiomeR:Identification of Core Microbiome
The Core Microbiome refers to the group of microorganisms that are consistently present in a particular environment, habitat, or host species. These microorganisms play a crucial role in the functioning and stability of that ecosystem. Identifying these microorganisms can contribute to the emerging field of personalized medicine. The 'CoreMicrobiomeR' is designed to facilitate the identification, statistical testing, and visualization of this group of microorganisms.This package offers three key functions to analyze and visualize microbial community data. This package has been developed based on the research papers published by Pereira et al.(2018) <doi:10.1186/s12864-018-4637-6> and Beule L, Karlovsky P. (2020) <doi:10.7717/peerj.9593>.
Maintained by Mohammad Samir Farooqi. Last updated 12 months ago.
0.5 match 1.00 score