Showing 200 of total 249 results (show query)
oscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 6 days ago.
deep-learningmachine-learningnlptransformersopenjdk
145 stars 13.21 score 436 scripts 1 dependentsohdsi
DatabaseConnector:Connecting to Various Database Platforms
An R 'DataBase Interface' ('DBI') compatible interface to various database platforms ('PostgreSQL', 'Oracle', 'Microsoft SQL Server', 'Amazon Redshift', 'Microsoft Parallel Database Warehouse', 'IBM Netezza', 'Apache Impala', 'Google BigQuery', 'Snowflake', 'Spark', 'SQLite', and 'InterSystems IRIS'). Also includes support for fetching data as 'Andromeda' objects. Uses either 'Java Database Connectivity' ('JDBC') or other 'DBI' drivers to connect to databases.
Maintained by Martijn Schuemie. Last updated 2 months ago.
56 stars 12.63 score 772 scripts 11 dependentsohdsi
SqlRender:Rendering Parameterized SQL and Translation to Dialects
A rendering tool for parameterized SQL that also translates into different SQL dialects. These dialects include 'Microsoft SQL Server', 'Oracle', 'PostgreSql', 'Amazon RedShift', 'Apache Impala', 'IBM Netezza', 'Google BigQuery', 'Microsoft PDW', 'Snowflake', 'Azure Synapse Analytics Dedicated', 'Apache Spark', 'SQLite', and 'InterSystems IRIS'.
Maintained by Martijn Schuemie. Last updated 15 days ago.
82 stars 12.52 score 488 scripts 13 dependentsmiraisolutions
XLConnect:Excel Connector for R
Provides comprehensive functionality to read, write and format Excel data.
Maintained by Martin Studer. Last updated 29 days ago.
cross-platformexcelr-languagexlconnectopenjdk
130 stars 12.28 score 1.2k scripts 1 dependentsohdsi
PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model
A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.
Maintained by Egill Fridgeirsson. Last updated 21 days ago.
190 stars 10.85 score 297 scriptsohdsi
FeatureExtraction:Generating Features for a Cohort
An R interface for generating features for a cohort using data in the Common Data Model. Features can be constructed using default or custom made feature definitions. Furthermore it's possible to aggregate features and get the summary statistics.
Maintained by Ger Inberg. Last updated 8 days ago.
62 stars 10.64 score 209 scripts 2 dependentsropensci
tabulapdf:Extract Tables from PDF Documents
Bindings for the 'Tabula' <https://tabula.technology/> 'Java' library, which can extract tables from PDF files. This tool can reduce time and effort in data extraction processes in fields like investigative journalism. It allows for automatic and manual table extraction, the latter facilitated through a 'Shiny' interface, enabling manual areas selection\ with a computer mouse for data retrieval.
Maintained by Mauricio Vargas Sepulveda. Last updated 3 months ago.
javapdfpdf-documentpeer-reviewedropenscitabulatabular-dataopenjdk
552 stars 10.07 score 159 scripts 1 dependentstrinker
qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
Maintained by Tyler Rinker. Last updated 5 years ago.
qdapquantitative-discourse-analysistext-analysistext-miningtext-plottingopenjdk
176 stars 9.61 score 1.3k scripts 3 dependentss-u
RJDBC:Provides Access to Databases Through the JDBC Interface
The RJDBC package is an implementation of R's DBI interface using JDBC as a back-end. This allows R to connect to any DBMS that has a JDBC driver.
Maintained by Simon Urbanek. Last updated 2 years ago.
52 stars 9.52 score 1.1k scripts 6 dependentsmimno
mallet:An R Wrapper for the Java Mallet Topic Modeling Toolkit
An R interface for the Java Machine Learning for Language Toolkit (mallet) <https://mimno.github.io/Mallet/> to estimate probabilistic topic models, such as Latent Dirichlet Allocation. We can use the R package to read textual data into 'mallet' from R objects, run the Java implementation of 'mallet' directly in R, and extract results as R objects. The 'mallet' toolkit has many functions, this wrapper focuses on the topic modeling sub-package written by David Mimno. The package uses the 'rJava' package to connect to a Java Virtual Machine (JVM).
Maintained by Måns Magnusson. Last updated 3 years ago.
39 stars 8.68 score 141 scripts 3 dependentsrjdverse
RJDemetra:Interface to 'JDemetra+' Seasonal Adjustment Software
Interface around 'JDemetra+' (<https://github.com/jdemetra/jdemetra-app>), the seasonal adjustment software officially recommended to the members of the European Statistical System (ESS) and the European System of Central Banks. It offers full access to all options and outputs of 'JDemetra+', including the two leading seasonal adjustment methods TRAMO/SEATS+ and X-12ARIMA/X-13ARIMA-SEATS.
Maintained by Alain Quartier-la-Tente. Last updated 21 days ago.
53 stars 8.67 score 128 scripts 5 dependentsropensci
babette:Control 'BEAST2'
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAST2' is commonly accompanied by 'BEAUti 2', 'Tracer' and 'DensiTree'. 'babette' provides for an alternative workflow of using all these tools separately. This allows doing complex Bayesian phylogenetics easily and reproducibly from 'R'.
Maintained by Richèl J.C. Bilderbeek. Last updated 11 days ago.
bayesian-inferencebeast2phylogeneticsopenjdk
45 stars 8.55 score 53 scripts 1 dependentstheharmonylab
topics:Creating and Significance Testing Language Features for Visualisation
Implements differential language analysis with statistical tests and offers various language visualization techniques for n-grams and topics. It also supports the 'text' package. For more information, visit <https://r-topics.org/> and <https://www.r-text.org/>.
Maintained by Oscar Kjell. Last updated 2 days ago.
5 stars 8.38 score 22 scripts 2 dependentswallaceecomod
wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions
The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.
Maintained by Mary E. Blair. Last updated 21 days ago.
133 stars 8.36 score 96 scriptslarskotthoff
FSelector:Selecting Attributes
Functions for selecting attributes from a given dataset. Attribute subset selection is the process of identifying and removing as much of the irrelevant and redundant information as possible.
Maintained by Lars Kotthoff. Last updated 2 years ago.
12 stars 8.26 score 478 scripts 4 dependentskurthornik
RWeka:R/Weka Interface
An R interface to Weka (Version 3.9.3). Weka is a collection of machine learning algorithms for data mining tasks written in Java, containing tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Package 'RWeka' contains the interface code, the Weka jar is in a separate package 'RWekajars'. For more information on Weka see <https://www.cs.waikato.ac.nz/ml/weka/>.
Maintained by Kurt Hornik. Last updated 2 years ago.
4 stars 8.24 score 1.8k scripts 14 dependentsifellows
OpenStreetMap:Access to Open Street Map Raster Images
Accesses high resolution raster maps using the OpenStreetMap protocol. Dozens of road, satellite, and topographic map servers are directly supported, including Apple, Mapnik, Bing, and stamen. Additionally raster maps may be constructed using custom tile servers. Maps can be plotted using either base graphics, or ggplot2. This package is not affiliated with the OpenStreetMap.org mapping project.
Maintained by Ian Fellows. Last updated 1 years ago.
11 stars 8.09 score 498 scripts 4 dependentsohdsi
CohortGenerator:Cohort Generation for the OMOP Common Data Model
Generate cohorts and subsets using an Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) Database. Cohorts are defined using 'CIRCE' (<https://github.com/ohdsi/circe-be>) or SQL compatible with 'SqlRender' (<https://github.com/OHDSI/SqlRender>).
Maintained by Anthony Sena. Last updated 6 months ago.
13 stars 7.91 score 165 scriptsropensci
beastier:Call 'BEAST2'
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAST2' is a command-line tool. This package provides a way to call 'BEAST2' from an 'R' function call.
Maintained by Richèl J.C. Bilderbeek. Last updated 1 months ago.
bayesianbeastbeast2phylogenetic-inferencephylogeneticsopenjdk
11 stars 7.87 score 47 scripts 4 dependentsbioc
RBioFormats:R interface to Bio-Formats
An R package which interfaces the OME Bio-Formats Java library to allow reading of proprietary microscopy image data and metadata.
Maintained by Andrzej Oleś. Last updated 5 months ago.
dataimportbio-formatsbioconductorimage-processingopenjdk
25 stars 7.57 score 82 scripts 1 dependentsrpremrajgit
mailR:A Utility to Send Emails from R
Interface to Apache Commons Email to send emails from R.
Maintained by Rahul Premraj. Last updated 3 years ago.
14 stars 7.46 score 374 scriptsohdsi
ResultModelManager:Result Model Manager
Database data model management utilities for R packages in the Observational Health Data Sciences and Informatics program <https://ohdsi.org>. 'ResultModelManager' provides utility functions to allow package maintainers to migrate existing SQL database models, export and import results in consistent patterns.
Maintained by Jamie Gilbert. Last updated 6 months ago.
4 stars 7.38 score 9 scripts 3 dependentskornl
gMCP:Graph Based Multiple Comparison Procedures
Functions and a graphical user interface for graphical described multiple test procedures.
Maintained by Kornelius Rohmeyer. Last updated 1 years ago.
10 stars 7.31 score 105 scripts 2 dependentspridiltal
staplr:A Toolkit for PDF Files
Provides functions to manipulate PDF files: fill out PDF forms; merge multiple PDF files into one; remove selected pages from a file; rename multiple files in a directory; rotate entire pdf document; rotate selected pages of a pdf file; Select pages from a file; splits single input PDF document into individual pages; splits single input PDF document into parts from given points.
Maintained by Priyanga Dilini Talagala. Last updated 1 years ago.
267 stars 7.31 score 36 scripts 2 dependentscolearendt
xlsxjars:Package required POI jars for the xlsx package
The xlsxjars package collects all the external jars required for the xlxs package. This release corresponds to POI 3.13.
Maintained by Adrian A. Dragulescu. Last updated 9 years ago.
1 stars 7.26 score 100 scripts 40 dependentsamattioc
RJSDMX:R Interface to SDMX Web Services
Provides functions to retrieve data and metadata from providers that disseminate data by means of SDMX web services. SDMX (Statistical Data and Metadata eXchange) is a standard that has been developed with the aim of simplifying the exchange of statistical information. More about the SDMX standard and the SDMX Web Services can be found at: <https://sdmx.org>.
Maintained by Attilio Mattiocco. Last updated 4 days ago.
86 stars 7.25 score 27 scriptsbioc
BridgeDbR:Code for using BridgeDb identifier mapping framework from within R
Use BridgeDb functions and load identifier mapping databases in R. It uses GitHub, Zenodo, and Figshare if you use this package to download identifier mappings files.
Maintained by Egon Willighagen. Last updated 5 months ago.
softwareannotationmetabolomicscheminformaticsbioconductor-packagebridgedbgenesidentifierslife-sciencesmetabolitesproteinsopenjdk
4 stars 6.97 score 43 scriptsgiabaio
survHE:Survival Analysis in Health Economic Evaluation
Contains a suite of functions for survival analysis in health economics. These can be used to run survival models under a frequentist (based on maximum likelihood) or a Bayesian approach (both based on Integrated Nested Laplace Approximation or Hamiltonian Monte Carlo). To run the Bayesian models, the user needs to install additional modules (packages), i.e. 'survHEinla' and 'survHEhmc'. These can be installed using 'remotes::install_github' from their GitHub repositories: (<https://github.com/giabaio/survHEhmc> and <https://github.com/giabaio/survHEinla/> respectively). 'survHEinla' is based on the package INLA, which is available for download at <https://inla.r-inla-download.org/R/stable/>. The user can specify a set of parametric models using a common notation and select the preferred mode of inference. The results can also be post-processed to produce probabilistic sensitivity analysis and can be used to export the output to an Excel file (e.g. for a Markov model, as often done by modellers and practitioners). <doi:10.18637/jss.v095.i14>.
Maintained by Gianluca Baio. Last updated 21 days ago.
frequentisthamiltonian-monte-carlohealth-economic-evaluationinlaplotting-survival-curvesrstansurvival-analysissurvival-modelsuncertaintyopenjdk
42 stars 6.88 score 2 dependentszachcp
rcdk:Interface to the 'CDK' Libraries
Allows the user to access functionality in the 'CDK', a Java framework for chemoinformatics. This allows the user to load molecules, evaluate fingerprints, calculate molecular descriptors and so on. In addition, the 'CDK' API allows the user to view structures in 2D.
Maintained by Zachary Charlop-Powers. Last updated 2 years ago.
1 stars 6.80 score 287 scripts 11 dependentsohdsi
Eunomia:Standard Dataset Manager for Observational Medical Outcomes Partnership Common Data Model Sample Datasets
Facilitates access to sample datasets from the 'EunomiaDatasets' repository (<https://github.com/ohdsi/EunomiaDatasets>).
Maintained by Frank DeFalco. Last updated 11 months ago.
43 stars 6.73 score 139 scriptsrjdverse
rjd3sts:State Space Framework and Structural Time Series with 'JDemetra+ 3.x'
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It offers access to several functions on state space models and structural time series.
Maintained by Jean Palate. Last updated 9 months ago.
2 stars 6.64 score 25 scripts 4 dependentsmiracum
DQAstats:Core Functions for Data Quality Assessment
Perform data quality assessment ('DQA') of electronic health records ('EHR'). Publication: Kapsner et al. (2021) <doi:10.1055/s-0041-1733847>.
Maintained by Lorenz A. Kapsner. Last updated 25 days ago.
9 stars 6.55 score 4 scripts 1 dependentss-u
venneuler:Venn and Euler Diagrams
Calculates and displays Venn and Euler Diagrams.
Maintained by Simon Urbanek. Last updated 1 years ago.
4 stars 6.54 score 273 scripts 4 dependentsrickhelmus
patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis
Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.
Maintained by Rick Helmus. Last updated 7 days ago.
mass-spectrometrynon-targetcppopenjdk
65 stars 6.24 score 43 scriptsohdsi
Characterization:Implement Descriptive Studies Using the Common Data Model
An end-to-end framework that enables users to implement various descriptive studies for a given set of target and outcome cohorts for data mapped to the Observational Medical Outcomes Partnership Common Data Model.
Maintained by Jenna Reps. Last updated 29 days ago.
3 stars 6.13 scorejosesamos
rolap:Obtaining Star Databases from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.
Maintained by Jose Samos. Last updated 1 years ago.
5 stars 6.12 score 25 scripts 1 dependentsbioc
esATAC:An Easy-to-use Systematic pipeline for ATACseq data analysis
This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.
Maintained by Zheng Wei. Last updated 5 months ago.
immunooncologysequencingdnaseqqualitycontrolalignmentpreprocessingcoverageatacseqdnaseseqatac-seqbioconductorpipelinecppopenjdk
23 stars 6.11 score 3 scriptskapelner
bartMachine:Bayesian Additive Regression Trees
An advanced implementation of Bayesian Additive Regression Trees with expanded features for data analysis and visualization.
Maintained by Adam Kapelner. Last updated 2 years ago.
6.08 score 309 scripts 6 dependentsaqlt
ggdemetra:'ggplot2' Extension for Seasonal and Trading Day Adjustment with 'RJDemetra'
Provides 'ggplot2' functions to return the results of seasonal and trading day adjustment made by 'RJDemetra'. 'RJDemetra' is an 'R' interface around 'JDemetra+' (<https://github.com/jdemetra/jdemetra-app>), the seasonal adjustment software officially recommended to the members of the European Statistical System and the European System of Central Banks.
Maintained by Alain Quartier-la-Tente. Last updated 8 months ago.
12 stars 6.06 score 16 scripts 1 dependentsmsperlin
GetDFPData:Reading Annual Financial Reports from Bovespa's DFP, FRE and FCA System
Reads annual financial reports including assets, liabilities, dividends history, stockholder composition and much more from Bovespa's DFP, FRE and FCA systems <http://www.b3.com.br/pt_br/produtos-e-servicos/negociacao/renda-variavel/empresas-listadas.htm>. These are web based interfaces for all financial reports of companies traded at Bovespa. The package is specially designed for large scale data importation, keeping a tabular (long) structure for easier processing.
Maintained by Marcelo Perlin. Last updated 4 years ago.
33 stars 6.06 score 69 scriptsbioc
RMassBank:Workflow to process tandem MS files and build MassBank records
Workflow to process tandem MS files and build MassBank records. Functions include automated extraction of tandem MS spectra, formula assignment to tandem MS fragments, recalibration of tandem MS spectra with assigned fragments, spectrum cleanup, automated retrieval of compound information from Internet databases, and export to MassBank records.
Maintained by RMassBank at Eawag. Last updated 5 months ago.
immunooncologybioinformaticsmassspectrometrymetabolomicssoftwareopenjdk
6.02 score 26 scriptsthiloklein
matchingMarkets:Analysis of Stable Matchings
Implements structural estimators to correct for the sample selection bias from observed outcomes in matching markets. This includes one-sided matching of agents into groups as well as two-sided matching of students to schools. The package also contains algorithms to find stable matchings in the three most common matching problems: the stable roommates problem, the college admissions problem, and the house allocation problem.
Maintained by Thilo Klein. Last updated 5 years ago.
40 stars 5.99 score 49 scriptsmhahsler
streamMOA:Interface for MOA Stream Clustering Algorithms
Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework (Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer (2010). MOA: Massive Online Analysis, Journal of Machine Learning Research 11: 1601-1604).
Maintained by Michael Hahsler. Last updated 7 months ago.
clusteringdataminingdatastreamopenjdk
13 stars 5.98 score 37 scriptsohdsi
EvidenceSynthesis:Synthesizing Causal Evidence in a Distributed Research Network
Routines for combining causal effect estimates and study diagnostics across multiple data sites in a distributed study, without sharing patient-level data. Allows for normal and non-normal approximations of the data-site likelihood of the effect parameter.
Maintained by Martijn Schuemie. Last updated 7 months ago.
8 stars 5.87 score 31 scriptsrjdverse
rjd3toolkit:Utility Functions around 'JDemetra+ 3.0'
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It provides functions allowing to model time series (create outlier regressors, user-defined calendar regressors, UCARIMA models...), to test the presence of trading days or seasonal effects and also to set specifications in pre-adjustment and benchmarking when using rjd3x13 or rjd3tramoseats.
Maintained by Tanguy Barthelemy. Last updated 5 months ago.
javajdemetraseasonal-adjustmenttime-seriestimeseriesopenjdk
6 stars 5.74 score 48 scripts 16 dependentsjosesamos
when:Definition of Date and Time Dimension Tables
In Multidimensional Systems the When dimension allows us to express when the analysed facts have occurred. The purpose of this package is to provide support for implementing this dimension in the form of date and time tables for Relational On-Line Analytical Processing star database systems.
Maintained by Jose Samos. Last updated 1 years ago.
1 stars 5.73 score 177 scripts 2 dependentsbioc
debCAM:Deconvolution by Convex Analysis of Mixtures
An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.
Maintained by Lulu Chen. Last updated 5 months ago.
softwarecellbiologygeneexpressionopenjdk
7 stars 5.69 score 14 scriptsverbal-autopsy-software
InSilicoVA:Probabilistic Verbal Autopsy Coding with 'InSilicoVA' Algorithm
Computes individual causes of death and population cause-specific mortality fractions using the 'InSilicoVA' algorithm from McCormick et al. (2016) <DOI:10.1080/01621459.2016.1152191>. It uses data derived from verbal autopsy (VA) interviews, in a format similar to the input of the widely used 'InterVA' method. This package provides general model fitting and customization for 'InSilicoVA' algorithm and basic graphical visualization of the output.
Maintained by Zehang Richard Li. Last updated 1 months ago.
3 stars 5.67 score 35 scripts 1 dependentsr-forge
RNetLogo:Provides an Interface to the Agent-Based Modelling Platform 'NetLogo'
Interface to use and access Wilensky's 'NetLogo' (Wilensky 1999) from R using either headless (no GUI) or interactive GUI mode. Provides functions to load models, execute commands, and get values from reporters. Mostly analogous to the 'NetLogo' 'Mathematica' Link <https://github.com/NetLogo/Mathematica-Link>.
Maintained by Jan C. Thiele. Last updated 8 years ago.
5.62 score 140 scripts 1 dependentsdnme-minturdep
comunicacion:DNMyE - Comunicación
Herramientas para la comunicación de la Dirección Nacional de Mercados y Estadística de la Subsecretaría de Turismo de Argentina.
Maintained by Juan Pablo Ruiz Nicolini. Last updated 8 months ago.
4 stars 5.57 score 124 scripts 1 dependentsadamlilith
enmSdmX:Species Distribution Modeling and Ecological Niche Modeling
Implements species distribution modeling and ecological niche modeling, including: bias correction, spatial cross-validation, model evaluation, raster interpolation, biotic "velocity" (speed and direction of movement of a "mass" represented by a raster), interpolating across a time series of rasters, and use of spatially imprecise records. The heart of the package is a set of "training" functions which automatically optimize model complexity based number of available occurrences. These algorithms include MaxEnt, MaxNet, boosted regression trees/gradient boosting machines, generalized additive models, generalized linear models, natural splines, and random forests. To enhance interoperability with other modeling packages, no new classes are created. The package works with 'PROJ6' geodetic objects and coordinate reference systems.
Maintained by Adam B. Smith. Last updated 1 months ago.
bias-correctionbiogeographyecological-niche-modelingecological-niche-modellingniche-modelingniche-modellingspecies-distribution-modelingopenjdk
25 stars 5.57 score 37 scriptsropensci
mauricer:Work with 'BEAST2' Packages
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAST2' is commonly accompanied by 'BEAUti 2' (<https://www.beast2.org>), which, among others, allows one to install 'BEAST2' package. This package allows to work with 'BEAST2' packages from 'R'.
Maintained by Richèl J.C. Bilderbeek. Last updated 7 months ago.
3 stars 5.56 score 10 scripts 2 dependentsmattsheng
iBART:Iterative Bayesian Additive Regression Trees Descriptor Selection Method
A statistical method based on Bayesian Additive Regression Trees with Global Standard Error Permutation Test (BART-G.SE) for descriptor selection and symbolic regression. It finds the symbolic formula of the regression function y=f(x) as described in Ye, Senftle, and Li (2023) <https://www.tandfonline.com/doi/abs/10.1080/01621459.2023.2294527>.
Maintained by Shengbin Ye. Last updated 2 months ago.
7 stars 5.53 score 16 scriptsmannau
boilerpipeR:Interface to the Boilerpipe Java Library
Generic Extraction of main text content from HTML files; removal of ads, sidebars and headers using the boilerpipe <https://github.com/kohlschutter/boilerpipe> Java library. The extraction heuristics from boilerpipe show a robust performance for a wide range of web site templates.
Maintained by Mario Annau. Last updated 4 years ago.
22 stars 5.52 score 30 scriptsdimitri-justeau
rflsgen:Neutral Landscape Generator with Targets on Landscape Indices
Interface to the 'flsgen' neutral landscape generator <https://github.com/dimitri-justeau/flsgen>. It allows to - Generate fractal terrain; - Generate landscape structures satisfying user targets over landscape indices; - Generate landscape raster from landscape structures.
Maintained by Dimitri Justeau-Allaire. Last updated 10 months ago.
13 stars 5.51 score 6 scriptsbioc
miRSM:Inferring miRNA sponge modules in heterogeneous data
The package aims to identify miRNA sponge or ceRNA modules in heterogeneous data. It provides several functions to study miRNA sponge modules at single-sample and multi-sample levels, including popular methods for inferring gene modules (candidate miRNA sponge or ceRNA modules), and two functions to identify miRNA sponge modules at single-sample and multi-sample levels, as well as several functions to conduct modular analysis of miRNA sponge modules.
Maintained by Junpeng Zhang. Last updated 5 months ago.
geneexpressionbiomedicalinformaticsclusteringgenesetenrichmentmicroarraysoftwaregeneregulationgenetargetcernamirnamirna-spongemirna-targetsmodulesopenjdk
4 stars 5.51 score 5 scriptsropensci
mcbette:Model Comparison Using 'babette'
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'mcbette' allows to do a Bayesian model comparison over some site and clock models, using 'babette' (<https://github.com/ropensci/babette/>).
Maintained by Richèl J.C. Bilderbeek. Last updated 8 months ago.
7 stars 5.50 score 18 scriptskurthornik
openNLP:Apache OpenNLP Tools Interface
An interface to the Apache OpenNLP tools (version 1.5.3). The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text written in Java. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. See <https://opennlp.apache.org/> for more information.
Maintained by Kurt Hornik. Last updated 5 years ago.
4 stars 5.48 score 386 scripts 8 dependentscrew102
slowraker:A Slow Version of the Rapid Automatic Keyword Extraction (RAKE) Algorithm
A mostly pure-R implementation of the RAKE algorithm (Rose, S., Engel, D., Cramer, N. and Cowley, W. (2010) <doi:10.1002/9780470689646.ch1>), which can be used to extract keywords from documents without any training data.
Maintained by Christopher Baker. Last updated 7 months ago.
6 stars 5.37 score 13 scripts 1 dependentsbioc
DaMiRseq:Data Mining for RNA-seq data: normalization, feature selection and classification
The DaMiRseq package offers a tidy pipeline of data mining procedures to identify transcriptional biomarkers and exploit them for both binary and multi-class classification purposes. The package accepts any kind of data presented as a table of raw counts and allows including both continous and factorial variables that occur with the experimental setting. A series of functions enable the user to clean up the data by filtering genomic features and samples, to adjust data by identifying and removing the unwanted source of variation (i.e. batches and confounding factors) and to select the best predictors for modeling. Finally, a "stacking" ensemble learning technique is applied to build a robust classification model. Every step includes a checkpoint that the user may exploit to assess the effects of data management by looking at diagnostic plots, such as clustering and heatmaps, RLE boxplots, MDS or correlation plot.
Maintained by Mattia Chiesa. Last updated 5 months ago.
sequencingrnaseqclassificationimmunooncologyopenjdk
5.32 score 7 scripts 1 dependentscdk-r
rcdklibs:The CDK Libraries Packaged for R
An R interface to the Chemistry Development Kit, a Java library for chemoinformatics. Given the size of the library itself, this package is not expected to change very frequently. To make use of the CDK within R, it is suggested that you use the 'rcdk' package. Note that it is possible to directly interact with the CDK using 'rJava'. However 'rcdk' exposes functionality in a more idiomatic way. The CDK library itself is released as LGPL and the sources can be obtained from <https://github.com/cdk/cdk>.
Maintained by Zachary Charlop-Powers. Last updated 1 years ago.
1 stars 5.31 score 20 scripts 12 dependentsbioc
fastreeR:Phylogenetic, Distance and Other Calculations on VCF and Fasta Files
Calculate distances, build phylogenetic trees or perform hierarchical clustering between the samples of a VCF or FASTA file. Functions are implemented in Java and called via rJava. Parallel implementation that operates directly on the VCF or FASTA file for fast execution.
Maintained by Anestis Gkanogiannis. Last updated 5 months ago.
phylogeneticsmetagenomicsclusteringopenjdk
3 stars 5.26 score 20 scriptsrjdverse
rjd3filters:Trend-Cycle Extraction with Linear Filters based on JDemetra+ v3.x
This package provides functions to build and apply symmetric and asymmetric moving averages (= linear filters) for trend-cycle extraction. In particular, it implements several modern approaches for real-time estimates from the viewpoint of revisions and time delay in detecting turning points. It includes the local polynomial approach of Proietti and Luati (2008), the Reproducing Kernel Hilbert Space (RKHS) of Dagum and Bianconcini (2008) and the Fidelity-Smoothness-Timeliness approach of Grun-Rehomme, Guggemos, and Ladiray (2018). It is based on Java libraries developped in 'JDemetra+' (<https://github.com/jdemetra>), time series analysis software.
Maintained by Alain Quartier-la-Tente. Last updated 27 days ago.
3 stars 5.19 score 77 scripts 3 dependentsrjdverse
rjd3highfreq:Seasonal Adjustment of High Frequency Data with 'JDemetra+ 3.x'
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It provides functions for seasonal adjustment of high-frequency data displaying multiple, non integer periodicities. Pre-adjustment with extended airline model and Arima Model Based decomposition.
Maintained by Jean Palate. Last updated 9 months ago.
2 stars 5.15 score 33 scripts 3 dependentsthiyangt
denguedatahub:A Tidy Format Datasets of Dengue by Country
Provides a weekly, monthly, yearly summary of dengue cases by state/ province/ country.
Maintained by Thiyanga S. Talagala. Last updated 1 months ago.
11 stars 5.12 score 34 scriptsmiracum
DIZutils:Utilities for 'DIZ' R Package Development
Utility functions used for the R package development infrastructure inside the data integration centers ('DIZ') to standardize and facilitate repetitive tasks such as setting up a database connection or issuing notification messages and to avoid redundancy.
Maintained by Jonathan M. Mang. Last updated 4 months ago.
3 stars 5.03 score 5 scripts 2 dependentsrjdverse
rjd3revisions:Revision analysis with 'JDemetra+ 3.x'
Revision analysis tool part of 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It performs a battery of tests on revisions and submit a report with the results. The various tests enable the users to detect potential bias and sources of inefficiency in preliminary estimates.
Maintained by Corentin Lemasson. Last updated 7 days ago.
3 stars 5.01 score 4 scriptsbioc
GARS:GARS: Genetic Algorithm for the identification of Robust Subsets of variables in high-dimensional and challenging datasets
Feature selection aims to identify and remove redundant, irrelevant and noisy variables from high-dimensional datasets. Selecting informative features affects the subsequent classification and regression analyses by improving their overall performances. Several methods have been proposed to perform feature selection: most of them relies on univariate statistics, correlation, entropy measurements or the usage of backward/forward regressions. Herein, we propose an efficient, robust and fast method that adopts stochastic optimization approaches for high-dimensional. GARS is an innovative implementation of a genetic algorithm that selects robust features in high-dimensional and challenging datasets.
Maintained by Mattia Chiesa. Last updated 5 months ago.
classificationfeatureextractionclusteringopenjdk
5.00 score 2 scriptsjorgetendeiro
GGUM:Generalized Graded Unfolding Model
An implementation of the generalized graded unfolding model (GGUM) in R, see Roberts, Donoghue, and Laughlin (2000) <doi:10.1177/01466216000241001>). It allows to simulate data sets based on the GGUM. It fits the GGUM and the GUM, and it retrieves item and person parameter estimates. Several plotting functions are available (item and test information functions; item and test characteristic curves; item category response curves). Additionally, there are some functions that facilitate the communication between R and 'GGUM2004'. Finally, a model-fit checking utility, MODFIT(), is also available.
Maintained by Jorge N. Tendeiro. Last updated 2 years ago.
6 stars 4.99 score 18 scripts 1 dependentskornl
Crossover:Analysis and Search of Crossover Designs
Generate and analyse crossover designs from combinatorial or search algorithms as well as from literature and a GUI to access them.
Maintained by Kornelius Rohmeyer. Last updated 1 years ago.
6 stars 4.94 score 29 scriptssocialresearchcentre
dialrjars:Required 'libphonenumber' jars for the 'dialr' Package
Collects 'libphonenumber' jars required for the 'dialr' package.
Maintained by Danny Smith. Last updated 23 days ago.
2 stars 4.93 score 4 scripts 1 dependentssocialresearchcentre
dialr:Parse, Format, and Validate International Phone Numbers
Parse, format, and validate international phone numbers using Google's 'libphonenumber' java library, <https://github.com/google/libphonenumber>.
Maintained by Danny Smith. Last updated 1 years ago.
12 stars 4.89 score 13 scriptsbioc
rRDP:Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Maintained by Michael Hahsler. Last updated 5 months ago.
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
3 stars 4.88 score 6 scriptsbioc
DeepPINCS:Protein Interactions and Networks with Compounds based on Sequences using Deep Learning
The identification of novel compound-protein interaction (CPI) is important in drug discovery. Revealing unknown compound-protein interactions is useful to design a new drug for a target protein by screening candidate compounds. The accurate CPI prediction assists in effective drug discovery process. To identify potential CPI effectively, prediction methods based on machine learning and deep learning have been developed. Data for sequences are provided as discrete symbolic data. In the data, compounds are represented as SMILES (simplified molecular-input line-entry system) strings and proteins are sequences in which the characters are amino acids. The outcome is defined as a variable that indicates how strong two molecules interact with each other or whether there is an interaction between them. In this package, a deep-learning based model that takes only sequence information of both compounds and proteins as input and the outcome as output is used to predict CPI. The model is implemented by using compound and protein encoders with useful features. The CPI model also supports other modeling tasks, including protein-protein interaction (PPI), chemical-chemical interaction (CCI), or single compounds and proteins. Although the model is designed for proteins, DNA and RNA can be used if they are represented as sequences.
Maintained by Dongmin Jung. Last updated 5 months ago.
softwarenetworkgraphandnetworkneuralnetworkopenjdk
4.78 score 4 scripts 2 dependentsbioc
rmelting:R Interface to MELTING 5
R interface to the MELTING 5 program (https://www.ebi.ac.uk/biomodels/tools/melting/) to compute melting temperatures of nucleic acid duplexes along with other thermodynamic parameters.
Maintained by J. Aravind. Last updated 5 months ago.
biomedicalinformaticscheminformaticsbioconductorbioinformaticsmelting-temperatureopenjdk
2 stars 4.78 score 10 scriptsbioc
sarks:Suffix Array Kernel Smoothing for discovery of correlative sequence motifs and multi-motif domains
Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.
Maintained by Dennis Wylie. Last updated 5 months ago.
motifdiscoverygeneregulationgeneexpressiontranscriptomicsrnaseqdifferentialexpressionfeatureextractionopenjdk
3 stars 4.78 score 3 scriptsjatanrt
eprscope:Processing and Analysis of Electron Paramagnetic Resonance Data and Spectra in Chemistry
Processing, analysis and plottting of Electron Paramagnetic Resonance (EPR) spectra in chemistry. Even though the package is mainly focused on continuous wave (CW) EPR/ENDOR, many functions may be also used for the integrated forms of 1D PULSED EPR spectra. It is able to find the most important spectral characteristics like g-factor, linewidth, maximum of derivative or integral intensities and single/double integrals. This is especially important in spectral (time) series consisting of many EPR spectra like during variable temperature experiments, electrochemical or photochemical radical generation and/or decay. Package also enables processing of data/spectra for the analytical (quantitative) purposes. Namely, how many radicals or paramagnetic centers can be found in the analyte/sample. The goal is to evaluate rate constants, considering different kinetic models, to describe the radical reactions. The key feature of the package resides in processing of the universal ASCII text formats (such as '.txt', '.csv' or '.asc') from scratch. No proprietary formats are used (except the MATLAB EasySpin outputs) and in such respect the package is in accordance with the FAIR data principles. Upon 'reading' (also providing automatic procedures for the most common EPR spectrometers) the spectral data are transformed into the universal R 'data frame' format. Subsequently, the EPR spectra can be visualized and are fully consistent either with the 'ggplot2' package or with the interactive formats based on 'plotly'. Additionally, simulations and fitting of the isotropic EPR spectra are also included in the package. Advanced simulation parameters provided by the MATLAB-EasySpin toolbox and results from the quantum chemical calculations like g-factor and hyperfine splitting/coupling constants (a/A) can be compared and summarized in table-format in order to analyze the EPR spectra by the most effective way.
Maintained by Ján Tarábek. Last updated 16 hours ago.
chemistrydata-analysisdata-visualizationepresrfittingoptimizationprogramming-languagereproducible-researchscientific-plottingspectroscopyopenjdk
4.76 score 7 scriptsbioc
methodical:Discovering genomic regions where methylation is strongly associated with transcriptional activity
DNA methylation is generally considered to be associated with transcriptional silencing. However, comprehensive, genome-wide investigation of this relationship requires the evaluation of potentially millions of correlation values between the methylation of individual genomic loci and expression of associated transcripts in a relatively large numbers of samples. Methodical makes this process quick and easy while keeping a low memory footprint. It also provides a novel method for identifying regions where a number of methylation sites are consistently strongly associated with transcriptional expression. In addition, Methodical enables housing DNA methylation data from diverse sources (e.g. WGBS, RRBS and methylation arrays) with a common framework, lifting over DNA methylation data between different genome builds and creating base-resolution plots of the association between DNA methylation and transcriptional activity at transcriptional start sites.
Maintained by Richard Heery. Last updated 2 months ago.
dnamethylationmethylationarraytranscriptiongenomewideassociationsoftwareopenjdk
4.65 score 14 scriptscoseal
aslib:Interface to the Algorithm Selection Benchmark Library
Provides an interface to the algorithm selection benchmark library at <http://www.aslib.net> and the 'LLAMA' package (<https://cran.r-project.org/package=llama>) for building algorithm selection models; see Bischl et al. (2016) <doi:10.1016/j.artint.2016.04.003>.
Maintained by Lars Kotthoff. Last updated 3 years ago.
7 stars 4.65 score 32 scriptsaqlt
rjdmarkdown:'rmarkdown' Extension for Formatted 'RJDemetra' Outputs
Functions to have nice 'rmarkdown' outputs of the seasonal and trading day adjustment models made with 'RJDemetra'.
Maintained by Alain Quartier-la-Tente. Last updated 8 months ago.
4 stars 4.60 score 9 scriptsohdsi
CirceR:Construct Cohort Inclusion and Restriction Criteria Expressions
Wraps the 'CIRCE' (<https://github.com/ohdsi/circe-be>) 'Java' library allowing cohort definition expressions to be edited and converted to 'Markdown' or 'SQL'.
Maintained by Chris Knoll. Last updated 11 months ago.
6 stars 4.58 score 64 scriptsingmarboeschen
JATSdecoder:A Metadata and Text Extraction and Manipulation Tool Set
Provides a function collection to extract metadata, sectioned text and study characteristics from scientific articles in 'NISO-JATS' format. Articles in PDF format can be converted to 'NISO-JATS' with the 'Content ExtRactor and MINEr' ('CERMINE', <https://github.com/CeON/CERMINE>). For convenience, two functions bundle the extraction heuristics: JATSdecoder() converts 'NISO-JATS'-tagged XML files to a structured list with elements title, author, journal, history, 'DOI', abstract, sectioned text and reference list. study.character() extracts multiple study characteristics like number of included studies, statistical methods used, alpha error, power, statistical results, correction method for multiple testing, software used. An estimation of the involved sample size is performed based on reports within the abstract and the reported degrees of freedom within statistical results. In addition, the package contains some useful functions to process text (text2sentences(), text2num(), ngram(), strsplit2(), grep2()). See Böschen, I. (2021) <doi:10.1007/s11192-021-04162-z> Böschen, I. (2021) <doi:10.1038/s41598-021-98782-3> and Böschen, I (2023) <doi:10.1038/s41598-022-27085-y>.
Maintained by Ingmar Böschen. Last updated 17 days ago.
cermineniso-jatspubmedcentraltext-extractiontext-miningxml-filesopenjdk
18 stars 4.56 score 7 scriptsohdsi
OhdsiReportGenerator:Observational Health Data Sciences and Informatics Report Generator
Extract results into R from the Observational Health Data Sciences and Informatics result database (see <https://ohdsi.github.io/Strategus/results-schema/index.html>) and generate reports/presentations via 'quarto' that summarize results in HTML format. Learn more about 'OhdsiReportGenerator' at <https://ohdsi.github.io/OhdsiReportGenerator/>.
Maintained by Jenna Reps. Last updated 30 days ago.
4.54 score 2 scriptsantonio-pgarcia
rrepast:Invoke 'Repast Simphony' Simulation Models
An R and Repast integration tool for running individual-based (IbM) simulation models developed using 'Repast Simphony' Agent-Based framework directly from R code supporting multicore execution. This package integrates 'Repast Simphony' models within R environment, making easier the tasks of running and analyzing model output data for automated parameter calibration and for carrying out uncertainty and sensitivity analysis using the power of R environment.
Maintained by Antonio Prestes Garcia. Last updated 5 years ago.
3 stars 4.53 score 38 scripts 1 dependentsrjdverse
rjd3x13:Seasonal Adjustment with X-13 in 'JDemetra+ 3.x'
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It offers full acces to options and outputs of X-13, including RegARIMA modelling (automatic ARIMA model with outlier detection and trading days adjustment) and X-11 decomposition.
Maintained by Tanguy Barthelemy. Last updated 5 months ago.
javajdemetraseasonal-adjustmenttime-seriestimeseriesx13openjdk
5 stars 4.48 score 8 scripts 4 dependentsnaidantu
bmggum:Bayesian Multidimensional Generalized Graded Unfolding Model
Full Bayesian estimation of Multidimensional Generalized Graded Unfolding Model (MGGUM) using 'rstan' (See Stan Development Team (2020) <https://mc-stan.org/>). Functions are provided for estimation, result extraction, model fit statistics, and plottings.
Maintained by Naidan Tu. Last updated 3 years ago.
5 stars 4.40 score 5 scriptsocbe-uio
DIscBIO:A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
An open, multi-algorithmic pipeline for easy, fast and efficient analysis of cellular sub-populations and the molecular signatures that characterize them. The pipeline consists of four successive steps: data pre-processing, cellular clustering with pseudo-temporal ordering, defining differential expressed genes and biomarker identification. More details on Ghannoum et. al. (2021) <doi:10.3390/ijms22031399>. This package implements extensions of the work published by Ghannoum et. al. (2019) <doi:10.1101/700989>.
Maintained by Waldir Leoncio. Last updated 1 years ago.
biomarker-discoveryjupyter-notebookscrna-seqsingle-cell-analysistranscriptomicsopenjdk
12 stars 4.38 score 5 scriptsvincexxc
glmulti:Model Selection and Multimodel Inference Made Easy
Automated model selection and model-averaging. Provides a wrapper for glm and other functions, automatically generating all possible models (under constraints set by the user) with the specified response and explanatory variables, and finding the best models in terms of some Information Criterion (AIC, AICc or BIC). Can handle very large numbers of candidate models. Features a Genetic Algorithm to find the best models when an exhaustive screening of the candidates is not feasible.
Maintained by Vincent Calcagno. Last updated 5 years ago.
1 stars 4.34 score 278 scripts 1 dependentsrjdverse
rjd3tramoseats:Seasonal Adjustment with TRAMO-SEATS in 'JDemetra+ 3.x'
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It offers full acces to options and outputs of TRAMO-SEATS (Time series Regression with ARIMA noise, Missing values and Outliers - Signal Extraction in ARIMA Time Series), including TRAMO modelling (automatic ARIMA model with outlier detection and trading days adjustment).
Maintained by Tanguy Barthelemy. Last updated 5 months ago.
javajdemetraseasonal-adjustmenttime-seriestimeseriestramoseatsopenjdk
5 stars 4.33 score 12 scripts 3 dependentsbioc
SpatialOmicsOverlay:Spatial Overlay for Omic Data from Nanostring GeoMx Data
Tools for NanoString Technologies GeoMx Technology. Package to easily graph on top of an OME-TIFF image. Plotting annotations can range from tissue segment to gene expression.
Maintained by Maddy Griswold. Last updated 5 months ago.
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsproprietaryplatformsrnaseqspatialdatarepresentationvisualizationopenjdk
4.30 score 8 scriptsbioc
SELEX:Functions for analyzing SELEX-seq data
Tools for quantifying DNA binding specificities based on SELEX-seq data.
Maintained by Harmen J. Bussemaker. Last updated 5 months ago.
softwaremotifdiscoverymotifannotationgeneregulationtranscriptionopenjdk
4.30 score 8 scriptskliegr
qCBA:Postprocessing of Rule Classification Models Learnt on Quantized Data
Implements the Quantitative Classification-based on Association Rules (QCBA) algorithm (<doi:10.1007/s10489-022-04370-x>). QCBA postprocesses rule classification models making them typically smaller and in some cases more accurate. Supported are 'CBA' implementations from 'rCBA', 'arulesCBA' and 'arc' packages, and 'CPAR', 'CMAR', 'FOIL2' and 'PRM' implementations from 'arulesCBA' package and 'SBRL' implementation from the 'sbrl' package. The result of the post-processing is an ordered CBA-like rule list.
Maintained by Tomáš Kliegr. Last updated 7 months ago.
11 stars 4.30 score 12 scriptsaqlt
ggdemetra3:'ggplot2' Extension for Seasonal and Trading Day Adjustment with 'JDemetra+' 3.0
Provides 'ggplot2' functions to return the results of seasonal and trading day adjustment made by the R interface to 'JDemetra+' 3.0.
Maintained by Alain Quartier-la-Tente. Last updated 3 months ago.
4 stars 4.26 score 8 scripts 1 dependentsohdsi
CohortExplorer:Explorer of Profiles of Patients in a Cohort
This software tool is designed to extract data from a randomized subset of individuals within a cohort and make it available for exploration in a shiny application environment. It retrieves date-stamped, event-level records from one or more data sources that represent patient data in the Observational Medical Outcomes Partnership (OMOP) data model format. This tool features a user-friendly interface that enables users to efficiently explore the extracted profiles, thereby facilitating applications, such as reviewing structured profiles.
Maintained by Gowtham Rao. Last updated 1 years ago.
3 stars 4.18 score 3 scriptsdaroczig
AWR:'AWS' Java 'SDK' for R
Make the compiled Java modules of the Amazon Web Services ('AWS') 'SDK' available to be used in downstream R packages interacting with 'AWS'. See <https://aws.amazon.com/sdk-for-java> for more information on the 'AWS' 'SDK' for Java.
Maintained by Gergely Daroczi. Last updated 4 years ago.
4.18 score 1 dependentsyangcq-ivy
NicheBarcoding:Niche-model-Based Species Identification
Species Identification using DNA Barcodes Integrated with Environmental Niche Models.
Maintained by Cai-qing YANG. Last updated 8 months ago.
1 stars 4.18 score 7 scriptsbioc
metabinR:Abundance and Compositional Based Binning of Metagenomes
Provide functions for performing abundance and compositional based binning on metagenomic samples, directly from FASTA or FASTQ files. Functions are implemented in Java and called via rJava. Parallel implementation that operates directly on input FASTA/FASTQ files for fast execution.
Maintained by Anestis Gkanogiannis. Last updated 5 months ago.
classificationclusteringmicrobiomesequencingsoftwareopenjdk
4.18 score 2 scriptsgk-crop
simplace:Interface to Use the Modelling Framework 'SIMPLACE'
Interface to interact with the modelling framework 'SIMPLACE' and to parse the results of simulations.
Maintained by Gunther Krauss. Last updated 6 months ago.
1 stars 4.18 score 6 scripts 1 dependentsrjdverse
rjd3x11plus:Interface to 'JDemetra+ 3.x' time series analysis software
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software.
Maintained by Jean Palate. Last updated 9 months ago.
1 stars 4.16 score 4 scripts 2 dependentsmiracum
DQAgui:Graphical User Interface for Data Quality Assessment
A graphical user interface (GUI) to the functions implemented in the R package 'DQAstats'. Publication: Mang et al. (2021) <doi:10.1186/s12911-022-01961-z>.
Maintained by Lorenz A. Kapsner. Last updated 2 months ago.
2 stars 4.15 score 10 scriptsstla
swipeR:Carousels using the 'JavaScript' Library 'Swiper'
Create carousels using the 'JavaScript' library 'Swiper' and the package 'htmlwidgets'. The carousels can be displayed in the 'RStudio' viewer pane, in 'Shiny' applications and in 'R markdown' documents. The package also provides a 'RStudio' addin allowing to choose image files and to display them in the viewer pane.
Maintained by Stéphane Laurent. Last updated 1 years ago.
carouselhtmlwidgetsshinyswiperopenjdk
11 stars 4.14 score 25 scriptsjaroslav-kuchar
rCBA:CBA Classifier for R
Provides implementations of a classifier based on the "Classification Based on Associations" (CBA). It can be used for building classification models from association rules. Rules are pruned in the order of precedence given by the sort criteria and a default rule is added. The final classifier labels provided instances. CBA was originally proposed by Liu, B. Hsu, W. and Ma, Y. Integrating Classification and Association Rule Mining. Proceedings KDD-98, New York, 27-31 August. AAAI. pp80-86 (1998, ISBN:1-57735-070-7).
Maintained by Jaroslav Kuchar. Last updated 6 years ago.
7 stars 4.14 score 39 scriptsrjdverse
rjd3nowcasting:Nowcasting with 'JDemetra+ 3.0'
Interface around 'JDemetra+ 3.x' (<https://github.com/jdemetra/jdplus-nowcasting>), TSACE project. It defines and estimates Dynamic Factor Models with the purpose of Nowcasting. News analysis is included in this second version.
Maintained by Corentin Lemasson. Last updated 9 months ago.
3 stars 4.13 score 1 scriptskarlropkins
AQEval:Air Quality Evaluation
Developed for use by those tasked with the routine detection, characterisation and quantification of discrete changes in air quality time-series, such as identifying the impacts of air quality policy interventions. The main functions use signal isolation then break-point/segment (BP/S) methods based on 'strucchange' and 'segmented' methods to detect and quantify change events (Ropkins & Tate, 2021, <doi:10.1016/j.scitotenv.2020.142374>).
Maintained by Karl Ropkins. Last updated 2 months ago.
9 stars 4.13 scorefdefalco
Achilles:Achilles Data Source Characterization
Automated Characterization of Health Information at Large-Scale Longitudinal Evidence Systems. Creates a descriptive statistics summary for an Observational Medical Outcomes Partnership Common Data Model standardized data source. This package includes functions for executing summary queries on the specified data source and exporting reporting content for use across a variety of Observational Health Data Sciences and Informatics community applications.
Maintained by Frank DeFalco. Last updated 2 years ago.
4.06 score 115 scriptspsychbruce
PsychWordVec:Word Embedding Research Framework for Psychological Science
An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').
Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.
bertcosine-similarityfasttextglovegptlanguage-modelnatural-language-processingnlppretrained-modelspsychologysemantic-analysistext-analysistext-miningtsneword-embeddingsword-vectorsword2vecopenjdk
22 stars 4.04 score 10 scriptsr-forge
loa:Lattice Options and Add-Ins
Various plots and functions that make use of the lattice/trellis plotting framework. The plots, which include loaPlot(), loaMapPlot() and trianglePlot(), and use panelPal(), a function that extends 'lattice' and 'hexbin' package methods to automate plot subscript and panel-to-panel and panel-to-key synchronization/management.
Maintained by Karl Ropkins. Last updated 3 months ago.
4.03 score 18 scripts 2 dependentsohdsi
CohortAlgebra:Use of Interval Algebra to Create New Cohort(s) from Existing Cohorts
This software tool is designed to generate new cohorts utilizing data from previously instantiated cohorts. It employs interval algebra operators such as UNION, INTERSECT, and MINUS to manipulate the data within the instantiated cohorts and create new cohorts.
Maintained by Gowtham Rao. Last updated 1 months ago.
1 stars 4.00 scorejosesamos
geogenr:Generator from American Community Survey Geodatabases
The American Community Survey (ACS) <https://www.census.gov/programs-surveys/acs> offers geodatabases with geographic information and associated data of interest to researchers in the area. The goal of this package is to generate objects that allow us to access and consult the information available in various formats, such as in 'GeoPackage' format or in multidimensional 'ROLAP' (Relational On-Line Analytical Processing) star format.
Maintained by Jose Samos. Last updated 11 months ago.
2 stars 4.00 score 8 scriptsbioc
VAExprs:Generating Samples of Gene Expression Data with Variational Autoencoders
A fundamental problem in biomedical research is the low number of observations, mostly due to a lack of available biosamples, prohibitive costs, or ethical reasons. By augmenting a few real observations with artificially generated samples, their analysis could lead to more robust and higher reproducible. One possible solution to the problem is the use of generative models, which are statistical models of data that attempt to capture the entire probability distribution from the observations. Using the variational autoencoder (VAE), a well-known deep generative model, this package is aimed to generate samples with gene expression data, especially for single-cell RNA-seq data. Furthermore, the VAE can use conditioning to produce specific cell types or subpopulations. The conditional VAE (CVAE) allows us to create targeted samples rather than completely random ones.
Maintained by Dongmin Jung. Last updated 5 months ago.
softwaregeneexpressionsinglecellopenjdk
4.00 score 4 scriptsjosesamos
rexer:Random Exercises and Exams Generator
The main purpose of this package is to streamline the generation of exams that include random elements in exercises. Exercises can be defined in a table, based on text and figures, and may contain gaps to be filled with provided options. Exam documents can be generated in various formats. It allows us to generate a version for conducting the assessment and another version that facilitates correction, linked through a code.
Maintained by Jose Samos. Last updated 1 years ago.
4.00 score 4 scriptsjosesamos
moodef:Defining 'Moodle' Elements from R
The main objective of this package is to support the definition of 'Moodle' elements taking advantage of the power that R offers. In this first version, it allows the definition of quizzes to be included in the question bank.
Maintained by Jose Samos. Last updated 2 months ago.
1 stars 4.00 score 4 scriptsinseefrlab
rjdworkspace:Manipulate 'JDemetra+' Workspaces
Set of tools to manipulate the 'JDemetra+' workspaces. Based on the 'RJDemetra' package (which interfaces with version 2 of the 'JDemetra+' (<https://github.com/jdemetra/jdemetra-app>), the seasonal adjustment software officially recommended to the members of the European Statistical System (ESS) and the European System of Central Banks). This package provides access to additional workspace manipulation functions such as metadata manipulation, raw paths and wrangling of several workspaces simultaneously. These additional functionalities are useful as part of a CVS data production chain.
Maintained by Tanguy Barthelemy. Last updated 2 months ago.
jdemetrarjdemetraworkspace-managementopenjdk
1 stars 3.95 score 2 scriptsrjdverse
rjd3bench:Interface to 'JDemetra+ 3.x' time series analysis software
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software.
Maintained by Corentin Lemasson. Last updated 8 months ago.
2 stars 3.94 score 11 scriptsantonio-pgarcia
evoper:Evolutionary Parameter Estimation for 'Repast Simphony' Models
The EvoPER, Evolutionary Parameter Estimation for Individual-based Models is an extensible package providing optimization driven parameter estimation methods using metaheuristics and evolutionary computation techniques (Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization for continuous domains, Tabu Search, Evolutionary Strategies, ...) which could be more efficient and require, in some cases, fewer model evaluations than alternatives relying on experimental design. Currently there are built in support for models developed with 'Repast Simphony' Agent-Based framework (<https://repast.github.io/>) and with NetLogo (<https://ccl.northwestern.edu/netlogo/>) which are the most used frameworks for Agent-based modeling.
Maintained by Antonio Prestes Garcia. Last updated 5 years ago.
6 stars 3.92 score 28 scriptskurthornik
RWekajars:R/Weka Interface Jars
External jars required for package 'RWeka'.
Maintained by Kurt Hornik. Last updated 5 years ago.
3.89 score 32 scripts 15 dependentsbioc
CHRONOS:CHRONOS: A time-varying method for microRNA-mediated sub-pathway enrichment analysis
A package used for efficient unraveling of the inherent dynamic properties of pathways. MicroRNA-mediated subpathway topologies are extracted and evaluated by exploiting the temporal transition and the fold change activity of the linked genes/microRNAs.
Maintained by Panos Balomenos. Last updated 5 months ago.
systemsbiologygraphandnetworkpathwayskeggopenjdk
3.86 score 12 scriptsaqlt
rjdqa:Quality Assessment for Seasonal Adjustment
Add-in to the 'RJDemetra' package on seasonal adjustments. It allows to produce dashboards to summarise models and quickly check the quality of the seasonal adjustment.
Maintained by Alain Quartier-la-Tente. Last updated 5 months ago.
jdemetraquality-assessmentopenjdk
2 stars 3.85 score 8 scriptss-u
scagnostics:Compute scagnostics - scatterplot diagnostics
Calculates graph theoretic scagnostics. Scagnostics describe various measures of interest for pairs of variables, based on their appearance on a scatterplot. They are useful tool for discovering interesting or unusual scatterplots from a scatterplot matrix, without having to look at every individual plot.
Maintained by Simon Urbanek. Last updated 3 years ago.
1 stars 3.81 score 87 scripts 1 dependentsyvonneglanville
joinXL:Perform Joins or Minus Queries on 'Excel' Files
Performs Joins and Minus Queries on 'Excel' Files fulljoinXL() Merges all rows of 2 'Excel' files based upon a common column in the files. innerjoinXL() Merges all rows from base file and join file when the join condition is met. leftjoinXL() Merges all rows from the base file, and all rows from the join file if the join condition is met. rightjoinXL() Merges all rows from the join file, and all rows from the base file if the join condition is met. minusXL() Performs 2 operations source-minus-target and target-minus-source If the files are identical all output files will be empty. Choose two 'Excel' files via a dialog box, and then follow prompts at the console to choose a base or source file and columns to merge or minus on.
Maintained by Yvonne Glanville. Last updated 9 years ago.
1 stars 3.70 score 5 scriptsncss-tech
jNSMR:Interface to the 'Java Newhall Simulation Model' (jNSM) "A Traditional Soil Climate Simulation Model"
Provides methods to create input, read output, and run the routines from the legacy Java Newhall Simulation Model (jNSM) for soil climate. Currently this package uses a modified version of the jNSM v1.6.1 which is available for download here: <https://www.nrcs.usda.gov/wps/portal/nrcs/detail/?cid=nrcs142p2_053559> and the source code found here <https://github.com/drww/newhall/>. The system requirements of the extraction and installation tools (Windows .EXE archive) at the official download link may not be met on your system but the core Java class files are stored in a platform-independent format (a Java JAR file; e.g. newhall-1.6.1.jar) which is a core dependency in this package. Several more recent modifications to the Newhall JAR file allow for higher throughput and more efficient batching of many simulations allowing for larger-than-memory raster-based inputs and outputs.
Maintained by Andrew G. Brown. Last updated 24 days ago.
climatejavajnsmmodelnewhallsimulationsoilopenjdk
1 stars 3.70 score 25 scriptsterminological
testRapi:A Test Library
Documents the features of the 'r6-generator-maven-plugin' by providing an example of an R package automatically generated from Java code by the plugin. It is not intended to be useful beyond testing, demonstrating and documenting the features of the r6 generator plugin.
Maintained by Rob Challen. Last updated 11 months ago.
3.70 score 1 scriptsdrordas
D2MCS:Data Driving Multiple Classifier System
Provides a novel framework to able to automatically develop and deploy an accurate Multiple Classifier System based on the feature-clustering distribution achieved from an input dataset. 'D2MCS' was developed focused on four main aspects: (i) the ability to determine an effective method to evaluate the independence of features, (ii) the identification of the optimal number of feature clusters, (iii) the training and tuning of ML models and (iv) the execution of voting schemes to combine the outputs of each classifier comprising the Multiple Classifier System.
Maintained by Miguel Ferreiro-Díaz. Last updated 3 years ago.
3.70 scoresamkerns
SQLove:Execute 'SQL' Scripts in 'R' Containing Multiple Queries
The nature of working with structured query language ('SQL') scripts efficiently often requires the creation of temporary tables and there are few clean and simple 'R' 'SQL' execution approaches that allow you to complete this kind of work with the 'R' environment. This package seeks to give 'SQL' implementations in 'R' a little love by deploying functions that allow you to deploy complex 'SQL' scripts within a typical 'R' workflow.
Maintained by Kerns Sam. Last updated 1 years ago.
3.70 score 2 scriptsrjdverse
rjd3providers:Interface to 'JDemetra+ 3.x' time series analysis software.
Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It offers full acces to txt, csv, xml and spreadsheets files whicha are meant to be read by JDemetra+ Graphical User Interface.
Maintained by Alessandro Piovani. Last updated 9 months ago.
1 stars 3.68 score 2 scripts 1 dependentss-u
JavaGD:Java Graphics Device
Graphics device routing all graphics commands to a Java program. The actual functionality of the JavaGD depends on the Java-side implementation. Simple AWT and Swing implementations are included.
Maintained by Simon Urbanek. Last updated 2 years ago.
3.65 score 15 scripts 7 dependentsalexfoxfox
rChoiceDialogs:'rChoiceDialogs' Collection
Collection of portable choice dialog widgets.
Maintained by Alex Lisovich. Last updated 3 years ago.
3.64 score 49 scripts 3 dependentsrjdverse
rjd3workspace:Interface to 'JDemetra+ 3.x' time series analysis software.
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>). It offers several functions to manipulate 'JDemetra+' workspaces, which can be read by the software and can store several seasonal adjusted series along with user-defined calendars or regression variables.
Maintained by Tanguy Barthelemy. Last updated 2 months ago.
3 stars 3.62 scoreact-org
RSCAT:Shadow-Test Approach to Computerized Adaptive Testing
As an advanced approach to computerized adaptive testing (CAT), shadow testing (van der Linden(2005) <doi:10.1007/0-387-29054-0>) dynamically assembles entire shadow tests as a part of selecting items throughout the testing process. Selecting items from shadow tests guarantees the compliance of all content constraints defined by the blueprint. 'RSCAT' is an R package for the shadow-test approach to CAT. The objective of 'RSCAT' is twofold: 1) Enhancing the effectiveness of shadow-test CAT simulation; 2) Contributing to the academic and scientific community for CAT research. RSCAT is currently designed for dichotomous items based on the three-parameter logistic (3PL) model.
Maintained by Bingnan Jiang. Last updated 3 years ago.
computerized-adaptive-testingitem-response-theoryjavamixed-integer-programmingopen-sourceoptimizationshadow-testingshinyxpress-moselopenjdk
7 stars 3.54 score 6 scriptsmarkush81
JGR:Java GUI for R
Java GUI for R - cross-platform, universal and unified Graphical User Interface for R. For full functionality on Windows and Mac OS X JGR requires a start application which depends on your OS.
Maintained by Markus Helbig. Last updated 2 years ago.
3.48 score 42 scripts 2 dependentscran
spcosa:Spatial Coverage Sampling and Random Sampling from Compact Geographical Strata
Spatial coverage sampling and random sampling from compact geographical strata created by k-means. See Walvoort et al. (2010) <doi:10.1016/j.cageo.2010.04.005> for details.
Maintained by Dennis Walvoort. Last updated 2 years ago.
2 stars 3.48 score 1 dependentsgk-crop
simplaceUtil:Provides Utility Functions and ShinyApps to work with the modeling framework 'SIMPLACE'
Provides Utility Functions and ShinyApps to work with the modeling framework 'SIMPLACE'. It visualises components of a solution, runs simulations and displays results.
Maintained by Gunther Krauss. Last updated 1 months ago.
1 stars 3.48 score 2 scriptsspang-lab
FastRet:Retention Time Prediction in Liquid Chromatography
A framework for predicting retention times in liquid chromatography. Users can train custom models for specific chromatography columns, predict retention times using existing models, or adjust existing models to account for altered experimental conditions. The provided functionalities can be accessed either via the R console or via a graphical user interface. Related work: Bonini et al. (2020) <doi:10.1021/acs.analchem.9b05765>.
Maintained by Tobias Schmidt. Last updated 2 months ago.
retentiontimechromotographylc-msdata-sciencelcmsopenjdk
3.48 score 4 scriptsmhahsler
arulesNBMiner:Mining NB-Frequent Itemsets and NB-Precise Rules
NBMiner is an implementation of the model-based mining algorithm for mining NB-frequent itemsets and NB-precise rules. Michael Hahsler (2006) <doi:10.1007/s10618-005-0026-2>.
Maintained by Michael Hahsler. Last updated 3 years ago.
6 stars 3.48 score 10 scriptsdmkaplan2000
RH2:DBI/RJDBC Interface to H2 Database
DBI/RJDBC interface to h2 database. h2 version 1.3.175 is included.
Maintained by "David M. Kaplan". Last updated 7 years ago.
1 stars 3.46 score 40 scriptskurthornik
openNLPdata:Apache OpenNLP Jars and Basic English Language Models
Apache OpenNLP jars and basic English language models.
Maintained by Kurt Hornik. Last updated 1 years ago.
3.44 score 40 scripts 9 dependentsifellows
Deducer:A Data Analysis GUI for R
An intuitive, cross-platform graphical data analysis system. It uses menus and dialogs to guide the user efficiently through the data manipulation and analysis process, and has an excel like spreadsheet for easy data frame visualization and editing. Deducer works best when used with the Java based R GUI JGR, but the dialogs can be called from the command line. Dialogs have also been integrated into the Windows Rgui.
Maintained by Ian Fellows. Last updated 9 years ago.
3.44 score 91 scripts 1 dependentsrjdverse
rjd3stl:R Interface to 'JDemetra+ 3.x' time series analysis software.
R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It provides functions allowing to decompose a time series, including high-frequency data with multiple periodicities.
Maintained by Jean Palate. Last updated 9 months ago.
1 stars 3.38 score 1 scriptsrapidsurveys
odkr:'Open Data Kit' ('ODK') R API
Utility functions for working with datasets gathered using 'Open Data Kit' ('ODK') <https://opendatakit.org/>. These include an API to interface with 'ODK Briefcase', a 'Java' application for fetching and pushing 'ODK' forms and their contents, that allows pulling of data from either a remote 'ODK Aggregate Server' or a local 'ODK' folder, a rename function to give more human readable variable names for 'ODK' datasets, a merge function to create a single dataframe from a nested 'ODK' dataset and an expand function to disaggregate multiple choice answers that have been collapsed into single code by 'ODK'.
Maintained by Ernest Guevarra. Last updated 6 months ago.
odkodk-briefcaseopen-data-kitopenjdk
11 stars 3.32 score 19 scriptsdaroczig
AWR.Kinesis:Amazon 'Kinesis' Consumer Application for Stream Processing
Fetching data from Amazon 'Kinesis' Streams using the Java-based 'MultiLangDaemon' interacting with Amazon Web Services ('AWS') for easy stream processing from R. For more information on 'Kinesis', see <https://aws.amazon.com/kinesis>.
Maintained by Gergely Daroczi. Last updated 7 years ago.
4 stars 3.30 score 7 scriptscefet-rj-dal
daltoolboxdp:Data Pre-Processing Extensions
An important aspect of data analytics is related to data management support for artificial intelligence. It is related to preparing data correctly. This package provides extensions to support data preparation in terms of both data sampling and data engineering. Overall, the package provides researchers with a comprehensive set of functionalities for data science based on experiment lines, promoting ease of use, extensibility, and integration with various tools and libraries. Information on Experiment Line is based on Ogasawara et al. (2009) <doi:10.1007/978-3-642-02279-1_20>.
Maintained by Eduardo Ogasawara. Last updated 4 months ago.
1 stars 3.26 score 12 scriptsdanielollech
dsa:Seasonal Adjustment of Daily Time Series
Seasonal- and calendar adjustment of time series with daily frequency using the DSA approach developed by Ollech, Daniel (2018): Seasonal adjustment of daily time series. Bundesbank Discussion Paper 41/2018.
Maintained by Daniel Ollech. Last updated 4 years ago.
1 stars 3.19 score 52 scripts 1 dependentsbeast-dev
BeastJar:JAR Dependency for MCMC Using 'BEAST'
Provides JAR to perform Markov chain Monte Carlo (MCMC) inference using the popular Bayesian Evolutionary Analysis by Sampling Trees 'BEAST' software library of Suchard et al (2018) <doi:10.1093/ve/vey016>. 'BEAST' supports auto-tuning Metropolis-Hastings, slice, Hamiltonian Monte Carlo and Sequential Monte Carlo sampling for a large variety of composable standard and phylogenetic statistical models using high performance computing. By placing the 'BEAST' JAR in this package, we offer an efficient distribution system for 'BEAST' use by other R packages using CRAN.
Maintained by Marc A. Suchard. Last updated 5 months ago.
1 stars 3.18 score 2 scripts 1 dependentsvarungiri
RxnSim:Functions to Compute Chemical and Chemical Reaction Similarity
Methods to compute chemical similarity between two or more reactions and molecules. Allows masking of chemical substructures for weighted similarity computations. Uses packages 'rCDK' and 'fingerprint' for cheminformatics functionality. Methods for reaction similarity and sub-structure masking are as described in: Giri et al. (2015) <doi:10.1093/bioinformatics/btv416>.
Maintained by Varun Giri. Last updated 2 years ago.
2 stars 3.18 score 15 scriptslianekluge
informativeSCI:Informative Simultaneous Confidence Intervals
Calculation of informative simultaneous confidence intervals for graphical described multiple test procedures and given information weights. Bretz et al. (2009) <doi:10.1002/sim.3495> and Brannath et al. (2024) <doi:10.48550/arXiv.2402.13719>. Furthermore, exploration of the behavior of the informative bounds in dependence of the information weights. Comparisons with compatible bounds are possible. Strassburger and Bretz (2008) <doi:10.1002/sim.3338>.
Maintained by Liane Kluge. Last updated 9 months ago.
3.18 scorecran
oRus:Operational Research User Stories
A first implementation of automated parsing of user stories, when used to defined functional requirements for operational research mathematical models. It allows reading user stories, splitting them on the who-what-why template, and classifying them according to the parts of the mathematical model that they represent. Also provides semantic grouping of stories, for project management purposes.
Maintained by Melina Vidoni. Last updated 5 years ago.
3.18 scoremaressyl
ODB:Open Document Databases '.odb' Management
Functions to create, connect, update and query 'HSQL' databases embedded in Open Document Databases files, as 'OpenOffice' and 'LibreOffice' do.
Maintained by Sylvain Mareschal. Last updated 5 years ago.
1 stars 3.15 score 14 scriptsbrry
OSMscale:Add a Scale Bar to 'OpenStreetMap' Plots
Functionality to handle and project lat-long coordinates, easily download background maps and add a correct scale bar to 'OpenStreetMap' plots in any map projection.
Maintained by Berry Boessenkool. Last updated 1 years ago.
1 stars 3.11 score 26 scriptsr-forge
pems.utils:Portable Emissions (and Other Mobile) Measurement System Utilities
Utility functions for the handling, analysis and visualisation of data from portable emissions measurement systems ('PEMS') and other similar mobile activity monitoring devices. The package includes a dedicated 'pems' data class that manages many of the quality control, unit handling and data archiving issues that can hinder efforts to standardise 'PEMS' research.
Maintained by Karl Ropkins. Last updated 3 months ago.
3.06 score 19 scriptstaylor-arnold
coreNLP:Wrappers Around Stanford CoreNLP Tools
Provides a minimal interface for applying annotators from the 'Stanford CoreNLP' java library. Methods are provided for tasks such as tokenisation, part of speech tagging, lemmatisation, named entity recognition, coreference detection and sentiment analysis.
Maintained by Taylor Arnold. Last updated 3 years ago.
1 stars 3.04 score 55 scriptsmarvels2031
MultiGroupSequential:Group-Sequential Procedures with Multiple Endpoints
Provides various testing procedures for group-sequential trials with multiple endpoints. Two sets of procedures are provided.
Maintained by Xiaodong Luo. Last updated 2 years ago.
3.00 score 1 scriptsgiabaio
survHEhmc:Survival Analysis in Health Economic Evaluation using Hamiltonian Monte Carlo
A module to complement the backbone structure of the package 'survHE' and expand its functionality to run survival models under a Bayesian approach (based on Hamiltonian Monte Carlo). <doi:10.18637/jss.v095.i14>.
Maintained by Gianluca Baio. Last updated 25 days ago.
hamiltonian-monte-carlohealth-economic-evaluationsrstansurvival-analysisuncertaintycppopenjdk
5 stars 3.00 score 1 scriptsohdsi
DatabaseConnectorJars:JAR Dependencies for the 'DatabaseConnector' Package
Provides external JAR dependencies for the 'DatabaseConnector' package.
Maintained by Martijn Schuemie. Last updated 3 years ago.
1 stars 3.00 scoreaphalo
rOmniDriver:Omni Driver R wrapper
This package is a wrapper of the OmniDriver java driver for Ocean Optics spectrometers.
Maintained by Pedro J. Aphalo. Last updated 7 months ago.
data-acquisitionspectroscopyopenjdk
1 stars 3.00 score 6 scriptscran
rmcfs:The MCFS-ID Algorithm for Feature Selection and Interdependency Discovery
MCFS-ID (Monte Carlo Feature Selection and Interdependency Discovery) is a Monte Carlo method-based tool for feature selection. It also allows for the discovery of interdependencies between the relevant features. MCFS-ID is particularly suitable for the analysis of high-dimensional, 'small n large p' transactional and biological data. M. Draminski, J. Koronacki (2018) <doi:10.18637/jss.v085.i12>.
Maintained by Michal Draminski. Last updated 7 months ago.
1 stars 2.95 score 1 dependentsfischuu
REPPlab:R Interface to 'EPP-Lab', a Java Program for Exploratory Projection Pursuit
An R Interface to 'EPP-lab' v1.0. 'EPP-lab' is a Java program for projection pursuit using genetic algorithms written by Alain Berro and S. Larabi Marie-Sainte and is included in the package. The 'EPP-lab' sources are available under https://github.com/fischuu/EPP-lab.git.
Maintained by Daniel Fischer. Last updated 2 years ago.
2.95 score 30 scripts 1 dependentslarskotthoff
llama:Leveraging Learning to Automatically Manage Algorithms
Provides functionality to train and evaluate algorithm selection models for portfolios.
Maintained by Lars Kotthoff. Last updated 4 years ago.
4 stars 2.80 score 53 scripts 1 dependentsgiabaio
survHEinla:Survival Analysis in Health Economic Evaluation using INLA
A module to complement the backbone structure of the package survHE and expand its functionality to run survival models under a Bayesian approach (based on Integrated Nested Laplace Approximation; the underlying 'INLA' package is available for download at <https://inla.r-inla-download.org/R/stable/>). <doi:10.18637/jss.v095.i14>.
Maintained by Gianluca Baio. Last updated 25 days ago.
bayesian-inferencecost-effectiveness-analysishealth-economic-evaluationintegrated-nested-laplace-approximationsurvival-analysisuncertaintyopenjdk
4 stars 2.78 scoresangillee
CBSr:Fits Cubic Bezier Spline Functions to Intertemporal and Risky Choice Data
Uses monotonically constrained Cubic Bezier Splines (CBS) to approximate latent utility functions in intertemporal choice and risky choice data. For more information, see Lee, Glaze, Bradlow, and Kable <doi:10.1007/s11336-020-09723-4>.
Maintained by Sangil Lee. Last updated 4 years ago.
2.70 scoreohdsi
CohortPathways:Create Pathways from Target to Event Cohorts
Software tool designed to compute the temporal relationship defined as pathways between any two instantiated cohorts. The cohorts are input as Target and event cohorts.
Maintained by Gowtham Rao. Last updated 9 days ago.
2.70 scorekapelner
OptimalRerandExpDesigns:Optimal Rerandomization Experimental Designs
This is a tool to find the optimal rerandomization threshold in non-sequential experiments. We offer three procedures.
Maintained by Adam Kapelner. Last updated 4 years ago.
2.70 score 1 scriptsatalay-k
drawsample:Draw Samples with the Desired Properties from a Data Set
A tool to sample data with the desired properties.Samples can be drawn by purposive sampling with determining distributional conditions, such as deviation from normality (skewness and kurtosis), and sample size in quantitative research studies. For purposive sampling, a researcher has something in mind and participants that fit the purpose of the study are included (Etikan,Musa, & Alkassim, 2015) <doi:10.11648/j.ajtas.20160501.11>.Purposive sampling can be useful for answering many research questions (Klar & Leeper, 2019) <doi:10.1002/9781119083771.ch21>.
Maintained by Kubra Atalay Kabasakal. Last updated 3 years ago.
1 stars 2.70 score 1 scriptsmatthbogaert
DecorateR:Fit and Deploy DECORATE Trees
DECORATE (Diverse Ensemble Creation by Oppositional Relabeling of Artificial Training Examples) builds an ensemble of J48 trees by recursively adding artificial samples of the training data ("Melville, P., & Mooney, R. J. (2005) <DOI:10.1016/j.inffus.2004.04.001>").
Maintained by Matthias Bogaert. Last updated 4 years ago.
2.70 score 2 scriptsterminological
jplantuml4r:Jplantuml4r: R6 Java Wrapper Package
An R6 package wrapping java code in the org.github.io.github.terminological:jplantuml4r library. This library was generated by the r6-generator-maven-plugin.
Maintained by Rob Challen. Last updated 11 months ago.
1 stars 2.70 score 3 scriptsxiaoruizhu
DataClean:Data Cleaning
Includes functions that researchers or practitioners may use to clean raw data, transferring html, xlsx, txt data file into other formats. And it also can be used to manipulate text variables, extract numeric variables from text variables and other variable cleaning processes. It is originated from a author's project which focuses on creative performance in online education environment. The resulting paper of that study will be published soon.
Maintained by Xiaorui(Jeremy) Zhu. Last updated 7 years ago.
2.70 score 4 scriptscrew102
rapidraker:Rapid Automatic Keyword Extraction (RAKE) Algorithm
A 'Java' implementation of the RAKE algorithm ('Rose', S., 'Engel', D., 'Cramer', N. and 'Cowley', W. (2010) <doi:10.1002/9780470689646.ch1>), which can be used to extract keywords from documents without any training data.
Maintained by Christopher Baker. Last updated 4 years ago.
1 stars 2.70 score 5 scriptsthijsjanzen
nodeSub:Simulate DNA Alignments Using Node Substitutions
Simulate DNA sequences for the node substitution model. In the node substitution model, substitutions accumulate additionally during a speciation event, providing a potential mechanistic explanation for substitution rate variation. This package provides tools to simulate such a process, simulate a reference process with only substitutions along the branches, and provides tools to infer phylogenies from alignments. More information can be found in Janzen (2021) <doi:10.1093/sysbio/syab085>.
Maintained by Thijs Janzen. Last updated 1 years ago.
1 stars 2.70 score 3 scriptskidoishi
MadanText:Persian Textmining Tool for Frequency Analysis, Statistical Analysis, and Word Clouds
MadanText is an open-source software designed specifically for text mining in the Persian language. It allows users to examine word frequencies, download data for analysis, and generate word clouds. This tool is particularly useful for researchers and analysts working with Persian language data.
Maintained by Kido Ishikawa. Last updated 1 years ago.
2.70 scorekidoishi
MadanTextNetwork:Persian Textmining Tool for Co-Occurrence_Network
MadanText_co-occurrence_network is an open-source software designed specifically for text mining in the Persian language. It adds co-occurrence network functionality to MadanText. The input file replaces the text format with an Excel format.
Maintained by Kido Ishikawa. Last updated 1 years ago.
2.70 scorecran
RSentiment:Analyse Sentiment of English Sentences
Analyses sentiment of a sentence in English and assigns score to it. It can classify sentences to the following categories of sentiments:- Positive, Negative, very Positive, very negative, Neutral. For a vector of sentences, it counts the number of sentences in each category of sentiment.In calculating the score, negation and various degrees of adjectives are taken into consideration. It deals only with English sentences.
Maintained by Subhasree Bose. Last updated 7 years ago.
2.70 scorekapelner
bartMachineJARs:bartMachine JARs
These are bartMachine's Java dependency libraries. Note: this package has no functionality of its own and should not be installed as a standalone package without bartMachine.
Maintained by Adam Kapelner. Last updated 3 years ago.
2.64 score 9 scripts 7 dependentsseasadjwg
SAvalidation:Validation checks on seasonally adjusted time series
Functions for running validation checks on a pair of time series, an unadjsuted (NSA) and seasonally adjusted (SA) series.
Maintained by Duncan Elliott. Last updated 5 months ago.
2.60 score 5 scriptsaqlt
publishTC:Tools to help publish the trend-cycle
This package provides several functions to facilitate the computation of trend-cycle component. In particular, the computation can be done: using the Cascade Linear Filter (CLF) Dagum, E. B., & Luati, A. (2008); using the classical Henderson symmetric filter and the surrogate Musgrave asymmetric filters; using a local Parametrization of the Musgrave asymmetric filters (Quartier-la-Tente 2024); extending the Henderson symmetric fiter and the surrogate Musgrave asymmetric filters to take into account additive outliers and level shifts (Quartier-la-Tente 2025). Confidence intervals can be computed and several plots are available.
Maintained by Alain Quartier-la-Tente. Last updated 8 days ago.
2.54 scorejwijffels
RMOA:Connect R with MOA for Massive Online Analysis
Connect R with MOA (Massive Online Analysis - <https://moa.cms.waikato.ac.nz/>) to build classification models and regression models on streaming data or out-of-RAM data. Also streaming recommendation models are made available.
Maintained by Jan Wijffels. Last updated 3 years ago.
1 stars 2.53 score 34 scriptsnielsjdewinter
shelltrace:Bivalve Growth and Trace Element Accumulation Model
Contains all the formulae of the growth and trace element uptake model described in the equally-named Geoscientific Model Development paper (de Winter, 2017, <doi:10.5194/gmd-2017-137>). The model takes as input a file with X- and Y-coordinates of digitized growth increments recognized on a longitudinal cross section through the bivalve shell, as well as a BMP file of an elemental map of the cross section surface with chemically distinct phases separated by phase analysis. It proceeds by a step-by-step process described in the paper, by which digitized growth increments are used to calculate changes in shell height, shell thickness, shell volume, shell mass and shell growth rate through the bivalve's life time. Then, results of this growth modelling are combined with the trace element mapping results to trace the incorporation of trace elements into the bivalve shell. Results of various modelling parameters can be exported in the form of XLSX files.
Maintained by Niels J. de Winter. Last updated 7 years ago.
bivalvechemistryelementsfisheriesgrowthmolluskpalaeobiologypalaeoecologypaleobiologypaleoclimateshellopenjdk
2.49 score 31 scriptsjwijffels
RMOAjars:External jars Required for Package RMOA
External jars required for package RMOA. RMOA is a framework to build data stream models on top of MOA (Massive Online Analysis - <https://moa.cms.waikato.ac.nz/>). The jar files are put in this R package, the modelling logic can be found in the RMOA package.
Maintained by Jan Wijffels. Last updated 3 years ago.
1 stars 2.48 score 1 dependentsfabriciomlopes
BASiNET:Classification of RNA Sequences using Complex Network Theory
It makes the creation of networks from sequences of RNA, with this is done the abstraction of characteristics of these networks with a methodology of threshold for the purpose of making a classification between the classes of the sequences. There are four data present in the 'BASiNET' package, "sequences", "sequences2", "sequences-predict" and "sequences2-predict" with 11, 10, 11 and 11 sequences respectively. These sequences were taken from the data set used in the article (LI, Aimin; ZHANG, Junying; ZHOU, Zhongyin, 2014) <doi:10.1186/1471-2105-15-311>, these sequences are used to run examples. The BASiNET was published on Nucleic Acids Research, (ITO, Eric; KATAHIRA, Isaque; VICENTE, Fábio; PEREIRA, Felipe; LOPES, Fabrício, 2018) <doi:10.1093/nar/gky462>.
Maintained by Fabricio Martins Lopes. Last updated 3 years ago.
softwarebiologicalquestiongenepredictionopenjdk
2.48 score 7 scriptsi02momuj
RKEEL:Using 'KEEL' in R Code
'KEEL' is a popular 'Java' software for a large number of different knowledge data discovery tasks. This package takes the advantages of 'KEEL' and R, allowing to use 'KEEL' algorithms in simple R code. The implemented R code layer between R and 'KEEL' makes easy both using 'KEEL' algorithms in R as implementing new algorithms for 'RKEEL' in a very simple way. It includes more than 100 algorithms for classification, regression, preprocess, association rules and imbalance learning, which allows a more complete experimentation process. For more information about 'KEEL', see <http://www.keel.es/>.
Maintained by Jose M. Moyano. Last updated 2 years ago.
2 stars 2.41 score 130 scriptsetendard7
ELT:Experience Life Tables
Build experience life tables.
Maintained by Wassim Youssef. Last updated 2 years ago.
2.40 score 25 scriptshdbeukel
corehunter:Multi-Purpose Core Subset Selection
Core Hunter is a tool to sample diverse, representative subsets from large germplasm collections, with minimum redundancy. Such so-called core collections have applications in plant breeding and genetic resource management in general. Core Hunter can construct cores based on genetic marker data, phenotypic traits or precomputed distance matrices, optimizing one of many provided evaluation measures depending on the precise purpose of the core (e.g. high diversity, representativeness, or allelic richness). In addition, multiple measures can be simultaneously optimized as part of a weighted index to bring the different perspectives closer together. The Core Hunter library is implemented in Java 8 as an open source project (see <http://www.corehunter.org>).
Maintained by Herman De Beukelaer. Last updated 2 years ago.
1 stars 2.34 score 22 scriptsaqlt
rjd3report:Quality Assessment and Reportiing for Seasonal Adjustment
Add-in to the 'RJDemetra' package on seasonal adjustments. It allows to produce quality assessments outputs (such as dashboards).
Maintained by Alain Quartier-la-Tente. Last updated 4 months ago.
1 stars 2.30 score 2 scriptsarnoxieseg
LLM:Logit Leaf Model Classifier for Binary Classification
Fits the Logit Leaf Model, makes predictions and visualizes the output. (De Caigny et al., (2018) <DOI:10.1016/j.ejor.2018.02.009>).
Maintained by Arno De Caigny. Last updated 5 years ago.
2 stars 2.30 score 5 scriptsgunjan15
ISM:Interpretive Structural Modelling (ISM)
The development of ISM was made by Warfield in 1974. ISM is the process of collaborating distinct or related essentials into a simplified and an organized format. Hence, ISM is a methodology that seeks the interrelationships among the various elements considered and endows with a hierarchical and multilevel structure. To run this package user needs to provide a matrix (VAXO) converted into 0's and 1's. Warfield,J.N. (1974) <doi:10.1109/TSMC.1974.5408524> Warfield,J.N. (1974, E-ISSN:2168-2909).
Maintained by Gunjan Bansal. Last updated 7 years ago.
2 stars 2.30 score 1 scriptsalun-thomas
rviewgraph:Animated Graph Layout Viewer
Provides 'Java' graphical user interfaces for viewing, manipulating and plotting graphs. Graphs may be directed or undirected.
Maintained by Alun Thomas. Last updated 2 years ago.
2.30 score 3 scriptsaybekec
RSP:'shiny' Applications for Statistical and Psychometric Analysis
Toolbox with 'shiny' applications for widely used psychometric methods. Those methods include following analysis: Item analysis, item response theory calibration, principal component analysis, confirmatory factor analysis - structural equation modeling, generating simulated data. References: Chalmers (2012, <doi:10.18637/jss.v048.i06>); Revelle (2022, <https://CRAN.R-project.org/package=psych Version = 2.2.9.>); Rosseel (2012, <doi:10.18637/jss.v048.i02>); Magis & Raiche (2012, <doi:10.18637/jss.v048.i08>); Magis & Barrada (2017, <doi:10.18637/jss.v076.c01>).
Maintained by Eren Can Aybek. Last updated 2 years ago.
2.28 score 19 scripts