Showing 24 of total 24 results (show query)
gagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
55.5 match 309 stars 18.31 score 10k scripts 8.6k dependentss-u
Cairo:R Graphics Device using Cairo Graphics Library for Creating High-Quality Bitmap (PNG, JPEG, TIFF), Vector (PDF, SVG, PostScript) and Display (X11 and Win32) Output
R graphics device using cairographics library that can be used to create high-quality vector (PDF, PostScript and SVG) and bitmap output (PNG,JPEG,TIFF), and high-quality rendering in displays (X11 and Win32). Since it uses the same back-end for all output, copying across formats is WYSIWYG. Files are created without the dependence on X11 or other external programs. This device supports alpha channel (semi-transparent drawing) and resulting images can contain transparent and semi-transparent regions. It is ideal for use in server environments (file output) and as a replacement for other devices that don't have Cairo's capabilities such as alpha support or anti-aliasing. Backends are modular such that any subset of backends is supported.
Maintained by Simon Urbanek. Last updated 7 months ago.
freetypecairolibx11libjpeg-turboharfbuzzicutiff
33.0 match 14 stars 12.52 score 3.9k scripts 71 dependentsnliulab
AutoScore:An Interpretable Machine Learning-Based Automatic Clinical Score Generator
A novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The details are described in our research paper<doi:10.2196/21798>. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.
Maintained by Feng Xie. Last updated 14 days ago.
14.1 match 32 stars 7.70 score 30 scriptsgagolews
stringx:Replacements for Base String Functions Powered by 'stringi'
English is the native language for only 5% of the World population. Also, only 17% of us can understand this text. Moreover, the Latin alphabet is the main one for merely 36% of the total. The early computer era, now a very long time ago, was dominated by the US. Due to the proliferation of the internet, smartphones, social media, and other technologies and communication platforms, this is no longer the case. This package replaces base R string functions (such as grep(), tolower(), sprintf(), and strptime()) with ones that fully support the Unicode standards related to natural language and date-time processing. It also fixes some long-standing inconsistencies, and introduces some new, useful features. Thanks to 'ICU' (International Components for Unicode) and 'stringi', they are fast, reliable, and portable across different platforms.
Maintained by Marek Gagolewski. Last updated 2 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringitexttext-processingunicode
11.5 match 28 stars 4.75 score 1 scriptseth-mds
ricu:Intensive Care Unit Data with R
Focused on (but not exclusive to) data sets hosted on PhysioNet (<https://physionet.org>), 'ricu' provides utilities for download, setup and access of intensive care unit (ICU) data sets. In addition to functions for running arbitrary queries against available data sets, a system for defining clinical concepts and encoding their representations in tabular ICU data is presented.
Maintained by Nicolas Bennett. Last updated 9 months ago.
8.6 match 39 stars 5.65 score 77 scriptsfriendly
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 5 months ago.
categorical-data-visualizationgeneralized-linear-modelsmosaic-plots
4.0 match 24 stars 10.34 score 472 scripts 3 dependentsadibender
pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis
The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated, competing risks and recurrent events data. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.
Maintained by Andreas Bender. Last updated 2 months ago.
additive-modelspammpammtoolspiece-wise-exponentialsurvival-analysis
3.5 match 48 stars 8.78 score 310 scripts 8 dependentslbraglia
aplore3:Datasets from Hosmer, Lemeshow and Sturdivant, "Applied Logistic Regression" (3rd Ed., 2013)
An unofficial companion to "Applied Logistic Regression" by D.W. Hosmer, S. Lemeshow and R.X. Sturdivant (3rd ed., 2013) containing the dataset used in the book.
Maintained by Luca Braglia. Last updated 8 years ago.
4.5 match 12 stars 5.86 score 108 scriptscran
lsm:Estimation of the log Likelihood of the Saturated Model
When the values of the outcome variable Y are either 0 or 1, the function lsm() calculates the estimation of the log likelihood in the saturated model. This model is characterized by Llinas (2006, ISSN:2389-8976) in section 2.3 through the assumptions 1 and 2. The function LogLik() works (almost perfectly) when the number of independent variables K is high, but for small K it calculates wrong values in some cases. For this reason, when Y is dichotomous and the data are grouped in J populations, it is recommended to use the function lsm() because it works very well for all K.
Maintained by Jorge Villalba. Last updated 9 months ago.
6.6 match 2.48 scorecsids
cstidy:Helpful Functions for Cleaning Surveillance Data
Helpful functions for the cleaning and manipulation of surveillance data, especially with regards to the creation and validation of panel data from individual level surveillance data.
Maintained by Richard Aubrey White. Last updated 9 months ago.
3.5 match 4.65 score 1 dependentsa-roshani
ntsDatasets:Neutrosophic Data Sets
Provides a collection of datasets related to neutrosophic sets for statistical modeling and analysis.
Maintained by Amin Roshani. Last updated 8 months ago.
3.8 match 1 stars 3.78 scoreaallignol
kmi:Kaplan-Meier Multiple Imputation for the Analysis of Cumulative Incidence Functions in the Competing Risks Setting
Performs a Kaplan-Meier multiple imputation to recover the missing potential censoring information from competing risks events, so that standard right-censored methods could be applied to the imputed data sets to perform analyses of the cumulative incidence functions (Allignol and Beyersmann, 2010 <doi:10.1093/biostatistics/kxq018>).
Maintained by Arthur Allignol. Last updated 6 years ago.
3.8 match 2 stars 2.98 score 16 scriptscran
givitiR:The GiViTI Calibration Test and Belt
Functions to assess the calibration of logistic regression models with the GiViTI (Gruppo Italiano per la Valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units - see <http://www.giviti.marionegri.it/>) approach. The approach consists in a graphical tool, namely the GiViTI calibration belt, and in the associated statistical test. These tools can be used both to evaluate the internal calibration (i.e. the goodness of fit) and to assess the validity of an externally developed model.
Maintained by Giovanni Nattino. Last updated 8 years ago.
3.3 match 3.32 score 21 scriptsdewittpe
pedalfast.data:PEDALFAST Data
Data files and documentation for PEDiatric vALidation oF vAriableS in TBI (PEDALFAST). The data was used in "Functional Status Scale in Children With Traumatic Brain Injury: A Prospective Cohort Study" by Bennett, Dixon, et al (2016) <doi:10.1097/PCC.0000000000000934>.
Maintained by Peter DeWitt. Last updated 6 months ago.
4.7 match 2.30 score 5 scriptsepimedplotly
ems:Epimed Solutions Collection for Data Editing, Analysis, and Benchmark of Health Units
Collection of functions related to benchmark with prediction models for data analysis and editing of clinical and epidemiological data.
Maintained by Lunna Borges. Last updated 3 years ago.
4.0 match 1 stars 2.26 score 12 scripts 1 dependentsraikens1
stratamatch:Stratification and Matching for Large Observational Data Sets
A pilot matching design to automatically stratify and match large datasets. The manual_stratify() function allows users to manually stratify a dataset based on categorical variables of interest, while the auto_stratify() function does automatically by allocating a held-aside (pilot) data set, fitting a prognostic score (see Hansen (2008) <doi:10.1093/biomet/asn004>) on the pilot set, and stratifying the data set based on prognostic score quantiles. The strata_match() function then does optimal matching of the data set in parallel within strata.
Maintained by Rachael C. Aikens. Last updated 3 years ago.
3.5 match 2.30 score 6 scriptsrachelhey
TBFmultinomial:TBF Methodology Extension for Multinomial Outcomes
Extends the test-based Bayes factor (TBF) methodology to multinomial regression models and discrete time-to-event models with competing risks. The TBF methodology has been well developed and implemented for the generalised linear model [Held et al. (2015) <doi:10.1214/14-STS510>] and for the Cox model [Held et al. (2016) <doi:10.1002/sim.7089>].
Maintained by Rachel Heyard. Last updated 6 years ago.
3.6 match 2.00 score 7 scriptsjodamatta
SLOS:ICU Length of Stay Prediction and Efficiency Evaluation
Provides tools for predicting ICU length of stay and assessing ICU efficiency. It is based on the methodologies proposed by Peres et al. (2022, 2023), which utilize data-driven approaches for modeling and validation, offering insights into ICU performance and patient outcomes. References: Peres et al. (2022)<https://pubmed.ncbi.nlm.nih.gov/35988701/>, Peres et al. (2023)<https://pubmed.ncbi.nlm.nih.gov/37922007/>. More information: <https://github.com/igor-peres/ICU-Length-of-Stay-Prediction>.
Maintained by Joana da Matta. Last updated 1 months ago.
3.9 match 1.30 scorecran
cif:Cointegrated ICU Forecasting
Set of forecasting tools to predict ICU beds using a Vector Error Correction model with a single cointegrating vector. Method described in Berta, P. Lovaglio, P.G. Paruolo, P. Verzillo, S., 2020. "Real Time Forecasting of Covid-19 Intensive Care Units demand" Health, Econometrics and Data Group (HEDG) Working Papers 20/16, HEDG, Department of Economics, University of York, <https://www.york.ac.uk/media/economics/documents/hedg/workingpapers/2020/2016.pdf>.
Maintained by Paolo Paruolo. Last updated 3 years ago.
3.8 match 1.00 score 4 scriptstdhock
nc:Named Capture to Data Tables
User-friendly functions for extracting a data table (row for each match, column for each group) from non-tabular text data using regular expressions, and for melting columns that match a regular expression. Patterns are defined using a readable syntax that makes it easy to build complex patterns in terms of simpler, re-usable sub-patterns. Named R arguments are translated to column names in the output; capture groups without names are used internally in order to provide a standard interface to three regular expression 'C' libraries ('PCRE', 'RE2', 'ICU'). Output can also include numeric columns via user-specified type conversion functions.
Maintained by Toby Hocking. Last updated 2 months ago.
0.5 match 16 stars 6.85 score 46 scriptspaithiov909
audubon:Japanese Text Processing Tools
A collection of Japanese text processing tools for filling Japanese iteration marks, Japanese character type conversions, segmentation by phrase, and text normalization which is based on rules for the 'Sudachi' morphological analyzer and the 'NEologd' (Neologism dictionary for 'MeCab'). These features are specific to Japanese and are not implemented in 'ICU' (International Components for Unicode).
Maintained by Akiru Kato. Last updated 21 days ago.
0.5 match 10 stars 5.61 score 3 scripts 1 dependentsimranshakoor
DataSetsUni:A Collection of Univariate Data Sets
A collection of widely used univariate data sets of various applied domains on applications of distribution theory. The functions allow researchers and practitioners to quickly, easily, and efficiently access and use these data sets. The data are related to different applied domains and as follows: Bio-medical, survival analysis, medicine, reliability analysis, hydrology, actuarial science, operational research, meteorology, extreme values, quality control, engineering, finance, sports and economics. The total 100 data sets are documented along with associated references for further details and uses.
Maintained by Muhammad Imran. Last updated 2 years ago.
1.7 match 1.00 score 1 scripts