R-universe search: occurrence

ropensci

spocc:Interface to Species Occurrence Data Sources

A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.

Maintained by Hannah Owens. Last updated 1 months ago.

specimens api web-services occurrences species taxonomy gbif inat vertnet ebird idigbio obis ala antweb bison data ecoengine inaturalist occurrence species-occurrence spocc

49.0 match 118 stars 10.09 score 552 scripts 5 dependents

ropensci

rgbif:Interface to the Global Biodiversity Information Facility API

A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.

Maintained by John Waller. Last updated 3 days ago.

gbif specimens api web-services occurrences species taxonomy biodiversity data lifewatch oscibio spocc

34.1 match 161 stars 13.26 score 2.1k scripts 20 dependents

nowosad

comat:Creates Co-Occurrence Matrices of Spatial Data

Builds co-occurrence matrices based on spatial raster data. It includes creation of weighted co-occurrence matrices (wecoma) and integrated co-occurrence matrices (incoma; Vadivel et al. (2007) <doi:10.1016/j.patrec.2007.01.004>).

Maintained by Jakub Nowosad. Last updated 1 years ago.

co-occurrence raster spatial cpp

34.2 match 6 stars 6.31 score 25 scripts 3 dependents

jbdorey

BeeBDC:Occurrence Data Cleaning

Flags and checks occurrence data that are in Darwin Core format. The package includes generic functions and data as well as some that are specific to bees. This package is meant to build upon and be complimentary to other excellent occurrence cleaning packages, including 'bdc' and 'CoordinateCleaner'. This package uses datasets from several sources and particularly from the Discover Life Website, created by Ascher and Pickering (2020). For further information, please see the original publication and package website. Publication - Dorey et al. (2023) <doi:10.1101/2023.06.30.547152> and package website - Dorey et al. (2023) <https://github.com/jbdorey/BeeBDC>.

Maintained by James B. Dorey. Last updated 4 months ago.

35.4 match 3 stars 5.68 score 7 scripts

r-a-dobson

dynamicSDM:Species Distribution and Abundance Modelling at High Spatio-Temporal Resolution

A collection of novel tools for generating species distribution and abundance models (SDM) that are dynamic through both space and time. These highly flexible functions incorporate spatial and temporal aspects across key SDM stages; including when cleaning and filtering species occurrence data, generating pseudo-absence records, assessing and correcting sampling biases and autocorrelation, extracting explanatory variables and projecting distribution patterns. Throughout, functions utilise Google Earth Engine and Google Drive to minimise the computing power and storage demands associated with species distribution modelling at high spatio-temporal resolution.

Maintained by Rachel Dobson. Last updated 27 days ago.

dynamicsdm google-earth-engine googledrive sdm spatiotemporal spatiotemporal-data-analysis spatiotemporal-forecasting species-distribution-modelling species-distributions

32.0 match 6 stars 6.16 score 20 scripts

luomus

finbif:Interface for the 'Finnish Biodiversity Information Facility' API

A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (<https://api.laji.fi>). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.

Maintained by William K. Morris. Last updated 6 days ago.

api biodiversity biodiversity-informatics biodiversity-information finbif finbif-access occurrences r-programming species specimens taxon taxonomy web-services

24.2 match 5 stars 8.15 score 42 scripts 3 dependents

divdyn

divDyn:Diversity Dynamics using Fossil Sampling Data

Functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.

Maintained by Adam T. Kocsis. Last updated 4 months ago.

diversity extinction fossil-data occurrences origination paleobiology cpp

28.7 match 11 stars 6.48 score 137 scripts

gagolews

stringi:Fast and Portable Character String Processing Facilities

A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).

Maintained by Marek Gagolewski. Last updated 1 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi stringr text text-processing tidy-data unicode cpp

9.9 match 309 stars 18.31 score 10k scripts 8.6k dependents

biodiverse

unmarked:Models for Data from Unmarked Animals

Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.

Maintained by Ken Kellner. Last updated 1 days ago.

openblas cpp openmp

12.0 match 4 stars 13.03 score 652 scripts 12 dependents

ecospat

ecospat:Spatial Ecology Miscellaneous Methods

Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.

Maintained by Olivier Broennimann. Last updated 1 months ago.

15.8 match 32 stars 9.35 score 418 scripts 1 dependents

griffithdan

cooccur:Probabilistic Species Co-Occurrence Analysis in R

This R package applies the probabilistic model of species co-occurrence (Veech 2013) to a set of species distributed among a set of survey or sampling sites. The algorithm calculates the observed and expected frequencies of co-occurrence between each pair of species. The expected frequency is based on the distribution of each species being random and independent of the other species. The analysis returns the probabilities that a more extreme (either low or high) value of co-occurrence could have been obtained by chance. The package also includes functions for visualizing species co-occurrence results and preparing data for downstream analyses.

Maintained by Daniel M. Griffith. Last updated 7 years ago.

29.9 match 3 stars 4.63 score 142 scripts

b-cubed-eu

b3gbi:General Biodiversity Indicators for Biodiversity Data Cubes

Calculate general biodiversity indicators from GBIF data cubes. Includes many common indicators such as species richness and evenness, which can be calculated over time (trends) or space (maps).

Maintained by Shawn Dove. Last updated 13 days ago.

biodiversity-indicators data-cubes

21.7 match 3 stars 6.26 score 34 scripts 1 dependents

avrodrigues

naturaList:Classify Occurrences by Confidence Levels in the Species ID

Classify occurrence records based on confidence levels of species identification. In addition, implement tools to filter occurrences inside grid cells and to manually check for possibles errors with an interactive shiny application.

Maintained by Arthur Vinicius Rodrigues. Last updated 1 years ago.

27.3 match 4.66 score 23 scripts

dwbapst

paleotree:Paleontological and Phylogenetic Analyses of Evolution

Provides tools for transforming, a posteriori time-scaling, and modifying phylogenies containing extinct (i.e. fossil) lineages. In particular, most users are interested in the functions timePaleoPhy, bin_timePaleoPhy, cal3TimePaleoPhy and bin_cal3TimePaleoPhy, which date cladograms of fossil taxa using stratigraphic data. This package also contains a large number of likelihood functions for estimating sampling and diversification rates from different types of data available from the fossil record (e.g. range data, occurrence data, etc). paleotree users can also simulate diversification and sampling in the fossil record using the function simFossilRecord, which is a detailed simulator for branching birth-death-sampling processes composed of discrete taxonomic units arranged in ancestor-descendant relationships. Users can use simFossilRecord to simulate diversification in incompletely sampled fossil records, under various models of morphological differentiation (i.e. the various patterns by which morphotaxa originate from one another), and with time-dependent, longevity-dependent and/or diversity-dependent rates of diversification, extinction and sampling. Additional functions allow users to translate simulated ancestor-descendant data from simFossilRecord into standard time-scaled phylogenies or unscaled cladograms that reflect the relationships among taxon units.

Maintained by David W. Bapst. Last updated 8 months ago.

16.0 match 21 stars 7.53 score 216 scripts 2 dependents

r-forge

wordspace:Distributional Semantic Models in R

An interactive laboratory for research on distributional semantic models ('DSM', see <https://en.wikipedia.org/wiki/Distributional_semantics> for more information).

Maintained by Stephanie Evert. Last updated 3 months ago.

cpp openmp

24.0 match 4.95 score 150 scripts 2 dependents

mjanuario

evolved:Open Software for Teaching Evolutionary Biology at Multiple Scales Through Virtual Inquiries

"Evolutionary Virtual Education" - 'evolved' - provides multiple tools to help educators (especially at the graduate level or in advanced undergraduate level courses) apply inquiry-based learning in general evolution classes. In particular, the tools provided include functions that simulate evolutionary processes (e.g., genetic drift, natural selection within a single locus) or concepts (e.g. Hardy-Weinberg equilibrium, phylogenetic distribution of traits). More than only simulating, the package also provides tools for students to analyze (e.g., measuring, testing, visualizing) datasets with characteristics that are common to many fields related to evolutionary biology. Importantly, the package is heavily oriented towards providing tools for inquiry-based learning - where students follow scientific practices to actively construct knowledge. For additional details, see package's vignettes.

Maintained by Matheus Januario. Last updated 1 months ago.

17.3 match 3 stars 6.73 score 23 scripts

ropensci

occCite:Querying and Managing Large Biodiversity Occurrence Datasets

Facilitates the gathering of biodiversity occurrence data from disparate sources. Metadata is managed throughout the process to facilitate reporting and enhanced ability to repeat analyses.

Maintained by Hannah L. Owens. Last updated 5 months ago.

biodiversity-data biodiversity-informatics biodiversity-standards citations museum-collection-specimens museum-collections museum-metadata

15.8 match 23 stars 7.30 score 43 scripts

frareb

inpdfr:Analyse Text Documents Using Ecological Tools

A set of functions to analyse and compare texts, using classical text mining functions, as well as those from theoretical ecology.

Maintained by Rebaudo Francois. Last updated 2 years ago.

25.7 match 2 stars 4.41 score 26 scripts

wallaceecomod

wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions

The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.

Maintained by Mary E. Blair. Last updated 9 days ago.

openjdk

13.5 match 133 stars 8.36 score 96 scripts

tjheaton

carbondate:Calibration and Summarisation of Radiocarbon Dates

Performs Bayesian non-parametric calibration of multiple related radiocarbon determinations, and summarises the calendar age information to plot their joint calendar age density (see Heaton (2022) <doi:10.1111/rssc.12599>). Also models the occurrence of radiocarbon samples as a variable-rate (inhomogeneous) Poisson process, plotting the posterior estimate for the occurrence rate of the samples over calendar time, and providing information about potential change points.

Maintained by Timothy J Heaton. Last updated 2 months ago.

cpp

19.4 match 5 stars 5.78 score 20 scripts

palaeoverse

palaeoverse:Prepare and Explore Data for Palaeobiological Analyses

Provides functionality to support data preparation and exploration for palaeobiological analyses, improving code reproducibility and accessibility. The wider aim of 'palaeoverse' is to bring the palaeobiological community together to establish agreed standards. The package currently includes functionality for data cleaning, binning (time and space), exploration, summarisation and visualisation. Reference datasets (i.e. Geological Time Scales <https://stratigraphy.org/chart>) and auxiliary functions are also provided. Details can be found in: Jones et al., (2023) <doi: 10.1111/2041-210X.14099>.

Maintained by Lewis A. Jones. Last updated 5 months ago.

biodiversity fossil palaeobiology paleobiology

12.8 match 21 stars 8.57 score 44 scripts 1 dependents

bmaitner

BIEN:Tools for Accessing the Botanical Information and Ecology Network Database

Provides Tools for Accessing the Botanical Information and Ecology Network Database. The BIEN database contains cleaned and standardized botanical data including occurrence, trait, plot and taxonomic data (See <https://bien.nceas.ucsb.edu/bien/> for more Information). This package provides functions that query the BIEN database by constructing and executing optimized SQL queries.

Maintained by Brian Maitner. Last updated 1 months ago.

17.9 match 6.04 score 205 scripts 5 dependents

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

8.3 match 79 stars 12.62 score 186 scripts 9 dependents

ropensci

rvertnet:Search 'Vertnet', a 'Database' of Vertebrate Specimen Records

Retrieve, map and summarize data from the 'VertNet.org' archives (<https://vertnet.org/>). Functions allow searching by many parameters, including 'taxonomic' names, places, and dates. In addition, there is an interface for conducting spatially delimited searches, and another for requesting large 'datasets' via email.

Maintained by Dave Slager. Last updated 5 months ago.

species occurrences biodiversity maps vertnet mammals mammalia specimens api-wrapper specimen spocc

11.9 match 7 stars 8.51 score 35 scripts 6 dependents

ropensci

CoordinateCleaner:Automated Cleaning of Occurrence Records from Biological Collections

Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) <doi:10.1111/2041-210X.13152>.

Maintained by Alexander Zizka. Last updated 1 years ago.

8.4 match 82 stars 10.93 score 306 scripts 3 dependents

iobis

robis:Ocean Biodiversity Information System (OBIS) Client

Client for the Ocean Biodiversity Information System (<https://obis.org>).

Maintained by Pieter Provoost. Last updated 1 years ago.

11.9 match 41 stars 7.54 score 282 scripts

tbep-tech

tbeptools:Data and Indicators for the Tampa Bay Estuary Program

Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.

Maintained by Marcus Beck. Last updated 9 days ago.

data-analysis tampa-bay tbep water-quality

11.3 match 10 stars 7.86 score 133 scripts

malaria-atlas-project

malariaAtlas:An R Interface to Open-Access Malaria Data, Hosted by the 'Malaria Atlas Project'

A suite of tools to allow you to download all publicly available parasite rate survey points, mosquito occurrence points and raster surfaces from the 'Malaria Atlas Project' <https://malariaatlas.org/> servers as well as utility functions for plotting the downloaded data.

Maintained by Mauricio van den Berg. Last updated 8 months ago.

database malaria opendata raster

9.7 match 44 stars 9.10 score 118 scripts 3 dependents

insightsengineering

chevron:Standard TLGs for Clinical Trials Reporting

Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.

Maintained by Joe Zhu. Last updated 24 days ago.

clinical-trials graphs listings nest reporting tables

9.8 match 12 stars 8.24 score 12 scripts

config-i1

smooth:Forecasting Using State Space Models

Functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting. The package includes ADAM (Svetunkov, 2023, <https://openforecast.org/adam/>), Exponential Smoothing (Hyndman et al., 2008, <doi: 10.1007/978-3-540-71918-2>), SARIMA (Svetunkov & Boylan, 2019 <doi: 10.1080/00207543.2019.1600764>), Complex Exponential Smoothing (Svetunkov & Kourentzes, 2018, <doi: 10.13140/RG.2.2.24986.29123>), Simple Moving Average (Svetunkov & Petropoulos, 2018 <doi: 10.1080/00207543.2017.1380326>) and several simulation functions. It also allows dealing with intermittent demand based on the iETS framework (Svetunkov & Boylan, 2019, <doi: 10.13140/RG.2.2.35897.06242>).

Maintained by Ivan Svetunkov. Last updated 2 days ago.

arima arima-forecasting ces ets exponential-smoothing forecast state-space time-series openblas cpp

5.7 match 90 stars 11.87 score 412 scripts 25 dependents

bioc

seqPattern:Visualising oligonucleotide patterns and motif occurrences across a set of sorted sequences

Visualising oligonucleotide patterns and sequence motifs occurrences across a large set of sequences centred at a common reference point and sorted by a user defined feature.

Maintained by Vanja Haberle. Last updated 5 months ago.

visualization sequencematching

14.0 match 4.78 score 12 scripts 7 dependents

rpolars

polars:Lightning-Fast 'DataFrame' Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Soren Welling. Last updated 3 days ago.

arrow polars rust

5.5 match 499 stars 12.01 score 1.0k scripts 2 dependents

edsandorf

spdesign:Designing Stated Preference Experiments

Contemporary software commonly used to design stated preference experiments are expensive and the code is closed source. This is a free software package with an easy to use interface to make flexible stated preference experimental designs using state-of-the-art methods. For an overview of stated choice experimental design theory, see e.g., Rose, J. M. & Bliemer, M. C. J. (2014) in Hess S. & Daly. A. <doi:10.4337/9781781003152>. The package website can be accessed at <https://spdesign.edsandorf.me>. We acknowledge funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant INSPiRE (Grant agreement ID: 793163).

Maintained by Erlend Dancke Sandorf. Last updated 5 months ago.

14.1 match 4.60 score 20 scripts

gawainantell

divvy:Spatial Subsampling of Biodiversity Occurrence Data

Divide taxonomic occurrence data into geographic regions of fair comparison, with three customisable methods to standardise area and extent. Calculate common biodiversity and range-size metrics on subsampled data. Background theory and practical considerations for the methods are described in Antell and others (2023) <doi:10.31223/X5997Z>.

Maintained by Gawain Antell. Last updated 1 years ago.

16.0 match 4.00 score 10 scripts

marlonecobos

nichevol:Tools for Ecological Niche Evolution Assessment Considering Uncertainty

A collection of tools that allow users to perform critical steps in the process of assessing ecological niche evolution over phylogenies, with uncertainty incorporated explicitly in reconstructions. The method proposed here for ancestral reconstruction of ecological niches characterizes species' niches using a bin-based approach that incorporates uncertainty in estimations. Compared to other existing methods, the approaches presented here reduce risk of overestimation of amounts and rates of ecological niche evolution. The main analyses include: initial exploration of environmental data in occurrence records and accessible areas, preparation of data for phylogenetic analyses, executing comparative phylogenetic analyses of ecological niches, and plotting for interpretations. Details on the theoretical background and methods used can be found in: Owens et al. (2020) <doi:10.1002/ece3.6359>, Peterson et al. (1999) <doi:10.1126/science.285.5431.1265>, Soberón and Peterson (2005) <doi:10.17161/bi.v2i0.4>, Peterson (2011) <doi:10.1111/j.1365-2699.2010.02456.x>, Barve et al. (2011) <doi:10.1111/ecog.02671>, Machado-Stredel et al. (2021) <doi:10.21425/F5FBG48814>, Owens et al. (2013) <doi:10.1016/j.ecolmodel.2013.04.011>, Saupe et al. (2018) <doi:10.1093/sysbio/syx084>, and Cobos et al. (2021) <doi:10.1111/jav.02868>.

Maintained by Marlon E. Cobos. Last updated 2 years ago.

16.2 match 14 stars 3.85 score 2 scripts

rorynolan

filesstrings:Handy File and String Manipulation

This started out as a package for file and string manipulation. Since then, the 'fs' and 'strex' packages emerged, offering functionality previously given by this package (but it's done better in these new ones). Those packages have hence almost pushed 'filesstrings' into extinction. However, it still has a small number of unique, handy file manipulation functions which can be seen in the vignette. One example is a function to remove spaces from all file names in a directory.

Maintained by Rory Nolan. Last updated 1 years ago.

7.3 match 22 stars 8.59 score 632 scripts 4 dependents

b-cubed-eu

gcube:Simulating Biodiversity Data Cubes

This R package provides a simulation framework for biodiversity data cubes. This can start from simulating multiple species distributed in a landscape over a temporal scope. In a second phase, the simulation of a variety of observation processes and effort can generate actual occurrence datasets. Based on their (simulated) spatial uncertainty, occurrences can then be designated to a grid to form a data cube.

Maintained by Ward Langeraert. Last updated 1 months ago.

biodiversity-informatics data-cubes simulations

13.5 match 6 stars 4.60 score 9 scripts

skembel

picante:Integrating Phylogenies and Ecology

Functions for phylocom integration, community analyses, null-models, traits and evolution. Implements numerous ecophylogenetic approaches including measures of community phylogenetic and trait diversity, phylogenetic signal, estimation of trait values for unobserved taxa, null models for community and phylogeny randomizations, and utility functions for data input/output and phylogeny plotting. A full description of package functionality and methods are provided by Kembel et al. (2010) <doi:10.1093/bioinformatics/btq166>.

Maintained by Steven W. Kembel. Last updated 2 years ago.

5.3 match 34 stars 11.42 score 1.1k scripts 16 dependents

jamiemkass

ENMeval:Automated Tuning and Evaluations of Ecological Niche Models

Runs ecological niche models over all combinations of user-defined settings (i.e., tuning), performs cross validation to evaluate models, and returns data tables to aid in selection of optimal model settings that balance goodness-of-fit and model complexity. Also has functions to partition data spatially (or not) for cross validation, to plot multiple visualizations of results, to run null models to estimate significance and effect sizes of performance metrics, and to calculate range overlap between model predictions, among others. The package was originally built for Maxent models (Phillips et al. 2006, Phillips et al. 2017), but the current version allows possible extensions for any modeling algorithm. The extensive vignette, which guides users through most package functionality but unfortunately has a file size too big for CRAN, can be found here on the package's Github Pages website: <https://jamiemkass.github.io/ENMeval/articles/ENMeval-2.0-vignette.html>.

Maintained by Jamie M. Kass. Last updated 2 months ago.

5.3 match 49 stars 11.25 score 332 scripts 2 dependents

xijianzheng

coefa:Meta Analysis of Factor Analysis Based on CO-Occurrence Matrices

Provide a series of functions to conduct a meta analysis of factor analysis based on co-occurrence matrices. The tool can be used to solve the factor structure (i.e. inner structure of a construct, or scale) debate in several disciplines, such as psychology, psychiatry, management, education so on. References: Shafer (2005) <doi:10.1037/1040-3590.17.3.324>; Shafer (2006) <doi:10.1002/jclp.20213>; Loeber and Schmaling (1985) <doi:10.1007/BF00910652>.

Maintained by Xijian Zheng. Last updated 2 years ago.

21.9 match 2.70 score 4 scripts

rorynolan

strex:Extra String Manipulation Functions

There are some things that I wish were easier with the 'stringr' or 'stringi' packages. The foremost of these is the extraction of numbers from strings. 'stringr' and 'stringi' make you figure out the regular expression for yourself; 'strex' takes care of this for you. There are many other handy functionalities in 'strex'. Contributions to this package are encouraged; it is intended as a miscellany of string manipulation functions that cannot be found in 'stringi' or 'stringr'.

Maintained by Rory Nolan. Last updated 6 months ago.

5.3 match 41 stars 10.59 score 1.2k scripts 18 dependents

mlammens

spThin:Functions for Spatial Thinning of Species Occurrence Records for Use in Ecological Models

A set of functions that can be used to spatially thin species occurrence data. The resulting thinned data can be used in ecological modeling, such as ecological niche modeling.

Maintained by Matthew E. Aiello-Lammens. Last updated 5 years ago.

6.9 match 2 stars 8.00 score 209 scripts 3 dependents

ropensci

paleobioDB:Download and Process Data from the Paleobiology Database

Includes functions to wrap most endpoints of the 'PaleobioDB' API and functions to visualize and process the fossil data. The API documentation for the Paleobiology Database can be found at <https://paleobiodb.org/data1.2/>.

Maintained by Adrián Castro Insua. Last updated 1 years ago.

8.9 match 42 stars 6.19 score 74 scripts

dwarton

ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)

Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.

Maintained by David Warton. Last updated 1 years ago.

8.3 match 8 stars 6.58 score 53 scripts

bioc

sarks:Suffix Array Kernel Smoothing for discovery of correlative sequence motifs and multi-motif domains

Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.

Maintained by Dennis Wylie. Last updated 5 months ago.

motifdiscovery generegulation geneexpression transcriptomics rnaseq differentialexpression featureextraction openjdk

11.0 match 3 stars 4.78 score 3 scripts

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

7.1 match 7.27 score 251 scripts 1 dependents

mapme-initiative

mapme.biodiversity:Efficient Monitoring of Global Biodiversity Portfolios

Biodiversity areas, especially primary forest, serve a multitude of functions for local economy, regional functionality of the ecosystems as well as the global health of our planet. Recently, adverse changes in human land use practices and climatic responses to increased greenhouse gas emissions, put these biodiversity areas under a variety of different threats. The present package helps to analyse a number of biodiversity indicators based on freely available geographical datasets. It supports computational efficient routines that allow the analysis of potentially global biodiversity portfolios. The primary use case of the package is to support evidence based reporting of an organization's effort to protect biodiversity areas under threat and to identify regions were intervention is most duly needed.

Maintained by Darius A. Görgen. Last updated 3 months ago.

environment eo gis mapme spatial sustainability

5.5 match 35 stars 9.24 score 287 scripts

ctmm-initiative

ctmm:Continuous-Time Movement Modeling

Functions for identifying, fitting, and applying continuous-space, continuous-time stochastic-process movement models to animal tracking data. The package is described in Calabrese et al (2016) <doi:10.1111/2041-210X.12559>, with models and methods based on those introduced and detailed in Fleming & Calabrese et al (2014) <doi:10.1086/675504>, Fleming et al (2014) <doi:10.1111/2041-210X.12176>, Fleming et al (2015) <doi:10.1103/PhysRevE.91.032107>, Fleming et al (2015) <doi:10.1890/14-2010.1>, Fleming et al (2016) <doi:10.1890/15-1607>, Péron & Fleming et al (2016) <doi:10.1186/s40462-016-0084-7>, Fleming & Calabrese (2017) <doi:10.1111/2041-210X.12673>, Péron et al (2017) <doi:10.1002/ecm.1260>, Fleming et al (2017) <doi:10.1016/j.ecoinf.2017.04.008>, Fleming et al (2018) <doi:10.1002/eap.1704>, Winner & Noonan et al (2018) <doi:10.1111/2041-210X.13027>, Fleming et al (2019) <doi:10.1111/2041-210X.13270>, Noonan & Fleming et al (2019) <doi:10.1186/s40462-019-0177-1>, Fleming et al (2020) <doi:10.1101/2020.06.12.130195>, Noonan et al (2021) <doi:10.1111/2041-210X.13597>, Fleming et al (2022) <doi:10.1111/2041-210X.13815>, Silva et al (2022) <doi:10.1111/2041-210X.13786>, Alston & Fleming et al (2023) <doi:10.1111/2041-210X.14025>.

Maintained by Christen H. Fleming. Last updated 2 months ago.

4.8 match 49 stars 10.57 score 534 scripts 4 dependents

tommyjones

textmineR:Functions for Text Mining and Topic Modeling

An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.

Maintained by Tommy Jones. Last updated 2 years ago.

cpp

4.6 match 106 stars 10.83 score 310 scripts 7 dependents

sylvainschmitt

SSDM:Stacked Species Distribution Modelling

Allows to map species richness and endemism based on stacked species distribution models (SSDM). Individuals SDMs can be created using a single or multiple algorithms (ensemble SDMs). For each species, an SDM can yield a habitat suitability map, a binary map, a between-algorithm variance map, and can assess variable importance, algorithm accuracy, and between- algorithm correlation. Methods to stack individual SDMs include summing individual probabilities and thresholding then summing. Thresholding can be based on a specific evaluation metric or by drawing repeatedly from a Bernoulli distribution. The SSDM package also provides a user-friendly interface.

Maintained by Sylvain Schmitt. Last updated 10 months ago.

7.0 match 44 stars 6.99 score 44 scripts

archaeostat

ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling

Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.

Maintained by Anne Philippe. Last updated 11 months ago.

archaeology bayesian-statistics geochronology markov-chain radiocarbon-dates

6.8 match 10 stars 6.90 score 66 scripts

gagolews

stringx:Replacements for Base String Functions Powered by 'stringi'

English is the native language for only 5% of the World population. Also, only 17% of us can understand this text. Moreover, the Latin alphabet is the main one for merely 36% of the total. The early computer era, now a very long time ago, was dominated by the US. Due to the proliferation of the internet, smartphones, social media, and other technologies and communication platforms, this is no longer the case. This package replaces base R string functions (such as grep(), tolower(), sprintf(), and strptime()) with ones that fully support the Unicode standards related to natural language and date-time processing. It also fixes some long-standing inconsistencies, and introduces some new, useful features. Thanks to 'ICU' (International Components for Unicode) and 'stringi', they are fast, reliable, and portable across different platforms.

Maintained by Marek Gagolewski. Last updated 2 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi text text-processing unicode

9.8 match 28 stars 4.75 score 1 scripts

lleisong

itsdm:Isolation Forest-Based Presence-Only Species Distribution Modeling

Collection of R functions to do purely presence-only species distribution modeling with isolation forest (iForest) and its variations such as Extended isolation forest and SCiForest. See the details of these methods in references: Liu, F.T., Ting, K.M. and Zhou, Z.H. (2008) <doi:10.1109/ICDM.2008.17>, Hariri, S., Kind, M.C. and Brunner, R.J. (2019) <doi:10.1109/TKDE.2019.2947676>, Liu, F.T., Ting, K.M. and Zhou, Z.H. (2010) <doi:10.1007/978-3-642-15883-4_18>, Guha, S., Mishra, N., Roy, G. and Schrijvers, O. (2016) <https://proceedings.mlr.press/v48/guha16.html>, Cortes, D. (2021) <arXiv:2110.13402>. Additionally, Shapley values are used to explain model inputs and outputs. See details in references: Shapley, L.S. (1953) <doi:10.1515/9781400881970-018>, Lundberg, S.M. and Lee, S.I. (2017) <https://dl.acm.org/doi/abs/10.5555/3295222.3295230>, Molnar, C. (2020) <ISBN:978-0-244-76852-2>, Štrumbelj, E. and Kononenko, I. (2014) <doi:10.1007/s10115-013-0679-x>. itsdm also provides functions to diagnose variable response, analyze variable importance, draw spatial dependence of variables and examine variable contribution. As utilities, the package includes a few functions to download bioclimatic variables including 'WorldClim' version 2.0 (see Fick, S.E. and Hijmans, R.J. (2017) <doi:10.1002/joc.5086>) and 'CMCC-BioClimInd' (see Noce, S., Caporaso, L. and Santini, M. (2020) <doi:10.1038/s41597-020-00726-5>.

Maintained by Lei Song. Last updated 2 years ago.

isolation-forest outlier-detection presence-onlymodel shapley-value species-distribution-modelling

8.1 match 4 stars 5.59 score 65 scripts

atlasoflivingaustralia

galah:Biodiversity Data from the GBIF Node Network

The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.

Maintained by Martin Westgate. Last updated 1 months ago.

4.9 match 43 stars 9.17 score 275 scripts 1 dependents

bnosac

udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

Maintained by Jan Wijffels. Last updated 2 years ago.

conll dependency-parser lemmatization natural-language-processing nlp pos-tagging r-pkg rcpp text-mining tokenizer udpipe cpp

3.8 match 215 stars 11.83 score 1.2k scripts 9 dependents

dracor-org

rdracor:Access to the 'DraCor' API

Provide an interface for 'Drama Corpora Project' ('DraCor') API: <https://dracor.org/documentation/api>.

Maintained by Ivan Pozdniakov. Last updated 6 months ago.

8.7 match 14 stars 5.05 score 40 scripts

theropod1

paleoDiv:Extracting and Visualizing Paleobiodiversity

Contains various tools for conveniently downloading and editing taxon-specific datasets from the Paleobiology Database <https://paleobiodb.org>, extracting information on abundance, temporal distribution of subtaxa and taxonomic diversity through deep time, and visualizing these data in relation to phylogeny and stratigraphy.

Maintained by Darius Nau. Last updated 5 months ago.

15.7 match 2 stars 2.78 score

thomaschln

nlpembeds:Natural Language Processing Embeddings

Provides efficient methods to compute co-occurrence matrices, pointwise mutual information (PMI) and singular value decomposition (SVD). In the biomedical and clinical settings, one challenge is the huge size of databases, e.g. when analyzing data of millions of patients over tens of years. To address this, this package provides functions to efficiently compute monthly co-occurrence matrices, which is the computational bottleneck of the analysis, by using the 'RcppAlgos' package and sparse matrices. Furthermore, the functions can be called on 'SQL' databases, enabling the computation of co-occurrence matrices of tens of gigabytes of data, representing millions of patients over tens of years. Partly based on Hong C. (2021) <doi:10.1038/s41746-021-00519-z>.

Maintained by Thomas Charlon. Last updated 25 days ago.

8.7 match 4.98 score

biorgeo

bioregion:Comparison of Bioregionalisation Methods

The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).

Maintained by Maxime Lenormand. Last updated 11 days ago.

biogeography bioregion bioregionalization cpp

6.9 match 7 stars 6.27 score 11 scripts

ecor

RGENERATEPREC:Tools to Generate Daily-Precipitation Time Series

The method 'generate()' is extended for spatial multi-site stochastic generation of daily precipitation. It generates precipitation occurrence in several sites using logit regression (Generalized Linear Models) and the approach by D.S. Wilks (1998) <doi:10.1016/S0022-1694(98)00186-3> .

Maintained by Emanuele Cordano. Last updated 7 months ago.

8.0 match 4 stars 5.26 score 45 scripts

equitable-equations

fqar:Floristic Quality Assessment Tools for R

Tools for downloading and analyzing floristic quality assessment data. See Freyman et al. (2015) <doi:10.1111/2041-210X.12491> for more information about floristic quality assessment and the associated database.

Maintained by Andrew Gard. Last updated 2 months ago.

7.0 match 5 stars 5.88 score 5 scripts

lbbe-software

Mondrian:A Simple Graphical Representation of the Relative Occurrence and Co-Occurrence of Events

The unique function of this package allows representing in a single graph the relative occurrence and co-occurrence of events measured in a sample. As examples, the package was applied to describe the occurrence and co-occurrence of different species of bacterial or viral symbionts infecting arthropods at the individual level. The graphics allows determining the prevalence of each symbiont and the patterns of multiple infections (i.e. how different symbionts share or not the same individual hosts). We named the package after the famous painter as the graphical output recalls Mondrian’s paintings.

Maintained by Aurélie Siberchicot. Last updated 8 months ago.

10.2 match 2 stars 4.00 score 8 scripts

macroecology

letsR:Data Handling and Analysis in Macroecology

Handling, processing, and analyzing geographic data on species' distributions and environmental variables. Read Vilela & Villalobos (2015) <doi:10.1111/2041-210X.12401> for details.

Maintained by Bruno Vilela. Last updated 2 months ago.

4.5 match 29 stars 8.87 score 104 scripts

bmaitner

S4DM:Small Sample Size Species Distribution Modeling

Implements a set of distribution modeling methods that are suited to species with small sample sizes (e.g., poorly sampled species or rare species). While these methods can also be used on well-sampled taxa, they are united by the fact that they can be utilized with relatively few data points. More details on the currently implemented methodologies can be found in Drake and Richards (2018) <doi:10.1002/ecs2.2373>, Drake (2015) <doi:10.1098/rsif.2015.0086>, and Drake (2014) <doi:10.1890/ES13-00202.1>.

Maintained by Brian S. Maitner. Last updated 1 months ago.

open-science range-modelling rare-species species-distribution-modeling species-distribution-modelling

6.7 match 4 stars 5.97 score 33 scripts

luomus

f2g:FinBIF to GBIF

Tools for publishing FinBIF data to GBIF.

Maintained by William K. Morris. Last updated 10 days ago.

13.1 match 1 stars 3.02 score

quanteda

quanteda:Quantitative Analysis of Textual Data

A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.

Maintained by Kenneth Benoit. Last updated 2 months ago.

corpus natural-language-processing quanteda text-analytics onetbb cpp

2.3 match 851 stars 16.68 score 5.4k scripts 51 dependents

r-forge

surveillance:Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.

Maintained by Sebastian Meyer. Last updated 2 days ago.

cpp

3.6 match 2 stars 10.68 score 446 scripts 3 dependents

ohdsi

omock:Creation of Mock Observational Medical Outcomes Partnership Common Data Model

Creates mock data for testing and package development for the Observational Medical Outcomes Partnership common data model. The package offers functions crafted with pipeline-friendly implementation, enabling users to effortlessly include only the necessary tables for their testing needs.

Maintained by Mike Du. Last updated 1 months ago.

5.1 match 2 stars 7.44 score 45 scripts 1 dependents

bioc

mosbi:Molecular Signature identification using Biclustering

This package is a implementation of biclustering ensemble method MoSBi (Molecular signature Identification from Biclustering). MoSBi provides standardized interfaces for biclustering results and can combine their results with a multi-algorithm ensemble approach to compute robust ensemble biclusters on molecular omics data. This is done by computing similarity networks of biclusters and filtering for overlaps using a custom error model. After that, the louvain modularity it used to extract bicluster communities from the similarity network, which can then be converted to ensemble biclusters. Additionally, MoSBi includes several network visualization methods to give an intuitive and scalable overview of the results. MoSBi comes with several biclustering algorithms, but can be easily extended to new biclustering algorithms.

Maintained by Tim Daniel Rose. Last updated 5 months ago.

software statisticalmethod clustering network cpp

8.8 match 4.30 score 8 scripts

nataliepatten

gatoRs:Geographic and Taxonomic Occurrence R-Based Scrubbing

Streamlines downloading and cleaning biodiversity data from Integrated Digitized Biocollections (iDigBio) and the Global Biodiversity Information Facility (GBIF).

Maintained by Natalie N. Patten. Last updated 10 months ago.

6.0 match 11 stars 6.16 score 66 scripts

tesselle

tabula:Analysis and Visualization of Archaeological Count Data

An easy way to examine archaeological count data. This package provides several tests and measures of diversity: heterogeneity and evenness (Brillouin, Shannon, Simpson, etc.), richness and rarefaction (Chao1, Chao2, ACE, ICE, etc.), turnover and similarity (Brainerd-Robinson, etc.). It allows to easily visualize count data and statistical thresholds: rank vs abundance plots, heatmaps, Ford (1962) and Bertin (1977) diagrams, etc.

Maintained by Nicolas Frerebeau. Last updated 13 days ago.

data-visualization archaeology archaeological-science

7.1 match 5.10 score 38 scripts 1 dependents

trinker

qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis

Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.

Maintained by Tyler Rinker. Last updated 4 years ago.

qdap quantitative-discourse-analysis text-analysis text-mining text-plotting openjdk

3.7 match 176 stars 9.61 score 1.3k scripts 3 dependents

jmestret

GeoThinneR:Simple Spatial Thinning for Ecological and Spatial Analysis

Provides efficient geospatial thinning algorithms to reduce the density of coordinate data while maintaining spatial relationships. Implements K-D Tree and brute-force distance-based thinning, as well as grid-based and precision-based thinning methods. For more information on the methods, see Elseberg et al. (2012) <https://hdl.handle.net/10446/86202>.

Maintained by Jorge Mestre-Tomás. Last updated 5 months ago.

cpp

6.2 match 9 stars 5.73 score 7 scripts 1 dependents

bjoelle

FossilSim:Simulation and Plots for Fossil and Taxonomy Data

Simulating and plotting taxonomy and fossil data on phylogenetic trees under mechanistic models of speciation, preservation and sampling.

Maintained by Joelle Barido-Sottani. Last updated 6 months ago.

6.7 match 1 stars 5.24 score 65 scripts 1 dependents

adamlilith

enmSdmX:Species Distribution Modeling and Ecological Niche Modeling

Implements species distribution modeling and ecological niche modeling, including: bias correction, spatial cross-validation, model evaluation, raster interpolation, biotic "velocity" (speed and direction of movement of a "mass" represented by a raster), interpolating across a time series of rasters, and use of spatially imprecise records. The heart of the package is a set of "training" functions which automatically optimize model complexity based number of available occurrences. These algorithms include MaxEnt, MaxNet, boosted regression trees/gradient boosting machines, generalized additive models, generalized linear models, natural splines, and random forests. To enhance interoperability with other modeling packages, no new classes are created. The package works with 'PROJ6' geodetic objects and coordinate reference systems.

Maintained by Adam B. Smith. Last updated 25 days ago.

bias-correction biogeography ecological-niche-modeling ecological-niche-modelling niche-modeling niche-modelling species-distribution-modeling openjdk

6.2 match 25 stars 5.62 score 37 scripts

eitsupi

neopolars:R Bindings for the 'polars' Rust Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Tatsuya Shima. Last updated 1 days ago.

rust cargo

7.0 match 40 stars 4.86 score 1 scripts

dvrbts

labdsv:Ordination and Multivariate Analysis for Ecology

A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.

Maintained by David W. Roberts. Last updated 2 years ago.

fortran

5.6 match 3 stars 6.08 score 452 scripts 13 dependents

azvoleff

glcm:Calculate Textures from Grey-Level Co-Occurrence Matrices (GLCMs)

Enables calculation of image textures (Haralick 1973) <doi:10.1109/TSMC.1973.4309314> from grey-level co-occurrence matrices (GLCMs). Supports processing images that cannot fit in memory.

Maintained by Alex Zvoleff. Last updated 5 years ago.

openblas cpp

6.7 match 15 stars 5.05 score 74 scripts

luismurao

tenm:Temporal Ecological Niche Models

Implements methods and functions to calibrate time-specific niche models (multi-temporal calibration), letting users execute a strict calibration and selection process of niche models based on ellipsoids, as well as functions to project the potential distribution in the present and in global change scenarios.The 'tenm' package has functions to recover information that may be lost or overlooked while applying a data curation protocol. This curation involves preserving occurrences that may appear spatially redundant (occurring in the same pixel) but originate from different time periods. A novel aspect of this package is that it might reconstruct the fundamental niche more accurately than mono-calibrated approaches. The theoretical background of the package can be found in Peterson et al. (2011)<doi:10.5860/CHOICE.49-6266>.

Maintained by Luis Osorio-Olvera. Last updated 8 months ago.

5.8 match 5 stars 5.77 score 34 scripts

viralemergence

insectDisease:Ecological Database of the World's Insect Pathogens

David Onstad provided us with this insect disease database, sometimes referred to as the 'Ecological Database of the Worlds Insect Pathogens' or EDWIP. Files have been converted from 'SQL' to csv, and ported into 'R' for easy exploration and analysis. Thanks to the Macroecology of Infectious Disease Research Coordination Network (RCN) for funding and support. Data are also served online in a static format at <https://edwip.ecology.uga.edu/>.

Maintained by Tad Dallas. Last updated 2 months ago.

7.5 match 13 stars 4.41 score 2 scripts

config-i1

greybox:Toolbox for Model Building and Forecasting

Implements functions and instruments for regression model building and its application to forecasting. The main scope of the package is in variables selection and models specification for cases of time series data. This includes promotional modelling, selection between different dynamic regressions with non-standard distributions of errors, selection based on cross validation, solutions to the fat regression model problem and more. Models developed in the package are tailored specifically for forecasting purposes. So as a results there are several methods that allow producing forecasts from these models and visualising them.

Maintained by Ivan Svetunkov. Last updated 2 days ago.

forecasting model-selection model-selection-and-evaluation regression regression-models statistics cpp

3.0 match 30 stars 11.03 score 97 scripts 34 dependents

mrmaxent

maxnet:Fitting 'Maxent' Species Distribution Models with 'glmnet'

Procedures to fit species distributions models from occurrence records and environmental variables, using 'glmnet' for model fitting. Model structure is the same as for the 'Maxent' Java package, version 3.4.0, with the same feature types and regularization options. See the 'Maxent' website <http://biodiversityinformatics.amnh.org/open_source/maxent> for more details.

Maintained by Steven Phillips. Last updated 2 years ago.

3.8 match 75 stars 8.68 score 169 scripts 7 dependents

emcramer

CHOIRBM:Plots the CHOIR Body Map

Collection of utility functions for visualizing body map data collected with the Collaborative Health Outcomes Information Registry.

Maintained by Eric Cramer. Last updated 1 years ago.

body-map cbm choir data-visualization visualization

5.9 match 5 stars 5.51 score 26 scripts

henrikbengtsson

matrixStats:Functions that Apply to Rows and Columns of Matrices (and to Vectors)

High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().

Maintained by Henrik Bengtsson. Last updated 2 months ago.

matrix performance vector

1.8 match 208 stars 18.09 score 20k scripts 2.3k dependents

mindthegap-erc

StratPal:Stratigraphic Paleobiology Modeling Pipelines

The fossil record is a joint expression of ecological, taphonomic, evolutionary, and stratigraphic processes (Holland and Patzkowsky, 2012, ISBN:978-0226649382). This package allowing to simulate biological processes in the time domain (e.g., trait evolution, fossil abundance), and examine how their expression in the rock record (stratigraphic domain) is influenced based on age-depth models, ecological niche models, and taphonomic effects. Functions simulating common processes used in modeling trait evolution or event type data such as first/last occurrences are provided and can be used standalone or as part of a pipeline. The package comes with example data sets and tutorials in several vignettes, which can be used as a template to set up one's own simulation.

Maintained by Niklas Hohmann. Last updated 24 days ago.

palaeobiology palaeontology paleobiology paleontology stratigraphic-paleobiology stratigraphy

5.3 match 1 stars 5.88 score 18 scripts

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 hours ago.

fortran cpp

1.8 match 87 stars 16.70 score 7.7k scripts 99 dependents

brpetrucci

paleobuddy:Simulating Diversification Dynamics

Simulation of species diversification, fossil records, and phylogenies. While the literature on species birth-death simulators is extensive, including important software like 'paleotree' and 'APE', we concluded there were interesting gaps to be filled regarding possible diversification scenarios. Here we strove for flexibility over focus, implementing a large array of regimens for users to experiment with and combine. In this way, 'paleobuddy' can be used in complement to other simulators as a flexible jack of all trades, or, in the case of scenarios implemented only here, can allow for robust and easy simulations for novel situations. Environmental data modified from that in 'RPANDA': Morlon H. et al (2016) <doi:10.1111/2041-210X.12526>.

Maintained by Bruno do Rosario Petrucci. Last updated 1 months ago.

evolution macroevolution paleobiology paleontology phylogenetics

5.9 match 6 stars 4.95 score 4 scripts

bbolker

emdbook:Support Functions and Data for "Ecological Models and Data"

Auxiliary functions and data sets for "Ecological Models and Data", a book presenting maximum likelihood estimation and related topics for ecologists (ISBN 978-0-691-12522-0).

Maintained by Ben Bolker. Last updated 8 months ago.

3.6 match 4 stars 8.04 score 656 scripts 21 dependents

mstrimas

colorist:Coloring Wildlife Distributions in Space-Time

Color and visualize wildlife distributions in space-time using raster data. In addition to enabling display of sequential change in distributions through the use of small multiples, 'colorist' provides functions for extracting several features of interest from a sequence of distributions and for visualizing those features using HCL (hue-chroma-luminance) color palettes. Resulting maps allow for "fair" visual comparison of intensity values (e.g., occurrence, abundance, or density) across space and time and can be used to address questions about where, when, and how consistently a species, group, or individual is likely to be found.

Maintained by Matthew Strimas-Mackey. Last updated 11 months ago.

5.1 match 14 stars 5.60 score 19 scripts

gustavobio

flora:Tools for Interacting with the Brazilian Flora 2020

Tools to quickly compile taxonomic and distribution data from the Brazilian Flora 2020.

Maintained by Gustavo Carvalho. Last updated 1 years ago.

5.3 match 29 stars 5.37 score 54 scripts 1 dependents

vascobranco

red:IUCN Redlisting Tools

Includes algorithms to facilitate the assessment of extinction risk of species according to the IUCN (International Union for Conservation of Nature, see <https://www.iucn.org/> for more information) red list criteria.

Maintained by Vasco V. Branco. Last updated 3 months ago.

biodiversity extinction-risk

6.1 match 1 stars 4.54 score 29 scripts 1 dependents

usepa

pTITAN2:Permutations of Treatment Labels and TITAN2 Analysis

Permute treatment labels for taxa and environmental gradients to generate an empirical distribution of change points. This is an extension for the 'TITAN2' package <https://cran.r-project.org/package=TITAN2>.

Maintained by Peter DeWitt. Last updated 3 years ago.

epa-unknown

7.5 match 1 stars 3.70 score 7 scripts

jlmelville

rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors

The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.

Maintained by James Melville. Last updated 8 months ago.

approximate-nearest-neighbor-search cpp

3.7 match 11 stars 7.31 score 75 scripts

smithsonian

gde:GBIF Dataset Explorer

Functions to explore datasets from the Global Biodiversity Information Facility (GBIF - <https://www.gbif.org/>) using a Shiny interface.

Maintained by Luis J Villanueva. Last updated 2 years ago.

biodiversity-informatics data-issues gbif gbif-data occurrence shiny shiny-r

10.0 match 1 stars 2.70 score

ptitle

rangeBuilder:Occurrence Filtering, Geographic Standardization and Generation of Species Range Polygons

Provides tools for filtering occurrence records, generating alpha-hull-derived range polygons and mapping species distributions.

Maintained by Pascal Title. Last updated 5 months ago.

cpp

5.1 match 9 stars 4.99 score 72 scripts 1 dependents

ghislainv

hSDM:Hierarchical Bayesian Species Distribution Models

User-friendly and fast set of functions for estimating parameters of hierarchical Bayesian species distribution models (Latimer and others 2006 <doi:10.1890/04-0609>). Such models allow interpreting the observations (occurrence and abundance of a species) as a result of several hierarchical processes including ecological processes (habitat suitability, spatial dependence and anthropogenic disturbance) and observation processes (species detectability). Hierarchical species distribution models are essential for accurately characterizing the environmental response of species, predicting their probability of occurrence, and assessing uncertainty in the model results.

Maintained by Ghislain Vieilledent. Last updated 2 years ago.

gsl

4.1 match 9 stars 6.04 score 41 scripts

rspatial

geodata:Download Geographic Data

Functions for downloading of geographic data for use in spatial analysis and mapping. The package facilitates access to climate, crops, elevation, land use, soil, species occurrence, accessibility, administrative boundaries and other data.

Maintained by Robert J. Hijmans. Last updated 1 months ago.

2.3 match 162 stars 10.75 score 1.5k scripts 7 dependents

ternaustralia

ausplotsR:TERN AusPlots Australian Ecosystem Monitoring Data

Extraction, preparation, visualisation and analysis of TERN AusPlots ecosystem monitoring data. Direct access to plot-based data on vegetation and soils across Australia, including physical sample barcode numbers. Simple function calls extract the data and merge them into species occurrence matrices for downstream analysis, or calculate things like basal area and fractional cover. TERN AusPlots is a national field plot-based ecosystem surveillance monitoring method and dataset for Australia. The data have been collected across a national network of plots and transects by the Terrestrial Ecosystem Research Network (TERN - <https://www.tern.org.au>), an Australian Government NCRIS-enabled project, and its Ecosystem Surveillance platform (<https://www.tern.org.au/tern-land-observatory/ecosystem-surveillance-and-environmental-monitoring/>).

Maintained by Greg Guerin. Last updated 1 years ago.

4.1 match 10 stars 6.07 score 59 scripts

b-cubed-eu

impIndicator:Impact Indicators of Alien Taxa

Compute impact indicators of alien taxa using GBIF occurrence cube and EICAT assessment of alien species. Aggregates species impact of various scores due to mecahnism. Aggregates site impact of various scores due to species.

Maintained by Mukhtar Muhammed Yahaya. Last updated 13 hours ago.

biodiversity-indicators impact invasive-species

5.6 match 4.38 score 4 scripts

julienvollering

MIAmaxent:A Modular, Integrated Approach to Maximum Entropy Distribution Modeling

Tools for training, selecting, and evaluating maximum entropy (and standard logistic regression) distribution models. This package provides tools for user-controlled transformation of explanatory variables, selection of variables by nested model comparison, and flexible model evaluation and projection. It follows principles based on the maximum- likelihood interpretation of maximum entropy modeling, and uses infinitely- weighted logistic regression for model fitting. The package is described in Vollering et al. (2019; <doi:10.1002/ece3.5654>).

Maintained by Julien Vollering. Last updated 7 months ago.

3.8 match 14 stars 6.53 score 30 scripts

mhahsler

arules:Mining Association Rules and Frequent Itemsets

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.

Maintained by Michael Hahsler. Last updated 1 months ago.

arules association-rules frequent-itemsets

1.7 match 194 stars 13.99 score 3.3k scripts 28 dependents

cmerow

rangeModelMetadata:Provides Templates for Metadata Files Associated with Species Range Models

Range Modeling Metadata Standards (RMMS) address three challenges: they (i) are designed for convenience to encourage use, (ii) accommodate a wide variety of applications, and (iii) are extensible to allow the community of range modelers to steer it as needed. RMMS are based on a data dictionary that specifies a hierarchical structure to catalog different aspects of the range modeling process. The dictionary balances a constrained, minimalist vocabulary to improve standardization with flexibility for users to provide their own values. Merow et al. (2019) <DOI:10.1111/geb.12993> describe the standards in more detail. Note that users who prefer to use the R package 'ecospat' can obtain it from <https://github.com/ecospat/ecospat>.

Maintained by Cory Merow. Last updated 8 months ago.

ecological-metadata-language ecological-modelling ecological-models ecology species-distribution-modelling species-distributions

3.4 match 6 stars 6.96 score 16 scripts 3 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 4 days ago.

1.8 match 845 stars 13.57 score 264 scripts 2 dependents

jmsigner

amt:Animal Movement Tools

Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.

Maintained by Johannes Signer. Last updated 4 months ago.

2.3 match 41 stars 10.54 score 418 scripts

farewe

virtualspecies:Generation of Virtual Species Distributions

Provides a framework for generating virtual species distributions, a procedure increasingly used in ecology to improve species distribution models. This package integrates the existing methodological approaches with the objective of generating virtual species distributions with increased ecological realism.

Maintained by Boris Leroy. Last updated 1 years ago.

3.5 match 17 stars 6.68 score 158 scripts 1 dependents

hope-data-science

akc:Automatic Knowledge Classification

A tidy framework for automatic knowledge classification and visualization. Currently, the core functionality of the framework is mainly supported by modularity-based clustering (community detection) in keyword co-occurrence network, and focuses on co-word analysis of bibliometric research. However, the designed functions in 'akc' are general, and could be extended to solve other tasks in text mining as well.

Maintained by Tian-Yuan Huang. Last updated 19 days ago.

4.0 match 15 stars 5.85 score 47 scripts

kpmainali

CooccurrenceAffinity:Affinity in Co-Occurrence Data

Computes a novel metric of affinity between two entities based on their co-occurrence (using binary presence/absence data). The metric and its MLE, alpha hat, were advanced in Mainali, Slud, et al, 2021 <doi:10.1126/sciadv.abj9204>. Various types of confidence intervals and median interval were developed in Mainali and Slud, 2022 <doi:10.1101/2022.11.01.514801>.

Maintained by Kumar Mainali. Last updated 2 years ago.

5.3 match 26 stars 4.39 score 19 scripts

massimoaria

bibliometrix:Comprehensive Science Mapping Analysis

Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.

Maintained by Massimo Aria. Last updated 8 days ago.

bibliometric-analysis bibliometrics citation citation-network citations co-authors co-occurence co-word-analysis correspondence-analysis coupling isi-web journal manuscript quantitative-analysis scholars science science-mapping scientific scientometrics scopus

1.8 match 545 stars 12.54 score 518 scripts 2 dependents

steffenmoritz

imputeTS:Time Series Missing Value Imputation

Imputation (replacement) of missing values in univariate time series. Offers several imputation functions and missing data plots. Available imputation algorithms include: 'Mean', 'LOCF', 'Interpolation', 'Moving Average', 'Seasonal Decomposition', 'Kalman Smoothing on Structural Time Series models', 'Kalman Smoothing on ARIMA models'. Published in Moritz and Bartz-Beielstein (2017) <doi:10.32614/RJ-2017-009>.

Maintained by Steffen Moritz. Last updated 3 years ago.

data-visualization imputation imputation-algorithm imputets missing-data time-series cpp

1.8 match 162 stars 12.18 score 1.9k scripts 27 dependents

ewenharrison

finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling

Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.

Maintained by Ewen Harrison. Last updated 7 months ago.

1.9 match 270 stars 11.43 score 1.0k scripts

pythonicr

strs:'Python' Style String Functions

A comprehensive set of string manipulation functions based on those found in 'Python' without relying on 'reticulate'. It provides functions that intend to (1) make it easier for users familiar with 'Python' to work with strings, (2) reduce the complexity often associated with string operations, (3) and enable users to write more readable and maintainable code that manipulates strings.

Maintained by Garrett Shipley. Last updated 2 months ago.

5.5 match 2 stars 3.90 score 5 scripts

nicholasjclark

MRFcov:Markov Random Fields with Additional Covariates

Approximate node interaction parameters of Markov Random Fields graphical networks. Models can incorporate additional covariates, allowing users to estimate how interactions between nodes in the graph are predicted to change across covariate gradients. The general methods implemented in this package are described in Clark et al. (2018) <doi:10.1002/ecy.2221>.

Maintained by Nicholas J Clark. Last updated 12 months ago.

conditional-random-fields graphical-models machine-learning markov-random-field multivariate-analysis multivariate-statistics network-analysis networks

3.5 match 24 stars 6.03 score 30 scripts

anttonalberdi

hilldiv:Integral Analysis of Diversity Based on Hill Numbers

Tools for analysing, comparing, visualising and partitioning diversity based on Hill numbers. 'hilldiv' is an R package that provides a set of functions to assist analysis of diversity for diet reconstruction, microbial community profiling or more general ecosystem characterisation analyses based on Hill numbers, using OTU/ASV tables and associated phylogenetic trees as inputs. The package includes functions for (phylo)diversity measurement, (phylo)diversity profile plotting, (phylo)diversity comparison between samples and groups, (phylo)diversity partitioning and (dis)similarity measurement. All of these grounded in abundance-based and incidence-based Hill numbers. The statistical framework developed around Hill numbers encompasses many of the most broadly employed diversity (e.g. richness, Shannon index, Simpson index), phylogenetic diversity (e.g. Faith's PD, Allen's H, Rao's quadratic entropy) and dissimilarity (e.g. Sorensen index, Unifrac distances) metrics. This enables the most common analyses of diversity to be performed while grounded in a single statistical framework. The methods are described in Jost et al. (2007) <DOI:10.1890/06-1736.1>, Chao et al. (2010) <DOI:10.1098/rstb.2010.0272> and Chiu et al. (2014) <DOI:10.1890/12-0960.1>; and reviewed in the framework of molecularly characterised biological systems in Alberdi & Gilbert (2019) <DOI:10.1111/1755-0998.13014>.

Maintained by Antton Alberdi. Last updated 4 years ago.

4.8 match 11 stars 4.35 score 41 scripts

tdaverse

ripserr:Calculate Persistent Homology with Ripser-Based Engines

Ports the Ripser <https://arxiv.org/abs/1908.02518> and Cubical Ripser <https://arxiv.org/abs/2005.12692> persistent homology calculation engines from C++. Can be used as a rapid calculation tool in topological data analysis pipelines.

Maintained by Raoul Wadhwa. Last updated 2 days ago.

algebraic-topology cohomology cpp cubical-complex persistent-homology pixel point-cloud r-language r-programming rcpp rips-complex ripser simplicial-complex simplicial-homology topological-data-analysis topology vietoris-complex voxel cpp

3.6 match 7 stars 5.80 score 6 scripts

billdenney

PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis

Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.

Maintained by Bill Denney. Last updated 16 days ago.

nca noncompartmental-analysis pharmacokinetics

1.7 match 73 stars 12.61 score 214 scripts 4 dependents

yiluheihei

RevEcoR:Reverse Ecology Analysis on Microbiome

An implementation of the reverse ecology framework. Reverse ecology refers to the use of genomics to study ecology with no a priori assumptions about the organism(s) under consideration, linking organisms to their environment. It allows researchers to reconstruct the metabolic networks and study the ecology of poorly characterized microbial species from their genomic information, and has substantial potentials for microbial community ecological analysis.

Maintained by Yang Cao. Last updated 6 years ago.

3.5 match 6 stars 5.77 score 22 scripts 1 dependents

hugheylab

phers:Calculate Phenotype Risk Scores

Use phenotype risk scores based on linked clinical and genetic data to study Mendelian disease and rare genetic variants. See Bastarache et al. 2018 <doi:10.1126/science.aal4043>.

Maintained by Jake Hughey. Last updated 2 years ago.

6.8 match 3.00 score 1 scripts

eikeluedeling

decisionSupport:Quantitative Support of Decision Making under Uncertainty

Supporting the quantitative analysis of binary welfare based decision making processes using Monte Carlo simulations. Decision support is given on two levels: (i) The actual decision level is to choose between two alternatives under probabilistic uncertainty. This package calculates the optimal decision based on maximizing expected welfare. (ii) The meta decision level is to allocate resources to reduce the uncertainty in the underlying decision problem, i.e to increase the current information to improve the actual decision making process. This problem is dealt with using the Value of Information Analysis. The Expected Value of Information for arbitrary prospective estimates can be calculated as well as Individual Expected Value of Perfect Information. The probabilistic calculations are done via Monte Carlo simulations. This Monte Carlo functionality can be used on its own.

Maintained by Eike Luedeling. Last updated 11 months ago.

3.9 match 6 stars 5.17 score 123 scripts

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 2 months ago.

annotation chipseq chipchip

2.3 match 8.75 score 584 scripts 6 dependents

bernd-mueller

epos:Epilepsy Ontologies' Similarities

Analysis and visualization of similarities between epilepsy ontologies based on text mining results by comparing ranked lists of co-occurring drug terms in the BioASQ corpus. The ranked result lists of neurological drug terms co-occurring with terms from the epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS undergo further analysis. The source data to create the ranked lists of drug names is produced using the text mining workflows described in Mueller, Bernd and Hagelstein, Alexandra (2016) <doi:10.4126/FRL01-006408558>, Mueller, Bernd et al. (2017) <doi:10.1007/978-3-319-58694-6_22>, Mueller, Bernd and Rebholz-Schuhmann, Dietrich (2020) <doi:10.1007/978-3-030-43887-6_52>, and Mueller, Bernd et al. (2022) <doi:10.1186/s13326-021-00258-w>.

Maintained by Bernd Mueller. Last updated 1 years ago.

4.8 match 4.03 score 53 scripts

cran

NHPoisson:Modelling and Validation of Non Homogeneous Poisson Processes

Tools for modelling, ML estimation, validation analysis and simulation of non homogeneous Poisson processes in time.

Maintained by Ana C. Cebrian. Last updated 5 years ago.

7.1 match 2 stars 2.71 score 43 scripts 2 dependents

paballand

EconGeo:Computing Key Indicators of the Spatial Distribution of Economic Activities

Functions to compute a series of indices commonly used in the fields of economic geography, economic complexity, and evolutionary economics to describe the location, distribution, spatial organization, structure, and complexity of economic activities. Functions include basic spatial indicators such as the location quotient, the Krugman specialization index, the Herfindahl or the Shannon entropy indices but also more advanced functions to compute different forms of normalized relatedness between economic activities or network-based measures of economic complexity. Most of the functions use matrix calculus and are based on bipartite (incidence) matrices consisting of region - industry pairs.

Maintained by Pierre-Alexandre Balland. Last updated 2 years ago.

3.9 match 41 stars 4.96 score 44 scripts

philipmostert

intSDM:Reproducible Integrated Species Distribution Models Across Norway using 'INLA'

Integration of disparate datasets is needed in order to make efficient use of all available data and thereby address the issues currently threatening biodiversity. Data integration is a powerful modeling framework which allows us to combine these datasets together into a single model, yet retain the strengths of each individual dataset. We therefore introduce the package, 'intSDM': an R package designed to help ecologists develop a reproducible workflow of integrated species distribution models, using data both provided from the user as well as data obtained freely online. An introduction to data integration methods is discussed in Issac, Jarzyna, Keil, Dambly, Boersch-Supan, Browning, Freeman, Golding, Guillera-Arroita, Henrys, Jarvis, Lahoz-Monfort, Pagel, Pescott, Schmucki, Simmonds and O’Hara (2020) <doi:10.1016/j.tree.2019.08.006>.

Maintained by Philip Mostert. Last updated 2 months ago.

3.1 match 5 stars 6.26 score 12 scripts

gabrielnakamura

FishPhyloMaker:Phylogenies for a List of Finned-Ray Fishes

Provides an alternative to facilitate the construction of a phylogeny for fish species from a list of species or a community matrix using as a backbone the phylogenetic tree proposed by Rabosky et al. (2018) <doi:10.1038/s41586-018-0273-1>.

Maintained by Gabriel Nakamura. Last updated 1 years ago.

3.5 match 8 stars 5.49 score 13 scripts

agi-lab

SynthETIC:Synthetic Experience Tracking Insurance Claims

Creation of an individual claims simulator which generates various features of non-life insurance claims. An initial set of test parameters, designed to mirror the experience of an Auto Liability portfolio, were set up and applied by default to generate a realistic test data set of individual claims (see vignette). The simulated data set then allows practitioners to back-test the validity of various reserving models and to prove and/or disprove certain actuarial assumptions made in claims modelling. The distributional assumptions used to generate this data set can be easily modified by users to match their experiences. Reference: Avanzi B, Taylor G, Wang M, Wong B (2020) "SynthETIC: an individual insurance claim simulator with feature control" <arXiv:2008.05693>.

Maintained by Melantha Wang. Last updated 1 years ago.

3.1 match 12 stars 6.22 score 23 scripts 2 dependents

cran

zetadiv:Functions to Compute Compositional Turnover Using Zeta Diversity

Functions to compute compositional turnover using zeta-diversity, the number of species shared by multiple assemblages. The package includes functions to compute zeta-diversity for a specific number of assemblages and to compute zeta-diversity for a range of numbers of assemblages. It also includes functions to explain how zeta-diversity varies with distance and with differences in environmental variables between assemblages, using generalised linear models, linear models with negative constraints, generalised additive models,shape constrained additive models, and I-splines.

Maintained by Guillaume Latombe. Last updated 3 years ago.

6.5 match 3 stars 2.89 score 64 scripts

brry

berryFunctions:Function Collection Related to Plotting and Hydrology

Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.

Maintained by Berry Boessenkool. Last updated 1 months ago.

2.0 match 13 stars 9.43 score 350 scripts 16 dependents

pharmaverse

admiral:ADaM in R Asset Library

A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).

Maintained by Ben Straub. Last updated 4 days ago.

cdisc clinical-trials open-source

1.3 match 236 stars 13.89 score 486 scripts 4 dependents

bioc

genomation:Summary, annotation and visualization of genomic data

A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.

Maintained by Altuna Akalin. Last updated 5 months ago.

annotation sequencing visualization cpgisland cpp

1.7 match 75 stars 11.09 score 738 scripts 5 dependents

bioc

MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework

MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).

Maintained by Shuangbin Xu. Last updated 5 months ago.

visualization microbiome software multiplecomparison featureextraction microbiome-analysis microbiome-data

1.9 match 183 stars 9.70 score 126 scripts 1 dependents

hannahlowens

voluModel:Modeling Species Distributions in Three Dimensions

Facilitates modeling species' ecological niches and geographic distributions based on occurrences and environments that have a vertical as well as horizontal component, and projecting models into three-dimensional geographic space. Working in three dimensions is useful in an aquatic context when the organisms one wishes to model can be found across a wide range of depths in the water column. The package also contains functions to automatically generate marine training model training regions using machine learning, and interpolate and smooth patchily sampled environmental rasters using thin plate splines. Davis Rabosky AR, Cox CL, Rabosky DL, Title PO, Holmes IA, Feldman A, McGuire JA (2016) <doi:10.1038/ncomms11484>. Nychka D, Furrer R, Paige J, Sain S (2021) <doi:10.5065/D6W957CT>. Pateiro-Lopez B, Rodriguez-Casal A (2022) <https://CRAN.R-project.org/package=alphahull>.

Maintained by Hannah L. Owens. Last updated 2 days ago.

2.8 match 9 stars 6.60 score 35 scripts

humaniverse

asylum:Data on Asylum and Resettlement for the UK

Data on Asylum and Resettlement for the UK, provided by the Home Office <https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables>.

Maintained by Matthew Gwynfryn Thomas. Last updated 17 days ago.

3.6 match 3 stars 4.99 score 36 scripts

schochastics

netrankr:Analyzing Partial Rankings in Networks

Implements methods for centrality related analyses of networks. While the package includes the possibility to build more than 20 indices, its main focus lies on index-free assessment of centrality via partial rankings obtained by neighborhood-inclusion or positional dominance. These partial rankings can be analyzed with different methods, including probabilistic methods like computing expected node ranks and relative rank probabilities (how likely is it that a node is more central than another?). The methodology is described in depth in the vignettes and in Schoch (2018) <doi:10.1016/j.socnet.2017.12.003>.

Maintained by David Schoch. Last updated 1 months ago.

network-analysis network-centrality openblas cpp openmp

1.9 match 49 stars 9.56 score 91 scripts 2 dependents

ropensci

SymbiotaR2:Downloading Data from Symbiota2 Portals into R

Download data from Symbiota2 portals using Symbiota's API. Covers the Checklists, Collections, Crowdsource, Exsiccati, Glossary, ImageProcessor, Key, Media, Occurrence, Reference, Taxa, Traits, and UserRoles API families. Each Symbiota2 portal owner can load their own plugins (and modified code), and so this package may not cover every possible API endpoint from a given Symbiota2 instance.

Maintained by Austin Koontz. Last updated 3 years ago.

database library specimen-records symbiota symbiota2 symbiota2-portal

5.3 match 2 stars 3.30 score 4 scripts

moderndive

moderndive:Tidyverse-Friendly Introductory Linear Regression

Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.

Maintained by Albert Y. Kim. Last updated 3 months ago.

1.5 match 88 stars 11.35 score 1.8k scripts

pakillo

rSDM:Species distribution and niche modelling in R

Functions for niche modelling and SDM.

Maintained by Francisco Rodriguez-Sanchez. Last updated 18 days ago.

5.4 match 7 stars 3.23 score 16 scripts

bioc

periodicDNA:Set of tools to identify periodic occurrences of k-mers in DNA sequences

This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). The functions of this package provide a straightforward approach to find periodic occurrences of k-mers in DNA sequences, such as regulatory elements. It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.

Maintained by Jacques Serizay. Last updated 5 months ago.

sequencematching motifdiscovery motifannotation sequencing coverage alignment dataimport

3.3 match 6 stars 5.26 score 5 scripts

babaknaimi

sdm:Species Distribution Modelling

An extensible framework for developing species distribution models using individual and community-based approaches, generate ensembles of models, evaluate the models, and predict species potential distributions in space and time. For more information, please check the following paper: Naimi, B., Araujo, M.B. (2016) <doi:10.1111/ecog.01881>.

Maintained by Babak Naimi. Last updated 2 months ago.

1.8 match 24 stars 9.53 score 312 scripts 1 dependents

alrobles

cofid:Copepod Fish Interaction Database

A curated list of copepod-fish ecological interaction records. It contains the taxonomy of the copepod and the fish and the publication from which the information was obtained. This database contains only marine and brackish water fish species. It excludes fish species that inhabit only freshwater.

Maintained by Angel Robles. Last updated 4 months ago.

5.0 match 3.40 score 3 scripts

darwin-eu

PatientProfiles:Identify Characteristics of Patients in the OMOP Common Data Model

Identify the characteristics of patients in data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model.

Maintained by Marti Catala. Last updated 10 days ago.

1.7 match 1 stars 9.97 score 225 scripts 9 dependents

thecomeonman

ggTimeSeries:Time Series Visualisations Using the Grammar of Graphics

Provides additional display mediums for time series visualisations.

Maintained by Aditya Kothari. Last updated 6 years ago.

3.2 match 1 stars 5.23 score 112 scripts

alj1983

MaxentVariableSelection:Selecting the Best Set of Relevant Environmental Variables along with the Optimal Regularization Multiplier for Maxent Niche Modeling

Complex niche models show low performance in identifying the most important range-limiting environmental variables and in transferring habitat suitability to novel environmental conditions (Warren and Seifert, 2011 <DOI:10.1890/10-1171.1>; Warren et al., 2014 <DOI:10.1111/ddi.12160>). This package helps to identify the most important set of uncorrelated variables and to fine-tune Maxent's regularization multiplier. In combination, this allows to constrain complexity and increase performance of Maxent niche models (assessed by information criteria, such as AICc (Akaike, 1974 <DOI:10.1109/TAC.1974.1100705>), and by the area under the receiver operating characteristic (AUC) (Fielding and Bell, 1997 <DOI:10.1017/S0376892997000088>). Users of this package should be familiar with Maxent niche modelling.

Maintained by "Alexander Jueterbock". Last updated 6 years ago.

3.1 match 4 stars 5.34 score 11 scripts

alarm-redist

redist:Simulation Methods for Legislative Redistricting

Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.

Maintained by Christopher T. Kenny. Last updated 2 months ago.

geospatial gerrymandering redistricting sampling openblas cpp openmp

1.8 match 68 stars 9.17 score 259 scripts

mikldk

malan:MAle Lineage ANalysis

MAle Lineage ANalysis by simulating genealogies backwards and imposing short tandem repeats (STR) mutations forwards. Intended for forensic Y chromosomal STR (Y-STR) haplotype analyses. Numerous analyses are possible, e.g. number of matches and meiotic distance to matches. Refer to papers mentioned in citation("malan") (DOI's: <doi:10.1371/journal.pgen.1007028>, <doi:10.21105/joss.00684> and <doi:10.1016/j.fsigen.2018.10.004>).

Maintained by Mikkel Meyer Andersen. Last updated 1 years ago.

openblas cpp

3.7 match 4.48 score 6 scripts

bioc

ShortRead:FASTQ input and manipulation

This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

dataimport sequencing qualitycontrol bioconductor-package core-package zlib cpp

1.3 match 8 stars 12.08 score 1.8k scripts 49 dependents

bioc

spatzie:Identification of enriched motif pairs from chromatin interaction data

Identifies motifs that are significantly co-enriched from enhancer-promoter interaction data. While enhancer-promoter annotation is commonly used to define groups of interaction anchors, spatzie also supports co-enrichment analysis between preprocessed interaction anchors. Supports BEDPE interaction data derived from genome-wide assays such as HiC, ChIA-PET, and HiChIP. Can also be used to look for differentially enriched motif pairs between two interaction experiments.

Maintained by Jennifer Hammelman. Last updated 5 months ago.

dna3dstructure generegulation peakdetection epigenetics functionalgenomics classification hic transcription

3.7 match 4.30 score 5 scripts

rbarkerclarke

gtexture:Generalized Application of Co-Occurrence Matrices and Haralick Texture

Generalizes application of gray-level co-occurrence matrix (GLCM) metrics to objects outside of images. The current focus is to apply GLCM metrics to the study of biological networks and fitness landscapes that are used in studying evolutionary medicine and biology, particularly the evolution of cancer resistance. The package was used in our publication, Barker-Clarke et al. (2023) <doi:10.1088/1361-6560/ace305>. A general reference to learn more about mathematical oncology can be found at Rockne et al. (2019) <doi:10.1088/1478-3975/ab1a09>.

Maintained by Rowan Barker-Clarke. Last updated 12 months ago.

5.2 match 3.00 score 1 scripts

ediorg

ecocomDP:Tools to Create, Use, and Convert ecocomDP Data

Work with the Ecological Community Data Design Pattern. 'ecocomDP' is a flexible data model for harmonizing ecological community surveys, in a research question agnostic format, from source data published across repositories, and with methods that keep the derived data up-to-date as the underlying sources change. Described in O'Brien et al. (2021), <doi:10.1016/j.ecoinf.2021.101374>.

Maintained by Colin Smith. Last updated 7 months ago.

1.9 match 32 stars 8.22 score 77 scripts

ropensci

helminthR:Access London Natural History Museum Host-Helminth Record Database

Access to large host-parasite data is often hampered by the availability of data and difficulty in obtaining it in a programmatic way to encourage analyses. 'helminthR' provides a programmatic interface to the London Natural History Museum's host-parasite database, one of the largest host-parasite databases existing currently <https://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>. The package allows the user to query by host species, parasite species, and geographic location.

Maintained by Tad Dallas. Last updated 2 years ago.

disease-networks helminth open-data parasites

3.5 match 7 stars 4.32 score 12 scripts

gavinsimpson

analogue:Analogue and Weighted Averaging Methods for Palaeoecology

Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.

Maintained by Gavin L. Simpson. Last updated 6 months ago.

1.7 match 14 stars 8.96 score 185 scripts 4 dependents

gabrielhoc

Mapinguari:Process-Based Biogeographical Analysis

Facilitates the incorporation of biological processes in biogeographical analyses. It offers conveniences in fitting, comparing and extrapolating models of biological processes such as physiology and phenology. These spatial extrapolations can be informative by themselves, but also complement traditional correlative species distribution models, by mixing environmental and process-based predictors. Caetano et al (2020) <doi:10.1111/oik.07123>.

Maintained by Gabriel Caetano. Last updated 2 years ago.

5.5 match 5 stars 2.70 score 4 scripts

plant-data

ritalic:Interface to the ITALIC Database of Lichen Biodiversity

A programmatic interface to the Web Service methods provided by ITALIC (<https://italic.units.it>). ITALIC is a database of lichen data in Italy and bordering European countries. 'ritalic' includes functions for retrieving information about lichen scientific names, geographic distribution, ecological data, morpho-functional traits and identification keys. More information about the data is available at <https://italic.units.it/?procedure=base&t=59&c=60>. The API documentation is available at <https://italic.units.it/?procedure=api>.

Maintained by Matteo Conti. Last updated 14 days ago.

biodiversity ecology fungi italic lichen

3.6 match 2 stars 4.04 score 6 scripts

traminer

TraMineR:Trajectory Miner: a Sequence Analysis Toolkit

Set of sequence analysis tools for manipulating, describing and rendering categorical sequences, and more generally mining sequence data in the field of social sciences. Although this sequence analysis package is primarily intended for state or event sequences that describe time use or life courses such as family formation histories or professional careers, its features also apply to many other kinds of categorical sequence data. It accepts many different sequence representations as input and provides tools for converting sequences from one format to another. It offers several functions for describing and rendering sequences, for computing distances between sequences with different metrics (among which optimal matching), original dissimilarity-based analysis tools, and functions for extracting the most frequent event subsequences and identifying the most discriminating ones among them. A user's guide can be found on the TraMineR web page.

Maintained by Gilbert Ritschard. Last updated 3 months ago.

cpp

1.8 match 11 stars 8.24 score 534 scripts 13 dependents

bioc

survcomp:Performance Assessment and Comparison for Survival Analysis

Assessment and Comparison for Performance of Risk Prediction (Survival) Models.

Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.

geneexpression differentialexpression visualization cpp

1.7 match 8.46 score 448 scripts 12 dependents

dkahle

TITAN2:Threshold Indicator Taxa Analysis

Uses indicator species scores across binary partitions of a sample set to detect congruence in taxon-specific changes of abundance and occurrence frequency along an environmental gradient as evidence of an ecological community threshold. Relevant references include Baker and King (2010) <doi:10.1111/j.2041-210X.2009.00007.x>, King and Baker (2010) <doi:10.1899/09-144.1>, and Baker and King (2013) <doi:10.1899/12-142.1>.

Maintained by David Kahle. Last updated 1 years ago.

2.2 match 13 stars 6.59 score 30 scripts

willgearty

deeptime:Plotting Tools for Anyone Working in Deep Time

Extends the functionality of other plotting packages (notably 'ggplot2') to help facilitate the plotting of data over long time intervals, including, but not limited to, geological, evolutionary, and ecological data. The primary goal of 'deeptime' is to enable users to add highly customizable timescales to their visualizations. Other functions are also included to assist with other areas of deep time visualization.

Maintained by William Gearty. Last updated 3 months ago.

geology ggplot2 paleontology visualization

1.3 match 92 stars 10.61 score 207 scripts 3 dependents

insightsengineering

cardx:Extra Analysis Results Data Utilities

Create extra Analysis Results Data (ARD) summary objects. The package supplements the simple ARD functions from the 'cards' package, exporting functions to put statistical results in the ARD format. These objects are used and re-used to construct summary tables, visualizations, and written reports.

Maintained by Daniel D. Sjoberg. Last updated 20 days ago.

1.7 match 19 stars 8.46 score 50 scripts

r-forge

fuzzySim:Fuzzy Similarity in Species Distributions

Functions to compute fuzzy versions of species occurrence patterns based on presence-absence data (including inverse distance interpolation, trend surface analysis, and prevalence-independent favourability obtained from probability of presence), as well as pair-wise fuzzy similarity (based on fuzzy logic versions of commonly used similarity indices) among those occurrence patterns. Includes also functions for model consensus and comparison (overlap and fuzzy similarity, fuzzy loss, fuzzy gain), and for data preparation, such as obtaining unique abbreviations of species names, defining the background region, cleaning and gridding (thinning) point occurrence data onto raster maps, selecting among (pseudo)absences to address survey bias, converting species lists (long format) to presence-absence tables (wide format), transposing part of a data frame, selecting relevant variables for models, assessing the false discovery rate, or analysing and dealing with multicollinearity. Initially described in Barbosa (2015) <doi:10.1111/2041-210X.12372>.

Maintained by A. Marcia Barbosa. Last updated 20 days ago.

2.6 match 2 stars 5.35 score 156 scripts

cran

CircNNTSR:Statistical Analysis of Circular Data using Nonnegative Trigonometric Sums (NNTS) Models

Includes functions for the analysis of circular data using distributions based on Nonnegative Trigonometric Sums (NNTS). The package includes functions for calculation of densities and distributions, for the estimation of parameters, for plotting and more.

Maintained by Maria Mercedes Gregorio-Dominguez. Last updated 2 years ago.

7.8 match 1.78 score 2 dependents

niklashohmann

DAIME:Effects of Changing Deposition Rates

Reverse and model the effects of changing deposition rates on geological data and rates. Based on Hohmann (2018) <doi:10.13140/RG.2.2.23372.51841> .

Maintained by Niklas Hohmann. Last updated 5 years ago.

4.6 match 3.00 score

kwb-r

kwb.utils:General Utility Functions Developed at KWB

This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).

Maintained by Hauke Sonnenberg. Last updated 12 months ago.

1.9 match 8 stars 7.33 score 12 scripts 78 dependents

aibrt

FreqProf:Frequency Profiles Computing and Plotting

Tools for generating an informative type of line graph, the frequency profile, which allows single behaviors, multiple behaviors, or the specific behavioral patterns of individual subjects to be graphed from occurrence/nonoccurrence behavioral data.

Maintained by Ronald E. Robertson. Last updated 9 years ago.

4.0 match 2 stars 3.48 score 7 scripts

gavinsimpson

coenocliner:Coenocline Simulation

Simulate species occurrence and abundances (counts) along gradients.

Maintained by Gavin L. Simpson. Last updated 4 years ago.

2.2 match 12 stars 6.03 score 15 scripts 1 dependents

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

1.7 match 37 stars 7.81 score 29 scripts

codymarquart

rENA:Epistemic Network Analysis

ENA (Shaffer, D. W. (2017) Quantitative Ethnography. ISBN: 0578191687) is a method used to identify meaningful and quantifiable patterns in discourse or reasoning. ENA moves beyond the traditional frequency-based assessments by examining the structure of the co-occurrence, or connections in coded data. Moreover, compared to other methodological approaches, ENA has the novelty of (1) modeling whole networks of connections and (2) affording both quantitative and qualitative comparisons between different network models. Shaffer, D.W., Collier, W., & Ruis, A.R. (2016).

Maintained by Cody L Marquart. Last updated 1 years ago.

openblas cpp

5.9 match 1 stars 2.26 score 36 scripts

calvagone

campsismod:Generic Implementation of a PK/PD Model

A generic, easy-to-use and expandable implementation of a pharmacokinetic (PK) / pharmacodynamic (PD) model based on the S4 class system. This package allows the user to read/write a pharmacometric model from/to files and adapt it further on the fly in the R environment. For this purpose, this package provides an intuitive API to add, modify or delete equations, ordinary differential equations (ODE's), model parameters or compartment properties (like infusion duration or rate, bioavailability and initial values). Finally, this package also provides a useful export of the model for use with simulation packages 'rxode2' and 'mrgsolve'. This package is designed and intended to be used with package 'campsis', a PK/PD simulation platform built on top of 'rxode2' and 'mrgsolve'.

Maintained by Nicolas Luyckx. Last updated 1 months ago.

2.0 match 5 stars 6.64 score 42 scripts 1 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

1.6 match 3 stars 8.20 score 7.8k scripts 11 dependents

jenniniku

gllvm:Generalized Linear Latent Variable Models

Analysis of multivariate data using generalized linear latent variable models (gllvm). Estimation is performed using either the Laplace method, variational approximations, or extended variational approximations, implemented via TMB (Kristensen et al. (2016), <doi:10.18637/jss.v070.i05>).

Maintained by Jenni Niku. Last updated 13 hours ago.

cpp openmp

1.3 match 52 stars 10.53 score 176 scripts 1 dependents

cwolock

survML:Tools for Flexible Survival Analysis Using Machine Learning

Statistical tools for analyzing time-to-event data using machine learning. Implements survival stacking for conditional survival estimation, standardized survival function estimation for current status data, and methods for algorithm-agnostic variable importance. See Wolock CJ, Gilbert PB, Simon N, and Carone M (2024) <doi:10.1080/10618600.2024.2304070>.

Maintained by Charles Wolock. Last updated 2 months ago.

1.6 match 16 stars 8.06 score 73 scripts 1 dependents

kevinstadler

cultevo:Tools, Measures and Statistical Tests for Cultural Evolution

Provides tools and statistics useful for analysing data from artificial language experiments. It implements the information-theoretic measure of the compositionality of signalling systems due to Spike (2016) <http://hdl.handle.net/1842/25930>, the Mantel test for distance matrix correlation (after Dietz 1983) <doi:10.1093/sysbio/32.1.21>), functions for computing string and meaning distance matrices as well as an implementation of the Page test for monotonicity of ranks (Page 1963) <doi:10.1080/01621459.1963.10500843> with exact p-values up to k = 22.

Maintained by Kevin Stadler. Last updated 1 years ago.

2.0 match 8 stars 6.50 score 131 scripts 1 dependents

kasperwelbers

corpustools:Managing, Querying and Analyzing Tokenized Text

Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.

Maintained by Kasper Welbers. Last updated 6 months ago.

cpp

1.7 match 31 stars 7.50 score 174 scripts 1 dependents

venelin

PCMBase:Simulation and Likelihood Calculation of Phylogenetic Comparative Models

Phylogenetic comparative methods represent models of continuous trait data associated with the tips of a phylogenetic tree. Examples of such models are Gaussian continuous time branching stochastic processes such as Brownian motion (BM) and Ornstein-Uhlenbeck (OU) processes, which regard the data at the tips of the tree as an observed (final) state of a Markov process starting from an initial state at the root and evolving along the branches of the tree. The PCMBase R package provides a general framework for manipulating such models. This framework consists of an application programming interface for specifying data and model parameters, and efficient algorithms for simulating trait evolution under a model and calculating the likelihood of model parameters for an assumed model and trait data. The package implements a growing collection of models, which currently includes BM, OU, BM/OU with jumps, two-speed OU as well as mixed Gaussian models, in which different types of the above models can be associated with different branches of the tree. The PCMBase package is limited to trait-simulation and likelihood calculation of (mixed) Gaussian phylogenetic models. The PCMFit package provides functionality for inference of these models to tree and trait data. The package web-site <https://venelin.github.io/PCMBase/> provides access to the documentation and other resources.

Maintained by Venelin Mitov. Last updated 10 months ago.

1.7 match 6 stars 7.56 score 85 scripts 3 dependents

cranhaven

rock:Reproducible Open Coding Kit

The Reproducible Open Coding Kit ('ROCK', and this package, 'rock') was developed to facilitate reproducible and open coding, specifically geared towards qualitative research methods. Although it is a general-purpose toolkit, three specific applications have been implemented, specifically an interface to the 'rENA' package that implements Epistemic Network Analysis ('ENA'), means to process notes from Cognitive Interviews ('CIs'), and means to work with decentralized construct taxonomies ('DCTs'). The 'ROCK' and this 'rock' package are described in the ROCK book <https://rockbook.org> and more information, such as tutorials, is available at <https://rock.science>.

Maintained by Gjalt-Jorn Peters. Last updated 8 days ago.

archived packages r-universe

3.8 match 5 stars 3.40 score

cran

SAPP:Statistical Analysis of Point Processes

Functions for statistical analysis of point processes.

Maintained by Masami Saga. Last updated 2 years ago.

fortran glibc

4.0 match 3.18 score 15 scripts

andrew-plowright

ForestTools:Tools for Analyzing Remote Sensing Forest Data

Tools for analyzing remote sensing forest data, including functions for detecting treetops from canopy models, outlining tree crowns, and calculating textural metrics.

Maintained by Andrew Plowright. Last updated 1 months ago.

1.8 match 73 stars 7.01 score 103 scripts 1 dependents

roelandkindt

maxlike:Model Species Distributions by Estimating the Probability of Occurrence Using Presence-Only Data

Provides a likelihood-based approach to modeling species distributions using presence-only data. In contrast to the popular software program MAXENT, this approach yields estimates of the probability of occurrence, which is a natural descriptor of a species' distribution.

Maintained by Roeland Kindt. Last updated 12 months ago.

6.7 match 1.87 score 20 scripts

config-i1

legion:Forecasting Using Multivariate Models

Functions implementing multivariate state space models for purposes of time series analysis and forecasting. The focus of the package is on multivariate models, such as Vector Exponential Smoothing, Vector ETS (Error-Trend-Seasonal model) etc. It currently includes Vector Exponential Smoothing (VES, de Silva et al., 2010, <doi:10.1177/1471082X0901000401>), Vector ETS (Svetunkov et al., 2023, <doi:10.1016/j.ejor.2022.04.040>) and simulation function for VES.

Maintained by Ivan Svetunkov. Last updated 1 months ago.

openblas cpp

1.8 match 11 stars 6.90 score 1 scripts 1 dependents

inbo

ladybird:Analysis of Ladybird Occurrence Data

Analysis of ladybird occurrence data from Belgium, the Netherlands and the UK since 1990.

Maintained by Thierry Onkelinx. Last updated 4 years ago.

7.3 match 1.70 score 3 scripts

roelandkindt

BiodiversityR:Package for Community Ecology and Suitability Analysis

Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.

Maintained by Roeland Kindt. Last updated 2 months ago.

1.7 match 16 stars 7.42 score 390 scripts 2 dependents

farewe

Rarity:Calculation of Rarity Indices for Species and Assemblages of Species

Allows calculation of rarity weights for species and indices of rarity for assemblages of species according to different methods (Leroy et al. 2012, Insect. Conserv. Divers. 5:159-168 <doi:10.1111/j.1752-4598.2011.00148.x>; Leroy et al. 2013, Divers. Distrib. 19:794-803 <doi:10.1111/ddi.12040>).

Maintained by Boris Leroy. Last updated 2 years ago.

3.4 match 1 stars 3.61 score 27 scripts 1 dependents

babaknaimi

usdm:Uncertainty Analysis for Species Distribution Models

This is a framework that aims to provide methods and tools for assessing the impact of different sources of uncertainties (e.g.positional uncertainty) on performance of species distribution models (SDMs).)

Maintained by Babak Naimi. Last updated 1 years ago.

1.8 match 3 stars 6.87 score 644 scripts 1 dependents

roux-ohdsi

allofus:Interface for 'All of Us' Researcher Workbench

Streamline use of the 'All of Us' Researcher Workbench (<https://www.researchallofus.org/data-tools/workbench/>)with tools to extract and manipulate data from the 'All of Us' database. Increase interoperability with the Observational Health Data Science and Informatics ('OHDSI') tool stack by decreasing reliance of 'All of Us' tools and allowing for cohort creation via 'Atlas'. Improve reproducible and transparent research using 'All of Us'.

Maintained by Rob Cavanaugh. Last updated 4 months ago.

1.7 match 16 stars 7.19 score 30 scripts

pythonicr

re:'Python' Style Regular Expression Functions

A comprehensive set of regular expression functions based on those found in 'Python' without relying on 'reticulate'. It provides functions that intend to (1) make it easier for users familiar with 'Python' to work with regular expressions, (2) reduce the complexity often associated with regular expressions code, (3) and enable users to write more readable and maintainable code that relies on regular expression-based pattern matching.

Maintained by Garrett Shipley. Last updated 7 months ago.

3.8 match 1 stars 3.28 score 19 scripts

quanteda

quanteda.textplots:Plots for the Quantitative Analysis of Textual Data

Plotting functions for visualising textual data. Extends 'quanteda' and related packages with plot methods designed specifically for text data, textual statistics, and models fit to textual data. Plot types include word clouds, lexical dispersion plots, scaling plots, network visualisations, and word 'keyness' plots.

Maintained by Kenneth Benoit. Last updated 7 months ago.

cpp

1.8 match 7 stars 6.77 score 648 scripts

bioc

soGGi:Visualise ChIP-seq, MNase-seq and motif occurrence as aggregate plots Summarised Over Grouped Genomic Intervals

The soGGi package provides a toolset to create genomic interval aggregate/summary plots of signal or motif occurence from BAM and bigWig files as well as PWM, rlelist, GRanges and GAlignments Bioconductor objects. soGGi allows for normalisation, transformation and arithmetic operation on and between summary plot objects as well as grouping and subsetting of plots by GRanges objects and user supplied metadata. Plots are created using the GGplot2 libary to allow user defined manipulation of the returned plot object. Coupled together, soGGi features a broad set of methods to visualise genomics data in the context of groups of genomic intervals such as genes, superenhancers and transcription factor binding events.

Maintained by Tom Carroll. Last updated 5 months ago.

sequencing chipseq coverage

2.7 match 4.49 score 51 scripts 1 dependents

cran

ppgm:PaleoPhyloGeographic Modeling of Climate Niches and Species Distributions

Reconstruction of paleoclimate niches using phylogenetic comparative methods and projection reconstructed niches onto paleoclimate maps. The user can specify various models of trait evolution or estimate the best fit model, include fossils, use one or multiple phylogenies for inference, and make animations of shifting suitable habitat through time. This model was first used in Lawing and Polly (2011), and further implemented in Lawing et al (2016) and Rivera et al (2020). Lawing and Polly (2011) <doi:10.1371/journal.pone.0028554> "Pleistocene climate, phylogeny and climate envelope models: An integrative approach to better understand species' response to climate change" Lawing et al (2016) <doi:10.1086/687202> "Including fossils in phylogenetic climate reconstructions: A deep time perspective on the climatic niche evolution and diversification of spiny lizards (Sceloporus)" Rivera et al (2020) <doi:10.1111/jbi.13915> "Reconstructing historical shifts in suitable habitat of Sceloporus lineages using phylogenetic niche modelling.".

Maintained by Alexandra Howard. Last updated 4 days ago.

4.0 match 3.00 score

fauvernierma

survPen:Multidimensional Penalized Splines for (Excess) Hazard Models, Relative Mortality Ratio Models and Marginal Intensity Models

Fits (excess) hazard, relative mortality ratio or marginal intensity models with multidimensional penalized splines allowing for time-dependent effects, non-linear effects and interactions between several continuous covariates. In survival and net survival analysis, in addition to modelling the effect of time (via the baseline hazard), one has often to deal with several continuous covariates and model their functional forms, their time-dependent effects, and their interactions. Model specification becomes therefore a complex problem and penalized regression splines represent an appealing solution to that problem as splines offer the required flexibility while penalization limits overfitting issues. Current implementations of penalized survival models can be slow or unstable and sometimes lack some key features like taking into account expected mortality to provide net survival and excess hazard estimates. In contrast, survPen provides an automated, fast, and stable implementation (thanks to explicit calculation of the derivatives of the likelihood) and offers a unified framework for multidimensional penalized hazard and excess hazard models. Later versions (>2.0.0) include penalized models for relative mortality ratio, and marginal intensity in recurrent event setting. survPen may be of interest to those who 1) analyse any kind of time-to-event data: mortality, disease relapse, machinery breakdown, unemployment, etc 2) wish to describe the associated hazard and to understand which predictors impact its dynamics, 3) wish to model the relative mortality ratio between a cohort and a reference population, 4) wish to describe the marginal intensity for recurrent event data. See Fauvernier et al. (2019a) <doi:10.21105/joss.01434> for an overview of the package and Fauvernier et al. (2019b) <doi:10.1111/rssc.12368> for the method.

Maintained by Mathieu Fauvernier. Last updated 3 months ago.

cpp

1.8 match 12 stars 6.82 score 85 scripts 1 dependents

bblonder

netassoc:Inference of Species Associations from Co-Occurrence Data

Infers species associations from community matrices. Uses local and (optional) regional-scale co-occurrence data by comparing observed partial correlation coefficients between species to those estimated from regional species distributions. Extends Gaussian graphical models to a null modeling framework. Provides interface to a variety of inverse covariance matrix estimation methods.

Maintained by Benjamin Blonder. Last updated 3 years ago.

5.2 match 2 stars 2.30 score 9 scripts

hemingnm

SESraster:Raster Randomization for Null Hypothesis Testing

Randomization of presence/absence species distribution raster data with or without including spatial structure for calculating standardized effect sizes and testing null hypothesis. The randomization algorithms are based on classical algorithms for matrices (Gotelli 2000, <doi:10.2307/177478>) implemented for raster data.

Maintained by Neander Marcel Heming. Last updated 5 months ago.

null-models randomization raster spatial spatial-analysis species-distribution-modelling

1.8 match 7 stars 6.61 score 32 scripts 2 dependents

cysouw

qlcMatrix:Utility Sparse Matrix Functions for Quantitative Language Comparison

Extension of the functionality of the 'Matrix' package for using sparse matrices. Some of the functions are very general, while other are highly specific for special data format as used for quantitative language comparison.

Maintained by Michael Cysouw. Last updated 9 months ago.

1.7 match 6 stars 6.98 score 256 scripts 1 dependents

philipmostert

PointedSDMs:Fit Models Derived from Point Processes to Species Distributions using 'inlabru'

Integrated species distribution modeling is a rising field in quantitative ecology thanks to significant rises in the quantity of data available, increases in computational speed and the proven benefits of using such models. Despite this, the general software to help ecologists construct such models in an easy-to-use framework is lacking. We therefore introduce the R package 'PointedSDMs': which provides the tools to help ecologists set up integrated models and perform inference on them. There are also functions within the package to help run spatial cross-validation for model selection, as well as generic plotting and predicting functions. An introduction to these methods is discussed in Issac, Jarzyna, Keil, Dambly, Boersch-Supan, Browning, Freeman, Golding, Guillera-Arroita, Henrys, Jarvis, Lahoz-Monfort, Pagel, Pescott, Schmucki, Simmonds and O’Hara (2020) <doi:10.1016/j.tree.2019.08.006>.

Maintained by Philip Mostert. Last updated 2 months ago.

1.3 match 25 stars 8.57 score 50 scripts 1 dependents

red-list-ecosystem

redlistr:Tools for the IUCN Red List of Ecosystems and Species

A toolbox created by members of the International Union for Conservation of Nature (IUCN) Red List of Ecosystems Committee for Scientific Standards. Primarily, it is a set of tools suitable for calculating the metrics required for making assessments of species and ecosystems against the IUCN Red List of Threatened Species and the IUCN Red List of Ecosystems categories and criteria. See the IUCN website for detailed guidelines, the criteria, publications and other information.

Maintained by Calvin Lee. Last updated 1 years ago.

1.8 match 32 stars 6.35 score 35 scripts

njlyon0

supportR:Support Functions for Wrangling and Visualization

Suite of helper functions for data wrangling and visualization. The only theme for these functions is that they tend towards simple, short, and narrowly-scoped. These functions are built for tasks that often recur but are not large enough in scope to warrant an ecosystem of interdependent functions.

Maintained by Nicholas J Lyon. Last updated 4 months ago.

data-science

1.8 match 5 stars 6.22 score 15 scripts

microsoft

wpa:Tools for Analysing and Visualising Viva Insights Data

Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.

Maintained by Martin Chan. Last updated 4 months ago.

workplace-analytics

1.7 match 30 stars 6.69 score 39 scripts 1 dependents

jsanchezalv

WARDEN:Workflows for Health Technology Assessments in R using Discrete EveNts

Toolkit to support and perform discrete event simulations without resource constraints in the context of health technology assessments (HTA). The package focuses on cost-effectiveness modelling and aims to be submission-ready to relevant HTA bodies in alignment with 'NICE TSD 15' <https://www.sheffield.ac.uk/nice-dsu/tsds/patient-level-simulation>. More details an examples can be found in the package website <https://jsanchezalv.github.io/WARDEN/>.

Maintained by Javier Sanchez Alvarez. Last updated 3 months ago.

1.7 match 6 stars 6.69 score 9 scripts

matildabrown

rWCVP:Generating Summaries, Reports and Plots from the World Checklist of Vascular Plants

A companion to the World Checklist of Vascular Plants (WCVP). It includes functions to generate maps and species lists, as well as match names to the WCVP. For more details and to cite the package, see: Brown M.J.M., Walker B.E., Black N., Govaerts R., Ondo I., Turner R., Nic Lughadha E. (in press). "rWCVP: A companion R package to the World Checklist of Vascular Plants". New Phytologist.

Maintained by Matilda Brown. Last updated 1 years ago.

1.8 match 22 stars 6.17 score 45 scripts 1 dependents

mt1022

cubar:Codon Usage Bias Analysis

A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.

Maintained by Hong Zhang. Last updated 3 months ago.

bioinformatics codon-usage machine-learning sequence-analysis

1.9 match 6 stars 5.82 score 8 scripts

fabrice-rossi

mixvlmc:Variable Length Markov Chains with Covariates

Estimates Variable Length Markov Chains (VLMC) models and VLMC with covariates models from discrete sequences. Supports model selection via information criteria and simulation of new sequences from an estimated model. See Bühlmann, P. and Wyner, A. J. (1999) <doi:10.1214/aos/1018031204> for VLMC and Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022) <doi:10.1111/jtsa.12615> for VLMC with covariates.

Maintained by Fabrice Rossi. Last updated 10 months ago.

machine-learning markov-chain markov-model statistics time-series cpp

1.8 match 2 stars 6.23 score 20 scripts

bioc

dominoSignal:Cell Communication Analysis for Single Cell RNA Sequencing

dominoSignal is a package developed to analyze cell signaling through ligand - receptor - transcription factor networks in scRNAseq data. It takes as input information transcriptomic data, requiring counts, z-scored counts, and cluster labels, as well as information on transcription factor activation (such as from SCENIC) and a database of ligand and receptor pairings (such as from CellPhoneDB). This package creates an object storing ligand - receptor - transcription factor linkages by cluster and provides several methods for exploring, summarizing, and visualizing the analysis.

Maintained by Jacob T Mitchell. Last updated 5 months ago.

systemsbiology singlecell transcriptomics network

1.7 match 5 stars 6.50 score 5 scripts

danlwarren

ENMTools:Analysis of Niche Evolution using Niche and Distribution Models

Constructing niche models and analyzing patterns of niche evolution. Acts as an interface for many popular modeling algorithms, and allows users to conduct Monte Carlo tests to address basic questions in evolutionary ecology and biogeography. Warren, D.L., R.E. Glor, and M. Turelli (2008) <doi:10.1111/j.1558-5646.2008.00482.x> Glor, R.E., and D.L. Warren (2011) <doi:10.1111/j.1558-5646.2010.01177.x> Warren, D.L., R.E. Glor, and M. Turelli (2010) <doi:10.1111/j.1600-0587.2009.06142.x> Cardillo, M., and D.L. Warren (2016) <doi:10.1111/geb.12455> D.L. Warren, L.J. Beaumont, R. Dinnage, and J.B. Baumgartner (2019) <doi:10.1111/ecog.03900>.

Maintained by Dan Warren. Last updated 2 months ago.

1.6 match 105 stars 6.91 score 126 scripts