Showing 102 of total 102 results (show query)
business-science
anomalize:Tidy Anomaly Detection
The 'anomalize' package enables a "tidy" workflow for detecting anomalies in data. The main functions are time_decompose(), anomalize(), and time_recompose(). When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the "normal" data from the anomalous data at scale (i.e. for multiple time series). Time series decomposition is used to remove trend and seasonal components via the time_decompose() function and methods include seasonal decomposition of time series by Loess ("stl") and seasonal decomposition by piecewise medians ("twitter"). The anomalize() function implements two methods for anomaly detection of residuals including using an inner quartile range ("iqr") and generalized extreme studentized deviation ("gesd"). These methods are based on those used in the 'forecast' package and the Twitter 'AnomalyDetection' package. Refer to the associated functions for specific references for these methods.
Maintained by Matt Dancho. Last updated 1 years ago.
anomalyanomaly-detectiondecompositiondetect-anomaliesiqrtime-series
50.5 match 339 stars 9.56 score 332 scriptscefet-rj-dal
harbinger:A Unified Time Series Event Detection Framework
By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.
Maintained by Eduardo Ogasawara. Last updated 3 months ago.
33.7 match 18 stars 8.32 score 216 scriptswaternumbers
anomalous:Anomaly Detection using the CAPA and PELT Algorithms
Implimentations of the univariate CAPA <doi:10.1002/sam.11586> and PELT <doi:10.1080/01621459.2012.737745> algotithms along with various cost functions for different distributions and models. The modular design, using R6 classes, favour ease of extension (for example user written cost functions) over the performance of other implimentations (e.g. <doi:10.32614/CRAN.package.changepoint>, <doi:10.32614/CRAN.package.anomaly>).
Maintained by Paul Smith. Last updated 3 months ago.
52.3 match 4.61 score 18 scriptsbusiness-science
timetk:A Tool Kit for Working with Time Series
Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.
Maintained by Matt Dancho. Last updated 1 years ago.
coercioncoercion-functionsdata-miningdplyrforecastforecastingforecasting-modelsmachine-learningseries-decompositionseries-signaturetibbletidytidyquanttidyversetimetime-seriestimeseries
16.0 match 625 stars 14.15 score 4.0k scripts 16 dependentsdankelley
oce:Analysis of Oceanographic Data
Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Maintained by Dan Kelley. Last updated 1 days ago.
14.2 match 146 stars 15.42 score 4.2k scripts 18 dependentsteos-10
gsw:Gibbs Sea Water Functions
Provides an interface to the Gibbs 'SeaWater' ('TEOS-10') C library, version 3.06-16-0 (commit '657216dd4f5ea079b5f0e021a4163e2d26893371', dated 2022-10-11, available at <https://github.com/TEOS-10/GSW-C>, which stems from 'Matlab' and other code written by members of Working Group 127 of 'SCOR'/'IAPSO' (Scientific Committee on Oceanic Research / International Association for the Physical Sciences of the Oceans).
Maintained by Dan Kelley. Last updated 8 days ago.
gibbsoceanographyseawaterteos-10
19.5 match 8 stars 8.53 score 286 scripts 19 dependentsbioc
scDiagnostics:Cell type annotation diagnostics
The scDiagnostics package provides diagnostic plots to assess the quality of cell type assignments from single cell gene expression profiles. The implemented functionality allows to assess the reliability of cell type annotations, investigate gene expression patterns, and explore relationships between different cell types in query and reference datasets allowing users to detect potential misalignments between reference and query datasets. The package also provides visualization capabilities for diagnostics purposes.
Maintained by Anthony Christidis. Last updated 5 months ago.
annotationclassificationclusteringgeneexpressionrnaseqsinglecellsoftwaretranscriptomics
14.0 match 8 stars 7.77 score 46 scriptsdavid-cortes
isotree:Isolation-Based Outlier Detection
Fast and multi-threaded implementation of isolation forest (Liu, Ting, Zhou (2008) <doi:10.1109/ICDM.2008.17>), extended isolation forest (Hariri, Kind, Brunner (2018) <doi:10.48550/arXiv.1811.02141>), SCiForest (Liu, Ting, Zhou (2010) <doi:10.1007/978-3-642-15883-4_18>), fair-cut forest (Cortes (2021) <doi:10.48550/arXiv.2110.13402>), robust random-cut forest (Guha, Mishra, Roy, Schrijvers (2016) <http://proceedings.mlr.press/v48/guha16.html>), and customizable variations of them, for isolation-based outlier detection, clustered outlier detection, distance or similarity approximation (Cortes (2019) <doi:10.48550/arXiv.1910.12362>), isolation kernel calculation (Ting, Zhu, Zhou (2018) <doi:10.1145/3219819.3219990>), and imputation of missing values (Cortes (2019) <doi:10.48550/arXiv.1911.06646>), based on random or guided decision tree splitting, and providing different metrics for scoring anomalies based on isolation depth or density (Cortes (2021) <doi:10.48550/arXiv.2111.11639>). Provides simple heuristics for fitting the model to categorical columns and handling missing data, and offers options for varying between random and guided splits, and for using different splitting criteria.
Maintained by David Cortes. Last updated 15 days ago.
anomaly-detectionimputationisolation-forestoutlier-detectioncppopenmp
8.0 match 203 stars 10.41 score 115 scripts 6 dependentseliocamp
metR:Tools for Easier Analysis of Meteorological Fields
Many useful functions and extensions for dealing with meteorological data in the tidy data framework. Extends 'ggplot2' for better plotting of scalar and vector fields and provides commonly used analysis methods in the atmospheric sciences.
Maintained by Elio Campitelli. Last updated 21 days ago.
atmospheric-scienceggplot2visualization
6.0 match 144 stars 12.19 score 1000 scripts 22 dependentscran
anomaly:Detecting Anomalies in Data
Implements Collective And Point Anomaly (CAPA) Fisch, Eckley, and Fearnhead (2022) <doi:10.1002/sam.11586>, Multi-Variate Collective And Point Anomaly (MVCAPA) Fisch, Eckley, and Fearnhead (2021) <doi:10.1080/10618600.2021.1987257>, Proportion Adaptive Segment Selection (PASS) Jeng, Cai, and Li (2012) <doi:10.1093/biomet/ass059>, and Bayesian Abnormal Region Detector (BARD) Bardwell and Fearnhead (2015) <doi:10.1214/16-BA998>. These methods are for the detection of anomalies in time series data. Further information regarding the use of this package along with detailed examples can be found in Fisch, Grose, Eckley, Fearnhead, and Bardwell (2024) <doi:10.18637/jss.v110.i01>.
Maintained by Daniel Grose. Last updated 7 months ago.
63.5 match 1 stars 1.00 scorenixtla
nixtlar:A Software Development Kit for 'Nixtla''s 'TimeGPT'
A Software Development Kit for working with 'Nixtla''s 'TimeGPT', a foundation model for time series forecasting. 'API' is an acronym for 'application programming interface'; this package allows users to interact with 'TimeGPT' via the 'API'. You can set and validate 'API' keys and generate forecasts via 'API' calls. It is compatible with 'tsibble' and base R. For more details visit <https://docs.nixtla.io/>.
Maintained by Mariana Menchero. Last updated 28 days ago.
7.7 match 30 stars 8.16 score 38 scriptsbflammers
ANN2:Artificial Neural Networks for Anomaly Detection
Training of neural networks for classification and regression tasks using mini-batch gradient descent. Special features include a function for training autoencoders, which can be used to detect anomalies, and some related plotting functions. Multiple activation functions are supported, including tanh, relu, step and ramp. For the use of the step and ramp activation functions in detecting anomalies using autoencoders, see Hawkins et al. (2002) <doi:10.1007/3-540-46145-0_17>. Furthermore, several loss functions are supported, including robust ones such as Huber and pseudo-Huber loss, as well as L1 and L2 regularization. The possible options for optimization algorithms are RMSprop, Adam and SGD with momentum. The package contains a vectorized C++ implementation that facilitates fast training through mini-batch learning.
Maintained by Bart Lammers. Last updated 4 years ago.
anomaly-detectionartificial-neural-networksautoencodersneural-networksrobust-statisticsopenblascppopenmp
11.3 match 13 stars 5.59 score 60 scriptsrobjhyndman
weird:Functions and Data Sets for "That's Weird: Anomaly Detection Using R" by Rob J Hyndman
All functions and data sets required for the examples in the book Hyndman (2024) "That's Weird: Anomaly Detection Using R" <https://OTexts.com/weird/>. All packages needed to run the examples are also loaded.
Maintained by Rob Hyndman. Last updated 3 months ago.
10.8 match 15 stars 5.71 score 18 scriptspridiltal
stray:Anomaly Detection in High Dimensional and Temporal Data
This is a modification of 'HDoutliers' package. The 'HDoutliers' algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. This package implements the algorithm proposed in Talagala, Hyndman and Smith-Miles (2019) <arXiv:1908.04000> for detecting anomalies in high-dimensional data that addresses these limitations of 'HDoutliers' algorithm. We define an anomaly as an observation that deviates markedly from the majority with a large distance gap. An approach based on extreme value theory is used for the anomalous threshold calculation.
Maintained by Priyanga Dilini Talagala. Last updated 1 years ago.
10.9 match 58 stars 5.47 score 34 scripts 1 dependentsdbolotov
amelie:Anomaly Detection with Normal Probability Functions
Implements anomaly detection as binary classification for cross-sectional data. Uses maximum likelihood estimates and normal probability functions to classify observations as anomalous. The method is presented in the following lecture from the Machine Learning course by Andrew Ng: <https://www.coursera.org/learn/machine-learning/lecture/C8IJp/algorithm/>, and is also described in: Aleksandar Lazarevic, Levent Ertoz, Vipin Kumar, Aysel Ozgur, Jaideep Srivastava (2003) <doi:10.1137/1.9781611972733.3>.
Maintained by Dmitriy Bolotov. Last updated 6 years ago.
15.8 match 3.70 score 6 scriptsdavid-cortes
outliertree:Explainable Outlier Detection Through Decision Tree Conditioning
Outlier detection method that flags suspicious values within observations, constrasting them against the normal values in a user-readable format, potentially describing conditions within the data that make a given outlier more rare. Full procedure is described in Cortes (2020) <doi:10.48550/arXiv.2001.00636>. Loosely based on the 'GritBot' <https://www.rulequest.com/gritbot-info.html> software.
Maintained by David Cortes. Last updated 2 months ago.
anomaly-detectionoutlier-detectioncppopenmp
7.5 match 58 stars 7.34 score 21 scripts 2 dependentsdicook
mulgar:Functions for Pre-Processing Data for Multivariate Data Visualisation using Tours
This is a companion to the book Cook, D. and Laa, U. (2023) <https://dicook.github.io/mulgar_book/> "Interactively exploring high-dimensional data and models in R". by Cook and Laa. It contains useful functions for processing data in preparation for visualising with a tour. There are also several sample data sets.
Maintained by Dianne Cook. Last updated 2 months ago.
12.0 match 4 stars 4.50 score 79 scriptsggobi
tourr:Tour Methods for Multivariate Data Visualisation
Implements geodesic interpolation and basis generation functions that allow you to create new tour methods from R.
Maintained by Dianne Cook. Last updated 17 days ago.
4.1 match 65 stars 11.17 score 426 scripts 9 dependentshaydarde
dLagM:Time Series Regression Models with Distributed Lag Models
Provides time series regression models with one predictor using finite distributed lag models, polynomial (Almon) distributed lag models, geometric distributed lag models with Koyck transformation, and autoregressive distributed lag models. It also consists of functions for computation of h-step ahead forecasts from these models. See Demirhan (2020)(<doi:10.1371/journal.pone.0228812>) and Baltagi (2011)(<doi:10.1007/978-3-642-20059-5>) for more information.
Maintained by Haydar Demirhan. Last updated 1 years ago.
13.3 match 2 stars 3.18 score 127 scriptscvxgrp
CVXR:Disciplined Convex Optimization
An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.
Maintained by Anqi Fu. Last updated 4 months ago.
3.2 match 207 stars 12.89 score 768 scripts 51 dependentslucasvenez
precintcon:Precipitation Intensity, Concentration and Anomaly Analysis
It contains functions to analyze the precipitation intensity, concentration and anomaly.
Maintained by Lucas Venezian Povoa. Last updated 9 years ago.
9.2 match 10 stars 4.28 score 38 scriptsbioc
GWASTools:Tools for Genome Wide Association Studies
Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.
Maintained by Stephanie M. Gogarten. Last updated 5 months ago.
snpgeneticvariabilityqualitycontrolmicroarray
3.6 match 17 stars 10.50 score 396 scripts 5 dependentsemilio-berti
GHCNr:Download Weather Station Data from GHCNd
The goal of 'GHCNr' is to provide a fast and friendly interface with the Global Historical Climatology Network daily (GHCNd) database, which contains daily summaries of weather station data worldwide (<https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily>). GHCNd is accessed through the web API <https://www.ncei.noaa.gov/access/services/data/v1>. 'GHCNr' main functionalities consist of downloading data from GHCNd, filter it, and to aggregate it at monthly and annual scales.
Maintained by Emilio Berti. Last updated 2 months ago.
7.5 match 2 stars 4.95 score 3 scriptsvitomuggeo
segmented:Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects)
Fitting regression models where, in addition to possible linear terms, one or more covariates have segmented (i.e., broken-line or piece-wise linear) or stepmented (i.e. piece-wise constant) effects. Multiple breakpoints for the same variable are allowed. The estimation method is discussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estimation for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). Segmented mixed models, i.e. random effects in the change point, are discussed in Muggeo (2014, <doi:10.1177/1471082X13504721>). Estimation of piecewise-constant relationships and changepoints (mean-shift models) is discussed in Fasola et al. (2018, <doi:10.1007/s00180-017-0740-4>).
Maintained by Vito M. R. Muggeo. Last updated 16 days ago.
3.6 match 9 stars 10.03 score 1.2k scripts 203 dependentscran
datarobot:'DataRobot' Predictive Modeling API
For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.
Maintained by AJ Alon. Last updated 1 years ago.
10.3 match 2 stars 3.48 scoretomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
4.0 match 3 stars 8.20 score 7.8k scripts 11 dependentsal-obrien
spectralAnomaly:Detect Anomalies Using the Spectral Residual Algorithm
Apply the spectral residual algorithm to data, such as a time series, to detect anomalies. Anomaly scores can be used to determine outliers based upon a threshold or fed into more sophisticated prediction models. Methods are based upon "Time-Series Anomaly Detection Service at Microsoft", Ren, H., Xu, B., Wang, Y., et al., (2019) <doi:10.48550/arXiv.1906.03821>.
Maintained by Allen OBrien. Last updated 7 days ago.
9.3 match 2 stars 3.48 score 3 scriptszcebeci
odetector:Outlier Detection Using Partitioning Clustering Algorithms
An object is called "outlier" if it remarkably deviates from the other objects in a data set. Outlier detection is the process to find outliers by using the methods that are based on distance measures, clustering and spatial methods (Ben-Gal, 2005 <ISBN 0-387-24435-2>). It is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for outlier removing in data processing. This package provides the implementations of some novel approaches to detect the outliers based on typicality degrees that are obtained with the soft partitioning clustering algorithms such as Fuzzy C-means and its variants.
Maintained by Zeynel Cebeci. Last updated 2 years ago.
anomaly-detectioncluster-analysisclusteringclustering-methodsdatadatapreparationdatapreprocessingexception-handlingfcmfraud-detectionfuzzy-clusteringnovelty-detectionoutlier-detectionoutlier-removaloutlierspartitioningpcmsurprise-exploration
8.0 match 3.70 score 4 scriptsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 6 days ago.
1.7 match 520 stars 16.52 score 1.4k scripts 38 dependentscran
s2dv:A Set of Common Tools for Seasonal to Decadal Verification
The advanced version of package 's2dverification'. It is intended for 'seasonal to decadal' (s2d) climate forecast verification, but it can also be used in other kinds of forecasts or general climate analysis. This package is specially designed for the comparison between the experimental and observational datasets. The functionality of the included functions covers from data retrieval, data post-processing, skill scores against observation, to visualization. Compared to 's2dverification', 's2dv' is more compatible with the package 'startR', able to use multiple cores for computation and handle multi-dimensional arrays with a higher flexibility. The CDO version used in development is 1.9.8.
Maintained by Ariadna Batalla. Last updated 5 months ago.
14.2 match 1.95 score 3 dependentssicarul
xray:X Ray Vision on your Datasets
Tools to analyze datasets previous to any statistical modeling. Has various functions designed to find inconsistencies and understanding the distribution of the data.
Maintained by Pablo Seibelt. Last updated 7 years ago.
4.9 match 75 stars 5.60 score 35 scriptskrahim
multitaper:Spectral Analysis Tools using the Multitaper Method
Implements multitaper spectral analysis using discrete prolate spheroidal sequences (Slepians) and sine tapers. It includes an adaptive weighted multitaper spectral estimate, a coherence estimate, Thomson's Harmonic F-test, and complex demodulation. The Slepians sequences are generated efficiently using a tridiagonal matrix solution, and jackknifed confidence intervals are available for most estimates. This package is an implementation of the method described in D.J. Thomson (1982) "Spectrum estimation and harmonic analysis" <doi:10.1109/PROC.1982.12433>.
Maintained by Karim Rahim. Last updated 8 months ago.
3.4 match 10 stars 7.62 score 67 scripts 26 dependentscran
ClimProjDiags:Set of Tools to Compute Various Climate Indices
Set of tools to compute metrics and indices for climate analysis. The package provides functions to compute extreme indices, evaluate the agreement between models and combine theses models into an ensemble. Multi-model time series of climate indices can be computed either after averaging the 2-D fields from different models provided they share a common grid or by combining time series computed on the model native grid. Indices can be assigned weights and/or combined to construct new indices.
Maintained by Victòria Agudetse. Last updated 1 years ago.
5.1 match 5.14 score 58 scripts 4 dependentssevvandi
oddnet:Anomaly Detection in Temporal Networks
Anomaly detection in dynamic, temporal networks. The package 'oddnet' uses a feature-based method to identify anomalies. First, it computes many features for each network. Then it models the features using time series methods. Using time series residuals it detects anomalies. This way, the temporal dependencies are accounted for when identifying anomalies (Kandanaarachchi, Hyndman 2022) <arXiv:2210.07407>.
Maintained by Sevvandi Kandanaarachchi. Last updated 10 months ago.
6.2 match 3 stars 4.22 score 11 scriptshoxo-m
sGMRFmix:Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection
An implementation of sparse Gaussian Markov random field mixtures presented by Ide et al. (2016) <doi:10.1109/ICDM.2016.0119>. It provides a novel anomaly detection method for multivariate noisy sensor data. It can automatically handle multiple operational modes. And it can also compute variable-wise anomaly scores.
Maintained by Koji Makiyama. Last updated 7 years ago.
10.9 match 1 stars 2.34 score 22 scriptsecor
RGENERATE:Tools to Generate Vector Time Series
A method 'generate()' is implemented in this package for the random generation of vector time series according to models obtained by 'RMAWGEN', 'vars' or other packages. This package was created to generalize the algorithms of the 'RMAWGEN' package for the analysis and generation of any environmental vector time series.
Maintained by Emanuele Cordano. Last updated 7 months ago.
5.6 match 1 stars 4.38 score 16 scripts 1 dependentspromerpr
scanstatistics:Space-Time Anomaly Detection using Scan Statistics
Detection of anomalous space-time clusters using the scan statistics methodology. Focuses on prospective surveillance of data streams, scanning for clusters with ongoing anomalies. Hypothesis testing is made possible by Monte Carlo simulation. Allévius (2018) <doi:10.21105/joss.00515>.
Maintained by Paul Romer Present. Last updated 2 years ago.
5.1 match 1 stars 4.81 score 43 scriptswch
gcookbook:Data for "R Graphics Cookbook"
Data sets used in the book "R Graphics Cookbook" by Winston Chang, published by O'Reilly Media.
Maintained by Winston Chang. Last updated 6 years ago.
3.4 match 10 stars 6.77 score 1.3k scripts 1 dependentsandrewzm
EFDR:Wavelet-Based Enhanced FDR for Detecting Signals from Complete or Incomplete Spatially Aggregated Data
Enhanced False Discovery Rate (EFDR) is a tool to detect anomalies in an image. The image is first transformed into the wavelet domain in order to decorrelate any noise components, following which the coefficients at each resolution are standardised. Statistical tests (in a multiple hypothesis testing setting) are then carried out to find the anomalies. The power of EFDR exceeds that of standard FDR, which would carry out tests on every wavelet coefficient: EFDR choose which wavelets to test based on a criterion described in Shen et al. (2002). The package also provides elementary tools to interpolate spatially irregular data onto a grid of the required size. The work is based on Shen, X., Huang, H.-C., and Cressie, N. 'Nonparametric hypothesis testing for a spatial signal.' Journal of the American Statistical Association 97.460 (2002): 1122-1140.
Maintained by Andrew Zammit-Mangion. Last updated 2 years ago.
4.4 match 5 stars 4.74 score 22 scriptssbshah10
discharge:Fourier Analysis of Discharge Data
Computes discrete fast Fourier transform of river discharge data and the derived metrics. The methods are described in J. L. Sabo, D. M. Post (2008) <doi:10.1890/06-1340.1> and J. L. Sabo, A. Ruhi, G. W. Holtgrieve, V. Elliott, M. E. Arias, P. B. Ngor, T. A. Räsänsen, S. Nam (2017) <doi:10.1126/science.aao1053>.
Maintained by Samarth Shah. Last updated 6 years ago.
12.5 match 1.56 score 36 scriptsjhmaindonald
gamclass:Functions and Data for a Course on Modern Regression and Classification
Functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.
Maintained by John Maindonald. Last updated 2 years ago.
4.0 match 4.82 score 44 scriptsbenrenard
sequenceR:A Simple Sequencer for Data Sonification
A rudimentary sequencer to define, manipulate and mix sound samples. The underlying motivation is to sonify data, as demonstrated in the blog <https://globxblog.github.io/>, the presentation by Renard and Le Bescond (2022, <https://hal.science/hal-03710340v1>) or the poster by Renard et al. (2023, <https://hal.inrae.fr/hal-04388845v1>).
Maintained by Benjamin Renard. Last updated 2 months ago.
3.8 match 3 stars 5.13 score 15 scriptsjmotif
jmotif:Time Series Analysis Toolkit Based on Symbolic Aggregate Discretization, i.e. SAX
Implements time series z-normalization, SAX, HOT-SAX, VSM, SAX-VSM, RePair, and RRA algorithms facilitating time series motif (i.e., recurrent pattern), discord (i.e., anomaly), and characteristic pattern discovery along with interpretable time series classification.
Maintained by Pavel Senin. Last updated 2 years ago.
anomalydiscoverydiscorddiscretizationkddsaxsax-vsmtimeseriescpp
3.7 match 55 stars 5.12 score 48 scriptsfchamroukhi
meteorits:Mixture-of-Experts Modeling for Complex Non-Normal Distributions
Provides a unified mixture-of-experts (ME) modeling and estimation framework with several original and flexible ME models to model, cluster and classify heterogeneous data in many complex situations where the data are distributed according to non-normal, possibly skewed distributions, and when they might be corrupted by atypical observations. Mixtures-of-Experts models for complex and non-normal distributions ('meteorits') are originally introduced and written in 'Matlab' by Faicel Chamroukhi. The references are mainly the following ones. The references are mainly the following ones. Chamroukhi F., Same A., Govaert, G. and Aknin P. (2009) <doi:10.1016/j.neunet.2009.06.040>. Chamroukhi F. (2010) <https://chamroukhi.com/FChamroukhi-PhD.pdf>. Chamroukhi F. (2015) <arXiv:1506.06707>. Chamroukhi F. (2015) <https://chamroukhi.com/FChamroukhi-HDR.pdf>. Chamroukhi F. (2016) <doi:10.1109/IJCNN.2016.7727580>. Chamroukhi F. (2016) <doi:10.1016/j.neunet.2016.03.002>. Chamroukhi F. (2017) <doi:10.1016/j.neucom.2017.05.044>.
Maintained by Florian Lecocq. Last updated 5 years ago.
artificial-intelligenceclusteringem-algorithmmixture-of-expertsneural-networksnon-linear-regressionpredictionrobust-learningskew-normalskew-tskewed-datastatistical-inferencestatistical-learningt-distributionunsupervised-learningopenblascpp
3.3 match 3 stars 5.12 score 11 scriptscran
vectorsurvR:Data Access and Analytical Tools for 'VectorSurv' Users
Allows registered 'VectorSurv' <https://vectorsurv.org/> users access to data through the 'VectorSurv API' <https://api.vectorsurv.org/>. Additionally provides functions for analysis and visualization.
Maintained by Christina De Cesaris. Last updated 2 months ago.
5.2 match 3.30 scorecmsaf
cmsafvis:Tools to Visualize CM SAF NetCDF Data
The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The 'cmsafvis' R-package provides a collection of R-operators for the analysis and visualization of CM SAF NetCDF data. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).
Maintained by Steffen Kothe. Last updated 6 months ago.
3.5 match 2 stars 4.73 score 1 scripts 1 dependentsbenrwoodard
adobeanalyticsr:R Client for 'Adobe Analytics' API 2.0
Connect to the 'Adobe Analytics' API v2.0 <https://github.com/AdobeDocs/analytics-2.0-apis> which powers 'Analysis Workspace'. The package was developed with the analyst in mind, and it will continue to be developed with the guiding principles of iterative, repeatable, timely analysis.
Maintained by Ben Woodard. Last updated 2 months ago.
2.3 match 18 stars 7.02 score 39 scriptsbupaverse
daqapo:Data Quality Assessment for Process-Oriented Data
Provides a variety of methods to identify data quality issues in process-oriented data, which are useful to verify data quality in a process mining context. Builds on the class for activity logs implemented in the package 'bupaR'. Methods to identify data quality issues either consider each activity log entry independently (e.g. missing values, activity duration outliers,...), or focus on the relation amongst several activity log entries (e.g. batch registrations, violations of the expected activity order,...).
Maintained by Niels Martin. Last updated 3 years ago.
3.9 match 6 stars 3.78 score 9 scriptslabgrs
npphen:Vegetation Phenological Cycle and Anomaly Detection using Remote Sensing Data
Calculates phenological cycle and anomalies using a non-parametric approach applied to time series of vegetation indices derived from remote sensing data or field measurements. The package implements basic and high-level functions for manipulating vector data (numerical series) and raster data (satellite derived products). Processing of very large raster files is supported. For more information, please check the following paper: Chávez et al. (2023) <doi:10.3390/rs15010073>.
Maintained by José A. Lastra. Last updated 3 months ago.
3.3 match 5 stars 4.32 score 14 scriptsjsta
wql:Exploring Water Quality Monitoring Data
Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.
Maintained by Jemma Stachelek. Last updated 2 months ago.
1.9 match 12 stars 7.34 score 204 scripts 3 dependentshubbardalex
autostsm:Automatic Structural Time Series Models
Automatic model selection for structural time series decomposition into trend, cycle, and seasonal components, plus optionality for structural interpolation, using the Kalman filter. Koopman, Siem Jan and Marius Ooms (2012) "Forecasting Economic Time Series Using Unobserved Components Time Series Models" <doi:10.1093/oxfordhb/9780195398649.013.0006>. Kim, Chang-Jin and Charles R. Nelson (1999) "State-Space Models with Regime Switching: Classical and Gibbs-Sampling Approaches with Applications" <doi:10.7551/mitpress/6444.001.0001><http://econ.korea.ac.kr/~cjkim/>.
Maintained by Alex Hubbard. Last updated 9 months ago.
3.8 match 3.55 score 29 scriptscran
dslabs:Data Science Labs
Datasets and functions that can be used for data analysis practice, homework and projects in data science courses and workshops. 26 datasets are available for case studies in data visualization, statistical inference, modeling, linear regression, data wrangling and machine learning.
Maintained by Rafael A. Irizarry. Last updated 1 years ago.
3.4 match 3.56 score 2 dependentscran
AnomalyScore:Anomaly Scoring for Multivariate Time Series
Compute an anomaly score for multivariate time series based on the k-nearest neighbors algorithm. Different computations of distances between time series are provided.
Maintained by Guillermo Granados. Last updated 4 months ago.
7.0 match 1.70 score 1 scriptsjmhewitt
telefit:Estimation and Prediction for Remote Effects Spatial Process Models
Implementation of the remote effects spatial process (RESP) model for teleconnection. The RESP model is a geostatistical model that allows a spatially-referenced variable (like average precipitation) to be influenced by covariates defined on a remote domain (like sea surface temperatures). The RESP model is introduced in Hewitt et al. (2018) <doi:10.1002/env.2523>. Sample code for working with the RESP model is available at <https://jmhewitt.github.io/research/resp_example>. This material is based upon work supported by the National Science Foundation under grant number AGS 1419558. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Maintained by Joshua Hewitt. Last updated 5 years ago.
3.8 match 1 stars 3.19 score 31 scriptsjmzobitz
demodelr:Simulating Differential Equations with Data
Designed to support the visualization, numerical computation, qualitative analysis, model-data fusion, and stochastic simulation for autonomous systems of differential equations. Euler and Runge-Kutta methods are implemented, along with tools to visualize the two-dimensional phaseplane. Likelihood surfaces and a simple Markov Chain Monte Carlo parameter estimator can be used for model-data fusion of differential equations and empirical models. The Euler-Maruyama method is provided for simulation of stochastic differential equations. The package was originally written for internal use to support teaching by Zobitz, and refined to support the text "Exploring modeling with data and differential equations using R" by John Zobitz (2021) <https://jmzobitz.github.io/ModelingWithR/index.html>.
Maintained by John Zobitz. Last updated 17 hours ago.
3.5 match 4 stars 3.34 score 11 scriptsbeanumber
macleish:Retrieve Data from MacLeish Field Station
Download data from the Ada and Archibald MacLeish Field Station in Whately, MA. The Ada and Archibald MacLeish Field Station is a 260-acre patchwork of forest and farmland located in West Whately, MA that provides opportunities for faculty and students to pursue environmental research, outdoor education, and low-impact recreation (see <https://www.smith.edu/about-smith/sustainable-smith/macleish> for more information). This package contains weather data over several years, and spatial data on various man-made and natural structures.
Maintained by Benjamin S. Baumer. Last updated 3 years ago.
2.0 match 2 stars 5.72 score 87 scriptskryberg-usgs
waterData:Retrieval, Analysis, and Anomaly Calculation of Daily Hydrologic Time Series Data
Imports U.S. Geological Survey (USGS) daily hydrologic data from USGS web services (see <https://waterservices.usgs.gov/> for more information), plots the data, addresses some common data problems, and calculates and plots anomalies.
Maintained by Karen R. Ryberg. Last updated 8 years ago.
3.3 match 3.45 score 56 scriptstim-salabim
remote:Empirical Orthogonal Teleconnections in R
Empirical orthogonal teleconnections in R. 'remote' is short for 'R(-based) EMpirical Orthogonal TEleconnections'. It implements a collection of functions to facilitate empirical orthogonal teleconnection analysis. Empirical Orthogonal Teleconnections (EOTs) denote a regression based approach to decompose spatio-temporal fields into a set of independent orthogonal patterns. They are quite similar to Empirical Orthogonal Functions (EOFs) with EOTs producing less abstract results. In contrast to EOFs, which are orthogonal in both space and time, EOT analysis produces patterns that are orthogonal in either space or time.
Maintained by Tim Appelhans. Last updated 8 years ago.
4.0 match 2.79 score 100 scriptsallen-1242
StructuralDecompose:Decomposes a Level Shifted Time Series
Explains the behavior of a time series by decomposing it into its trend, seasonality and residuals. It is built to perform very well in the presence of significant level shifts. It is designed to play well with any breakpoint algorithm and any smoothing algorithm. Currently defaults to 'lowess' for smoothing and 'strucchange' for breakpoint identification. The package is useful in areas such as trend analysis, time series decomposition, breakpoint identification and anomaly detection.
Maintained by Allen Sunny. Last updated 2 years ago.
decompositiontimeseries-analysis
2.5 match 1 stars 4.18 score 4 scriptsjomopo
BINCOR:Estimate the Correlation Between Two Irregular Time Series
Estimate the correlation between two irregular time series that are not necessarily sampled on identical time points. This program is also applicable to the situation of two evenly spaced time series that are not on the same time grid. 'BINCOR' is based on a novel estimation approach proposed by Mudelsee (2010, 2014) to estimate the correlation between two climate time series with different timescales. The idea is that autocorrelation (AR1 process) allows to correlate values obtained on different time points. 'BINCOR' contains four functions: bin_cor() (the main function to build the binned time series), plot_ts() (to plot and compare the irregular and binned time series, cor_ts() (to estimate the correlation between the binned time series) and ccf_ts() (to estimate the cross-correlation between the binned time series).
Maintained by Josué M. Polanco-Martínez. Last updated 7 years ago.
6.8 match 1.48 score 5 scriptsropensci
rsat:Dealing with Multiplatform Satellite Images
Downloading, customizing, and processing time series of satellite images for a region of interest. 'rsat' functions allow a unified access to multispectral images from Landsat, MODIS and Sentinel repositories. 'rsat' also offers capabilities for customizing satellite images, such as tile mosaicking, image cropping and new variables computation. Finally, 'rsat' covers the processing, including cloud masking, compositing and gap-filling/smoothing time series of images (Militino et al., 2018 <doi:10.3390/rs10030398> and Militino et al., 2019 <doi:10.1109/TGRS.2019.2904193>).
Maintained by Unai Pérez - Goya. Last updated 11 months ago.
1.3 match 54 stars 7.45 score 52 scriptsatsa-es
atsalibrary:Packages, data and scripts for ATSA course and lab book
This package will load the needed packages and data files for the ATSA course material when students install from GitHub.
Maintained by Elizabeth E. Holmes. Last updated 2 years ago.
3.5 match 4 stars 2.41 score 13 scriptsinbo
inlatools:Diagnostic Tools for INLA Models
Several functions which can be useful to choose sensible priors and diagnose the fitted model.
Maintained by Thierry Onkelinx. Last updated 5 months ago.
bayesian-statisticsgplv3inlamixed-modelsmodel-checkingmodel-validation
1.9 match 4 stars 4.41 score 43 scriptstxwri
adc:Calculate Antecedent Discharge Conditions
Calculates some antecedent discharge conditions useful in water quality modeling. Includes methods for calculating flow anomalies, base flow, and smooth discounted flows from daily flow measurements. Antecedent discharge algorithms are described and reviewed in Zhang and Ball (2017) <doi:10.1016/j.jhydrol.2016.12.052>.
Maintained by Michael Schramm. Last updated 2 years ago.
2.5 match 3 stars 3.18 score 3 scriptsshirintaheri
climetrics:Climate Change Metrics
A framework that facilitates spatio-temporal analysis of climate dynamics through exploring and measuring different dimensions of climate change in space and time.
Maintained by Shirin Taheri. Last updated 11 months ago.
2.0 match 14 stars 3.85 score 9 scriptsinbo
n2kupdate:Auxiliary Functions to Update the n2kresult Database
The functions are useful to store the results from https:// github.com/inbo/n2kanalysis into a PostgreSQL database created with https:// github.com/inbo/n2kresult.
Maintained by Thierry Onkelinx. Last updated 6 years ago.
4.3 match 1.70 score 1 scriptscicarrascog
imputeREE:Impute Missing Rare Earth Element Data in Zircon
Set of functions to impute missing rare earth data, calculate La and Pr concentrations and Ce anomalies in zircons based on the Chondrite-Onuma and Chondrite-Lattice of Carrasco-Godoy and Campbell (2023) <doi:10.1007/s00410-023-02025-9> and the Logarithmic regression from Zhong et al. (2019) <doi:10.1007/s00710-019-00682-y>.
Maintained by Carlos Carrasco Godoy. Last updated 1 years ago.
2.3 match 3 stars 3.18 score 3 scriptsgabrielodom
mvMonitoring:Multi-State Adaptive Dynamic Principal Component Analysis for Multivariate Process Monitoring
Use multi-state splitting to apply Adaptive-Dynamic PCA (ADPCA) to data generated from a continuous-time multivariate industrial or natural process. Employ PCA-based dimension reduction to extract linear combinations of relevant features, reducing computational burdens. For a description of ADPCA, see <doi:10.1007/s00477-016-1246-2>, the 2016 paper from Kazor et al. The multi-state application of ADPCA is from a manuscript under current revision entitled "Multi-State Multivariate Statistical Process Control" by Odom, Newhart, Cath, and Hering, and is expected to appear in Q1 of 2018.
Maintained by Gabriel Odom. Last updated 1 years ago.
1.3 match 4 stars 5.24 score 29 scriptsinbo
n2kanalysis:Generic Functions to Analyse Data from the 'Natura 2000' Monitoring
All generic functions and classes for the analysis for the 'Natura 2000' monitoring. The classes contain all required data and definitions to fit the model without the need to access other sources. Potentially they might need access to one or more parent objects. An aggregation object might for example need the result of an imputation object. The actual definition of the analysis, using these generic function and classes, is defined in dedictated analysis R packages for every monitoring scheme. For example 'abvanalysis' and 'watervogelanalysis'.
Maintained by Thierry Onkelinx. Last updated 2 months ago.
2.0 match 1 stars 3.18 score 7 scriptsjaydevine
pheble:Classifying High-Dimensional Phenotypes with Ensemble Learning
A system for binary and multi-class classification of high-dimensional phenotypic data using ensemble learning. By combining predictions from different classification models, this package attempts to improve performance over individual learners. The pre-processing, training, validation, and testing are performed end-to-end to minimize user input and simplify the process of classification.
Maintained by Jay Devine. Last updated 2 years ago.
2.3 match 2.70 scorecran
adamethods:Archetypoid Algorithms and Anomaly Detection
Collection of several algorithms to obtain archetypoids with small and large databases, and with both classical multivariate data and functional data (univariate and multivariate). Some of these algorithms also allow to detect anomalies (outliers). Please see Vinue and Epifanio (2020) <doi:10.1007/s11634-020-00412-9>.
Maintained by Guillermo Vinue. Last updated 5 years ago.
3.6 match 1.63 score 43 scriptsmfacevedol
renpow:Renewable Power Systems and the Environment
Supports calculations and visualization for renewable power systems and the environment. Analysis and graphical tools for DC and AC circuits and their use in electric power systems. Analysis and graphical tools for thermodynamic cycles and heat engines, supporting efficiency calculations in coal-fired power plants, gas-fired power plants. Calculations of carbon emissions and atmospheric CO2 dynamics. Analysis of power flow and demand for the grid, as well as power models for microgrids and off-grid systems. Provides resource and power generation for hydro power, wind power, and solar power.
Maintained by Miguel F. Acevedo. Last updated 7 years ago.
3.6 match 1.48 score 30 scriptsdyfanjones
sagemaker.mlframework:sagemaker machine learning developed by amazon
`sagemaker` machine learning developed by amazon.
Maintained by Dyfan Jones. Last updated 3 years ago.
amazon-sagemakerawsmachine-learningsagemakersdk
1.8 match 2.48 score 2 dependentscollincr-usgs
gravmagsubs:Gravitational and Magnetic Attraction of 3-D Vertical Rectangular Prisms
Computes the gravitational and magnetic anomalies generated by 3-D vertical rectangular prisms at specific observation points using the method of Plouff (1976) <doi:10.1190/1.1440645>.
Maintained by C. Cronkite-Ratcliff. Last updated 2 years ago.
1.6 match 2.70 score 4 scriptstalegari
solitude:An Implementation of Isolation Forest
Isolation forest is anomaly detection method introduced by the paper Isolation based Anomaly Detection (Liu, Ting and Zhou <doi:10.1145/2133360.2133363>).
Maintained by Komala Sheshachala Srikanth. Last updated 4 years ago.
isolation-forestoutliersrpackages
0.8 match 24 stars 5.24 score 70 scripts 1 dependentsropenspain
climaemet:Climate AEMET Tools
Tools to download the climatic data of the Spanish Meteorological Agency (AEMET) directly from R using their API and create scientific graphs (climate charts, trend analysis of climate time series, temperature and precipitation anomalies maps, warming stripes graphics, climatograms, etc.).
Maintained by Diego Hernangómez. Last updated 1 months ago.
aemetclimatedataforecast-apiropenspainsciencespainweather-api
0.5 match 43 stars 8.32 score 59 scriptscumulocity-iot
pmml:Generate PMML for Various Models
The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define machine learning, statistical and data mining models and to share models between PMML compliant applications. More information about the PMML industry standard and the Data Mining Group can be found at <http://dmg.org/>. The generated PMML can be imported into any PMML consuming application, such as Zementis Predictive Analytics products. The package isofor (used for anomaly detection) can be installed with devtools::install_github("gravesee/isofor").
Maintained by Dmitriy Bolotov. Last updated 3 years ago.
0.5 match 20 stars 7.98 score 560 scripts 1 dependentszhaokg
Rbeast:Bayesian Change-Point Detection and Time Series Decomposition
Interpretation of time series data is affected by model choices. Different models can give different or even contradicting estimates of patterns, trends, and mechanisms for the same data--a limitation alleviated by the Bayesian estimator of abrupt change,seasonality, and trend (BEAST) of this package. BEAST seeks to improve time series decomposition by forgoing the "single-best-model" concept and embracing all competing models into the inference via a Bayesian model averaging scheme. It is a flexible tool to uncover abrupt changes (i.e., change-points), cyclic variations (e.g., seasonality), and nonlinear trends in time-series observations. BEAST not just tells when changes occur but also quantifies how likely the detected changes are true. It detects not just piecewise linear trends but also arbitrary nonlinear trends. BEAST is applicable to real-valued time series data of all kinds, be it for remote sensing, economics, climate sciences, ecology, and hydrology. Example applications include its use to identify regime shifts in ecological data, map forest disturbance and land degradation from satellite imagery, detect market trends in economic data, pinpoint anomaly and extreme events in climate data, and unravel system dynamics in biological data. Details on BEAST are reported in Zhao et al. (2019) <doi:10.1016/j.rse.2019.04.034>.
Maintained by Kaiguang Zhao. Last updated 6 months ago.
anomoly-detectionbayesian-time-seriesbreakpoint-detectionchangepoint-detectioninterrupted-time-seriesseasonality-analysisstructural-breakpointtechnical-analysistime-seriestime-series-decompositiontrendtrend-analysis
0.5 match 302 stars 7.63 score 89 scriptsbioc
PeacoQC:Peak-based selection of high quality cytometry data
This is a package that includes pre-processing and quality control functions that can remove margin events, compensate and transform the data and that will use PeacoQCSignalStability for quality control. This last function will first detect peaks in each channel of the flowframe. It will remove anomalies based on the IsolationTree function and the MAD outlier detection method. This package can be used for both flow- and mass cytometry data.
Maintained by Annelies Emmaneel. Last updated 5 months ago.
flowcytometryqualitycontrolpreprocessingpeakdetection
0.5 match 16 stars 7.38 score 28 scripts 3 dependentsrabarata
exdqlm:Extended Dynamic Quantile Linear Models
Routines for Bayesian estimation and analysis of dynamic quantile linear models utilizing the extended asymmetric Laplace error distribution, also known as extended dynamic quantile linear models (exDQLM) described in Barata et al (2020) <doi:10.1214/21-AOAS1497>.
Maintained by Raquel Barata. Last updated 2 years ago.
3.6 match 1 stars 1.00 scorecran
SpatialVx:Spatial Forecast Verification
Spatial forecast verification refers to verifying weather forecasts when the verification set (forecast and observations) is on a spatial field, usually a high-resolution gridded spatial field. Most of the functions here require the forecast and observed fields to be gridded and on the same grid. For a thorough review of most of the methods in this package, please see Gilleland et al. (2009) <doi: 10.1175/2009WAF2222269.1> and for a tutorial on some of the main functions available here, see Gilleland (2022) <doi: 10.5065/4px3-5a05>.
Maintained by Eric Gilleland. Last updated 4 months ago.
1.9 match 1 stars 1.83 score 68 scriptsadunaic
forecastLSW:Forecasting Routines for Locally Stationary Wavelet Processes
Implementation to perform forecasting of locally stationary wavelet processes by examining the local second order structure of the time series.
Maintained by Rebecca Killick. Last updated 2 years ago.
3.3 match 1.00 score 3 scriptshoxo-m
densratio:Density Ratio Estimation
Density ratio estimation. The estimated density ratio function can be used in many applications such as anomaly detection, change-point detection, covariate shift adaptation. The implemented methods are uLSIF (Hido et al. (2011) <doi:10.1007/s10115-010-0283-2>), RuLSIF (Yamada et al. (2011) <doi:10.1162/NECO_a_00442>), and KLIEP (Sugiyama et al. (2007) <doi:10.1007/s10463-008-0197-x>).
Maintained by Koji Makiyama. Last updated 6 years ago.
anomalydetectionmachine-learningmachine-learning-algorithmsmachine-learning-libraryr-languagestatistics
0.5 match 21 stars 6.36 score 36 scripts 2 dependentsbioc
flowAI:Automatic and interactive quality control for flow cytometry data
The package is able to perform an automatic or interactive quality control on FCS data acquired using flow cytometry instruments. By evaluating three different properties: 1) flow rate, 2) signal acquisition, 3) dynamic range, the quality control enables the detection and removal of anomalies.
Maintained by Gianni Monaco. Last updated 5 months ago.
flowcytometryqualitycontrolbiomedicalinformaticsimmunooncology
0.5 match 5.67 score 86 scripts 3 dependentsmayer79
outForest:Multivariate Outlier Detection and Replacement
Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) <doi:10.1145/1541880.1541882>. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data.
Maintained by Michael Mayer. Last updated 8 months ago.
machine-learningoutlieroutlier-analysisoutlier-detectionrandom-forest
0.5 match 13 stars 5.39 score 19 scriptssevvandi
outlierensembles:A Collection of Outlier Ensemble Algorithms
Ensemble functions for outlier/anomaly detection. There is a new ensemble method proposed using Item Response Theory. Existing outlier ensemble methods from Schubert et al (2012) <doi:10.1137/1.9781611972825.90>, Chiang et al (2017) <doi:10.1016/j.jal.2016.12.002> and Aggarwal and Sathe (2015) <doi:10.1145/2830544.2830549> are also included.
Maintained by Sevvandi Kandanaarachchi. Last updated 23 days ago.
0.5 match 3 stars 4.18 score 9 scriptsainsuotain
pasadr:An Implementation of Process-Aware Stealthy Attack Detection(PASAD)
Anomaly detection method based on the paper "Truth will out: Departure-based process-level detection of stealthy attacks on control systems" from Wissam Aoudi, Mikel Iturbe, and Magnus Almgren (2018) <DOI:10.1145/3243734.3243781>. Also referred to the following implementation: <https://github.com/rahulrajpl/PyPASAD>.
Maintained by Donghwan Kim. Last updated 4 years ago.
0.5 match 3 stars 3.18 scorekumes
seasonalityPlot:Seasonality Variation Plots of Stock Prices and Cryptocurrencies
The price action at any given time is determined by investor sentiment and market conditions. Although there is no established principle, over a long period of time, things often move with a certain periodicity. This is sometimes referred to as anomaly. The seasonPlot() function in this package calculates and visualizes the average value of price movements over a year for any given period. In addition, the monthly increase or decrease in price movement is represented with a colored background. This seasonPlot() function can use the same symbols as the 'quantmod' package (e.g. ^IXIC, ^DJI, SPY, BTC-USD, and ETH-USD etc).
Maintained by Satoshi Kume. Last updated 6 months ago.
0.5 match 1 stars 3.00 score 6 scriptsmyaseen208
PakNAcc:'shiny' App for National Accounts
Provides a comprehensive suite of tools for analyzing Pakistan's Quarterly National Accounts data. Users can gain detailed insights into Pakistan's economic performance, visualize quarterly trends, and detect patterns and anomalies in key economic indicators. Compare sector contributions—including agriculture, industry, and services—to understand their influence on economic growth or decline. Customize analyses by filtering and manipulating data to focus on specific areas of interest. Ideal for policymakers, researchers, and analysts aiming to make informed, data-driven decisions based on timely and detailed economic insights.
Maintained by Muhammad Yaseen. Last updated 4 months ago.
0.5 match 3.00 scorepigian
snap:Simple Neural Application
A simple wrapper to easily design vanilla deep neural networks using 'Tensorflow'/'Keras' backend for regression, classification and multi-label tasks, with some tweaks and tricks (skip shortcuts, embedding, feature selection and anomaly detection).
Maintained by Giancarlo Vercellino. Last updated 4 years ago.
0.5 match 2.00 scorecran
adplots:Ad-Plot and Ud-Plot for Visualizing Distributional Properties and Normality
The empirical cumulative average deviation function introduced by the author is utilized to develop both Ad- and Ud-plots. The Ad-plot can identify symmetry, skewness, and outliers of the data distribution, including anomalies. The Ud-plot created by slightly modifying Ad-plot is exceptional in assessing normality, outperforming normal QQ-plot, normal PP-plot, and their derivations. The d-value that quantifies the degree of proximity between the Ud-plot and the graph of the estimated normal density function helps guide to make decisions on confirmation of normality. Full description of this methodology can be found in the article by Wijesuriya (2025) <doi:10.1080/03610926.2024.2440583>.
Maintained by Uditha Amarananda Wijesuriya. Last updated 2 months ago.
0.5 match 2.00 scoremarchionnilab
lpcover:LPCover: Functionality for integer programming methods for covering
Integer programming functionality for different 'covering' optimizations as presented in Ke et al, "Efficient Representations of Tumor Diversity with Paired DNA-RNA Anomalies".
Maintained by Wikum Dinalankara. Last updated 4 years ago.
0.5 match 1.70 score 2 scriptscran
kmodR:K-Means with Simultaneous Outlier Detection
An implementation of the 'k-means--' algorithm proposed by Chawla and Gionis, 2013 in their paper, "k-means-- : A unified approach to clustering and outlier detection. SIAM International Conference on Data Mining (SDM13)", <doi:10.1137/1.9781611972832.21> and using 'ordering' described by Howe, 2013 in the thesis, Clustering and anomaly detection in tropical cyclones". Useful for creating (potentially) tighter clusters than standard k-means and simultaneously finding outliers inexpensively in multidimensional space.
Maintained by David Charles Howe. Last updated 3 years ago.
0.5 match 1.70 scorecran
kuiper.2samp:Two-Sample Kuiper Test
This function performs the two-sample Kuiper test to assess the anomaly of continuous, one-dimensional probability distributions. References used for this method are (1). Kuiper, N. H. (1960). <DOI:10.1016/S1385-7258(60)50006-0> and (2). Paltani, S. (2004). <DOI:10.1051/0004-6361:20034220>.
Maintained by Ying Ruan. Last updated 6 years ago.
0.5 match 1.00 score