R-universe search: anomaly

business-science

anomalize:Tidy Anomaly Detection

The 'anomalize' package enables a "tidy" workflow for detecting anomalies in data. The main functions are time_decompose(), anomalize(), and time_recompose(). When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the "normal" data from the anomalous data at scale (i.e. for multiple time series). Time series decomposition is used to remove trend and seasonal components via the time_decompose() function and methods include seasonal decomposition of time series by Loess ("stl") and seasonal decomposition by piecewise medians ("twitter"). The anomalize() function implements two methods for anomaly detection of residuals including using an inner quartile range ("iqr") and generalized extreme studentized deviation ("gesd"). These methods are based on those used in the 'forecast' package and the Twitter 'AnomalyDetection' package. Refer to the associated functions for specific references for these methods.

Maintained by Matt Dancho. Last updated 1 years ago.

anomaly anomaly-detection decomposition detect-anomalies iqr time-series

50.5 match 339 stars 9.56 score 332 scripts

cefet-rj-dal

harbinger:A Unified Time Series Event Detection Framework

By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.

Maintained by Eduardo Ogasawara. Last updated 3 months ago.

33.7 match 18 stars 8.32 score 216 scripts

waternumbers

anomalous:Anomaly Detection using the CAPA and PELT Algorithms

Implimentations of the univariate CAPA <doi:10.1002/sam.11586> and PELT <doi:10.1080/01621459.2012.737745> algotithms along with various cost functions for different distributions and models. The modular design, using R6 classes, favour ease of extension (for example user written cost functions) over the performance of other implimentations (e.g. <doi:10.32614/CRAN.package.changepoint>, <doi:10.32614/CRAN.package.anomaly>).

Maintained by Paul Smith. Last updated 3 months ago.

cpp

52.3 match 4.61 score 18 scripts

business-science

timetk:A Tool Kit for Working with Time Series

Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.

Maintained by Matt Dancho. Last updated 1 years ago.

coercion coercion-functions data-mining dplyr forecast forecasting forecasting-models machine-learning series-decomposition series-signature tibble tidy tidyquant tidyverse time time-series timeseries

16.0 match 625 stars 14.15 score 4.0k scripts 16 dependents

dankelley

oce:Analysis of Oceanographic Data

Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.

Maintained by Dan Kelley. Last updated 1 days ago.

oceanography fortran cpp

14.2 match 146 stars 15.42 score 4.2k scripts 18 dependents

teos-10

gsw:Gibbs Sea Water Functions

Provides an interface to the Gibbs 'SeaWater' ('TEOS-10') C library, version 3.06-16-0 (commit '657216dd4f5ea079b5f0e021a4163e2d26893371', dated 2022-10-11, available at <https://github.com/TEOS-10/GSW-C>, which stems from 'Matlab' and other code written by members of Working Group 127 of 'SCOR'/'IAPSO' (Scientific Committee on Oceanic Research / International Association for the Physical Sciences of the Oceans).

Maintained by Dan Kelley. Last updated 8 days ago.

gibbs oceanography seawater teos-10

19.5 match 8 stars 8.53 score 286 scripts 19 dependents

bioc

scDiagnostics:Cell type annotation diagnostics

The scDiagnostics package provides diagnostic plots to assess the quality of cell type assignments from single cell gene expression profiles. The implemented functionality allows to assess the reliability of cell type annotations, investigate gene expression patterns, and explore relationships between different cell types in query and reference datasets allowing users to detect potential misalignments between reference and query datasets. The package also provides visualization capabilities for diagnostics purposes.

Maintained by Anthony Christidis. Last updated 5 months ago.

annotation classification clustering geneexpression rnaseq singlecell software transcriptomics

14.0 match 8 stars 7.77 score 46 scripts

david-cortes

isotree:Isolation-Based Outlier Detection

Fast and multi-threaded implementation of isolation forest (Liu, Ting, Zhou (2008) <doi:10.1109/ICDM.2008.17>), extended isolation forest (Hariri, Kind, Brunner (2018) <doi:10.48550/arXiv.1811.02141>), SCiForest (Liu, Ting, Zhou (2010) <doi:10.1007/978-3-642-15883-4_18>), fair-cut forest (Cortes (2021) <doi:10.48550/arXiv.2110.13402>), robust random-cut forest (Guha, Mishra, Roy, Schrijvers (2016) <http://proceedings.mlr.press/v48/guha16.html>), and customizable variations of them, for isolation-based outlier detection, clustered outlier detection, distance or similarity approximation (Cortes (2019) <doi:10.48550/arXiv.1910.12362>), isolation kernel calculation (Ting, Zhu, Zhou (2018) <doi:10.1145/3219819.3219990>), and imputation of missing values (Cortes (2019) <doi:10.48550/arXiv.1911.06646>), based on random or guided decision tree splitting, and providing different metrics for scoring anomalies based on isolation depth or density (Cortes (2021) <doi:10.48550/arXiv.2111.11639>). Provides simple heuristics for fitting the model to categorical columns and handling missing data, and offers options for varying between random and guided splits, and for using different splitting criteria.

Maintained by David Cortes. Last updated 15 days ago.

anomaly-detection imputation isolation-forest outlier-detection cpp openmp

8.0 match 203 stars 10.41 score 115 scripts 6 dependents

cmsaf

cmsafops:Tools for CM SAF NetCDF Data

The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The 'cmsafops' R-package provides a collection of R-operators for the analysis and manipulation of CM SAF NetCDF formatted data. Other CF conform NetCDF data with time, longitude and latitude dimension should be applicable, but there is no guarantee for an error-free application. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).

Maintained by Steffen Kothe. Last updated 6 months ago.

14.8 match 2 stars 5.03 score 4 scripts 2 dependents

eliocamp

metR:Tools for Easier Analysis of Meteorological Fields

Many useful functions and extensions for dealing with meteorological data in the tidy data framework. Extends 'ggplot2' for better plotting of scalar and vector fields and provides commonly used analysis methods in the atmospheric sciences.

Maintained by Elio Campitelli. Last updated 21 days ago.

atmospheric-science ggplot2 visualization

6.0 match 144 stars 12.19 score 1000 scripts 22 dependents

cran

anomaly:Detecting Anomalies in Data

Implements Collective And Point Anomaly (CAPA) Fisch, Eckley, and Fearnhead (2022) <doi:10.1002/sam.11586>, Multi-Variate Collective And Point Anomaly (MVCAPA) Fisch, Eckley, and Fearnhead (2021) <doi:10.1080/10618600.2021.1987257>, Proportion Adaptive Segment Selection (PASS) Jeng, Cai, and Li (2012) <doi:10.1093/biomet/ass059>, and Bayesian Abnormal Region Detector (BARD) Bardwell and Fearnhead (2015) <doi:10.1214/16-BA998>. These methods are for the detection of anomalies in time series data. Further information regarding the use of this package along with detailed examples can be found in Fisch, Grose, Eckley, Fearnhead, and Bardwell (2024) <doi:10.18637/jss.v110.i01>.

Maintained by Daniel Grose. Last updated 7 months ago.

cpp

63.5 match 1 stars 1.00 score

nixtla

nixtlar:A Software Development Kit for 'Nixtla''s 'TimeGPT'

A Software Development Kit for working with 'Nixtla''s 'TimeGPT', a foundation model for time series forecasting. 'API' is an acronym for 'application programming interface'; this package allows users to interact with 'TimeGPT' via the 'API'. You can set and validate 'API' keys and generate forecasts via 'API' calls. It is compatible with 'tsibble' and base R. For more details visit <https://docs.nixtla.io/>.

Maintained by Mariana Menchero. Last updated 28 days ago.

7.7 match 30 stars 8.16 score 38 scripts

bflammers

ANN2:Artificial Neural Networks for Anomaly Detection

Training of neural networks for classification and regression tasks using mini-batch gradient descent. Special features include a function for training autoencoders, which can be used to detect anomalies, and some related plotting functions. Multiple activation functions are supported, including tanh, relu, step and ramp. For the use of the step and ramp activation functions in detecting anomalies using autoencoders, see Hawkins et al. (2002) <doi:10.1007/3-540-46145-0_17>. Furthermore, several loss functions are supported, including robust ones such as Huber and pseudo-Huber loss, as well as L1 and L2 regularization. The possible options for optimization algorithms are RMSprop, Adam and SGD with momentum. The package contains a vectorized C++ implementation that facilitates fast training through mini-batch learning.

Maintained by Bart Lammers. Last updated 4 years ago.

anomaly-detection artificial-neural-networks autoencoders neural-networks robust-statistics openblas cpp openmp

11.3 match 13 stars 5.59 score 60 scripts

robjhyndman

weird:Functions and Data Sets for "That's Weird: Anomaly Detection Using R" by Rob J Hyndman

All functions and data sets required for the examples in the book Hyndman (2024) "That's Weird: Anomaly Detection Using R" <https://OTexts.com/weird/>. All packages needed to run the examples are also loaded.

Maintained by Rob Hyndman. Last updated 3 months ago.

10.8 match 15 stars 5.71 score 18 scripts

pridiltal

stray:Anomaly Detection in High Dimensional and Temporal Data

This is a modification of 'HDoutliers' package. The 'HDoutliers' algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. This package implements the algorithm proposed in Talagala, Hyndman and Smith-Miles (2019) <arXiv:1908.04000> for detecting anomalies in high-dimensional data that addresses these limitations of 'HDoutliers' algorithm. We define an anomaly as an observation that deviates markedly from the majority with a large distance gap. An approach based on extreme value theory is used for the anomalous threshold calculation.

Maintained by Priyanga Dilini Talagala. Last updated 1 years ago.

stray

10.9 match 58 stars 5.47 score 34 scripts 1 dependents

dbolotov

amelie:Anomaly Detection with Normal Probability Functions

Implements anomaly detection as binary classification for cross-sectional data. Uses maximum likelihood estimates and normal probability functions to classify observations as anomalous. The method is presented in the following lecture from the Machine Learning course by Andrew Ng: <https://www.coursera.org/learn/machine-learning/lecture/C8IJp/algorithm/>, and is also described in: Aleksandar Lazarevic, Levent Ertoz, Vipin Kumar, Aysel Ozgur, Jaideep Srivastava (2003) <doi:10.1137/1.9781611972733.3>.

Maintained by Dmitriy Bolotov. Last updated 6 years ago.

anomaly-detection

15.8 match 3.70 score 6 scripts

cran

CSTools:Assessing Skill of Climate Forecasts on Seasonal-to-Decadal Timescales

Exploits dynamical seasonal forecasts in order to provide information relevant to stakeholders at the seasonal timescale. The package contains process-based methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. This package was developed in the context of the 'ERA4CS' project 'MEDSCOPE' and the 'H2020 S2S4E' project and includes contributions from 'ArticXchange' project founded by 'EU-PolarNet 2'. 'Pérez-Zanón et al. (2022) <doi:10.5194/gmd-15-6115-2022>'. 'Doblas-Reyes et al. (2005) <doi:10.1111/j.1600-0870.2005.00104.x>'. 'Mishra et al. (2018) <doi:10.1007/s00382-018-4404-z>'. 'Sanchez-Garcia et al. (2019) <doi:10.5194/asr-16-165-2019>'. 'Straus et al. (2007) <doi:10.1175/JCLI4070.1>'. 'Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>'. 'Torralba et al. (2017) <doi:10.1175/JAMC-D-16-0204.1>'. 'D'Onofrio et al. (2014) <doi:10.1175/JHM-D-13-096.1>'. 'Verfaillie et al. (2017) <doi:10.5194/gmd-10-4257-2017>'. 'Van Schaeybroeck et al. (2019) <doi:10.1016/B978-0-12-812372-0.00010-8>'. 'Yiou et al. (2013) <doi:10.1007/s00382-012-1626-3>'.

Maintained by Victoria Agudetse. Last updated 1 years ago.

fortran

10.7 match 2 stars 5.32 score 62 scripts 1 dependents

david-cortes

outliertree:Explainable Outlier Detection Through Decision Tree Conditioning

Outlier detection method that flags suspicious values within observations, constrasting them against the normal values in a user-readable format, potentially describing conditions within the data that make a given outlier more rare. Full procedure is described in Cortes (2020) <doi:10.48550/arXiv.2001.00636>. Loosely based on the 'GritBot' <https://www.rulequest.com/gritbot-info.html> software.

Maintained by David Cortes. Last updated 2 months ago.

anomaly-detection outlier-detection cpp openmp

7.5 match 58 stars 7.34 score 21 scripts 2 dependents

dicook

mulgar:Functions for Pre-Processing Data for Multivariate Data Visualisation using Tours

This is a companion to the book Cook, D. and Laa, U. (2023) <https://dicook.github.io/mulgar_book/> "Interactively exploring high-dimensional data and models in R". by Cook and Laa. It contains useful functions for processing data in preparation for visualising with a tour. There are also several sample data sets.

Maintained by Dianne Cook. Last updated 2 months ago.

12.0 match 4 stars 4.50 score 79 scripts

ggobi

tourr:Tour Methods for Multivariate Data Visualisation

Implements geodesic interpolation and basis generation functions that allow you to create new tour methods from R.

Maintained by Dianne Cook. Last updated 17 days ago.

4.1 match 65 stars 11.17 score 426 scripts 9 dependents

haydarde

dLagM:Time Series Regression Models with Distributed Lag Models

Provides time series regression models with one predictor using finite distributed lag models, polynomial (Almon) distributed lag models, geometric distributed lag models with Koyck transformation, and autoregressive distributed lag models. It also consists of functions for computation of h-step ahead forecasts from these models. See Demirhan (2020)(<doi:10.1371/journal.pone.0228812>) and Baltagi (2011)(<doi:10.1007/978-3-642-20059-5>) for more information.

Maintained by Haydar Demirhan. Last updated 1 years ago.

13.3 match 2 stars 3.18 score 127 scripts

cvxgrp

CVXR:Disciplined Convex Optimization

An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.

Maintained by Anqi Fu. Last updated 4 months ago.

cpp

3.2 match 207 stars 12.89 score 768 scripts 51 dependents

lucasvenez

precintcon:Precipitation Intensity, Concentration and Anomaly Analysis

It contains functions to analyze the precipitation intensity, concentration and anomaly.

Maintained by Lucas Venezian Povoa. Last updated 9 years ago.

9.2 match 10 stars 4.28 score 38 scripts

bioc

GWASTools:Tools for Genome Wide Association Studies

Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.

Maintained by Stephanie M. Gogarten. Last updated 5 months ago.

snp geneticvariability qualitycontrol microarray

3.6 match 17 stars 10.50 score 396 scripts 5 dependents

emilio-berti

GHCNr:Download Weather Station Data from GHCNd

The goal of 'GHCNr' is to provide a fast and friendly interface with the Global Historical Climatology Network daily (GHCNd) database, which contains daily summaries of weather station data worldwide (<https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily>). GHCNd is accessed through the web API <https://www.ncei.noaa.gov/access/services/data/v1>. 'GHCNr' main functionalities consist of downloading data from GHCNd, filter it, and to aggregate it at monthly and annual scales.

Maintained by Emilio Berti. Last updated 2 months ago.

7.5 match 2 stars 4.95 score 3 scripts

vitomuggeo

segmented:Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects)

Fitting regression models where, in addition to possible linear terms, one or more covariates have segmented (i.e., broken-line or piece-wise linear) or stepmented (i.e. piece-wise constant) effects. Multiple breakpoints for the same variable are allowed. The estimation method is discussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estimation for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). Segmented mixed models, i.e. random effects in the change point, are discussed in Muggeo (2014, <doi:10.1177/1471082X13504721>). Estimation of piecewise-constant relationships and changepoints (mean-shift models) is discussed in Fasola et al. (2018, <doi:10.1007/s00180-017-0740-4>).

Maintained by Vito M. R. Muggeo. Last updated 16 days ago.

3.6 match 9 stars 10.03 score 1.2k scripts 203 dependents

cran

datarobot:'DataRobot' Predictive Modeling API

For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.

Maintained by AJ Alon. Last updated 1 years ago.

10.3 match 2 stars 3.48 score

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

4.0 match 3 stars 8.20 score 7.8k scripts 11 dependents

al-obrien

spectralAnomaly:Detect Anomalies Using the Spectral Residual Algorithm

Apply the spectral residual algorithm to data, such as a time series, to detect anomalies. Anomaly scores can be used to determine outliers based upon a threshold or fed into more sophisticated prediction models. Methods are based upon "Time-Series Anomaly Detection Service at Microsoft", Ren, H., Xu, B., Wang, Y., et al., (2019) <doi:10.48550/arXiv.1906.03821>.

Maintained by Allen OBrien. Last updated 7 days ago.

9.3 match 2 stars 3.48 score 3 scripts

zcebeci

odetector:Outlier Detection Using Partitioning Clustering Algorithms

An object is called "outlier" if it remarkably deviates from the other objects in a data set. Outlier detection is the process to find outliers by using the methods that are based on distance measures, clustering and spatial methods (Ben-Gal, 2005 <ISBN 0-387-24435-2>). It is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for outlier removing in data processing. This package provides the implementations of some novel approaches to detect the outliers based on typicality degrees that are obtained with the soft partitioning clustering algorithms such as Fuzzy C-means and its variants.

Maintained by Zeynel Cebeci. Last updated 2 years ago.

anomaly-detection cluster-analysis clustering clustering-methods data datapreparation datapreprocessing exception-handling fcm fraud-detection fuzzy-clustering novelty-detection outlier-detection outlier-removal outliers partitioning pcm surprise-exploration

8.0 match 3.70 score 4 scripts

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 6 days ago.

autograd deep-learning torch cpp

1.7 match 520 stars 16.52 score 1.4k scripts 38 dependents

cran

s2dv:A Set of Common Tools for Seasonal to Decadal Verification

The advanced version of package 's2dverification'. It is intended for 'seasonal to decadal' (s2d) climate forecast verification, but it can also be used in other kinds of forecasts or general climate analysis. This package is specially designed for the comparison between the experimental and observational datasets. The functionality of the included functions covers from data retrieval, data post-processing, skill scores against observation, to visualization. Compared to 's2dverification', 's2dv' is more compatible with the package 'startR', able to use multiple cores for computation and handle multi-dimensional arrays with a higher flexibility. The CDO version used in development is 1.9.8.

Maintained by Ariadna Batalla. Last updated 5 months ago.

14.2 match 1.95 score 3 dependents

sicarul

xray:X Ray Vision on your Datasets

Tools to analyze datasets previous to any statistical modeling. Has various functions designed to find inconsistencies and understanding the distribution of the data.

Maintained by Pablo Seibelt. Last updated 7 years ago.

4.9 match 75 stars 5.60 score 35 scripts

krahim

multitaper:Spectral Analysis Tools using the Multitaper Method

Implements multitaper spectral analysis using discrete prolate spheroidal sequences (Slepians) and sine tapers. It includes an adaptive weighted multitaper spectral estimate, a coherence estimate, Thomson's Harmonic F-test, and complex demodulation. The Slepians sequences are generated efficiently using a tridiagonal matrix solution, and jackknifed confidence intervals are available for most estimates. This package is an implementation of the method described in D.J. Thomson (1982) "Spectrum estimation and harmonic analysis" <doi:10.1109/PROC.1982.12433>.

Maintained by Karim Rahim. Last updated 8 months ago.

fortran openblas

3.4 match 10 stars 7.62 score 67 scripts 26 dependents

cran

ClimProjDiags:Set of Tools to Compute Various Climate Indices

Set of tools to compute metrics and indices for climate analysis. The package provides functions to compute extreme indices, evaluate the agreement between models and combine theses models into an ensemble. Multi-model time series of climate indices can be computed either after averaging the 2-D fields from different models provided they share a common grid or by combining time series computed on the model native grid. Indices can be assigned weights and/or combined to construct new indices.

Maintained by Victòria Agudetse. Last updated 1 years ago.

5.1 match 5.14 score 58 scripts 4 dependents

sevvandi

oddnet:Anomaly Detection in Temporal Networks

Anomaly detection in dynamic, temporal networks. The package 'oddnet' uses a feature-based method to identify anomalies. First, it computes many features for each network. Then it models the features using time series methods. Using time series residuals it detects anomalies. This way, the temporal dependencies are accounted for when identifying anomalies (Kandanaarachchi, Hyndman 2022) <arXiv:2210.07407>.

Maintained by Sevvandi Kandanaarachchi. Last updated 10 months ago.

6.2 match 3 stars 4.22 score 11 scripts

hoxo-m

sGMRFmix:Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection

An implementation of sparse Gaussian Markov random field mixtures presented by Ide et al. (2016) <doi:10.1109/ICDM.2016.0119>. It provides a novel anomaly detection method for multivariate noisy sensor data. It can automatically handle multiple operational modes. And it can also compute variable-wise anomaly scores.

Maintained by Koji Makiyama. Last updated 7 years ago.

10.9 match 1 stars 2.34 score 22 scripts

drwolf85

HRTnomaly:Historical, Relational, and Tail Anomaly-Detection Algorithms

The presence of outliers in a dataset can substantially bias the results of statistical analyses. To correct for outliers, micro edits are manually performed on all records. A set of constraints and decision rules is typically used to aid the editing process. However, straightforward decision rules might overlook anomalies arising from disruption of linear relationships. Computationally efficient methods are provided to identify historical, tail, and relational anomalies at the data-entry level (Sartore et al., 2024; <doi:10.6339/24-JDS1136>). A score statistic is developed for each anomaly type, using a distribution-free approach motivated by the Bienaymé-Chebyshev's inequality, and fuzzy logic is used to detect cellwise outliers resulting from different types of anomalies. Each data entry is individually scored and individual scores are combined into a final score to determine anomalous entries. In contrast to fuzzy logic, Bayesian bootstrap and a Bayesian test based on empirical likelihoods are also provided as studied by Sartore et al. (2024; <doi:10.3390/stats7040073>). These algorithms allow for a more nuanced approach to outlier detection, as it can identify outliers at data-entry level which are not obviously distinct from the rest of the data. --- This research was supported in part by the U.S. Department of Agriculture, National Agriculture Statistics Service. The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA, or US Government determination or policy.

Maintained by Luca Sartore. Last updated 18 days ago.

openmp

12.5 match 2.00 score

ecor

RGENERATE:Tools to Generate Vector Time Series

A method 'generate()' is implemented in this package for the random generation of vector time series according to models obtained by 'RMAWGEN', 'vars' or other packages. This package was created to generalize the algorithms of the 'RMAWGEN' package for the analysis and generation of any environmental vector time series.

Maintained by Emanuele Cordano. Last updated 7 months ago.

5.6 match 1 stars 4.38 score 16 scripts 1 dependents

promerpr

scanstatistics:Space-Time Anomaly Detection using Scan Statistics

Detection of anomalous space-time clusters using the scan statistics methodology. Focuses on prospective surveillance of data streams, scanning for clusters with ongoing anomalies. Hypothesis testing is made possible by Monte Carlo simulation. Allévius (2018) <doi:10.21105/joss.00515>.

Maintained by Paul Romer Present. Last updated 2 years ago.

cpp

5.1 match 1 stars 4.81 score 43 scripts

wch

gcookbook:Data for "R Graphics Cookbook"

Data sets used in the book "R Graphics Cookbook" by Winston Chang, published by O'Reilly Media.

Maintained by Winston Chang. Last updated 6 years ago.

3.4 match 10 stars 6.77 score 1.3k scripts 1 dependents

andrewzm

EFDR:Wavelet-Based Enhanced FDR for Detecting Signals from Complete or Incomplete Spatially Aggregated Data

Enhanced False Discovery Rate (EFDR) is a tool to detect anomalies in an image. The image is first transformed into the wavelet domain in order to decorrelate any noise components, following which the coefficients at each resolution are standardised. Statistical tests (in a multiple hypothesis testing setting) are then carried out to find the anomalies. The power of EFDR exceeds that of standard FDR, which would carry out tests on every wavelet coefficient: EFDR choose which wavelets to test based on a criterion described in Shen et al. (2002). The package also provides elementary tools to interpolate spatially irregular data onto a grid of the required size. The work is based on Shen, X., Huang, H.-C., and Cressie, N. 'Nonparametric hypothesis testing for a spatial signal.' Journal of the American Statistical Association 97.460 (2002): 1122-1140.

Maintained by Andrew Zammit-Mangion. Last updated 2 years ago.

4.4 match 5 stars 4.74 score 22 scripts

sbshah10

discharge:Fourier Analysis of Discharge Data

Computes discrete fast Fourier transform of river discharge data and the derived metrics. The methods are described in J. L. Sabo, D. M. Post (2008) <doi:10.1890/06-1340.1> and J. L. Sabo, A. Ruhi, G. W. Holtgrieve, V. Elliott, M. E. Arias, P. B. Ngor, T. A. Räsänsen, S. Nam (2017) <doi:10.1126/science.aao1053>.

Maintained by Samarth Shah. Last updated 6 years ago.

12.5 match 1.56 score 36 scripts

jhmaindonald

gamclass:Functions and Data for a Course on Modern Regression and Classification

Functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.

Maintained by John Maindonald. Last updated 2 years ago.

4.0 match 4.82 score 44 scripts

benrenard

sequenceR:A Simple Sequencer for Data Sonification

A rudimentary sequencer to define, manipulate and mix sound samples. The underlying motivation is to sonify data, as demonstrated in the blog <https://globxblog.github.io/>, the presentation by Renard and Le Bescond (2022, <https://hal.science/hal-03710340v1>) or the poster by Renard et al. (2023, <https://hal.inrae.fr/hal-04388845v1>).

Maintained by Benjamin Renard. Last updated 2 months ago.

audio sonification

3.8 match 3 stars 5.13 score 15 scripts

jmotif

jmotif:Time Series Analysis Toolkit Based on Symbolic Aggregate Discretization, i.e. SAX

Implements time series z-normalization, SAX, HOT-SAX, VSM, SAX-VSM, RePair, and RRA algorithms facilitating time series motif (i.e., recurrent pattern), discord (i.e., anomaly), and characteristic pattern discovery along with interpretable time series classification.

Maintained by Pavel Senin. Last updated 2 years ago.

anomalydiscovery discord discretization kdd sax sax-vsm timeseries cpp

3.7 match 55 stars 5.12 score 48 scripts

hvillalo

satin:Visualisation and Analysis of Ocean Data Derived from Satellites

With 'satin' functions, visualisation, data extraction and further analysis like producing climatologies from several images, and anomalies of satellite derived ocean data can be easily done. Reading functions can import a user defined geographical extent of data stored in netCDF files. Currently supported ocean data sources include NASA's Oceancolor web page <https://oceancolor.gsfc.nasa.gov/>, sensors VIIRS-SNPP; MODIS-Terra; MODIS-Aqua; and SeaWiFS. Available variables from this source includes chlorophyll concentration, sea surface temperature (SST), and several others. Data sources specific for SST that can be imported too includes Pathfinder AVHRR <https://www.ncei.noaa.gov/products/avhrr-pathfinder-sst> and GHRSST <https://www.ghrsst.org/>. In addition, ocean productivity data produced by Oregon State University <http://sites.science.oregonstate.edu/ocean.productivity/> can also be handled previous conversion from HDF4 to HDF5 format. Many other ocean variables can be processed by importing netCDF data files from two European Union's Copernicus Marine Service databases <https://marine.copernicus.eu/>, namely Global Ocean Physical Reanalysis and Global Ocean Biogeochemistry Hindcast.

Maintained by Héctor Villalobos. Last updated 10 months ago.

avhrr copernicus ghrsst modis

5.5 match 4 stars 3.30 score 9 scripts

fchamroukhi

meteorits:Mixture-of-Experts Modeling for Complex Non-Normal Distributions

Provides a unified mixture-of-experts (ME) modeling and estimation framework with several original and flexible ME models to model, cluster and classify heterogeneous data in many complex situations where the data are distributed according to non-normal, possibly skewed distributions, and when they might be corrupted by atypical observations. Mixtures-of-Experts models for complex and non-normal distributions ('meteorits') are originally introduced and written in 'Matlab' by Faicel Chamroukhi. The references are mainly the following ones. The references are mainly the following ones. Chamroukhi F., Same A., Govaert, G. and Aknin P. (2009) <doi:10.1016/j.neunet.2009.06.040>. Chamroukhi F. (2010) <https://chamroukhi.com/FChamroukhi-PhD.pdf>. Chamroukhi F. (2015) <arXiv:1506.06707>. Chamroukhi F. (2015) <https://chamroukhi.com/FChamroukhi-HDR.pdf>. Chamroukhi F. (2016) <doi:10.1109/IJCNN.2016.7727580>. Chamroukhi F. (2016) <doi:10.1016/j.neunet.2016.03.002>. Chamroukhi F. (2017) <doi:10.1016/j.neucom.2017.05.044>.

Maintained by Florian Lecocq. Last updated 5 years ago.

artificial-intelligence clustering em-algorithm mixture-of-experts neural-networks non-linear-regression prediction robust-learning skew-normal skew-t skewed-data statistical-inference statistical-learning t-distribution unsupervised-learning openblas cpp

3.3 match 3 stars 5.12 score 11 scripts

cran

vectorsurvR:Data Access and Analytical Tools for 'VectorSurv' Users

Allows registered 'VectorSurv' <https://vectorsurv.org/> users access to data through the 'VectorSurv API' <https://api.vectorsurv.org/>. Additionally provides functions for analysis and visualization.

Maintained by Christina De Cesaris. Last updated 2 months ago.

5.2 match 3.30 score

cmsaf

cmsafvis:Tools to Visualize CM SAF NetCDF Data

The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The 'cmsafvis' R-package provides a collection of R-operators for the analysis and visualization of CM SAF NetCDF data. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).

Maintained by Steffen Kothe. Last updated 6 months ago.

3.5 match 2 stars 4.73 score 1 scripts 1 dependents

benrwoodard

adobeanalyticsr:R Client for 'Adobe Analytics' API 2.0

Connect to the 'Adobe Analytics' API v2.0 <https://github.com/AdobeDocs/analytics-2.0-apis> which powers 'Analysis Workspace'. The package was developed with the analyst in mind, and it will continue to be developed with the guiding principles of iterative, repeatable, timely analysis.

Maintained by Ben Woodard. Last updated 2 months ago.

2.3 match 18 stars 7.02 score 39 scripts

vmoprojs

GeoModels:Procedures for Gaussian and Non Gaussian Geostatistical (Large) Data Analysis

Functions for Gaussian and Non Gaussian (bivariate) spatial and spatio-temporal data analysis are provided for a) (fast) simulation of random fields, b) inference for random fields using standard likelihood and a likelihood approximation method called weighted composite likelihood based on pairs and b) prediction using (local) best linear unbiased prediction. Weighted composite likelihood can be very efficient for estimating massive datasets. Both regression and spatial (temporal) dependence analysis can be jointly performed. Flexible covariance models for spatial and spatial-temporal data on Euclidean domains and spheres are provided. There are also many useful functions for plotting and performing diagnostic analysis. Different non Gaussian random fields can be considered in the analysis. Among them, random fields with marginal distributions such as Skew-Gaussian, Student-t, Tukey-h, Sin-Arcsin, Two-piece, Weibull, Gamma, Log-Gaussian, Binomial, Negative Binomial and Poisson. See the URL for the papers associated with this package, as for instance, Bevilacqua and Gaetan (2015) <doi:10.1007/s11222-014-9460-6>, Bevilacqua et al. (2016) <doi:10.1007/s13253-016-0256-3>, Vallejos et al. (2020) <doi:10.1007/978-3-030-56681-4>, Bevilacqua et. al (2020) <doi:10.1002/env.2632>, Bevilacqua et. al (2021) <doi:10.1111/sjos.12447>, Bevilacqua et al. (2022) <doi:10.1016/j.jmva.2022.104949>, Morales-Navarrete et al. (2023) <doi:10.1080/01621459.2022.2140053>, and a large class of examples and tutorials.

Maintained by Moreno Bevilacqua. Last updated 2 months ago.

fortran openblas glibc

3.6 match 3 stars 4.17 score 83 scripts

bupaverse

daqapo:Data Quality Assessment for Process-Oriented Data

Provides a variety of methods to identify data quality issues in process-oriented data, which are useful to verify data quality in a process mining context. Builds on the class for activity logs implemented in the package 'bupaR'. Methods to identify data quality issues either consider each activity log entry independently (e.g. missing values, activity duration outliers,...), or focus on the relation amongst several activity log entries (e.g. batch registrations, violations of the expected activity order,...).

Maintained by Niels Martin. Last updated 3 years ago.

3.9 match 6 stars 3.78 score 9 scripts

labgrs

npphen:Vegetation Phenological Cycle and Anomaly Detection using Remote Sensing Data

Calculates phenological cycle and anomalies using a non-parametric approach applied to time series of vegetation indices derived from remote sensing data or field measurements. The package implements basic and high-level functions for manipulating vector data (numerical series) and raster data (satellite derived products). Processing of very large raster files is supported. For more information, please check the following paper: Chávez et al. (2023) <doi:10.3390/rs15010073>.

Maintained by José A. Lastra. Last updated 3 months ago.

3.3 match 5 stars 4.32 score 14 scripts

jsta

wql:Exploring Water Quality Monitoring Data

Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.

Maintained by Jemma Stachelek. Last updated 2 months ago.

water-quality

1.9 match 12 stars 7.34 score 204 scripts 3 dependents

hubbardalex

autostsm:Automatic Structural Time Series Models

Automatic model selection for structural time series decomposition into trend, cycle, and seasonal components, plus optionality for structural interpolation, using the Kalman filter. Koopman, Siem Jan and Marius Ooms (2012) "Forecasting Economic Time Series Using Unobserved Components Time Series Models" <doi:10.1093/oxfordhb/9780195398649.013.0006>. Kim, Chang-Jin and Charles R. Nelson (1999) "State-Space Models with Regime Switching: Classical and Gibbs-Sampling Approaches with Applications" <doi:10.7551/mitpress/6444.001.0001><http://econ.korea.ac.kr/~cjkim/>.

Maintained by Alex Hubbard. Last updated 9 months ago.

3.8 match 3.55 score 29 scripts

cran

dslabs:Data Science Labs

Datasets and functions that can be used for data analysis practice, homework and projects in data science courses and workshops. 26 datasets are available for case studies in data visualization, statistical inference, modeling, linear regression, data wrangling and machine learning.

Maintained by Rafael A. Irizarry. Last updated 1 years ago.

3.4 match 3.56 score 2 dependents

cran

AnomalyScore:Anomaly Scoring for Multivariate Time Series

Compute an anomaly score for multivariate time series based on the k-nearest neighbors algorithm. Different computations of distances between time series are provided.

Maintained by Guillermo Granados. Last updated 4 months ago.

7.0 match 1.70 score 1 scripts

jmhewitt

telefit:Estimation and Prediction for Remote Effects Spatial Process Models

Implementation of the remote effects spatial process (RESP) model for teleconnection. The RESP model is a geostatistical model that allows a spatially-referenced variable (like average precipitation) to be influenced by covariates defined on a remote domain (like sea surface temperatures). The RESP model is introduced in Hewitt et al. (2018) <doi:10.1002/env.2523>. Sample code for working with the RESP model is available at <https://jmhewitt.github.io/research/resp_example>. This material is based upon work supported by the National Science Foundation under grant number AGS 1419558. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Maintained by Joshua Hewitt. Last updated 5 years ago.

openblas cpp

3.8 match 1 stars 3.19 score 31 scripts

jmzobitz

demodelr:Simulating Differential Equations with Data

Designed to support the visualization, numerical computation, qualitative analysis, model-data fusion, and stochastic simulation for autonomous systems of differential equations. Euler and Runge-Kutta methods are implemented, along with tools to visualize the two-dimensional phaseplane. Likelihood surfaces and a simple Markov Chain Monte Carlo parameter estimator can be used for model-data fusion of differential equations and empirical models. The Euler-Maruyama method is provided for simulation of stochastic differential equations. The package was originally written for internal use to support teaching by Zobitz, and refined to support the text "Exploring modeling with data and differential equations using R" by John Zobitz (2021) <https://jmzobitz.github.io/ModelingWithR/index.html>.

Maintained by John Zobitz. Last updated 19 hours ago.

3.5 match 4 stars 3.34 score 11 scripts

beanumber

macleish:Retrieve Data from MacLeish Field Station

Download data from the Ada and Archibald MacLeish Field Station in Whately, MA. The Ada and Archibald MacLeish Field Station is a 260-acre patchwork of forest and farmland located in West Whately, MA that provides opportunities for faculty and students to pursue environmental research, outdoor education, and low-impact recreation (see <https://www.smith.edu/about-smith/sustainable-smith/macleish> for more information). This package contains weather data over several years, and spatial data on various man-made and natural structures.

Maintained by Benjamin S. Baumer. Last updated 3 years ago.

2.0 match 2 stars 5.72 score 87 scripts

kryberg-usgs

waterData:Retrieval, Analysis, and Anomaly Calculation of Daily Hydrologic Time Series Data

Imports U.S. Geological Survey (USGS) daily hydrologic data from USGS web services (see <https://waterservices.usgs.gov/> for more information), plots the data, addresses some common data problems, and calculates and plots anomalies.

Maintained by Karen R. Ryberg. Last updated 8 years ago.

3.3 match 3.45 score 56 scripts

tim-salabim

remote:Empirical Orthogonal Teleconnections in R

Empirical orthogonal teleconnections in R. 'remote' is short for 'R(-based) EMpirical Orthogonal TEleconnections'. It implements a collection of functions to facilitate empirical orthogonal teleconnection analysis. Empirical Orthogonal Teleconnections (EOTs) denote a regression based approach to decompose spatio-temporal fields into a set of independent orthogonal patterns. They are quite similar to Empirical Orthogonal Functions (EOFs) with EOTs producing less abstract results. In contrast to EOFs, which are orthogonal in both space and time, EOT analysis produces patterns that are orthogonal in either space or time.

Maintained by Tim Appelhans. Last updated 8 years ago.

cpp

4.0 match 2.79 score 100 scripts

allen-1242

StructuralDecompose:Decomposes a Level Shifted Time Series

Explains the behavior of a time series by decomposing it into its trend, seasonality and residuals. It is built to perform very well in the presence of significant level shifts. It is designed to play well with any breakpoint algorithm and any smoothing algorithm. Currently defaults to 'lowess' for smoothing and 'strucchange' for breakpoint identification. The package is useful in areas such as trend analysis, time series decomposition, breakpoint identification and anomaly detection.

Maintained by Allen Sunny. Last updated 2 years ago.

decomposition timeseries-analysis

2.5 match 1 stars 4.18 score 4 scripts

jomopo

BINCOR:Estimate the Correlation Between Two Irregular Time Series

Estimate the correlation between two irregular time series that are not necessarily sampled on identical time points. This program is also applicable to the situation of two evenly spaced time series that are not on the same time grid. 'BINCOR' is based on a novel estimation approach proposed by Mudelsee (2010, 2014) to estimate the correlation between two climate time series with different timescales. The idea is that autocorrelation (AR1 process) allows to correlate values obtained on different time points. 'BINCOR' contains four functions: bin_cor() (the main function to build the binned time series), plot_ts() (to plot and compare the irregular and binned time series, cor_ts() (to estimate the correlation between the binned time series) and ccf_ts() (to estimate the cross-correlation between the binned time series).

Maintained by Josué M. Polanco-Martínez. Last updated 7 years ago.

6.8 match 1.48 score 5 scripts

ropensci

rsat:Dealing with Multiplatform Satellite Images

Downloading, customizing, and processing time series of satellite images for a region of interest. 'rsat' functions allow a unified access to multispectral images from Landsat, MODIS and Sentinel repositories. 'rsat' also offers capabilities for customizing satellite images, such as tile mosaicking, image cropping and new variables computation. Finally, 'rsat' covers the processing, including cloud masking, compositing and gap-filling/smoothing time series of images (Militino et al., 2018 <doi:10.3390/rs10030398> and Militino et al., 2019 <doi:10.1109/TGRS.2019.2904193>).

Maintained by Unai Pérez - Goya. Last updated 11 months ago.

satellite-images

1.3 match 54 stars 7.45 score 52 scripts

atsa-es

atsalibrary:Packages, data and scripts for ATSA course and lab book

This package will load the needed packages and data files for the ATSA course material when students install from GitHub.

Maintained by Elizabeth E. Holmes. Last updated 2 years ago.

3.5 match 4 stars 2.41 score 13 scripts

inbo

inlatools:Diagnostic Tools for INLA Models

Several functions which can be useful to choose sensible priors and diagnose the fitted model.

Maintained by Thierry Onkelinx. Last updated 5 months ago.

bayesian-statistics gplv3 inla mixed-models model-checking model-validation

1.9 match 4 stars 4.41 score 43 scripts

txwri

adc:Calculate Antecedent Discharge Conditions

Calculates some antecedent discharge conditions useful in water quality modeling. Includes methods for calculating flow anomalies, base flow, and smooth discounted flows from daily flow measurements. Antecedent discharge algorithms are described and reviewed in Zhang and Ball (2017) <doi:10.1016/j.jhydrol.2016.12.052>.

Maintained by Michael Schramm. Last updated 2 years ago.

hydrology water water-quality

2.5 match 3 stars 3.18 score 3 scripts

shirintaheri

climetrics:Climate Change Metrics

A framework that facilitates spatio-temporal analysis of climate dynamics through exploring and measuring different dimensions of climate change in space and time.

Maintained by Shirin Taheri. Last updated 11 months ago.

2.0 match 14 stars 3.85 score 9 scripts

inbo

n2kupdate:Auxiliary Functions to Update the n2kresult Database

The functions are useful to store the results from https:// github.com/inbo/n2kanalysis into a PostgreSQL database created with https:// github.com/inbo/n2kresult.

Maintained by Thierry Onkelinx. Last updated 6 years ago.

etl natura2000

4.3 match 1.70 score 1 scripts

cicarrascog

imputeREE:Impute Missing Rare Earth Element Data in Zircon

Set of functions to impute missing rare earth data, calculate La and Pr concentrations and Ce anomalies in zircons based on the Chondrite-Onuma and Chondrite-Lattice of Carrasco-Godoy and Campbell (2023) <doi:10.1007/s00410-023-02025-9> and the Logarithmic regression from Zhong et al. (2019) <doi:10.1007/s00710-019-00682-y>.

Maintained by Carlos Carrasco Godoy. Last updated 1 years ago.

2.3 match 3 stars 3.18 score 3 scripts

gabrielodom

mvMonitoring:Multi-State Adaptive Dynamic Principal Component Analysis for Multivariate Process Monitoring

Use multi-state splitting to apply Adaptive-Dynamic PCA (ADPCA) to data generated from a continuous-time multivariate industrial or natural process. Employ PCA-based dimension reduction to extract linear combinations of relevant features, reducing computational burdens. For a description of ADPCA, see <doi:10.1007/s00477-016-1246-2>, the 2016 paper from Kazor et al. The multi-state application of ADPCA is from a manuscript under current revision entitled "Multi-State Multivariate Statistical Process Control" by Odom, Newhart, Cath, and Hering, and is expected to appear in Q1 of 2018.

Maintained by Gabriel Odom. Last updated 1 years ago.

1.3 match 4 stars 5.24 score 29 scripts

inbo

n2kanalysis:Generic Functions to Analyse Data from the 'Natura 2000' Monitoring

All generic functions and classes for the analysis for the 'Natura 2000' monitoring. The classes contain all required data and definitions to fit the model without the need to access other sources. Potentially they might need access to one or more parent objects. An aggregation object might for example need the result of an imputation object. The actual definition of the analysis, using these generic function and classes, is defined in dedictated analysis R packages for every monitoring scheme. For example 'abvanalysis' and 'watervogelanalysis'.

Maintained by Thierry Onkelinx. Last updated 2 months ago.

analysis monitoring natura2000

2.0 match 1 stars 3.18 score 7 scripts

jaydevine

pheble:Classifying High-Dimensional Phenotypes with Ensemble Learning

A system for binary and multi-class classification of high-dimensional phenotypic data using ensemble learning. By combining predictions from different classification models, this package attempts to improve performance over individual learners. The pre-processing, training, validation, and testing are performed end-to-end to minimize user input and simplify the process of classification.

Maintained by Jay Devine. Last updated 2 years ago.

2.3 match 2.70 score

cran

adamethods:Archetypoid Algorithms and Anomaly Detection

Collection of several algorithms to obtain archetypoids with small and large databases, and with both classical multivariate data and functional data (univariate and multivariate). Some of these algorithms also allow to detect anomalies (outliers). Please see Vinue and Epifanio (2020) <doi:10.1007/s11634-020-00412-9>.

Maintained by Guillermo Vinue. Last updated 5 years ago.

3.6 match 1.63 score 43 scripts

mfacevedol

renpow:Renewable Power Systems and the Environment

Supports calculations and visualization for renewable power systems and the environment. Analysis and graphical tools for DC and AC circuits and their use in electric power systems. Analysis and graphical tools for thermodynamic cycles and heat engines, supporting efficiency calculations in coal-fired power plants, gas-fired power plants. Calculations of carbon emissions and atmospheric CO2 dynamics. Analysis of power flow and demand for the grid, as well as power models for microgrids and off-grid systems. Provides resource and power generation for hydro power, wind power, and solar power.

Maintained by Miguel F. Acevedo. Last updated 7 years ago.

3.6 match 1.48 score 30 scripts

dyfanjones

sagemaker.mlframework:sagemaker machine learning developed by amazon

`sagemaker` machine learning developed by amazon.

Maintained by Dyfan Jones. Last updated 3 years ago.

amazon-sagemaker aws machine-learning sagemaker sdk

1.8 match 2.48 score 2 dependents

collincr-usgs

gravmagsubs:Gravitational and Magnetic Attraction of 3-D Vertical Rectangular Prisms

Computes the gravitational and magnetic anomalies generated by 3-D vertical rectangular prisms at specific observation points using the method of Plouff (1976) <doi:10.1190/1.1440645>.

Maintained by C. Cronkite-Ratcliff. Last updated 2 years ago.

cpp openmp

1.6 match 2.70 score 4 scripts

talegari

solitude:An Implementation of Isolation Forest

Isolation forest is anomaly detection method introduced by the paper Isolation based Anomaly Detection (Liu, Ting and Zhou <doi:10.1145/2133360.2133363>).

Maintained by Komala Sheshachala Srikanth. Last updated 4 years ago.

isolation-forest outliers rpackages

0.8 match 24 stars 5.24 score 70 scripts 1 dependents

ropenspain

climaemet:Climate AEMET Tools

Tools to download the climatic data of the Spanish Meteorological Agency (AEMET) directly from R using their API and create scientific graphs (climate charts, trend analysis of climate time series, temperature and precipitation anomalies maps, warming stripes graphics, climatograms, etc.).

Maintained by Diego Hernangómez. Last updated 1 months ago.

aemet climate data forecast-api ropenspain science spain weather-api

0.5 match 43 stars 8.32 score 59 scripts

cumulocity-iot

pmml:Generate PMML for Various Models

The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define machine learning, statistical and data mining models and to share models between PMML compliant applications. More information about the PMML industry standard and the Data Mining Group can be found at <http://dmg.org/>. The generated PMML can be imported into any PMML consuming application, such as Zementis Predictive Analytics products. The package isofor (used for anomaly detection) can be installed with devtools::install_github("gravesee/isofor").

Maintained by Dmitriy Bolotov. Last updated 3 years ago.

machine-learning pmml zementis

0.5 match 20 stars 7.98 score 560 scripts 1 dependents

zhaokg

Rbeast:Bayesian Change-Point Detection and Time Series Decomposition

Interpretation of time series data is affected by model choices. Different models can give different or even contradicting estimates of patterns, trends, and mechanisms for the same data--a limitation alleviated by the Bayesian estimator of abrupt change,seasonality, and trend (BEAST) of this package. BEAST seeks to improve time series decomposition by forgoing the "single-best-model" concept and embracing all competing models into the inference via a Bayesian model averaging scheme. It is a flexible tool to uncover abrupt changes (i.e., change-points), cyclic variations (e.g., seasonality), and nonlinear trends in time-series observations. BEAST not just tells when changes occur but also quantifies how likely the detected changes are true. It detects not just piecewise linear trends but also arbitrary nonlinear trends. BEAST is applicable to real-valued time series data of all kinds, be it for remote sensing, economics, climate sciences, ecology, and hydrology. Example applications include its use to identify regime shifts in ecological data, map forest disturbance and land degradation from satellite imagery, detect market trends in economic data, pinpoint anomaly and extreme events in climate data, and unravel system dynamics in biological data. Details on BEAST are reported in Zhao et al. (2019) <doi:10.1016/j.rse.2019.04.034>.

Maintained by Kaiguang Zhao. Last updated 6 months ago.

anomoly-detection bayesian-time-series breakpoint-detection changepoint-detection interrupted-time-series seasonality-analysis structural-breakpoint technical-analysis time-series time-series-decomposition trend trend-analysis

0.5 match 302 stars 7.63 score 89 scripts

bioc

PeacoQC:Peak-based selection of high quality cytometry data

This is a package that includes pre-processing and quality control functions that can remove margin events, compensate and transform the data and that will use PeacoQCSignalStability for quality control. This last function will first detect peaks in each channel of the flowframe. It will remove anomalies based on the IsolationTree function and the MAD outlier detection method. This package can be used for both flow- and mass cytometry data.

Maintained by Annelies Emmaneel. Last updated 5 months ago.

flowcytometry qualitycontrol preprocessing peakdetection

0.5 match 16 stars 7.38 score 28 scripts 3 dependents

rabarata

exdqlm:Extended Dynamic Quantile Linear Models

Routines for Bayesian estimation and analysis of dynamic quantile linear models utilizing the extended asymmetric Laplace error distribution, also known as extended dynamic quantile linear models (exDQLM) described in Barata et al (2020) <doi:10.1214/21-AOAS1497>.

Maintained by Raquel Barata. Last updated 2 years ago.

3.6 match 1 stars 1.00 score

pridiltal

oddstream:Outlier Detection in Data Streams

We proposes a framework that provides real time support for early detection of anomalous series within a large collection of streaming time series data. By definition, anomalies are rare in comparison to a system's typical behaviour. We define an anomaly as an observation that is very unlikely given the forecast distribution. The algorithm first forecasts a boundary for the system's typical behaviour using a representative sample of the typical behaviour of the system. An approach based on extreme value theory is used for this boundary prediction process. Then a sliding window is used to test for anomalous series within the newly arrived collection of series. Feature based representation of time series is used as the input to the model. To cope with concept drift, the forecast boundary for the system's typical behaviour is updated periodically. More details regarding the algorithm can be found in Talagala, P. D., Hyndman, R. J., Smith-Miles, K., et al. (2019) <doi:10.1080/10618600.2019.1617160>.

Maintained by Priyanga Dilini Talagala. Last updated 5 years ago.

0.8 match 64 stars 4.71 score 16 scripts

cran

SpatialVx:Spatial Forecast Verification

Spatial forecast verification refers to verifying weather forecasts when the verification set (forecast and observations) is on a spatial field, usually a high-resolution gridded spatial field. Most of the functions here require the forecast and observed fields to be gridded and on the same grid. For a thorough review of most of the methods in this package, please see Gilleland et al. (2009) <doi: 10.1175/2009WAF2222269.1> and for a tutorial on some of the main functions available here, see Gilleland (2022) <doi: 10.5065/4px3-5a05>.

Maintained by Eric Gilleland. Last updated 4 months ago.

1.9 match 1 stars 1.83 score 68 scripts

adunaic

forecastLSW:Forecasting Routines for Locally Stationary Wavelet Processes

Implementation to perform forecasting of locally stationary wavelet processes by examining the local second order structure of the time series.

Maintained by Rebecca Killick. Last updated 2 years ago.

3.3 match 1.00 score 3 scripts

hoxo-m

densratio:Density Ratio Estimation

Density ratio estimation. The estimated density ratio function can be used in many applications such as anomaly detection, change-point detection, covariate shift adaptation. The implemented methods are uLSIF (Hido et al. (2011) <doi:10.1007/s10115-010-0283-2>), RuLSIF (Yamada et al. (2011) <doi:10.1162/NECO_a_00442>), and KLIEP (Sugiyama et al. (2007) <doi:10.1007/s10463-008-0197-x>).

Maintained by Koji Makiyama. Last updated 6 years ago.

anomalydetection machine-learning machine-learning-algorithms machine-learning-library r-language statistics

0.5 match 21 stars 6.36 score 36 scripts 2 dependents

bioc

flowAI:Automatic and interactive quality control for flow cytometry data

The package is able to perform an automatic or interactive quality control on FCS data acquired using flow cytometry instruments. By evaluating three different properties: 1) flow rate, 2) signal acquisition, 3) dynamic range, the quality control enables the detection and removal of anomalies.

Maintained by Gianni Monaco. Last updated 5 months ago.

flowcytometry qualitycontrol biomedicalinformatics immunooncology

0.5 match 5.67 score 86 scripts 3 dependents

mayer79

outForest:Multivariate Outlier Detection and Replacement

Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) <doi:10.1145/1541880.1541882>. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data.

Maintained by Michael Mayer. Last updated 8 months ago.

machine-learning outlier outlier-analysis outlier-detection random-forest

0.5 match 13 stars 5.39 score 19 scripts

sevvandi

outlierensembles:A Collection of Outlier Ensemble Algorithms

Ensemble functions for outlier/anomaly detection. There is a new ensemble method proposed using Item Response Theory. Existing outlier ensemble methods from Schubert et al (2012) <doi:10.1137/1.9781611972825.90>, Chiang et al (2017) <doi:10.1016/j.jal.2016.12.002> and Aggarwal and Sathe (2015) <doi:10.1145/2830544.2830549> are also included.

Maintained by Sevvandi Kandanaarachchi. Last updated 23 days ago.

0.5 match 3 stars 4.18 score 9 scripts

ainsuotain

pasadr:An Implementation of Process-Aware Stealthy Attack Detection(PASAD)

Anomaly detection method based on the paper "Truth will out: Departure-based process-level detection of stealthy attacks on control systems" from Wissam Aoudi, Mikel Iturbe, and Magnus Almgren (2018) <DOI:10.1145/3243734.3243781>. Also referred to the following implementation: <https://github.com/rahulrajpl/PyPASAD>.

Maintained by Donghwan Kim. Last updated 4 years ago.

0.5 match 3 stars 3.18 score

kumes

seasonalityPlot:Seasonality Variation Plots of Stock Prices and Cryptocurrencies

The price action at any given time is determined by investor sentiment and market conditions. Although there is no established principle, over a long period of time, things often move with a certain periodicity. This is sometimes referred to as anomaly. The seasonPlot() function in this package calculates and visualizes the average value of price movements over a year for any given period. In addition, the monthly increase or decrease in price movement is represented with a colored background. This seasonPlot() function can use the same symbols as the 'quantmod' package (e.g. ^IXIC, ^DJI, SPY, BTC-USD, and ETH-USD etc).

Maintained by Satoshi Kume. Last updated 6 months ago.

0.5 match 1 stars 3.00 score 6 scripts

myaseen208

PakNAcc:'shiny' App for National Accounts

Provides a comprehensive suite of tools for analyzing Pakistan's Quarterly National Accounts data. Users can gain detailed insights into Pakistan's economic performance, visualize quarterly trends, and detect patterns and anomalies in key economic indicators. Compare sector contributions—including agriculture, industry, and services—to understand their influence on economic growth or decline. Customize analyses by filtering and manipulating data to focus on specific areas of interest. Ideal for policymakers, researchers, and analysts aiming to make informed, data-driven decisions based on timely and detailed economic insights.

Maintained by Muhammad Yaseen. Last updated 4 months ago.

0.5 match 3.00 score

pigian

snap:Simple Neural Application

A simple wrapper to easily design vanilla deep neural networks using 'Tensorflow'/'Keras' backend for regression, classification and multi-label tasks, with some tweaks and tricks (skip shortcuts, embedding, feature selection and anomaly detection).

Maintained by Giancarlo Vercellino. Last updated 4 years ago.

0.5 match 2.00 score

cran

adplots:Ad-Plot and Ud-Plot for Visualizing Distributional Properties and Normality

The empirical cumulative average deviation function introduced by the author is utilized to develop both Ad- and Ud-plots. The Ad-plot can identify symmetry, skewness, and outliers of the data distribution, including anomalies. The Ud-plot created by slightly modifying Ad-plot is exceptional in assessing normality, outperforming normal QQ-plot, normal PP-plot, and their derivations. The d-value that quantifies the degree of proximity between the Ud-plot and the graph of the estimated normal density function helps guide to make decisions on confirmation of normality. Full description of this methodology can be found in the article by Wijesuriya (2025) <doi:10.1080/03610926.2024.2440583>.

Maintained by Uditha Amarananda Wijesuriya. Last updated 2 months ago.

0.5 match 2.00 score

cran

CLimd:Generating Rainfall Rasters from IMD NetCDF Data

The developed function is a comprehensive tool for the analysis of India Meteorological Department (IMD) NetCDF rainfall data. Specifically designed to process high-resolution daily gridded rainfall datasets. It provides four key functions to process IMD NetCDF rainfall data and create rasters for various temporal scales, including annual, seasonal, monthly, and weekly rainfall. For method details see, Malik, A. (2019).<DOI:10.1007/s12517-019-4454-5>. It supports different aggregation methods, such as sum, min, max, mean, and standard deviation. These functions are designed for spatio-temporal analysis of rainfall patterns, trend analysis,geostatistical modeling of rainfall variability, identifying rainfall anomalies and extreme events and can be an input for hydrological and agricultural models.

Maintained by Nobin Chandra Paul. Last updated 1 years ago.

0.5 match 2.00 score

marchionnilab

lpcover:LPCover: Functionality for integer programming methods for covering

Integer programming functionality for different 'covering' optimizations as presented in Ke et al, "Efficient Representations of Tumor Diversity with Paired DNA-RNA Anomalies".

Maintained by Wikum Dinalankara. Last updated 4 years ago.

software statisticalmethod

0.5 match 1.70 score 2 scripts

cran

kmodR:K-Means with Simultaneous Outlier Detection

An implementation of the 'k-means--' algorithm proposed by Chawla and Gionis, 2013 in their paper, "k-means-- : A unified approach to clustering and outlier detection. SIAM International Conference on Data Mining (SDM13)", <doi:10.1137/1.9781611972832.21> and using 'ordering' described by Howe, 2013 in the thesis, Clustering and anomaly detection in tropical cyclones". Useful for creating (potentially) tighter clusters than standard k-means and simultaneously finding outliers inexpensively in multidimensional space.

Maintained by David Charles Howe. Last updated 3 years ago.

0.5 match 1.70 score

cran

kuiper.2samp:Two-Sample Kuiper Test

This function performs the two-sample Kuiper test to assess the anomaly of continuous, one-dimensional probability distributions. References used for this method are (1). Kuiper, N. H. (1960). <DOI:10.1016/S1385-7258(60)50006-0> and (2). Paltani, S. (2004). <DOI:10.1051/0004-6361:20034220>.

Maintained by Ying Ruan. Last updated 6 years ago.

0.5 match 1.00 score

cran

onlineCOV:Online Change Point Detection in High-Dimensional Covariance Structure

Implement a new stopping rule to detect anomaly in the covariance structure of high-dimensional online data. The detection procedure can be applied to Gaussian or non-Gaussian data with a large number of components. Moreover, it allows both spatial and temporal dependence in data. The dependence can be estimated by a data-driven procedure. The level of threshold in the stopping rule can be determined at a pre-selected average run length. More detail can be seen in Li, L. and Li, J. (2020) "Online Change-Point Detection in High-Dimensional Covariance Structure with Application to Dynamic Networks." <arXiv:1911.07762>.

Maintained by Jun Li. Last updated 5 years ago.

fortran

0.5 match 1.00 score