Title: | Feed-Forward Neural Networks and Multinomial Log-Linear Models |
---|---|
Description: | Software for feed-forward neural networks with a single hidden layer, and for multinomial log-linear models. |
Authors: | Brian Ripley [aut, cre, cph], William Venables [cph] |
Maintainer: | Brian Ripley <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 7.3-19 |
Built: | 2024-06-15 17:27:55 UTC |
Source: | CRAN |
Generates a class indicator function from a given factor.
class.ind(cl)
class.ind(cl)
cl |
factor or vector of classes for cases. |
a matrix which is zero except for the column corresponding to the class.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
# The function is currently defined as class.ind <- function(cl) { n <- length(cl) cl <- as.factor(cl) x <- matrix(0, n, length(levels(cl)) ) x[(1:n) + n*(unclass(cl)-1)] <- 1 dimnames(x) <- list(names(cl), levels(cl)) x }
# The function is currently defined as class.ind <- function(cl) { n <- length(cl) cl <- as.factor(cl) x <- matrix(0, n, length(levels(cl)) ) x[(1:n) + n*(unclass(cl)-1)] <- 1 dimnames(x) <- list(names(cl), levels(cl)) x }
Fits multinomial log-linear models via neural networks.
multinom(formula, data, weights, subset, na.action, contrasts = NULL, Hess = FALSE, summ = 0, censored = FALSE, model = FALSE, ...)
multinom(formula, data, weights, subset, na.action, contrasts = NULL, Hess = FALSE, summ = 0, censored = FALSE, model = FALSE, ...)
formula |
a formula expression as for regression models, of the form
|
data |
an optional data frame in which to interpret the variables occurring
in |
weights |
optional case weights in fitting. |
subset |
expression saying which subset of the rows of the data should be used in the fit. All observations are included by default. |
na.action |
a function to filter missing data. |
contrasts |
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. |
Hess |
logical for whether the Hessian (the observed/expected information matrix) should be returned. |
summ |
integer; if non-zero summarize by deleting duplicate rows and adjust weights.
Methods 1 and 2 differ in speed (2 uses |
censored |
If Y is a matrix with |
model |
logical. If true, the model frame is saved as component |
... |
additional arguments for |
multinom
calls nnet
. The variables on the rhs of
the formula should be roughly scaled to [0,1] or the fit will be slow
or may not converge at all.
A nnet
object with additional components:
deviance |
the residual deviance, compared to the full saturated model (that explains individual observations exactly). Also, minus twice log-likelihood. |
edf |
the (effective) number of degrees of freedom used by the model |
AIC |
the AIC for this fit. |
Hessian |
(if |
model |
(if |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
oc <- options(contrasts = c("contr.treatment", "contr.poly")) library(MASS) example(birthwt) (bwt.mu <- multinom(low ~ ., bwt)) options(oc)
oc <- options(contrasts = c("contr.treatment", "contr.poly")) library(MASS) example(birthwt) (bwt.mu <- multinom(low ~ ., bwt)) options(oc)
Fit single-hidden-layer neural network, possibly with skip-layer connections.
nnet(x, ...) ## S3 method for class 'formula' nnet(formula, data, weights, ..., subset, na.action, contrasts = NULL) ## Default S3 method: nnet(x, y, weights, size, Wts, mask, linout = FALSE, entropy = FALSE, softmax = FALSE, censored = FALSE, skip = FALSE, rang = 0.7, decay = 0, maxit = 100, Hess = FALSE, trace = TRUE, MaxNWts = 1000, abstol = 1.0e-4, reltol = 1.0e-8, ...)
nnet(x, ...) ## S3 method for class 'formula' nnet(formula, data, weights, ..., subset, na.action, contrasts = NULL) ## Default S3 method: nnet(x, y, weights, size, Wts, mask, linout = FALSE, entropy = FALSE, softmax = FALSE, censored = FALSE, skip = FALSE, rang = 0.7, decay = 0, maxit = 100, Hess = FALSE, trace = TRUE, MaxNWts = 1000, abstol = 1.0e-4, reltol = 1.0e-8, ...)
formula |
A formula of the form |
x |
matrix or data frame of |
y |
matrix or data frame of target values for examples. |
weights |
(case) weights for each example – if missing defaults to 1. |
size |
number of units in the hidden layer. Can be zero if there are skip-layer units. |
data |
Data frame from which variables specified in |
subset |
An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) |
na.action |
A function to specify the action to be taken if |
contrasts |
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. |
Wts |
initial parameter vector. If missing chosen at random. |
mask |
logical vector indicating which parameters should be optimized (default all). |
linout |
switch for linear output units. Default logistic output units. |
entropy |
switch for entropy (= maximum conditional likelihood) fitting. Default by least-squares. |
softmax |
switch for softmax (log-linear model) and maximum conditional
likelihood fitting. |
censored |
A variant on |
skip |
switch to add skip-layer connections from input to output. |
rang |
Initial random weights on [- |
decay |
parameter for weight decay. Default 0. |
maxit |
maximum number of iterations. Default 100. |
Hess |
If true, the Hessian of the measure of fit at the best set of weights
found is returned as component |
trace |
switch for tracing optimization. Default |
MaxNWts |
The maximum allowable number of weights. There is no intrinsic limit
in the code, but increasing |
abstol |
Stop if the fit criterion falls below |
reltol |
Stop if the optimizer is unable to reduce the fit criterion by a
factor of at least |
... |
arguments passed to or from other methods. |
If the response in formula
is a factor, an appropriate classification
network is constructed; this has one output and entropy fit if the
number of levels is two, and a number of outputs equal to the number
of classes and a softmax output stage for more levels. If the
response is not a factor, it is passed on unchanged to nnet.default
.
Optimization is done via the BFGS method of optim
.
object of class "nnet"
or "nnet.formula"
.
Mostly internal structure, but has components
wts |
the best set of weights found |
value |
value of fitting criterion plus weight decay term. |
fitted.values |
the fitted values for the training data. |
residuals |
the residuals for the training data. |
convergence |
|
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
# use half the iris data ir <- rbind(iris3[,,1],iris3[,,2],iris3[,,3]) targets <- class.ind( c(rep("s", 50), rep("c", 50), rep("v", 50)) ) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ir1 <- nnet(ir[samp,], targets[samp,], size = 2, rang = 0.1, decay = 5e-4, maxit = 200) test.cl <- function(true, pred) { true <- max.col(true) cres <- max.col(pred) table(true, cres) } test.cl(targets[-samp,], predict(ir1, ir[-samp,])) # or ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), species = factor(c(rep("s",50), rep("c", 50), rep("v", 50)))) ir.nn2 <- nnet(species ~ ., data = ird, subset = samp, size = 2, rang = 0.1, decay = 5e-4, maxit = 200) table(ird$species[-samp], predict(ir.nn2, ird[-samp,], type = "class"))
# use half the iris data ir <- rbind(iris3[,,1],iris3[,,2],iris3[,,3]) targets <- class.ind( c(rep("s", 50), rep("c", 50), rep("v", 50)) ) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ir1 <- nnet(ir[samp,], targets[samp,], size = 2, rang = 0.1, decay = 5e-4, maxit = 200) test.cl <- function(true, pred) { true <- max.col(true) cres <- max.col(pred) table(true, cres) } test.cl(targets[-samp,], predict(ir1, ir[-samp,])) # or ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), species = factor(c(rep("s",50), rep("c", 50), rep("v", 50)))) ir.nn2 <- nnet(species ~ ., data = ird, subset = samp, size = 2, rang = 0.1, decay = 5e-4, maxit = 200) table(ird$species[-samp], predict(ir.nn2, ird[-samp,], type = "class"))
Evaluates the Hessian (matrix of second derivatives) of the specified
neural network. Normally called via argument Hess=TRUE
to nnet
or via
vcov.multinom
.
nnetHess(net, x, y, weights)
nnetHess(net, x, y, weights)
net |
object of class |
x |
training data. |
y |
classes for training data. |
weights |
the (case) weights used in the |
square symmetric matrix of the Hessian evaluated at the weights stored in the net.
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
# use half the iris data ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3]) targets <- matrix(c(rep(c(1,0,0),50), rep(c(0,1,0),50), rep(c(0,0,1),50)), 150, 3, byrow=TRUE) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ir1 <- nnet(ir[samp,], targets[samp,], size=2, rang=0.1, decay=5e-4, maxit=200) eigen(nnetHess(ir1, ir[samp,], targets[samp,]), TRUE)$values
# use half the iris data ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3]) targets <- matrix(c(rep(c(1,0,0),50), rep(c(0,1,0),50), rep(c(0,0,1),50)), 150, 3, byrow=TRUE) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ir1 <- nnet(ir[samp,], targets[samp,], size=2, rang=0.1, decay=5e-4, maxit=200) eigen(nnetHess(ir1, ir[samp,], targets[samp,]), TRUE)$values
Predict new examples by a trained neural net.
## S3 method for class 'nnet' predict(object, newdata, type = c("raw","class"), ...)
## S3 method for class 'nnet' predict(object, newdata, type = c("raw","class"), ...)
object |
an object of class |
newdata |
matrix or data frame of test examples. A vector is considered to be a row vector comprising a single case. |
type |
Type of output |
... |
arguments passed to or from other methods. |
This function is a method for the generic function
predict()
for class "nnet"
.
It can be invoked by calling predict(x)
for an
object x
of the appropriate class, or directly by
calling predict.nnet(x)
regardless of the
class of the object.
If type = "raw"
, the matrix of values returned by the trained network;
if type = "class"
, the corresponding class (which is probably only
useful if the net was generated by nnet.formula
).
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
# use half the iris data ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3]) targets <- class.ind( c(rep("s", 50), rep("c", 50), rep("v", 50)) ) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ir1 <- nnet(ir[samp,], targets[samp,],size = 2, rang = 0.1, decay = 5e-4, maxit = 200) test.cl <- function(true, pred){ true <- max.col(true) cres <- max.col(pred) table(true, cres) } test.cl(targets[-samp,], predict(ir1, ir[-samp,])) # or ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), species = factor(c(rep("s",50), rep("c", 50), rep("v", 50)))) ir.nn2 <- nnet(species ~ ., data = ird, subset = samp, size = 2, rang = 0.1, decay = 5e-4, maxit = 200) table(ird$species[-samp], predict(ir.nn2, ird[-samp,], type = "class"))
# use half the iris data ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3]) targets <- class.ind( c(rep("s", 50), rep("c", 50), rep("v", 50)) ) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ir1 <- nnet(ir[samp,], targets[samp,],size = 2, rang = 0.1, decay = 5e-4, maxit = 200) test.cl <- function(true, pred){ true <- max.col(true) cres <- max.col(pred) table(true, cres) } test.cl(targets[-samp,], predict(ir1, ir[-samp,])) # or ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), species = factor(c(rep("s",50), rep("c", 50), rep("v", 50)))) ir.nn2 <- nnet(species ~ ., data = ird, subset = samp, size = 2, rang = 0.1, decay = 5e-4, maxit = 200) table(ird$species[-samp], predict(ir.nn2, ird[-samp,], type = "class"))
Find the maximum position in a vector, breaking ties at random.
which.is.max(x)
which.is.max(x)
x |
a vector |
Ties are broken at random.
index of a maximal value.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
max.col
, which.max
which takes the first of ties.
## Not run: ## this is incomplete pred <- predict(nnet, test) table(true, apply(pred, 1, which.is.max)) ## End(Not run)
## Not run: ## this is incomplete pred <- predict(nnet, test) table(true, apply(pred, 1, which.is.max)) ## End(Not run)