Fit Neural Networks
Description
Fit singlehiddenlayer neural network, possibly with skiplayer connections.
Usage
nnet(x, ...)
nnet(formula, data, weights, ...,
subset, na.action, contrasts = NULL)
nnet(x, y, weights, size, Wts, mask,
linout = FALSE, entropy = FALSE, softmax = FALSE,
censored = FALSE, skip = FALSE, rang = 0.7, decay = 0,
maxit = 100, Hess = FALSE, trace = TRUE, MaxNWts = 1000,
abstol = 1.0e4, reltol = 1.0e8, ...)
Arguments
formula 
A formula of the form class ~ x1 + x2 + ...

x 
matrix or data frame of x values for examples.

y 
matrix or data frame of target values for examples.

weights 
(case) weights for each example – if missing defaults to 1.

size 
number of units in the hidden layer. Can be zero if there are skiplayer units.

data 
Data frame from which variables specified in formula are
preferentially to be taken.

subset 
An index vector specifying the cases to be used in the training
sample. (NOTE: If given, this argument must be named.)

na.action 
A function to specify the action to be taken if NA s are found.
The default action is for the procedure to fail. An alternative is
na.omit, which leads to rejection of cases with missing values on
any required variable. (NOTE: If given, this argument must be named.)

contrasts 
a list of contrasts to be used for some or all of
the factors appearing as variables in the model formula.

Wts 
initial parameter vector. If missing chosen at random.

mask 
logical vector indicating which parameters should be optimized (default all).

linout 
switch for linear output units. Default logistic output units.

entropy 
switch for entropy (= maximum conditional likelihood) fitting.
Default by leastsquares.

softmax 
switch for softmax (loglinear model) and maximum conditional
likelihood fitting. linout , entropy , softmax and censored are mutually
exclusive.

censored 
A variant on softmax , in which nonzero targets mean possible
classes. Thus for softmax a row of (0, 1, 1) means one example
each of classes 2 and 3, but for censored it means one example whose
class is only known to be 2 or 3.

skip 
switch to add skiplayer connections from input to output.

rang 
Initial random weights on [rang , rang ]. Value about 0.5 unless the
inputs are large, in which case it should be chosen so that
rang * max(x ) is about 1.

decay 
parameter for weight decay. Default 0.

maxit 
maximum number of iterations. Default 100.

Hess 
If true, the Hessian of the measure of fit at the best set of weights
found is returned as component Hessian .

trace 
switch for tracing optimization. Default TRUE .

MaxNWts 
The maximum allowable number of weights. There is no intrinsic limit
in the code, but increasing MaxNWts will probably allow fits that
are very slow and timeconsuming.

abstol 
Stop if the fit criterion falls below abstol , indicating an
essentially perfect fit.

reltol 
Stop if the optimizer is unable to reduce the fit criterion by a
factor of at least 1  reltol .

... 
arguments passed to or from other methods.

Details
If the response in formula
is a factor, an appropriate classification
network is constructed; this has one output and entropy fit if the
number of levels is two, and a number of outputs equal to the number
of classes and a softmax output stage for more levels. If the
response is not a factor, it is passed on unchanged to nnet.default
.
Optimization is done via the BFGS method of optim
.
Value
object of class "nnet"
or "nnet.formula"
.
Mostly internal structure, but has components
wts 
the best set of weights found

value 
value of fitting criterion plus weight decay term.

fitted.values 
the fitted values for the training data.

residuals 
the residuals for the training data.

convergence 
1 if the maximum number of iterations was reached, otherwise 0 .

References
Ripley, B. D. (1996)
Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and Ripley, B. D. (2002)
Modern Applied Statistics with S. Fourth edition. Springer.
See Also
predict.nnet
, nnetHess
Examples
ir < rbind(iris3[,,1],iris3[,,2],iris3[,,3])
targets < class.ind( c(rep("s", 50), rep("c", 50), rep("v", 50)) )
samp < c(sample(1:50,25), sample(51:100,25), sample(101:150,25))
ir1 < nnet(ir[samp,], targets[samp,], size = 2, rang = 0.1,
decay = 5e4, maxit = 200)
test.cl < function(true, pred) {
true < max.col(true)
cres < max.col(pred)
table(true, cres)
}
test.cl(targets[samp,], predict(ir1, ir[samp,]))
ird < data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),
species = factor(c(rep("s",50), rep("c", 50), rep("v", 50))))
ir.nn2 < nnet(species ~ ., data = ird, subset = samp, size = 2, rang = 0.1,
decay = 5e4, maxit = 200)
table(ird$species[samp], predict(ir.nn2, ird[samp,], type = "class"))