---
title: "NeuralNetwork"
language: "en"
type: "Method"
summary: "NeuralNetwork (Machine Learning Method) Method for Classify and Predict. Models class probabilities or predicts the value distribution using a neural network. A neural network consists of stacked layers, each performing a simple computation. Information is processed layer by layer from the input layer to the output layer. The neural network is trained to minimize a loss function on the training set using gradient descent. The following options can be given: The option NetworkDepth controls the capacity of the network. A deeper network will be able to fit more complex patterns but will be more prone to overfitting. The option MaxTrainingRounds can be used to speed up the training but also as a regularization parameter: setting a lower value can prevent overfitting."
keywords: 
- Machine Learning
- Prediction
- Classification
- Neural Networks
- Layers
- Activation function
canonical_url: "https://reference.wolfram.com/language/ref/method/NeuralNetwork.html"
source: "Wolfram Language Documentation"
---
[EXPERIMENTAL]

# "NeuralNetwork" (Machine Learning Method)

* Method for ``Classify`` and ``Predict``.

* Models class probabilities or predicts the value distribution using a neural network.

---

## Details & Suboptions

* A neural network consists of stacked layers, each performing a simple computation. Information is processed layer by layer from the input layer to the output layer. The neural network is trained to minimize a loss function on the training set using gradient descent.

* The following options can be given:

|                    |           |                                               |
| ------------------ | --------- | --------------------------------------------- |
| MaxTrainingRounds  | Automatic | maximum number of iterations over the dataset |
| "NetworkDepth"     | Automatic | the depth of the network                      |

* The option ``"NetworkDepth"`` controls the capacity of the network. A deeper network will be able to fit more complex patterns but will be more prone to overfitting.

* The option ``MaxTrainingRounds`` can be used to speed up the training but also as a regularization parameter: setting a lower value can prevent overfitting.

---

## Examples (4)

### Basic Examples (2)

Train a classifier function on labeled examples:

```wl
In[1]:= c = Classify[{1, 2, 3, 4} -> {"A", "A", "B", "B"}, Method -> "NeuralNetwork"]

Out[1]=
ClassifierFunction[Association["ExampleNumber" -> 4, "ClassNumber" -> 2, 
  "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor["ToMLDataset", 
      Association["Input" -> Association["f1" -> Association["Type" -> "Numerical"]], 
 ... " -> DateObject[{2025, 6, 25, 1, 5, 
       24.5103085`9.141923701250969}, "Instant", "Gregorian", -5.], "ProcessorCount" -> 4, 
    "ProcessorType" -> "x86-64", "OperatingSystem" -> "Windows", "SystemWordLength" -> 64, 
    "Evaluations" -> {}]]]
```

Obtain information about the classifier:

```wl
In[2]:= Information[c]

Out[2]=
MachineLearning`MLInformationObject[ClassifierFunction[Association["ExampleNumber" -> 4, 
   "ClassNumber" -> 2, "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor[
       "ToMLDataset", Association["Input" -> Association[
        ...  DateObject[{2025, 6, 25, 1, 5, 
        24.5103085`9.141923701250969}, "Instant", "Gregorian", -5.], "ProcessorCount" -> 4, 
     "ProcessorType" -> "x86-64", "OperatingSystem" -> "Windows", "SystemWordLength" -> 64, 
     "Evaluations" -> {}]]]]
```

Classify a new example:

```wl
In[3]:= c[1.3]

Out[3]= "A"
```

---

Generate some data and visualize it:

```wl
In[1]:=
data = Table[x -> x + RandomVariate[NormalDistribution[0, 2]], {x, RandomReal[{-5, 5}, 100]}];
ListPlot[List@@@data]

Out[1]= [image]
```

Train a predictor function on it:

```wl
In[2]:= p = Predict[data, Method -> "NeuralNetwork"]

Out[2]=
PredictorFunction[Association["ExampleNumber" -> 100, 
  "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor["ToMLDataset", 
      Association["Input" -> Association["f1" -> Association["Type" -> "Numerical"]], 
       "Output" ->  ...   "Date" -> DateObject[{2019, 1, 9, 14, 40, 20.253265`8.059070028894233}, "Instant", "Gregorian", 
      -8.], "ProcessorCount" -> 4, "ProcessorType" -> "x86-64", "OperatingSystem" -> "MacOSX", 
    "SystemWordLength" -> 64, "Evaluations" -> {}]]]
```

Compare the data with the predicted values and look at the standard deviation:

```wl
In[3]:=
Show[Plot[{p[x], 
	p[x] + StandardDeviation[p[x, "Distribution"]], p[x] - StandardDeviation[p[x, "Distribution"]]}, 
	{x, -5, 5}, 
	PlotStyle -> {Blue, Gray, Gray}, 
	Filling -> {2 -> {3}}, 
	Exclusions -> False, 
	PerformanceGoal -> "Speed", PlotLegends -> {"Prediction", "Confidence Interval"}], ListPlot[List@@@data, PlotStyle -> Red, PlotLegends -> {"Data"}]]

Out[3]= [image]
```

### Options (2)

#### MaxTrainingRounds (1)

Generate a training set and visualize it:

```wl
In[1]:=
trainingdata = Table[x -> Sin[x] + RandomVariate[NormalDistribution[0, .3]], {x, RandomReal[{-5,   5}, 50] }];
ListPlot[List@@@trainingdata]

Out[1]= [image]
```

Train two predictors using different ``MaxTrainingRounds`` and compare their performances on the training set:

```wl
In[2]:= {p1, p2, p3} = Predict[trainingdata, Method -> {"NeuralNetwork", MaxTrainingRounds -> #}]& /@ {20, 100, 500};

In[3]:= Show[ListPlot[List@@@trainingdata], Plot[{p1[x], p2[x], p3[x]}, {x, -10, 10}, PlotLegends -> {20, 100, 500}]]

Out[3]= [image]
```

#### "NetworkDepth" (1)

Use the ``"NetworkDepth"`` suboption to specify the number of units in the neural network:

```wl
In[1]:= pr1 = Predict[{1, 2, 3, 4, 5} -> {1, 2, 3, 4, 5}, Method -> {"NeuralNetwork", "NetworkDepth" -> 1}]

Out[1]=
PredictorFunction[Association["ExampleNumber" -> 5, 
  "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor["ToMLDataset", 
      Association["Input" -> Association["f1" -> Association["Type" -> "Numerical"]], 
       "Output" -> As ... te" -> DateObject[{2018, 10, 4, 8, 38, 
       1.771942`7.001024491149182}, "Instant", "Gregorian", -7.], "ProcessorCount" -> 4, 
    "ProcessorType" -> "x86-64", "OperatingSystem" -> "MacOSX", "SystemWordLength" -> 64, 
    "Evaluations" -> {}]]]
```

Train a second ``PredictorFunction`` by changing the ``"NetworkDepth"`` :

```wl
In[2]:= pr4 = Predict[{1, 2, 3, 4, 5} -> {1, 2, 3, 4, 5}, Method -> {"NeuralNetwork", "NetworkDepth" -> 4}]

Out[2]=
PredictorFunction[Association["ExampleNumber" -> 5, 
  "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor["ToMLDataset", 
      Association["Input" -> Association["f1" -> Association["Type" -> "Numerical"]], 
       "Output" -> As ... te" -> DateObject[{2018, 10, 4, 8, 38, 
       8.211223`7.666982833816565}, "Instant", "Gregorian", -7.], "ProcessorCount" -> 4, 
    "ProcessorType" -> "x86-64", "OperatingSystem" -> "MacOSX", "SystemWordLength" -> 64, 
    "Evaluations" -> {}]]]
```

Plot the mean prediction:

```wl
In[3]:= Show[Plot[{pr1[x], pr4[x]}, {x, 0, 6}], ListPlot[Transpose[{{1, 2, 3, 4, 5}, {1, 2, 3, 4, 5}}]]]

Out[3]= [image]
```

## See Also

* [`Classify`](https://reference.wolfram.com/language/ref/Classify.en.md)
* [`Predict`](https://reference.wolfram.com/language/ref/Predict.en.md)
* [`NetTrain`](https://reference.wolfram.com/language/ref/NetTrain.en.md)
* [`ClassifierFunction`](https://reference.wolfram.com/language/ref/ClassifierFunction.en.md)
* [`PredictorFunction`](https://reference.wolfram.com/language/ref/PredictorFunction.en.md)
* [`ClassifierMeasurements`](https://reference.wolfram.com/language/ref/ClassifierMeasurements.en.md)
* [`PredictorMeasurements`](https://reference.wolfram.com/language/ref/PredictorMeasurements.en.md)
* [`SequencePredict`](https://reference.wolfram.com/language/ref/SequencePredict.en.md)
* [`ClusterClassify`](https://reference.wolfram.com/language/ref/ClusterClassify.en.md)
* [`DecisionTree`](https://reference.wolfram.com/language/ref/method/DecisionTree.en.md)
* [`LinearRegression`](https://reference.wolfram.com/language/ref/method/LinearRegression.en.md)
* [`LogisticRegression`](https://reference.wolfram.com/language/ref/method/LogisticRegression.en.md)
* [`GaussianProcess`](https://reference.wolfram.com/language/ref/method/GaussianProcess.en.md)
* [`GradientBoostedTrees`](https://reference.wolfram.com/language/ref/method/GradientBoostedTrees.en.md)
* [`Markov`](https://reference.wolfram.com/language/ref/method/Markov.en.md)
* [`NaiveBayes`](https://reference.wolfram.com/language/ref/method/NaiveBayes.en.md)
* [`NearestNeighbors`](https://reference.wolfram.com/language/ref/method/NearestNeighbors.en.md)
* [`RandomForest`](https://reference.wolfram.com/language/ref/method/RandomForest.en.md)
* [`SupportVectorMachine`](https://reference.wolfram.com/language/ref/method/SupportVectorMachine.en.md)

## Related Links

* [An Elementary Introduction to the Wolfram Language: Machine Learning](https://www.wolfram.com/language/elementary-introduction/22-machine-learning.html)

## History

* [Introduced in 2014 (10.0)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn100.en.md) \| [Updated in 2017 (11.2)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn112.en.md)