Mixture density networks

Chris Bishop

发表年份: 1994
引用次数: 1,273
访问权限: 开放获取

摘要

Minimization of a sum-of-squares or cross-entropy error function leads to network out-puts which approximate the conditional averages of the target data, conditioned on the input vector. For classications problems, with a suitably chosen target coding scheme, these averages represent the posterior probabilities of class membership, and so can be regarded as optimal. For problems involving the prediction of continuous variables, how-ever, the conditional averages provide only a very limited description of the properties of the target variables. This is particularly true for problems in which the mapping to be learned is multi-valued, as often arises in the solution of inverse problems, since the average of several correct target values is not necessarily itself a correct value. In order to obtain a complete description of the data, for the purposes of predicting the outputs cor-responding to new input vectors, we must model the conditional probability distribution of the target data, again conditioned on the input vector. In this paper we introduce a new class of network models obtained by combining a conventional neural network with a mixture density model. The complete system is called a Mixture Density Network, and can in principle represent arbitrary conditional probability distributions in the same way that a conventional neural network can represent arbitrary functions. We demonstrate the eectiveness of Mixture Density Networks using both a toy problem and a problem involving robot inverse kinematics. 1 Previously issued as NCRG/94/4288

关键词

Artificial neural networkConditional probability distributionMathematicsMinificationProbability density functionConditional probabilityEntropy (arrow of time)Posterior probabilityAlgorithmMathematical optimization

Mixture density networks

摘要

关键词

相关论文

Fractional Differential Equations

A new optimizer using particle swarm theory

Self-Organizing Maps

The Organization of Behavior