Skip to main content
Log in

An adaptive Gauss-Newton algorithm for training multilayer nonlinear filters that have embedded memory

  • Published:
Circuits, Systems and Signal Processing Aims and scope Submit manuscript

Abstract

We describe herein a new means of training dynamic multilayer nonlinear adaptive filters, orneural networks. We restrict our discussion to multilayer dynamic Volterra networks, which are structured so as to restrict their degrees of computational freedom, based on a priori knowledge about the dynamic operation to be emulated. The networks consist of linear dynamic filters together with nonlinear generalized single-layer subnets. We describe how a Newton-like optimization strategy can be applied to these dynamic architectures and detail a newmodified Gauss-Newton optimization technique. The new training algorithm converges faster and to a smaller value of cost than backpropagation-through-time for a wide range of adaptive filtering applications. We apply the algorithm to modeling the inverse of a nonlinear dynamic tracking system. The superior performance of the algorithm over standard techniques is demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. L. Armijo, Minimization of functions having Lipschitz continuous first-partial derivatives,Pacific J. Math., vol. 16, no. 1 pp. 1–3, 1966.

    Google Scholar 

  2. A. Back et al., A unifying view of some training algorithms for multilayer perceptrons with FIR filter synapses,Proc. 1994 IEEE Workshop on Neural Networks for Signal Processing, vol. 1, pp. 146–154, 1994.

    Google Scholar 

  3. R. Battiti, First-and second-order methods for learning: Between steepest descent and Newton's method,Neural Comp., vol. 4, pp. 141–166, 1992.

    Google Scholar 

  4. E. B. Baum and D. Haussler, What size net gives valid generalization?,Neural Comp., vol. 1, pp. 151–160, 1989

    Google Scholar 

  5. D. P. Bertsekas,Nonlinear Programming, vol. 1, 2nd edition, Athena Scientific, Belmont, MA, 1995.

    Google Scholar 

  6. L. O. Chua, S. P. Boyd, and Y. S. Tang, Measuring Volterra kernells,IEEE Trans. Circuits Systems, vol. CAS-30, pp. 571–577, August 1983.

    Google Scholar 

  7. P. Eykhoff,System Identification, 1st edition, John Wiley & Sons, New York, 1979.

    Google Scholar 

  8. R. Fletcher,Practical Methods of Optimization, vol. 1, John Wiley, New York, 1980.

    Google Scholar 

  9. G. F. Franklin, J. D. Powell and A. Emami-Naeini,Feedback Control of Dynamic Systems, 2nd edition, Addison-Wesley, Reading, MA, 1991.

    Google Scholar 

  10. G. F. Franklin, J. D. Powell and M. L. Workman,Digital Control of Dynamic Systems, Addison-Wesley, Reading, MA, 1990.

    Google Scholar 

  11. G. Govind and P. A. Ramapoorthy, Multi-layered neural networks and Volterra series: The missing link,IEEE International Conference on Systems Engineering, August 1990, 633-6.

  12. S. Haykin,Neural Networks, Macmillan College Publishing, New York, 1994.

    Google Scholar 

  13. S. B. Holden and P. J. W. Rayner, Generalization and PAC learning: Some new results in the class of generalized single-layer networks,IEEE Trans. Neural Networks, vol. 6, no. 2, pp. 368–377, 1995.

    Google Scholar 

  14. T. Kailath and B. Hassibi,H optimal training algorithms and their relation to backpropagation,Proc. NIPS94-Neural Information Processing Systems: Natural and Synthetic, pp. 191–198, November–December 1994.

  15. T. Kailath and B. Hassibi,H optimality of the LMS algorithm,IEEE Trans. Signal Process., vol. 44, pp. 267–280, 1996.

    Google Scholar 

  16. K. I. Kim and E. J. Powers,Orthogonalised frequency domain Volterra model for non-Gaussian inputs, IEE Proc.-F, vol. 140, no. 6, pp. 403–409, 1993.

    Google Scholar 

  17. A. Lapedes and R. Farber, Nonlinear signal processing using neural networks: Prediction and system modeling, Technical Report LA-UR-87-2662, Los Alamos National Laboratory, Los Alamos, NM, 1987.

    Google Scholar 

  18. D. G. Luenberger,Linear and Nonlinear Programming, 2nd edition, Addison-Wesley, Reading, MA, 1984.

    Google Scholar 

  19. D. W.Marquardt, An Algorithm for least squares estimation of nonlinear parameters,SIAM J., vol. 11, 431–444, 1963.

    Google Scholar 

  20. P. J. W. Rayner, M. R. Lynch, and S. B. Holden, Removal of degeneracy in adaptive Volterra networks by dynamic structuring,Proc. ICASSP, pp. 2069–2072, 1991.

  21. D. E. Rumelhart, J. L. McLelland, and the PDP Research Group,Parallel Distributed Processing, Explorations in the Microstructures of Cognition, vol. 1, MIT Press, Cambridge MA, 1987.

    Google Scholar 

  22. K. Saito and R. Nakano, Partial BFGS update and efficient step-length calculation for three-layer neural networks,Neural Comp., vol. 9, pp. 123–141, 1997.

    Google Scholar 

  23. S. D. Sterns and B. Widrow,Adaptive Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, 1985.

    Google Scholar 

  24. D. S. Touretsky, What size net gives valid generalization,Adv. Neural Inform. Systems, pp. 81–90, 1989.

  25. M. Vidyasagar,Nonlinear System Analyses, 2nd edition, Prentice-Hall, Englewood Cliffs, NJ, 1993.

    Google Scholar 

  26. V. Volterra,Theory of Functions, Blackie & Sons, Glasgow, Scotland, 1930.

    Google Scholar 

  27. E. Wan, Discrete time neural networks,J. Appl. Intell., no. 3, pp. 91–105 1993,

    Google Scholar 

  28. E. A. Wan, Finite impulse response neural networks with application in time-series prediction, Ph.D. thesis, Stanford University, Palo Alto, CA, November 1993.

    Google Scholar 

  29. P. J. Werbos, Backpropagation through time: What it does and how to do it,Proc. IEEE, vol. 78, no. 10, pp. 1550–60, October 1990.

    Google Scholar 

  30. D. H. Wolpert, A mathematical theory of generalization: Part 1,Complex Systems, vol. 4, pp. 151–200, 1990.

    Google Scholar 

  31. D. H. Wolpert, A mathematical theory of generalization: Part 2,Complex Systems, vol. 4, pp 201–249, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported by the Stanford Gravity Probe-B project under NASA contract AS 8-36125.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rabinowitz, M., Gutt, G.M. & Franklin, G.F. An adaptive Gauss-Newton algorithm for training multilayer nonlinear filters that have embedded memory. Circuits Systems and Signal Process 18, 407–429 (1999). https://doi.org/10.1007/BF01200791

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01200791

Keywords

Navigation