function [ iters, endpt ] = hooke(nvars, startpt, rho, epsilon, itermax)
% HOOKE - Performes Nonlinear Optimization using the algorithm
% of Hooke and Jeeves ... see below for references.
%
% Copyright (c) 2004-2007 John L. Weatherwax.
% All rights reserved.
%
% This is a MATLAB(TM) version of the C routine written by M. G. Johnson.
%
% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions
% are met:
%
% 1. Redistributions of source code must retain the above copyright
% notice, this list of conditions and the following disclaimer.
% 2. Redistributions in binary form must reproduce the above copyright
% notice, this list of conditions and the following disclaimer in the
% documentation and/or other materials provided with the distribution.
%
% THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
% ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
% ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
% FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
% DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
% OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
% HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
% LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
% OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
% SUCH DAMAGE.
%
%
% Find a point X where the nonlinear function f(X) has a local minimum.
% X is an n-vector and f(X) is a scalar. In mathematical notation
% f: R^n -> R^1. The objective function f() is NOT required to be
% continuous. Nor does f() need to be differentiable. The program
% does not use or require derivatives of f().
%
% The software user supplies three things: a subroutine that
% computes f(X), an initial "starting guess" of the minimum point
% X, and values for the algorithm convergence parameters. Then
% the program searches for a local minimum, beginning from the
% starting guess, using the Direct Search algorithm of Hooke and
% Jeeves.
%
% The C program was adapted from the Algol pseudocode found in:
%
% "Algorithm 178: Direct Search"
% by Arthur F. Kaupe Jr.,
% Communications of the ACM, Vol 6. p.313 (June 1963).
%
% It includes the improvements suggested by
%
% Bell and Pike (CACM v.9, p. 684, Sept 1966)
%
% and those of
%
% Tomlin and Smith, "Remark on Algorithm 178" (CACM v.12).
%
% The original paper, which I don't recommend as highly as the one
% by A. Kaupe, is:
%
% R. Hooke and T. A. Jeeves,
% "Direct Search Solution of Numerical and Statistical Problems",
% Journal of the ACM, Vol. 8, April 1961, pp. 212-229.
%
% Calling sequence:
% int hooke(nvars, startpt, endpt, rho, epsilon, itermax)
%
% nvars {an integer} This is the number of dimensions
% in the domain of f(). It is the number of
% coordinates of the starting point (and the
% minimum point.)
% startpt {an array of doubles} This is the user-
% supplied guess at the minimum.
% endpt {an array of doubles} This is the location of
% the local minimum, calculated by the program
% rho {a double} This is a user-supplied convergence
% parameter (more detail below), which should be
% set to a value between 0.0 and 1.0. Larger
% values of rho give greater probability of
% convergence on highly nonlinear functions, at a
% cost of more function evaluations. Smaller
% values of rho reduces the number of evaluations
% (and the program running time), but increases
% the risk of nonconvergence. See below.
% epsilon {a double} This is the criterion for halting
% the search for a minimum. When the algorithm
% begins to make less and less progress on each
% iteration, it checks the halting criterion: if
% the stepsize is below epsilon, terminate the
% iteration and return the current best estimate
% of the minimum. Larger values of epsilon (such
% as 1.0e-4) give quicker running time, but a
% less accurate estimate of the minimum. Smaller
% values of epsilon (such as 1.0e-7) give longer
% running time, but a more accurate estimate of
% the minimum.
% itermax {an integer} A second, rarely used, halting
% criterion. If the algorithm uses >= itermax
% iterations, halt.
%
%
% The user-supplied objective function f(x,n) should return a C
% "double". Its arguments are x -- an array of doubles, and
% n -- an integer. x is the point at which f(x) should be
% evaluated, and n is the number of coordinates of x. That is,
% n is the number of coefficients being fitted.
%
% rho, the algorithm convergence control
% The algorithm works by taking "steps" from one estimate of
% a minimum, to another (hopefully better) estimate. Taking
% big steps gets to the minimum more quickly, at the risk of
% "stepping right over" an excellent point. The stepsize is
% controlled by a user supplied parameter called rho. At each
% iteration, the stepsize is multiplied by rho (0 < rho < 1),
% so the stepsize is successively reduced.
% Small values of rho correspond to big stepsize changes,
% which make the algorithm run more quickly. However, there
% is a chance (especially with highly nonlinear functions)
% that these big changes will accidentally overlook a
% promising search vector, leading to nonconvergence.
% Large values of rho correspond to small stepsize changes,
% which force the algorithm to carefully examine nearby points
% instead of optimistically forging ahead. This improves the
% probability of convergence.
% The stepsize is reduced until it is equal to (or smaller
% than) epsilon. So the number of iterations performed by
% Hooke-Jeeves is determined by rho and epsilon:
% rho**(number_of_iterations) = epsilon
% In general it is a good idea to set rho to an aggressively
% small value like 0.5 (hoping for fast convergence). Then,
% if the user suspects that the reported minimum is incorrect
% (or perhaps not accurate enough), the program can be run
% again with a larger value of rho such as 0.85, using the
% result of the first minimization as the starting guess to
% begin the second minimization.
% Normal use: (1) Code your function f() in the C language
% (2) Install your starting guess {or read it in}
% (3) Run the program
% (4) {for the skeptical}: Use the computed minimum
% as the starting point for another run
% Data Fitting:
% Code your function f() to be the sum of the squares of the
% errors (differences) between the computed values and the
% measured values. Then minimize f() using Hooke-Jeeves.
% EXAMPLE: you have 20 datapoints (ti, yi) and you want to
% find A,B,C such that (A*t*t) + (B*exp(t)) + (C*tan(t))
% fits the data as closely as possible. Then f() is just
% f(x) = SUM (measured_y[i] - ((A*t[i]*t[i]) + (B*exp(t[i]))
% + (C*tan(t[i]))))^2
% where x[] is a 3-vector consisting of {A, B, C}.
%
% The original author of this software was M. G. Johnson.
% Permission to use, copy, modify, and distribute this software
% for any purpose without fee is hereby granted, provided that
% this entire notice is included in all copies of any software
% which is or includes a copy or modification of this software
% and in all copies of the supporting documentation for such
% software. THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT
% ANY EXPRESS OR IMPLIED WARRANTY. IN PARTICULAR, NEITHER THE
% AUTHOR NOR AT&T MAKE ANY REPRESENTATION OR WARRANTY OF ANY
% KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS
% FITNESS FOR ANY PARTICULAR PURPOSE.
%
% Matlab Version Written by:
% --
% John L. Weatherwax 2004-10-08
%
% email: wax@alum.mit.edu
%
% Please send comments and especially bug reports to the
% above email address.
%
%-----
newx = startpt;
xbefore = startpt;
delta = abs( rho*startpt );
ind = find( delta==0.0 ); delta(ind) = rho;
iadj = 0;
steplength = rho;
iters = 0;
%fbefore = f(newx);
fbefore = banana(newx);
newf = fbefore;
while( (iters < itermax) & (steplength > epsilon) )
iters=iters+1;
iadj=iadj+1;
% Find best new point, one coord at a time
newx = xbefore;
[ newf, delta, newx ] = best_nearby(delta, newx, fbefore, nvars);
% if we made some improvements, pursue that direction
keep = 1;
while( (newf < fbefore) & (keep == 1) )
iadj = 0;
for i=1:nvars,
% firstly, arrange the sign of delta()
if( newx(i) <= xbefore(i) )
delta(i) = -abs(delta(i));
else
delta(i) = +abs(delta(i));
end
% now, move further in this direction
tmp = xbefore(i);
xbefore(i) = newx(i);
newx(i) = newx(i)+newx(i)-tmp;
end
fbefore = newf;
[ newf, delta, newx ] = best_nearby(delta, newx, fbefore, nvars);
% if the further (optimistic) move was bad....
if( newf >= fbefore ) break; end
% make sure that the differences between the new
% and the old points are due to actual displacements;
% beware of roundoff errors that might cause newf < fbefore
keep = 0;
for i=1:nvars,
keep=1;
if( abs(newx(i)-xbefore(i)) > 0.5*abs(delta(i)) )
break;
else
keep=0;
end
end % End for...
end % End inner while...
if( (steplength >= epsilon) & (newf >= fbefore) )
steplength = steplength * rho;
delta = rho*delta;
end
end
endpt = xbefore;