| [Top] | [Contents] | [Index] | [ ? ] |
This file documents the Open Optimization Library (OOL), a collection of numerical routines for optimization. It corresponds to release 0.2.0 of the library.
More information about OOL can be found at the project homepage, http://ool.sourceforge.net.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The Open Optimization Library (OOL) is a collection of routines for constrained minimization problems, implemented following the GNU Scientific Library (GSL) standards. The routines have been either translated into C or written from scratch in C, and present a modern Applications Programming Interface (API) for C programmers, allowing wrappers to be written for very high level languages. The source code is distributed under the GNU General Public License.
| 1.1 Routines available in OOL | ||
| 1.2 OOL is Free Software | ||
| 1.3 Obtaining OOL | ||
| 1.4 No Warranty | ||
| 1.5 Reporting Bugs |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The library aims to cover a wide range of methods in constrained optimization. Initially, routines are available for minimization of differentiable functions subject to simple bounds on their variables.
The use of these routines is described in this manual. Each chapter provides detailed definitions of the functions, followed by example programs and references to the articles on which the algorithms are based.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Specifically, we want to make sure that you have the right to share copies of programs that you are given which use the Open Optimization Library, that you receive their source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.
To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of any code which uses the Open Optimization Library, you must give the recipients all the rights that you have received. You must make sure that they, too, receive or can get the source code, both to the library and the code which uses it. And you must tell them their rights. This means that the library should not be redistributed in proprietary programs.
Also, for our own protection, we must make certain that everyone finds out that there is no warranty for the Open Optimization Library. If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.
The precise conditions for the distribution of software related to the Open Optimization Library are found in the GNU General Public License (see section GNU General Public License). Further information about this license is available from the GNU Project webpage Frequently Asked Questions about the GNU GPL,
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The preferred platform for the library is a GNU system, which allows it to take advantage of additional features in the GNU C compiler and GNU C library. However, the library is fully portable and should compile on most systems.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If you find a bug which is not listed in these files please report it to biloti@mat.ufpr.br.
All bug reports should include:
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| 2.1 ANSI C Compliance | ||
| 2.2 Compiling and Linking | ||
| 2.3 Shared Libraries | ||
| 2.4 Compatibility with C++ | ||
| 2.5 Aliasing of arrays | ||
| 2.6 Thread-safety |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The library is written in ANSI C and is intended to conform to the ANSI C standard. It should be portable to any system with a working ANSI C compiler.
The library does not rely on any non-ANSI extensions in the interface it exports to the user. Programs you write using OOL can be ANSI compliant. Extensions which can be used in a way compatible with pure ANSI C are supported, however, via conditional compilation. This allows the library to take advantage of compiler extensions on those platforms which support them.
When an ANSI C feature is known to be broken on a particular system the library will exclude any related functions at compile-time. This should make it impossible to link a program that would use these functions and give incorrect results.
To avoid namespace conflicts all exported function names and variables
have the prefix ool_, while exported macros have the prefix
OOL_.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
#include <ool/ool_conmin.h> |
If the directory is not installed on the standard search path of your
compiler you will also need to provide its location to the preprocessor
as a command line flag. The default location of the `ool'
directory is `/usr/local/include/ool'. A typical compilation
command for a source file `example.c' with the GNU C compiler
gcc is,
gcc -I/usr/local/include -c example.c |
gcc searches `/usr/local/include' automatically so
the -I option can be omitted when OOL is installed in its default
location.
The library is installed as a single file, `libool.a'. A shared version of the library is also installed on systems that support shared libraries. The default location of these files is `/usr/local/lib'. To link against the library you need to specify not only the main library, but also the GSL library, which is required by OOL, and a supporting CBLAS library, which provides standard basic linear algebra subroutines. A suitable CBLAS implementation is provided in the library `libgslcblas.a' if your system does not provide one. The following example shows how to link an application with the library,
gcc example.o -lool -lgsl -lgslcblas -lm |
gcc example.o -lool -lgsl -lcblas -lm |
-lcblas. The library must conform to
the CBLAS standard. The ATLAS package provides a portable
high-performance BLAS library with a CBLAS interface. It is
free software and should be installed for any work requiring fast vector
and matrix operations. The following command line will link with the
ATLAS library and its CBLAS interface,
gcc example.o -lool -lgsl -lcblas -latlas -lm |
The program ool-config provides information on the local version
of the library. For example, the following command shows that the
library has been installed under the directory `/usr/local',
bash$ ool-config --prefix /usr/local |
ool-config --help.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
LD_LIBRARY_PATH to include
the directory where the library is installed. For example, in the
Bourne shell (/bin/sh or /bin/bash), the library path can be set
with the following commands:
LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH ./example |
/bin/csh or /bin/tcsh) the equivalent
command is,
setenv LD_LIBRARY_PATH /usr/local/lib:$LD_LIBRARY_PATH |
To save retyping these commands each session they should be placed in an individual or system-wide login file.
To compile a statically linked version of the program, use the
-static flag in gcc,
gcc -static example.o -lool -lgsl -lgslcblas -lm |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
extern "C" linkage when included in C++ programs.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
const
arguments) then overlapping or aliased memory regions can be safely
used.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The library can be used in multi-threaded programs. All the functions
are thread-safe, in the sense that they do not use static variables.
Memory is always associated with objects and not with functions. For
functions which use workspace objects as temporary storage the
workspaces should be allocated on a per-thread basis. For functions
which use table objects as read-only memory the tables can be used
by multiple threads simultaneously. Table arguments are always declared
const in function prototypes, to indicate that they may be
safely accessed by different threads.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To impatients, this chapter works out an example on the usage of the library. The code presented, despite its simplicity, can be used to solve real problems with minor modifications.
| 3.1 Example |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Consider the problem
| minimize ƒ(x) ≡ ∑ (xi - ai)² |
| subject to L ≤ x ≤ U |
We present a sample code for solving this problem with help of the OOL, and comment it line by line.
1: #include <stdio.h> 2: #include <ool/ool_conmin.h> 3: void iteration_echo( ool_conmin_minimizer *M ); |
Line 1 includes the C standard IO header and line 2 includes the
header for the OOL library. The last should be included in every
program that uses OOL. Line 3 simply defines the prototype of an
auxiliary function defined below. The following block defines the
objective function. The prototype for objective function should always
be like in line 4 and 5. The objective function may have parameters,
which are provided by a void pointer params. In this example
params points to a vector. To be used inside the function,
params is typecasted to gsl_vector * (line 7). For
functions depending on several parameters, params could points
to a structure defined to cluster them all.
4: double
5: fun( const gsl_vector *X, void* params )
6: {
7: gsl_vector *a = (gsl_vector *) params;
8: size_t ii, nn;
9: double ai, xi;
10: double f;
11: nn = X->size;
12: f = 0;
13: for( ii = 0; ii < nn; ii++ )
14: {
15: ai = gsl_vector_get( a, ii );
16: xi = gsl_vector_get( X, ii );
17: f += (xi - ai)*(xi - ai);
18: }
19: return f;
20: }
|
The next block implements functions to evaluate the gradient (lines 21--35), objective function and gradient (simultaneously) (36--53), and the product of Hessian by a given vector (54--66).
21: void
22: fun_df( const gsl_vector *X, void* params, gsl_vector *G )
23: {
24: gsl_vector *a = (gsl_vector *) params;
25: size_t ii, nn;
26: double ai, xi, gi;
27: nn = X->size;
28: for( ii = 0; ii < nn; ii++ )
29: {
30: ai = gsl_vector_get( a, ii );
31: xi = gsl_vector_get( X, ii );
32: gi = 2 * ( xi - ai );
33: gsl_vector_set( G, ii, gi );
34: }
35: }
36: void
37: fun_fdf( const gsl_vector *X, void* params,
38: double *f, gsl_vector *G )
39: {
40: gsl_vector *a = (gsl_vector *) params;
41: size_t ii, nn;
42: double ai, xi, gi;
43: nn = X->size;
44: (*f) = 0;
45: for( ii = 0; ii < nn; ii++ )
46: {
47: ai = gsl_vector_get( a, ii );
48: xi = gsl_vector_get( X, ii );
49: (*f) += (xi - ai)*(xi - ai);
50: gi = 2 * ( xi - ai );
51: gsl_vector_set( G, ii, gi );
52: }
53: }
54: void
55: fun_Hv( const gsl_vector *X, void *params,
56: const gsl_vector *V, gsl_vector *hv )
57: {
58: size_t ii, nn;
59: double hvi;
60: nn = X->size;
61: for( ii = 0; ii < nn; ii++ )
62: {
63: hvi = 2 * gsl_vector_get( V, ii );
64: gsl_vector_set( hv, ii, hvi);
65: }
66: }
|
Finally, comes the main routine which drives the optimization. Lines 69 and 70 define the number of variables of the problem and the limit for the number of iterations, respectively.
67: int main( void )
68: {
69: size_t nn = 100;
70: size_t nmax = 10000;
71: size_t ii;
72: int status;
|
The next two lines are the only two lines which are method dependent. They are responsible to select which optimization algorithm will be used, the SPG algorithm, in this example.
73: const ool_conmin_minimizer_type *T = ool_conmin_minimizer_spg; 74: ool_conmin_spg_parameters P; |
The next four lines declare variables to hold the objective function, the constraints, the minimizer method, and the initial iterate, respectively.
75: ool_conmin_function F; 76: ool_conmin_constraint C; 77: ool_conmin_minimizer *M; 78: gsl_vector *X; |
In line 79 a vector is declared to represent the function parameters as in the description of the problem. In the following block, this vector is initialized as a = (0.1, 0.2,..., 10)T.
79: gsl_vector *a; 80: a = gsl_vector_alloc( nn ); 81: for ( ii = 0; ii < nn; ii++) 82: gsl_vector_set( a, ii, ((double) ii + 1.0)/10.0 ); |
The function structure is filled in with the number of variables (line 83), pointers to routines to evaluate the objective function and its derivatives (84--87), and a pointer to the function parameters (88).
83: F.n = nn; 84: F.f = &fun; 85: F.df = &fun_df; 86: F.fdf = &fun_fdf; 87: F.Hv = &fun_Hv; 88: F.params = (void *) a; |
The memory allocation to store the bounds is performed in lines 90 and 91. Further the lower and upper bounds are set to -3 and 3, respectively, to all variables.
89: C.n = nn; 90: C.L = gsl_vector_alloc( C.n ); 91: C.U = gsl_vector_alloc( C.n ); 92: gsl_vector_set_all( C.L, -3.0 ); 93: gsl_vector_set_all( C.U, 3.0 ); |
These two lines allocate and set the initial iterate.
94: X = gsl_vector_alloc( nn ); 95: gsl_vector_set_all( X, 1.0 ); |
Line 96 allocate the necessary memory for an instance of the
optimization algorithm of type T (defined in line 73). Line 97
initializes its parameters to default values. If the you want to tune
the method by changing the default values to some parameter this
should be done between lines 97 and 98.
96: M = ool_conmin_minimizer_alloc( T, nn ); 97: ool_conmin_parameters_default( T, (void*)(&P) ); |
In line 98, everything is put together. It states that this instance
of the method M is responsible for minimizing function F,
subject to constraints C, starting from point X, with
parameters P.
98: ool_conmin_minimizer_set( M, &F, &C, X, (void*)(&P) ); |
We are now in position to begin iterating. The iteration counter is
initialized (line 99) and some information concerning the initial
point is displayed (101--102). The iteration loop is repeated while
the maximum number of iterations was not reached and the status
is OOL_CONTINUE. Further conditions could also be considered
(maximum number of function/gradient evaluation for example). The
iteration counter is incremented (104) and one single iteration of the
method is performed (105). In line 106 the current iterate is checked
for optimality. Finally some information concerning this iteration is
displayed.
99: ii = 0;
100: status = OOL_CONTINUE;
101: printf( "%4i : ", ii );
102: iteration_echo ( M );
103: while( ii < nmax && status == OOL_CONTINUE ){
104: ii++;
105: ool_conmin_minimizer_iterate( M );
106: status = ool_conmin_is_optimal( M );
107: printf( "%4i : ", ii );
108: iteration_echo( M );
109: }
|
The program ends displaying the convergence status (110--113), number of variables, function and gradient evaluations, the objective function value at the last iterate and the norm of its projected gradient (114--123).
110: if(status == OOL_SUCCESS)
111: printf("\nConvergence in %i iterations", ii);
112: else
113: printf("\nStopped with %i iterations", ii);
114: printf("\nvariables................: %6i"
115: "\nfunction evaluations.....: %6i"
116: "\ngradient evaluations.....: %6i"
117: "\nfunction value...........: % .6e"
118: "\nprojected gradient norm..: % .6e\n",
119: nn,
120: ool_conmin_minimizer_fcount( M ),
121: ool_conmin_minimizer_gcount( M ),
122: ool_conmin_minimizer_minimum( M ),
123: ool_conmin_minimizer_size( M ));
|
To finalize, all allocated memory is freed.
124: gsl_vector_free( C.L ); 125: gsl_vector_free( C.U ); 126: gsl_vector_free( X ); 127: gsl_vector_free( a ); 128: ool_conmin_minimizer_free( M ); 129: return OOL_SUCCESS; 130: } |
This last block codes an auxiliar function to display few elements of the current iterate and the objective value.
131: void iteration_echo( ool_conmin_minimizer *M )
132: {
133: double f = M->f;
134: size_t ii, nn;
135: nn = X->size;
136: printf( "f( " );
137: for( ii = 0; ii < 3; ii++ )
138: printf( "%+6.3e, ", gsl_vector_get( M->x, ii ) );
139: printf( "... ) = %+6.3e\n", f );
140: }
|
This code produces the output:
0 : f( +0.000e+00, +0.000e+00, +0.000e+00, ... ) = +3.384e+03 1 : f( +1.000e-02, +2.000e-02, +3.000e-02, ... ) = +2.741e+03 2 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 Convergence in 2 iterations variables................: 100 function evaluations.....: 3 gradient evaluations.....: 3 function value...........: 1.167950e+03 projected gradient norm..: 8.881784e-16 |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter furnishes a brief description of the theory regarding constrained minimization.
| 4.1 Theory Overview | ||
| 4.2 Optimality Conditions | ||
| 4.3 General Method Structure | ||
| 4.4 References and Further Reading |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The problem considered in the CONMIN library is the minimization of a smooth function ƒ: Rn → R subject to x ∈ Ω. The set Ω defined by the constraints is called feasile set. Briefly, our aim is to solve
min ƒ(x), s.t. x ∈ Ω
where, by this, we mean to find a local minimizer, that is, a point x∗ such that
ƒ(x∗) ≤ ƒ(x), for all x ∈ Ω near x∗.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The most general situation for the feasible set is when
Ω = { x ∈ Rn | g(x) &le 0, h(x) = 0, L ≤ x &le U }.
where g:Rn → Rng and h:Rn → Rnh. In this case, the Karush-Kuhn-Tucker (KKT) first order optimality conditions states that, if x∗ is a local minimum, then there exist λi, for i = 1,…nh and μi, for i = 1,…ng, such that
| ∇ƒ(x) + ∑λi ∇hi (x∗) + ∑μi ∇gi (x∗) = 0, |
| x∗ ∈ Ω, |
| μi ≥ 0, for i=1,…,ng. |
We say that x∗ is a critical point if x∗ solves the system above. Therefore, the critical points are candidates for the local minimum.
| 4.2.1 Convex Case | ||
| 4.2.2 Box Constraints |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Instead of finding the critical points by solving the KKT system above, methods for minimization subject to general constraints often may be built based on combinations of methods for minimization with simpler constraints. For instance, when Ω is convex and ƒ is a smooth function, the necessary optimality condition is that the projected gradient given by
gp = PΩ (x - ∇ ƒ(x) ) - x.
vanishes at the minimum, where PΩ is the ortogonal projection onto Ω. This can be seen as a particular form of the KKT optimality conditions. The critical points x∗, in this case, are those such that gp(x∗) = 0.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The simplest convex case, regarding the constraints, is when the feasible set Ω is given by
Ω = { x ∈ Rn | L ≤ x &le U }.
We say that the minimization is subject to bound, simple bound or box constraints. Note that the projection onto the feasible set is simply a box projection, and is given by
(PΩ(x))i = min{ max{ Li, xi }, Ui }.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| 4.3.1 Line Search |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A common component of optimization methods are line search strategies. By this we mean that once given the current point xc, and a descent direction d, we look for λ such that
ƒ(xc + λ d) &le ƒ(xc).
However, if the decreasing achieved at the inequality above for some λ is too small, it is not possible to guarantee convergence to a local minimum. To prevent this, we require that the trial λ to satisfy the Armijo "sufficient decrease" condition given byƒ(xc + λ d) ≤ ƒ(xc) + λ γ ∇ƒ(xc)T d,
where γ ∈ (0,1). It says that we must choose a λ such that ƒ(x+) for the trial point x+ = xc + λ d is smaller than the damped linear model. In the figure below, the interval for λ where the condition holds is [0,λmax]. Clearly, we can rewrite it alternatively asƒ(x+) ≤ ƒ(xc) + γ ∇ƒ(xc)T (x+ - xc),
where we avoid explicitly writing λ and d.

The nonmonotone Armijo condition to accept a trial point x+ is
ƒ(x+) ≤ max0 ≤ j ≤ min{ k, M-1} ƒ(xk-j) + γ ∇ƒ(xk)T (x+ - xk),
where M is an integer greater than zero. If x+ is rejected, the steplength is redefined asλ ← max{ σ1 λ, min{ σ2 λ, &lambda∗ } }
where λ∗ minimizes a quadratic model.| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes the common interfaces for constrained minimization of arbitrary multidimensional functions. The library provides implementations and reimplementations of methods published in well known optimization and applied mathematics journals. The library follows, as close as possible, the GSL standards and tries to use the GSL routines, specially those ones for optimization, described at chapters 31, 32, 33, 34, 35 and 36 of the GSL Reference Manual, version 1.4.
In order to fit the same standards of the GSL multidimensional minimization, only one iteration, for each method, is implemented. This means that in case of reimplemented codes, to obtain the same results of the original version, the user must provide a main routine that calls the OOL method and checks for optimality, as well as for stopping criteria, exactly as in the original implementation.
The header file `ool_conmin.h' contains prototypes for the minimization functions and related declarations.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Initially the user must define two variables: T indicates the
method used and P stands for its parameters. For example, if you
choose to use the Projected Gradient Methods (pgrad), the code lines
are
/* Method structs */ const ool_conmin_minimizer_type *T = ool_conmin_minimizer_pgrad; ool_conmin_pgrad_parameters P; |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The following function initializes a multidimensional constrained minimizer. The minimizer itself depends only on the dimension of the problem and the algorithm and can be reused for different problems.
OOL_ENOMEM.
printf ("s is a '%s' minimizer\n",
ool_conmin_minimizer_name (s));
|
s is a 'pgrad' minimizer.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
You must provide a parametric function of variables for the minimizer to operate on. You may also need to provide a routine which calculates the gradient of the function and a third routine which calculates both the function value and the gradient together. Some methods may require a routine to calculate the Hessian times a given vector. In order to allow for general parameters the functions are defined by the following data type:
size_t n
double (* f) (const gsl_vector * x, const void * params)
void (* df) (const gsl_vector * x, const void * params, gsl_vector * g)
void (* fdf) (const gsl_vector * x, const void * params, double * f, gsl_vector * g)
void (* Hv) (const gsl_vector *x, const void * params, const gsl_vector * V, gsl_vector * hv )
void * params
double
quadratic( const gsl_vector *X, void* params )
{
size_t ii, nn;
double f, x;
nn = X->size;
f = 0;
for( ii = 0; ii < nn; ii++ )
{
x = gsl_vector_get( X, ii );
f += (ii+1) * gsl_pow_2( x );
}
return f;
}
void
quadratic_df( const gsl_vector *X, void* params, gsl_vector *G )
{
size_t ii, nn;
double xx, gg;
nn = X->size;
for( ii = 0; ii < nn; ii++ )
{
xx = gsl_vector_get( X, ii );
gg = 2.0 * (ii+1) * xx;
gsl_vector_set( G, ii, gg );
}
}
void
quadratic_fdf( const gsl_vector *X, void* params,
double *f, gsl_vector *G )
{
size_t ii, ip1, nn;
double xx, gg;
nn = X->size;
(*f) = 0;
for( ii = 0; ii < nn; ii++ )
{
ip1 = ii + 1;
xx = gsl_vector_get( X, ii );
(*f) += ip1 * gsl_pow_2( xx );
gg = 2.0 * ip1 * xx;
gsl_vector_set( G, ii, gg );
}
}
void
quadratic_Hv( const gsl_vector *X, void *params,
const gsl_vector *V, gsl_vector *hv )
{
size_t ii, nn;
double aux;
nn = X->size;
for( ii = 0; ii < nn; ii++ )
{
aux = 2.0 * (ii+1) * gsl_vector_get( V, ii );
gsl_vector_set( hv, ii, aux );
}
}
|
The function structure can be initialized using the following code,
ool_conmin_function F; F.n = nn; F.f = &quadratic; F.df = &quadratic_df; F.fdf = &quadratic_fdf; F.Hv = &quadratic_Hv; F.params = NULL; |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
You must provide the lower and upper bound of the box constraints for the minimizers to operate on.
size_t n
gsl_vector * L
gsl_vector * U
The constraints can be initialized using the following code,
ool_conmin_constraint C; C.n = nn; C.L = gsl_vector_alloc( nn ); C.U = gsl_vector_alloc( nn ); gsl_vector_set_all( C.L, 1.0 ); gsl_vector_set_all( C.U, 10.0 ); |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The following function drives the iteration of each algorithm. The function performs one iteration to update the state of the minimizer. The same function works for all minimizers so that different methods can be substituted at runtime without modifications to the code.
The method parameters can be changed between the iterations, using the functions
The minimizer maintains a current best estimate of the minimum at all times. This information can be accessed through the following auxiliary functions,
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A minimization procedure should stop when one of the following conditions is true:
The handling of these conditions is under user control. The functions below permit the user to test the precision of the current result.
OOL_SUCCESS if the following condition is achieved, and
OOL_CONTINUE otherwise.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Presently, there are only algorithms for simple bound constraints. They use the value of the function and most of its gradient at each evaluation point, too.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes the routine PGRAD, for constrained minimization of differentiable multidimensional functions, which implements the Projected Gradient Method.
| 6.1 Overview | ||
| 6.2 Parameters | ||
| 6.3 Stopping Criteria | ||
| 6.4 Example | ||
| 6.5 References and Further Reading |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The problem solved by PGRAD is the minimization of a smooth function with bounds on the variables (see section 4. Brief Theoretical Introduction and 4.2.2 Box Constraints).
The Projected Gradient method is a natural extension of the Steepest Descent method for unconstrained minimization. At each step a line search is performed over direction of projected gradient. This method is not inteded to be used for production. For further details, see the reference at the end of this section.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Inside PGRAD, there are few parameters which can be choosen to improve the minimizing efforts. The parameter list, together with the default values are given below.
= 1.0e-4
= -1e+99
= 1.0e-4
= 0.1
= 0.9
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The algorithm declares convergence whenever the infinite norm of projected gradient is less than or equal to a given tolerance.
The infinite norm of the projected gradient at the current iterate can
be retrieved by calling ool_conmin_minimizer_size() function.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
For an example of PGRAD, see section 3. Quick Start, replacing lines 73 and 75 of that code by
const ool_conmin_minimizer_type *T = ool_conmin_minimizer_pgrad; ool_conmin_pgrad_parameters P; |
The output of the program is shown below.
0 : f( +0.000e+00, +0.000e+00, +0.000e+00, ... ) = +3.384e+03 1 : f( +2.000e-01, +4.000e-01, +6.000e-01, ... ) = +1.190e+03 2 : f( +1.800e-01, +3.600e-01, +5.400e-01, ... ) = +1.182e+03 3 : f( +1.640e-01, +3.280e-01, +4.920e-01, ... ) = +1.177e+03 4 : f( +1.512e-01, +3.024e-01, +4.536e-01, ... ) = +1.174e+03 5 : f( +1.410e-01, +2.819e-01, +4.229e-01, ... ) = +1.172e+03 6 : f( +1.328e-01, +2.655e-01, +3.983e-01, ... ) = +1.170e+03 7 : f( +1.262e-01, +2.524e-01, +3.786e-01, ... ) = +1.169e+03 8 : f( +1.210e-01, +2.419e-01, +3.629e-01, ... ) = +1.169e+03 9 : f( +1.168e-01, +2.336e-01, +3.503e-01, ... ) = +1.169e+03 10 : f( +1.134e-01, +2.268e-01, +3.403e-01, ... ) = +1.168e+03 11 : f( +1.107e-01, +2.215e-01, +3.322e-01, ... ) = +1.168e+03 12 : f( +1.086e-01, +2.172e-01, +3.258e-01, ... ) = +1.168e+03 13 : f( +1.069e-01, +2.137e-01, +3.206e-01, ... ) = +1.168e+03 14 : f( +1.055e-01, +2.110e-01, +3.165e-01, ... ) = +1.168e+03 15 : f( +1.044e-01, +2.088e-01, +3.132e-01, ... ) = +1.168e+03 16 : f( +1.035e-01, +2.070e-01, +3.106e-01, ... ) = +1.168e+03 17 : f( +1.028e-01, +2.056e-01, +3.084e-01, ... ) = +1.168e+03 18 : f( +1.023e-01, +2.045e-01, +3.068e-01, ... ) = +1.168e+03 19 : f( +1.018e-01, +2.036e-01, +3.054e-01, ... ) = +1.168e+03 20 : f( +1.014e-01, +2.029e-01, +3.043e-01, ... ) = +1.168e+03 21 : f( +1.012e-01, +2.023e-01, +3.035e-01, ... ) = +1.168e+03 22 : f( +1.009e-01, +2.018e-01, +3.028e-01, ... ) = +1.168e+03 23 : f( +1.007e-01, +2.015e-01, +3.022e-01, ... ) = +1.168e+03 24 : f( +1.006e-01, +2.012e-01, +3.018e-01, ... ) = +1.168e+03 25 : f( +1.005e-01, +2.009e-01, +3.014e-01, ... ) = +1.168e+03 26 : f( +1.004e-01, +2.008e-01, +3.011e-01, ... ) = +1.168e+03 27 : f( +1.003e-01, +2.006e-01, +3.009e-01, ... ) = +1.168e+03 28 : f( +1.002e-01, +2.005e-01, +3.007e-01, ... ) = +1.168e+03 29 : f( +1.002e-01, +2.004e-01, +3.006e-01, ... ) = +1.168e+03 30 : f( +1.002e-01, +2.003e-01, +3.005e-01, ... ) = +1.168e+03 31 : f( +1.001e-01, +2.002e-01, +3.004e-01, ... ) = +1.168e+03 32 : f( +1.001e-01, +2.002e-01, +3.003e-01, ... ) = +1.168e+03 33 : f( +1.001e-01, +2.002e-01, +3.002e-01, ... ) = +1.168e+03 34 : f( +1.001e-01, +2.001e-01, +3.002e-01, ... ) = +1.168e+03 35 : f( +1.001e-01, +2.001e-01, +3.002e-01, ... ) = +1.168e+03 36 : f( +1.000e-01, +2.001e-01, +3.001e-01, ... ) = +1.168e+03 37 : f( +1.000e-01, +2.001e-01, +3.001e-01, ... ) = +1.168e+03 38 : f( +1.000e-01, +2.001e-01, +3.001e-01, ... ) = +1.168e+03 39 : f( +1.000e-01, +2.000e-01, +3.001e-01, ... ) = +1.168e+03 40 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 41 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 42 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 43 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 44 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 45 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 46 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 47 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 48 : f( +1.000e-01, +2.000e-01, +3.000e-01, ... ) = +1.168e+03 Convergence in 48 iterations variables................: 100 function evaluations.....: 96 gradient evaluations.....: 49 function value...........: 1.167950e+03 projected gradient norm..: 8.362779e-05 |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes the routine SPG for constrained minimization of differentiable multidimensional functions on convex sets.
| 7.1 Overview | ||
| 7.2 Parameters | ||
| 7.3 Stopping Criteria | ||
| 7.4 Example | ||
| 7.5 References and Further Reading |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The Spectral Projected Gradient Method is designed to minimize differentiable function on convex sets. The SPG Method is similar to the Projected Gradient Method, differing only by the choice of the spectral steplength, which accelerates the convergence of the method. Despite this similarity, it can be shown that the SPG Method is related to the quasi-Newton family of methods.
This implementation deals with box constrained (see section 4. Brief Theoretical Introduction and 4.2.2 Box Constraints).
The SPG routine implements the version SPG2 of the Nonmonotone Spectral Projected Gradient Method, as published in Birgin et al. (2000), which differs from the classical SPG method by the nonmonotone Armijo sufficient decrease condition to accept a trial point (see section 4.3.1 Line Search). This method is a competitive option for large problems, since it has low memory requirements.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Inside SPG, there is a set of parameters which can be changed to improve the minimizing efforts. The parameter list, together with the default values are given below.
= -1.0e+99
= -1.0e-4
= 10
= 1.0e+30
= 1.0e-30
= 1.0e-4
= 0.1
= 0.9
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The method declares convergence whenever the function value is equal to or less than a prescribed value, through the fmin parameter, or the inifinte norm of projected gradient is less than the tolerance defined in tol parameter.
The infinite norm of the projected gradient at the current iterate can
be retrieved by calling ool_conmin_minimizer_size() function.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
See section 3. Quick Start, for an example program which employs SPG.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes the routine GENCAN for constrained minimization of arbitrary multidimensional functions.
| 8.1 Overview | ||
| 8.2 Parameters | ||
| 8.3 Stopping Criteria | ||
| 8.4 Example | ||
| 8.5 References and Further Reading |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The problem solved by GENCAN is the minimization of a smooth function with bounds on the variables (see section 4. Brief Theoretical Introduction and 4.2.2 Box Constraints).
Suppose that an iterate is inside a given face of (the feasible set) and consider to be the projection of inside this face. The main algorithm then performs the test , where . If false, the new point must remain in the face and an iteration of a truncated Newton method is performed (only for the free variables). However, if that condition holds, some constraint must be abandoned and the new point is computed doing one iteration of the "Spectral Projected Gradient" method. For further details, see the reference at the end of this section.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Inside GENCAN, there is a set of parameters which can be changed to improve the minimizing efforts. The parameter list, together with the default values are given below.
= 1.0e-5
= 1.0e-5
= -1.0e+99
= -1
delmin, 0.1 max( 1, )) is
used if udelta0 is non-positive.
= -1
= -1
ucgmia and ucgmib are both set
to positive values, the maximum number of iterations is set as max(1,
ucgmia * nind + ucgmib), where nind states for the number
of variables of the subproblem. Otherwise, thr default value for the
maximum number of iterations is a linear function of the infinite norm
of the projected gradient, which ranges from max(1, 10 *
log(nind)) to nind, as the current iterate gets closer to
the solution.
= 1
= epsgpen
= 1.0e-1
= 1.0e-5
cg_scre states for Conjugate Gradient Stopping Criterion
Relation, and cg_gpnf states for Conjugate Gradient Projected
Gradient Final Norm. Both are related to a stopping criterion of
Conjugate Gradients. This stopping criterion depends on the norm of
the residual of the linear system. The norm of the residual should be
less or equal than a small quantity which decreases as the iterate
gets closer to the solution of the minimization problem. Then, the log
of the required accuracy requested to Conjugate Gradient has a linear
dependence on the log of the norm of the continuous projected
gradient. This linear relation uses the squared Euclidian norm of the
projected gradient if cg_scre is equal to 1 and uses the
infinite norm if cg_scre is equal to 2. In addition, the
precision required to CG is equal to cg_epsi (conjugate gradient
initial epsilon) at the initial iterate and cg_epsf (conjugate
gradient final epsilon) when the Euclidian- or infinite norm of the
projected gradient is equal to cg_gpnf (conjugate gradients
projected gradient final norm) which is an estimation of the value of
the Euclidian- or infinite norm of the projected gradient at the
solution. In case cg_scre is equal to 2, one suggests that
cg_gpnf be equal to epsgpsn.
= 1.0e-4
= 5
epsnqmp of the best progress during
maxitnqmp consecutive iterations then CG is stopped declaring
"not enough progress of the quadratic model".
= 0
nearlyq = 1 if the objective function is nearly quadratic.
When an iteration of the CG finds a direction such that
then, depending on nearlyq, CG does: if
nearlyq=0, it stops at current point or if nearly=1 it takes
the direction and tries to go to the boundary choosing the
best among the two points at the boundary and the current one.
= 2.0
sigma1
and sigma2. Sometimes we take as a new trial step the previous
one divided by nint.
= 2.0
= 4
mininterp
interpolations, the steplength is too small. In that case, failure of
the line search is declared. It is possible that the direction is not
a descent direction due to an error in the gradient calculations.
= 100
= 0
trtype = 0 means Euclidian norm
trust-region and trtype = 1 means infinite-norm trust-region.
= 0.9
eta) times the norm of the continuous projected gradient. Using
the default value is a rather conservative strategy in the sense that
internal iterations are preferred over SPG iterations.
= 0.1
= 1.0e-10
= 1.0e+10
lspgmi,lspgma].
= 1.0e-6
= 1.0e-4
= 0.5
beta . If satisfies the Armijo condition but does not satisfy the beta
condition then the point is accepted, but if it satisfied the Armijo
condition and also satisfies the beta condition then we know that
there is the possibility for a successful extrapolation.
= 0.1
= 0.9
nint.
= 1.0e-7
= 1.0e-10
= 1.0e+20
= 1.0e+99
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In the original paper, some criteria are used to detect if the iteration process should stop. They are (in the same order they appear)
The infinite norm of the projected gradient at the current iterate can
be retrieved by calling ool_conmin_minimizer_size() function.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] |