Stochastic
optimization algorithms (FFR105, FIM711), LP1 (first
quarter), 2018
Lecturer and examiner:
Professor
Mattias Wahde, Tel: 772 3727, email: mattias.wahde@chalmers.se
Course assistants:
Sina Torabi, email: sina.torabi@chalmers.se
Luca Caltagirone, email: luca.caltagirone@chalmers.se
Literature: Wahde, M., Biologically Inspired
Optimization Methods: An Introduction, WIT Press,
2008
Note: The book is sold by Chalmers' bookstore (Cremona).
It is also possible to buy the book online (at Amazon and
many other bookstores). However, note that Cremona offers
the book at a reduced price, but has only a limited
supply.
Preliminary
program:
Date

Time

Room

Contents

20180904

10.0011.45

HB2

Course introduction and
motivation, (pp. 18), Classical optimization methods
(introduction, pp. 812)

20180905

08.0009.45

HB4

Classical optimization
methods (i), (pp. 1221, Appendix B.1, pp. 173174)

20180907

10.0011.45

HB2

Classical optimization
methods (ii), pp. 2134, Handout of programming
problem

20180911

10.0011.45

FB

Evolutionary algorithms:
background and introduction, (pp. 3545, 8283). Handout of home problem 1

20180911

18.0021.00

MT9, 1113

Introduction to Matlab
programming for stochastic optimization algorithms

20180912

08.0009.45

HB4

Evolutionary algorithms:
components of EAs, (pp. 4659)

20180914

10.0011.45

KE

Evolutionary algorithms:
properties, (pp. 5971), Appendix B.2, pp. 174183, Handin of programming
problem

20180918

10.0011.45

FB

Classical optimization methods
and evolutionary algorithms (review, problem solving,
Q&A, etc.)

20180919

08.0009.45

HB4 
Linear genetic programming and
interactive evolutionary computation, (pp. 7281)

20180921

10.0011.45

FB

Neural networks (Appendix A,
pp. 151172), data analysis (Appendix C, pp. 193204)

20180925

10.0011.45

HB1

Evolutionary algorithms:
Applications I (various papers etc.), Handin of home
problem 1

20180926

08.0009.45



No lecture

20180928

10.0011.45

FB

Evolutionary algorithms,
Applications II (various papers etc.), Handout of home
problem 2

20181002

10.0011.45

FB

Ant colony optimization:
background and introduction, (pp. 99106)

20181003

08.0009.45

HB4

Ant colony optimization: AS
vs. MMAS, applications, properties of ACO, (pp.
107116), Appendix B.3, (pp. 183187)

20181005

10.0011.45

FB

Particle swarm optimization:
Background and introduction, (pp. 117124)

20181009

10.0011.45

FB

Particle swarm optimization:
Properties of PSO, applications, (pp. 124138)

20181010

08.0009.45

HB4

Problemsolving class
(various problems), review

20181012

10.0011.45



No
lecture

20181016

10.0011.45

FB

Performance comparison (EAs,
ACO, PSO), (pp. 139149)

20181017

08.0009.45

HB4

Applications of biologically
inspired optimization algorithms in autonomous robots,
vehicles, and software agents. Handin of home
problem 2

20181019

10.0011.45

FB

Course summary

20181023

10.0011.45

My office

Consultation (exam
preparation)

20181024 
08.0009.45 
 
No
lecture 
Slides, other
handouts, home problems, web links, demonstration programs, and
videos:
20180904 Course memo
(kurspm)
20180904 List of misprints in
the book
20180904 Slides from
Lecture 1
20180904 Quiz from
Lecture 1
20180905 Slides from
Lecture 2
20180905 Quiz from
Lecture 2
20180907 Matlab coding
standard (Note: Follow this standard
carefully!)
20180907 Introductory
programming problem (Note! Strict deadline:
20180914)
20180907 Slides from
Lecture 3
20180907 Quiz from
Lecture 3
20180911 Slides from
Lecture 4
20180911 Quiz from
Lecture 4
20180911 Checklist
for home problem submission. Note: Read carefully
before submitting HP1 (and HP2)!
20180911 Home problem 1
Note: Strict deadline, 20180925; See also the
course memo above, p. 3
20180911 Introduction
to Matlab programming for stochastic optimization
(study carefully if you did not attend the evening
session)
20180912 Slides
from Lecture 5
20180912 Quiz from
Lecture 5
20180914 Slides from
Lecture 6
20180914 Quiz from
Lecture 6
20180918 Slides from
Lecture 7 (no quiz today) Note: General
feedback regarding the IPP, see the last slides.
20180919 Slides from
Lecture 8
20180919 Quiz from
Lecture 8
20180921 Slides from
Lecture 9
20180921 Quiz from
Lecture 9
20180925 Slides from
Lecture 10 (no quiz today).
20180928
Home problem 2
Note: Strict
deadline, 20181017
20180928
TSPGraphics.zip
(needed for HP2.1)
20180928 LoadCityLocations.m
(needed for HP2.1)
20180928 AntSystem.m
(needed for HP2.1c)
20180928 GetSlopeAngle.m
(needed as a starting point for HP2.3)
20180928 LoadFunctionData.m
(needed for HP2.4)
20180928 StructExample.zip
(struct usage illustration, for HP2.4)
20180928 Slides
from Lecture 11 (no quiz today).
20181002 Slides
from Lecture 12
20181002 Quiz
from Lecture 12
20181003 Slides
from Lecture 13
20181003 Quiz
from Lecture 13
20181005 Slides
from Lecture 14
20181005 Quiz
from Lecture 14
20181009 Exam
kit (three exams with solutions)
20181009 Slides
from Lecture 15
20181009 Quiz
from Lecture 15
20181010 Slides
from Lecture 16 (no quiz today).
20181016 Slides
from Lecture 17
20181016 Quiz
from Lecture 17
20181017 Slides
from Lecture 18 (no quiz today).
Examination and
grade requirements:
Exam: 20181031
(14.0018.00) , Maximum score: 25p.
Home problems: Maximum (total) score: 25p
Grade requirements: A minimum of 10p is required on
the exam for a passing grade. In addition, you
must solve the introductory programming problem, and (as a minimum) provide
satisfactory solutions to the mandatory home
problems (which will be marked on the problem sheets). The
requirements for the various grades are specified in the course
memo and can be summarized as follows (the numbers refer to the
sum of the exam result and the results on the two home problems,
maximum 50 p in total):
Chalmers:
5 Total score in [42,50]
4 Total score in [33,41.5]
3 Total score up to 32.5 (but note also the minimum requirements listed
above)
GU:
VG: Total score in [39,50]
G: Total score up to 38.5 (but note also the minimum requirements
listed above)
ECTS grades: A=5, B=4, C=3, D = weak 3 (total score below 25).
FAQ:
Q1. Will you
send a confirmation regarding the home problem submissions?
A1. For the Introductory programming problem (IPP) we will
not, in general, but for HP1 and HP2 you will get a confirmation
when the assistants start correcting your solutions. In any case,
always keep the home problem submissions in your Sent box. In the very unlikely
case that I do not receive an email that you send, it is good to
have a copy. Also, important: Preferably
use your Chalmers email address, i.e. "...@student.chalmers.se"
(or the equivalent at GU). If you use that address, the risk that
your email would bounce back or end up as spam, is very low.
Q2. What
happens if one submits the IPP late (i.e. after 20180914)?
A2. You should really strive to
complete the IPP by the deadline. It should take only a few
hours to complete it. We will first correct the IPPs received
before the deadline, and then give feedback (individually, to
each student) which may be highly relevant for your work with
HP1 and HP2. If you submit the IPP late, we may not be able to
give you feedback before the deadline for HP1. Moreover, for any
given student, we will not begin correcting
HP1 or HP2 until the
IPP has been received. Thus, in order to get feedback (and
points) on the home problems, you must first submit the IPP
(Exception: Those who already handed in the IPP last year need
not hand it in again).
Q3. Are there any recommended problems (from Chapter
2) for selfstudy?
A3. Since the number of problems is quite small (in
the book), it is a good idea to do all of them, and also to go
through carefully the examples in the book. Note that, in the
first exercise class (20180918), I will do problems 2.12 and
2.13 on the blackboard (and some problems from Chapter 3).
Q4. When
will we receive feedback regarding the Introductory
programming problem (IPP)?
A4.
The individual feedback was already handed out, at the lecture on
Sept. 18th. I will bring the individual feedback sheets to the
next few lectures (but we will not email the individual
feedback), so that everyone can receive their sheet (which is
important!). In addition, general feedback regarding the IPP was
given in the slides from 20180918. Please read those slides
carefully before submitting HP1!
Q5. In HP1.3c) how
do we know which point to check (for stationarity)?
A5. The
GA does not generally give the exact minimum (i.e. the
(x1,x2) at which grad f = (0,0)) but a close enough
approximation. Given that approximation, it is not difficult
to find the minimum (x1*,x2*), and then prove that this point
is indeed a stationary point. For example, if (for some
function) a GA were to converge to, say (8.999999,2.999999),
one could conclude that the true minimum would probably be at
(9,3), and then check that point, to conclude that it
is the stationary point (This is indeed the reason for
carrying out calculations in 1.3c). Note that the numbers just
given are examples, not the
values you will get for the function in 1.3. As for the proof
of stationarity, yes, it does require some steps (which should
be carried out manually), but one can shorten the calculations
by carefully considering how to approach the computation of
the gradient (for example, using expressions for the
derivative of a product of two functions etc.), rather than
trying a brute force approach.
Q6. In HP1.1 is the "modulus" the
same as the "norm" of the vector.
A6. Yes,
the modulus of a vector refers to its length, and is a special
case of the norm (to be precise, "norm" is a more general
term. The so called L2norm equals the length of the vector,
i.e. its modulus). One can also refer to it as the "magnitude"
of the vector. However (NOTE!) In Matlab, it
is the command norm that you should use to
get the length of the vector (the mod command
does something else  it computes the so called "modulus after
division" and should of course not be used in this case!)
Q7. Shall
we use elitism in the GA for HP2.1b?
A7. Yes.
Q8. How should the GA be structured for HP2.1b?
A8. As mentioned in the problem
formulation, we will not use crossover in this problem
(one can define specialized crossover operators for
TSP, but we will not do so here). Thus, once all the
individuals (of a given generation) have been
evaluated, you should (of course) apply selection as
usual (either tournament or roulettewheel), and then
apply (swap) mutations to the selected individuals.
Note: Do not omit the selection step! When generating new individuals in a GA,
one applies three steps: selection (e.g. tournament),
crossover, and mutation. Not using crossover does NOT
imply that one should skip the selection step as well!
This step is crucial, since it gives preference to
individuals with higher fitness. Without selection, the
algorithm would only have access to mutations, which will
slow it down considerably. Thus, if the population size is
N, carry out N/2 selection steps, where, in
each step, (tournament) selection is called twice, to
obtain two individuals. Then skip the crossover step, and
proceed directly to mutating the two selected individuals
etc. Do not forget to include the elitism step, as well
(see the previous question).
Q9. My code for 2.1c (ACO) gives division by zero in some
situations. What should I do?
A9. In
all likelihood, there is an error in your code for the pheromone
update (ComputeDeltaPheromoneLevels or, less likely,
UpdatePheromoneLevels). The division by zero problem should not
occur if the code is correct (in fact we (Sina, Luca, and I)
never get this problem with our respective ACO codes, regardless
of the duration of the run). A common error is to
forget to update the pheromone level on the final edge in a tour, i.e. when returning to the city of
origin. However, there are also other versions of this error.
Suggestion: Just for testing (not for submitting), make a simple
graph with, say, 4 nodes, and compute manually the change in
pheromone levels, according to the equation(s) in the course
book. Then check whether or not you get the same pheromone
levels from your code. When writing the code, I suggest that you
do not try to carry out the entire pheromone update as a single
vector or matrix operation. Doing so is a very common source of
error. The risk of making an error is much smaller if you write
the code with loops, defining suitable temporary variables as
required etc. Once you are certain that the code works as it
should, you can (if you wish  not required) consider various
ways of optimizing the code.
Q10. What is a suitable
value for the number of ants for the ACO problem (2.1c)?
A10. As
mentioned in the course book, the number of ants is
usually set equal to the number of nodes, so 50 in this case.
Note that the number given in AntSystem (i.e. 20) is certainly
NOT suitable (but neither should you expect it to be! After all,
it does say "set to appropriate value"...)
Q11. What
should be the weight range for the neural network
in HP2.3?
A11. The weight range should
be sufficiently large so that the sigmoid
(sigma(z) = 1/(1 + exp(cz)) can take values in
(almost) the full range [0,1]. With c = 1, a
suitable weight range is, say, [5,5] or [10,10].
If you set a too small weight range, it will be
difficult (or impossible) to solve the problem
since the sigmoid will then give values near 0.5.
On the other hand, if the weight range is too
large, it might be difficult for the GA (or any
optimization algorithm) to find suitable weight
values. Note
that both weights and biases can take negative as
well as positive values (and zero, of course).
A common mistake is to
allow only positive weights (and biases). Note that, in the initial
population, it might be a good idea to have rather
small network weights (but then, as the evolution
progresses, both weights and biases should be allowed
to take values in the full specified range (e.g.
[5,5] or [10,10]).
Q12. In HP2.4, is it allowed to hardcode the final
operation (as a division)?
A12. No. I give
the functional form of g(x) in the problem formulation to reduce
(a bit) the complexity of the problem, but when you implement
the LGP, you should not assume that the functional form is
known.
Q13. What is a typical running time for HP2.4?
A13. Of course,
the running time varies since the optimization algorithm is
stochastic, but the number of evaluated individuals would
typically be in the range of a few hundred thousand to a few
million (meaning thousands of generations, in case of a
population size of 100). Note that this does not imply
that the optimization procedure is inefficient: The number of
possible functions (involving the four operators used here) is
huge, even if one were to restrict the possible values of the
coefficients (which we don't do,
in this case!). Thus, even after a million individuals, the LGP
algorithm has only sampled a tiny part of the space of possible
functions. In any case, before starting a long run (and this
also applies to HP2.3), make a few short runs to test the
performance of your optimization algorithm for various parameter
settings.
Q14. What is required for HP2.1d?
A14. Regarding the
code, all you need to do is to write a wrapper program around
the function that generates the path length (which you will have
implemented for the ACO), so that the length of the NN path can
be computed without the user having to enter any parameters
(since the starting node should be selected randomly here, the
lengthcalculation program in HP2.1d will give slightly
different results if run repeatedly. This is perfectly OK). You
should also include the length of the NN path in the report (and
also perhaps add some conclusion regarding the reason for the
ACO's superior performance for this problem).
Q15. Are we allowed to
modify the GA or LGP algorithm (relative to the standard
versions) for HP2.3 and HP2.4?
A15. Yes, you may
(and probably should, at least for HP2.4). Note that this does
not apply to HP2.1b (for which the GA is described in detail in
the problem formulation). For HP2.4 (and perhaps HP2.3) you may
wish to try, for example, with varying mutation rates, and other
minor modifications. As mentioned above (Q13), it's also a
good idea to make a few short runs with your GA, in order to
find suitable parameter settings. The required running time
might be much longer if the operators and parameters are not
chosen with some care. Of course, you should also carefully
check that the GA is correctly implemented (some students had an
error in their tournament selection method for HP1.3, for
example).
Q16. Should the pheromone
matrix (tau_ij) be symmetric?
A16. There is no
(general) reason to make it symmetric. For the particular case
of TSP, the visibility matrix is symmetric, but
that is not the case for all problems. Moreover, forcing
tau_ij to be symmetric complicates things unnecessarily (for
example, you must then decide how to symmetrize it 
there are several ways). I recommend not trying to symmetrize
the matrix but, at the same time, it is not forbidden to do
so.
Q17. Should we also
optimize the biases in HP2.3?
A17. Yes! Biases are
weights, and should be encoded and optimized along with
the other weights.
Q18. Should we implement
the neural network (for HP2.3) ourselves?
A18. Yes you should.
It's rather easy to implement a neural network with a single
hidden layer, and it's a good exercise to do so. Specifically,
do not attempt to use the deep learning toolbox (or
any other neural network toolbox in Matlab) in this case:
Understanding fully how those toolboxes work (and setting the
correct parameters, activation functios etc.) will take much
longer time than just implementing a simple neural network of
the kind you need for HP2.3.
Q19. How should one set the
mutation probability in HP2.4?
A19. It's a good idea to
let the mutation probability vary with the length of the
chromosome. Thus, one can set it as pmut = c/L, where L is
the length of the chromosome under consideration, and c is
a parameter of order one (say in the interval 2  5 or
so). It also generally helps to let c fall gradually (with
the generations) during the LGP run, down to (around) 1.
Q20. Despite updating
the pheromone on the edge that takes an ant back
to its node of origin (in ACO; see also Q9 above),
there are still problems with the convergence.
What else could be wrong?
A20. Other common errors include (i)
forgetting to sum the pheromone contribution from all ants (in
this problem, we use AS, not MMAS!); (ii) incorrectly
computing the cumulative sum of probabilities (for the
denominator in the equation for p(e_ijS)) when selecting the
next node; (iii) setting the evaporation rate too high (around
0.5 is usually suitable).
Last update: 20181017,
11.00