AAI research group


Stochastic optimization algorithms (FFR105, FIM711), LP1 (first quarter), 2018

Lecturer and examiner:
Professor Mattias Wahde, Tel: 772 3727, e-mail: mattias.wahde@chalmers.se

Course assistants:

Sina Torabi, e-mail: sina.torabi@chalmers.se
Luca Caltagirone, e-mail: luca.caltagirone@chalmers.se

Literature: Wahde, M., Biologically Inspired Optimization Methods: An Introduction, WIT Press, 2008
Note: The book is sold by Chalmers' bookstore (Cremona). It is also possible to buy the book online (at Amazon and many other bookstores). However, note that Cremona offers the book at a reduced price, but has only a limited supply.

Preliminary program:

Date

Time

Room

Contents

20180904

10.00-11.45

HB2

Course introduction and motivation, (pp. 1-8), Classical optimization methods (introduction, pp. 8-12)

20180905

08.00-09.45

HB4

Classical optimization methods (i), (pp. 12-21, Appendix B.1, pp. 173-174)

20180907

10.00-11.45

HB2

Classical optimization methods (ii), pp. 21-34, Handout of programming problem

20180911

10.00-11.45

FB

Evolutionary algorithms: background and introduction, (pp. 35-45, 82-83). Handout of home problem 1

20180911

18.00-21.00

MT9, 11-13

Introduction to Matlab programming for stochastic optimization algorithms

20180912

08.00-09.45

HB4

Evolutionary algorithms: components of EAs, (pp. 46-59)

20180914

10.00-11.45

KE

Evolutionary algorithms: properties, (pp. 59-71), Appendix B.2, pp. 174-183, Handin of programming problem

20180918

10.00-11.45 FB

Classical optimization methods and evolutionary algorithms (review, problem solving, Q&A, etc.)

20180919

08.00-09.45 HB4

Linear genetic programming and interactive evolutionary computation, (pp. 72-81)

20180921

10.00-11.45           

FB

Neural networks (Appendix A, pp. 151-172), data analysis (Appendix C, pp. 193-204)

20180925

10.00-11.45

HB1

Evolutionary algorithms: Applications I (various papers etc.), Handin of home problem 1

20180926

08.00-09.45

-

No lecture

20180928

10.00-11.45

FB

Evolutionary algorithms, Applications II (various papers etc.), Handout of home problem 2

20181002

10.00-11.45

FB

Ant colony optimization: background and introduction, (pp.  99-106)

20181003

08.00-09.45

HB4

Ant colony optimization: AS vs. MMAS,  applications, properties of ACO, (pp. 107-116), Appendix B.3, (pp. 183-187)

20181005

10.00-11.45

FB

Particle swarm optimization: Background and introduction, (pp. 117-124)

20181009   

10.00-11.45

FB

Particle swarm optimization: Properties of PSO, applications, (pp. 124-138)

20181010

08.00-09.45

HB4

Problem-solving class (various problems), review

20181012

10.00-11.45

-

No lecture

20181016

10.00-11.45

FB

Performance comparison (EAs, ACO, PSO), (pp. 139-149)

20181017

08.00-09.45

HB4

Applications of biologically inspired optimization algorithms in autonomous robots, vehicles, and software agents. Handin of home problem 2

20181019

10.00-11.45

FB

Course summary

20181023

10.00-11.45

My office

Consultation (exam preparation)

20181024 08.00-09.45 - No lecture


Slides, other handouts, home problems, web links, demonstration programs, and videos:

20180904 Course memo (kurs-pm)
20180904 List of misprints in the book
20180904 Slides from Lecture 1
20180904 Quiz from Lecture 1
20180905 Slides from Lecture 2
20180905 Quiz from Lecture 2
20180907 Matlab coding standard (Note: Follow this standard carefully!)
20180907 Introductory programming problem (Note! Strict deadline: 20180914)
20180907 Slides from Lecture 3
20180907 Quiz from Lecture 3
20180911 Slides from Lecture 4
20180911 Quiz from Lecture 4
20180911 Checklist for home problem submission. Note: Read carefully before submitting HP1 (and HP2)!
20180911 Home problem 1 Note: Strict deadline, 20180925; See also the course memo above, p. 3
20180911 Introduction to Matlab programming for stochastic optimization (study carefully if you did not attend the evening session)
20180912 Slides from Lecture 5
20180912 Quiz from Lecture 5
20180914 Slides from Lecture 6
20180914 Quiz from Lecture 6
20180918 Slides from Lecture 7 (no quiz today) Note: General feedback regarding the IPP, see the last slides.
20180919 Slides from Lecture 8
20180919 Quiz from Lecture 8
20180921 Slides from Lecture 9
20180921 Quiz from Lecture 9
20180925 Slides from Lecture 10 (no quiz today).
20180928 Home problem 2 Note: Strict deadline, 20181017
20180928 TSPGraphics.zip (needed for HP2.1)
20180928 LoadCityLocations.m (needed for HP2.1)
20180928 AntSystem.m (needed for HP2.1c)
20180928 GetSlopeAngle.m (needed as a starting point for HP2.3)
20180928 LoadFunctionData.m (needed for HP2.4)
20180928 StructExample.zip (struct usage illustration, for HP2.4)
20180928 Slides from Lecture 11 (no quiz today).
20181002 Slides from Lecture 12
20181002 Quiz from Lecture 12
20181003 Slides from Lecture 13
20181003 Quiz from Lecture 13
20181005 Slides from Lecture 14
20181005 Quiz from Lecture 14
20181009 Exam kit (three exams with solutions)
20181009 Slides from Lecture 15
20181009 Quiz from Lecture 15
20181010 Slides from Lecture 16 (no quiz today).
20181016 Slides from Lecture 17
20181016 Quiz from Lecture 17
20181017 Slides from Lecture 18 (no quiz today).

Examination and grade requirements:

Exam: 20181031 (14.00-18.00) , Maximum score: 25p.
Home problems: Maximum (total) score: 25p

Grade requirements: A minimum of 10p is required on the exam for a passing grade. In addition, you must solve the introductory programming problem, and
(as a minimum) provide satisfactory solutions to the mandatory home problems (which will be marked on the problem sheets). The requirements for the various grades are specified in the course memo and can be summarized as follows (the numbers refer to the sum of the exam result and the results on the two home problems, maximum 50 p in total):

Chalmers:
5   Total score in [42,50]
4   Total score in [33,41.5]
3   Total score up to 32.5
(but note also the minimum requirements listed above)

GU:
VG: Total score in [39,50]
G: Total score up to 38.5 (but note also the minimum requirements listed above)

ECTS grades: A=5, B=4, C=3, D = weak 3 (total score below 25).


FAQ:

Q1. Will you send a confirmation regarding the home problem submissions?
A1. For the Introductory programming problem (IPP) we will not, in general, but for HP1 and HP2 you will get a confirmation when the assistants start correcting your solutions. In any case, always keep the home problem submissions in your Sent box. In the very unlikely case that I do not receive an e-mail that you send, it is good to have a copy. Also, important: Preferably use your Chalmers e-mail address, i.e. "...@student.chalmers.se" (or the equivalent at GU). If you use that address, the risk that your e-mail would bounce back or end up as spam, is very low.

Q2. What happens if one submits the IPP late (i.e. after 20180914)?
A2. You should really strive to complete the IPP by the deadline. It should take only a few hours to complete it. We will first correct the IPPs received before the deadline, and then give feedback (individually, to each student) which may be highly relevant for your work with HP1 and HP2. If you submit the IPP late, we may not be able to give you feedback before the deadline for HP1. Moreover, for any given student, we will not begin correcting HP1 or HP2 until the IPP has been received. Thus, in order to get feedback (and points) on the home problems, you must first submit the IPP (Exception: Those who already handed in the IPP last year need not hand it in again).

Q3. Are there any recommended problems (from Chapter 2) for self-study?
A3. Since the number of problems is quite small (in the book), it is a good idea to do all of them, and also to go through carefully the examples in the book. Note that, in the first exercise class (20180918), I will do problems 2.12 and 2.13 on the blackboard (and some problems from Chapter 3).

Q4. When will we receive feedback regarding the Introductory programming problem (IPP)?
A4.
The individual feedback was already handed out, at the lecture on Sept. 18th. I will bring the individual feedback sheets to the next few lectures (but we will not e-mail the individual feedback), so that everyone can receive their sheet (which is important!). In addition, general feedback regarding the IPP was given in the slides from 20180918. Please read those slides carefully before submitting HP1!

Q5. In HP1.3c) how do we know which point to check (for stationarity)?
A5. The GA does not generally give the exact minimum (i.e. the (x1,x2) at which grad f = (0,0)) but a close enough approximation. Given that approximation, it is not difficult to find the minimum (x1*,x2*), and then prove that this point is indeed a stationary point. For example, if (for some function) a GA were to converge to, say (8.999999,2.999999), one could conclude that the true minimum would probably be at (9,3), and then check that point, to conclude that it is the stationary point (This is indeed the reason for carrying out calculations in 1.3c). Note that the numbers just given are examples, not the values you will get for the function in 1.3. As for the proof of stationarity, yes, it does require some steps (which should be carried out manually), but one can shorten the calculations by carefully considering how to approach the computation of the gradient (for example, using expressions for the derivative of a product of two functions etc.), rather than trying a brute force approach.

Q6. In HP1.1 is the "modulus" the same as the "norm" of the vector.
A6. Yes, the modulus of a vector refers to its length, and is a special case of the norm (to be precise, "norm" is a more general term. The so called L2-norm equals the length of the vector, i.e. its modulus). One can also refer to it as the "magnitude" of the vector. However (NOTE!) In Matlab, it is the command norm that you should use to get the length of the vector (the mod command does something else - it computes the so called "modulus after division" and should of course not be used in this case!)

Q7. Shall we use elitism in the GA for HP2.1b?
A7.
Yes.

Q8.
How should the GA be structured for HP2.1b?
A8.
As mentioned in the problem formulation, we will not use crossover in this problem (one can define specialized crossover operators for TSP, but we will not do so here). Thus, once all the individuals (of a given generation) have been evaluated, you should (of course) apply selection as usual (either tournament or roulette-wheel), and then apply (swap) mutations to the selected individuals. Note: Do not omit the selection step! When generating new individuals in a GA, one applies three steps: selection (e.g. tournament), crossover, and mutation. Not using crossover does NOT imply that one should skip the selection step as well! This step is crucial, since it gives preference to individuals with higher fitness. Without selection, the algorithm would only have access to mutations, which will slow it down considerably. Thus, if the population size is N, carry out N/2 selection steps, where, in each step, (tournament) selection is called twice, to obtain two individuals. Then skip the crossover step, and proceed directly to mutating the two selected individuals etc. Do not forget to include the elitism step, as well (see the previous question).

Q9. My code for 2.1c (ACO) gives division by zero in some situations. What should I do?
A9. In all likelihood, there is an error in your code for the pheromone update (ComputeDeltaPheromoneLevels or, less likely, UpdatePheromoneLevels). The division by zero problem should not occur if the code is correct (in fact we (Sina, Luca, and I) never get this problem with our respective ACO codes, regardless of the duration of the run). A common error is to forget to update the pheromone level on the final edge in a tour, i.e. when returning to the city of origin. However, there are also other versions of this error. Suggestion: Just for testing (not for submitting), make a simple graph with, say, 4 nodes, and compute manually the change in pheromone levels, according to the equation(s) in the course book. Then check whether or not you get the same pheromone levels from your code. When writing the code, I suggest that you do not try to carry out the entire pheromone update as a single vector or matrix operation. Doing so is a very common source of error. The risk of making an error is much smaller if you write the code with loops, defining suitable temporary variables as required etc. Once you are certain that the code works as it should, you can (if you wish - not required) consider various ways of optimizing the code.

Q10. What is a suitable value for the number of ants for the ACO problem (2.1c)?
A10. As mentioned in the course book, the number of ants is usually set equal to the number of nodes, so 50 in this case. Note that the number given in AntSystem (i.e. 20) is certainly NOT suitable (but neither should you expect it to be! After all, it does say "set to appropriate value"...)

Q11. What should be the weight range for the neural network in HP2.3?
A11. The weight range should be sufficiently large so that the sigmoid (sigma(z) = 1/(1 + exp(-cz)) can take values in (almost) the full range [0,1]. With c = 1, a suitable weight range is, say, [-5,5] or [-10,10]. If you set a too small weight range, it will be difficult (or impossible) to solve the problem since the sigmoid will then give values near 0.5. On the other hand, if the weight range is too large, it might be difficult for the GA (or any optimization algorithm) to find suitable weight values. Note that both weights and biases can take negative as well as positive values (and zero, of course).
A common mistake is to allow only positive weights (and biases).
Note that, in the initial population, it might be a good idea to have rather small network weights (but then, as the evolution progresses, both weights and biases should be allowed to take values in the full specified range (e.g. [-5,5] or [-10,10]).

Q12. In HP2.4, is it allowed to hardcode the final operation (as a division)?
A12. No. I give the functional form of g(x) in the problem formulation to reduce (a bit) the complexity of the problem, but when you implement the LGP, you should not assume that the functional form is known.

Q13. What is a typical running time for HP2.4?
A13. Of course, the running time varies since the optimization algorithm is stochastic, but the number of evaluated individuals would typically be in the range of a few hundred thousand to a few million (meaning thousands of generations, in case of a population size of 100). Note that this does not imply that the optimization procedure is inefficient: The number of possible functions (involving the four operators used here) is huge, even if one were to restrict the possible values of the coefficients (which we don't do, in this case!). Thus, even after a million individuals, the LGP algorithm has only sampled a tiny part of the space of possible functions. In any case, before starting a long run (and this also applies to HP2.3), make a few short runs to test the performance of your optimization algorithm for various parameter settings.

Q14. What is required for HP2.1d?
A14. Regarding the code, all you need to do is to write a wrapper program around the function that generates the path length (which you will have implemented for the ACO), so that the length of the NN path can be computed without the user having to enter any parameters (since the starting node should be selected randomly here, the length-calculation program in HP2.1d will give slightly different results if run repeatedly. This is perfectly OK). You should also include the length of the NN path in the report (and also perhaps add some conclusion regarding the reason for the ACO's superior performance for this problem).

Q15. Are we allowed to modify the GA or LGP algorithm (relative to the standard versions) for HP2.3 and HP2.4?
A15. Yes, you may (and probably should, at least for HP2.4). Note that this does not apply to HP2.1b (for which the GA is described in detail in the problem formulation). For HP2.4 (and perhaps HP2.3) you may wish to try, for example, with varying mutation rates, and other minor modifications.  As mentioned above (Q13), it's also a good idea to make a few short runs with your GA, in order to find suitable parameter settings. The required running time might be much longer if the operators and parameters are not chosen with some care. Of course, you should also carefully check that the GA is correctly implemented (some students had an error in their tournament selection method for HP1.3, for example).

Q16. Should the pheromone matrix (tau_ij) be symmetric?
A16. There is no (general) reason to make it symmetric. For the particular case of TSP, the visibility matrix is symmetric, but that is not the case for all problems. Moreover, forcing tau_ij to be symmetric complicates things unnecessarily (for example, you must then decide how to symmetrize it - there are several ways). I recommend not trying to symmetrize the matrix but, at the same time, it is not forbidden to do so.

Q17. Should we also optimize the biases in HP2.3?
A17. Yes! Biases are weights, and should be encoded and optimized along with the other weights.

Q18. Should we implement the neural network (for HP2.3) ourselves?
A18. Yes you should. It's rather easy to implement a neural network with a single hidden layer, and it's a good exercise to do so. Specifically, do not attempt to use the deep learning toolbox (or any other neural network toolbox in Matlab) in this case: Understanding fully how those toolboxes work (and setting the correct parameters, activation functios etc.) will take much longer time than just implementing a simple neural network of the kind you need for HP2.3.

Q19. How should one set the mutation probability in HP2.4?
A19. It's a good idea to let the mutation probability vary with the length of the chromosome. Thus, one can set it as pmut = c/L, where L is the length of the chromosome under consideration, and c is a parameter of order one (say in the interval 2 - 5 or so). It also generally helps to let c fall gradually (with the generations) during the LGP run, down to (around) 1.

Q20. Despite updating the pheromone on the edge that takes an ant back to its node of origin (in ACO; see also Q9 above), there are still problems with the convergence. What else could be wrong?
A20. Other common errors include (i) forgetting to sum the pheromone contribution from all ants (in this problem, we use AS, not MMAS!); (ii) incorrectly computing the cumulative sum of probabilities (for the denominator in the equation for p(e_ij|S)) when selecting the next node; (iii) setting the evaporation rate too high (around 0.5 is usually suitable).

Last update: 20181017, 11.00