Reactive Search Optimization (RSO) advocates the integration of learning techniques into search heuristics for solving complex optimization problems. In the last few years, RSO has been mostly employed in self-adapting a local search method in a manner depending on the previous history of the search. The learning signals consisted of data about the structural characteristics of the instance collected while the algorithm is running. For example, data about sizes of basins of attraction, entrapment of trajectories, repetitions of previously visited configurations. In this context, the algorithm learns by interacting from a previously unknown environment given by an existing (and fixed) problem definition. This thesis considers a second interesting online learning loop, where the source of learning signals is the decision maker, who is fine-tuning her preferences (formalized as an utility function) based on a learning process triggered by the presentation of tentative solutions. The objective function and, more in general, the problem definition is not fully stated at the beginning and needs to be refined during the search for a satisfying solution. In practice, this lack of complete knowledge may occur for different reasons: insufficient or costly knowledge elicitation, soft constraints which are in the mind of the decision maker, revision of preferences after becoming aware of some possible solutions, etc. The work developed in the thesis can be classified within the well known paradigm of Interactive Decision Making (IDM). In particular, it considers interactive optimization from a machine learning perspective, where IDM is seen as a joint learning process involving the optimization component and the DM herself. During the interactive process, on one hand, the decision maker improves her knowledge about the problem in question and, on the other hand, the preference model learnt by the optimization component evolves in response to the additional information provided by the user. We believe that understanding the interplay between these two learning processes is essential to improve the design of interactive decision making systems. This thesis goes in this direction, 1) by considering a final user that may change her preferences as a result of the deeper knowledge of the problem and that may occasionally provide inconsistent feedback during the interactive process, 2) by introducing a couple of IDM techniques that can learn an arbitrary preference model in these changing and noisy conditions. The investigation is performed within two different problems settings, the traditional multi-objective optimization and a constraint-based formulation for the DM preferences. In both cases, the ultimate goal of the IDM algorithm developed is the identification of the solution preferred by the final user. This task is accomplished by alternating a learning phase generating an approximated model of the user preferences with an optimization stage identifying the optimizers of the current model. Current tentative solutions will be evaluated by the final user, in order to provide additional training data. However, the cognitive limitations of the user while analyzing the tentative solutions demands to minimize the amount of elicited information. This requires a shift of paradigm with respect to standard machine learning strategies, in order to model the relevant areas of the optimization surface rather than reconstruct it entirely. In our approach the shift is obtained both by the application of well known active learning principles during the learning phase and by the suitable trade-off among diversification and intensification of the search during the optimization stage.

A Reactive Search Optimization approach to interactive decision making / Campigotto, Paolo. - (2011), pp. 1-95.

A Reactive Search Optimization approach to interactive decision making

Campigotto, Paolo
2011-01-01

Abstract

Reactive Search Optimization (RSO) advocates the integration of learning techniques into search heuristics for solving complex optimization problems. In the last few years, RSO has been mostly employed in self-adapting a local search method in a manner depending on the previous history of the search. The learning signals consisted of data about the structural characteristics of the instance collected while the algorithm is running. For example, data about sizes of basins of attraction, entrapment of trajectories, repetitions of previously visited configurations. In this context, the algorithm learns by interacting from a previously unknown environment given by an existing (and fixed) problem definition. This thesis considers a second interesting online learning loop, where the source of learning signals is the decision maker, who is fine-tuning her preferences (formalized as an utility function) based on a learning process triggered by the presentation of tentative solutions. The objective function and, more in general, the problem definition is not fully stated at the beginning and needs to be refined during the search for a satisfying solution. In practice, this lack of complete knowledge may occur for different reasons: insufficient or costly knowledge elicitation, soft constraints which are in the mind of the decision maker, revision of preferences after becoming aware of some possible solutions, etc. The work developed in the thesis can be classified within the well known paradigm of Interactive Decision Making (IDM). In particular, it considers interactive optimization from a machine learning perspective, where IDM is seen as a joint learning process involving the optimization component and the DM herself. During the interactive process, on one hand, the decision maker improves her knowledge about the problem in question and, on the other hand, the preference model learnt by the optimization component evolves in response to the additional information provided by the user. We believe that understanding the interplay between these two learning processes is essential to improve the design of interactive decision making systems. This thesis goes in this direction, 1) by considering a final user that may change her preferences as a result of the deeper knowledge of the problem and that may occasionally provide inconsistent feedback during the interactive process, 2) by introducing a couple of IDM techniques that can learn an arbitrary preference model in these changing and noisy conditions. The investigation is performed within two different problems settings, the traditional multi-objective optimization and a constraint-based formulation for the DM preferences. In both cases, the ultimate goal of the IDM algorithm developed is the identification of the solution preferred by the final user. This task is accomplished by alternating a learning phase generating an approximated model of the user preferences with an optimization stage identifying the optimizers of the current model. Current tentative solutions will be evaluated by the final user, in order to provide additional training data. However, the cognitive limitations of the user while analyzing the tentative solutions demands to minimize the amount of elicited information. This requires a shift of paradigm with respect to standard machine learning strategies, in order to model the relevant areas of the optimization surface rather than reconstruct it entirely. In our approach the shift is obtained both by the application of well known active learning principles during the learning phase and by the suitable trade-off among diversification and intensification of the search during the optimization stage.
2011
XXIII
2010-2011
Ingegneria e Scienza dell'Informaz (cess.4/11/12)
Information and Communication Technology
Battiti, Roberto
no
Inglese
Settore INF/01 - Informatica
Settore MAT/09 - Ricerca Operativa
File in questo prodotto:
File Dimensione Formato  
thesisCampigotto.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 595.27 kB
Formato Adobe PDF
595.27 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368048
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact