Tutorial T3: Meta-Learning & Algorithm Selection

.
Room D.Maria, Monday, September 7, 9:00–12:10

by
Pavel Brazdil (contact person), Joaquin Vanschoren, Christophe Giraud-Carrier, Lars Kotthoff

 Slides  (updated 7.9.)

Main Topics at a Glance and Schedule

9h     1. Introduction to meta-learning and algorithm selection (by Pavel Brazdil)
9h30  2. Meta-learning infrastructures (by Joaquin Vanschoren) 
10h    3. Hyper-parameter optimization and other topics (by Christophe Giraud-Carrier)

10h30 Coffee break
11h    4. Algorithm selection and configuration in different domains (by Lars Kotthoff)
11h30 5. Advanced Topics and Current / Future challenges 
12h10 - 13h20 Lunch break

Details of the Topics Covered

Introduction to meta-learning and algorithm selection

Overview of basic concepts
Types of recommendation (best algorithms, ranking etc.)
Data-specific recommendations - role of data characteristics
How to define and solve a metalearning problem
Combining metalearning and deployment of algorithms (active testing)

Meta-learning infrastructures

OpenML.org
Sharing and organizing algorithms, datasets and experiments
Integration in data mining environments (R, WEKA, MOA, RapidMiner,…)
Incorporating meta-learning within data mining systems
Demonstrations of systems

Hyper-parameter optimization and other topics

Hyper-parameter optimization
Clustering of ML algorithms and its affects for meta-learning

Algorithm selection and configuration in different domains

Problem definition
Components of algorithm selection systems
Algorithm selection in specific problems : Boolean satisfiability problem (SAT), traveling salesman,
job-shop scheduling, quadratic assignment problem, sorting etc.)
Example systems: SATzilla, LLAMA
System demonstrations

Advanced Topics and Current / future challenges

Exploiting metalearning in workflow design
Metalearning in the context of streaming data
Other advanced topics and current / future challenges
Scheduling meta-data acquisition for meta-learning & algorithm selection systems

Comments on tutorial structure and contents
The tutorial will introduce and cover the state-of-the-art in meta-learning, algorithm selection, and algorithm configuration. Algorithm selection and configuration are increasingly relevant today. Researchers and practitioners from all branches of science and technology face a large choice of parameterized machine learning algorithms, with little guidance as to when and how to use these techniques. Data mining challenges frequently remind us that algorithm selection and configuration are crucial in order to achieve the best performance, and drive industrial applications.

Meta-learning leverages knowledge of past algorithm applications to learn how to select the best techniques for future applications, and offers effective techniques that are superior to humans both in terms of the end result and especially in the time required to achieve it. Recently, meta-learning techniques are being incorporated in algorithm configuration techniques. This synergy leads to new techniques that recommend the best techniques and their parameter settings simultaneously, and that speed up algorithm configuration by learning which parameter settings are likely most useful on the data at hand.

After introducing the nature of algorithm selection, we elucidate how it arises machine learning and data mining, but also in other domains such as optimization and SAT solving. We show that it is possible to use meta-learning techniques to identify the potentially best algorithm(s) and their parameter settings for a new task, based on meta-level information and prior experiments.

Moreover, many contemporary problems require that solutions be elaborated in the form of complex systems or workflows which include many different processes or operations. Constructing such complex systems or workflows requires extensive expertise, and could be greatly facilitated by leveraging planning, meta-learning, and intelligent system design. This task is inherently interdisciplinary. The tutorial will provide an introduction to this new challenging domain. We also discuss the prerequisites for effective meta-learning systems, and how recent infrastructures, such as OpenML.org, open up new possibilities to build systems that effectively advice users on which algorithms to apply.

The tutorial is relevant to the ECML PKDD community because one of the largest hurdles for adapting machine learning solutions in applications is the plethora of methods available and their varying performance: there is no guarantee that a particular approach will continue to perform well beyond the testing phase. The techniques presented in this tutorial can help to alleviate this problem, while at the same time making it easier for researchers to use machine learning and data mining as off-the-shelf tools.

The intended audience will include researchers (Ph.D.s) and research students interested to learn about, or consolidate their knowledge about: the state-of-the-art in algorithm selection and algorithm configuration; how to use data mining software and platforms to select algorithms in practice how to provide advice to end users about which algorithms to select in diverse domains, including optimization, SAT etc. and incorporate this knowledge in new platforms. We specifically aim to attract researchers in diverse areas that have encountered the problem of algorithm selection and thus promote the exchange of ideas and possible collaborations.

This tutorial can be seen as a part of a joint endeavor that is intended to include also a workshop on Metalearning & Algorithm Selection (http://metasel2015.inesctec.pt/).

Names of the tutorial instructors

Pavel Brazdil
University of Porto / INESC TEC, Porto, Portugal
pbrazdil at inescporto.pt
http://www.liaad.up.pt/area/pbrazdil/
Main research interests: Data mining, machine learning, metalearning and text mining.

Joaquin Vanschoren
Eindhoven University of Technology (TU/e), Eindhoven, The Netherlands
j.vanschoren at tue.nl
http://about.me/joaquinvanschoren
Main research interests: meta-learning, web-scale machine learning and data science.

Christophe Giraud-Carrier
Dept. of Computer Science, Brigham Young University (BYU), USA
cgc at cs.byu.edu
Main research interests: Data mining, machine learning, metalearning.

Lars Kotthoff
University College Cork, Ireland
lars.kotthoff@insight-centre.org
Main research interests: Algorithm selection and configuration, innovative applications of constraint programming, especially in machine learning and data mining.

List of previous venues
This tutorial builds on several prior tutorials on meta-learning, algorithm selection, and algorithm configuration. These were mostly presented at AI conferences, and typically covered these fields separately. The proposed tutorial is updated significantly, and stresses recent advancements in the confluence of these fields, i.e. the use of machine learning models for combined algorithm selection and configuration, as well as recent infrastructures and tools.

  • A tutorial on Meta-learning & Algorithm Selection was presented at ECAI 2014 and attracted over 50 participants. Two of the current proponents (P. Brazdil, J. Vanschoren) presented parts of this tutorial. 
  • A tutorial on Automatically Improving Empirical Performance: Algorithm Configuration and Selection was presented at AAAI 2013, in part by Lars Kotthoff, attracting approximately 50 participants.
  • A tutorial on Advances in Algorithm Selection and Configuration for Constraint Solving and Satisfiability was presented at IJCAI 2013, in part by Lars Kotthoff, attracting approx.. 50 participants.
  • A tutorial on Metalearning was presented by Christophe Giraud-Carrier at ICMLA 2008 that was attended by about 15 participants.

INESC TEC - Laboratório Associado