COST Action IC0702 Spring School 2009

**A Practical Tutorial on Normative Reasoning**

Leon van der Torre**Arguing and Explaning Decisions**

Leila Amgoud**Belief Functions: Theory and Applications to Pattern Recognition and Learning**

Thierry Deneoux**Graphical Models and How to Learn them from Data**

Christian Borgelt**Statistical Reasoning under Uncertainty and Imprecision**

María Ángeles Gil**Fuzzy Representation of Random Variables**

Ana Colubi**Regression Analysis with Imprecise Data**

Gil González-Rodríguez**Robust Statistics: Concepts, Methods, Applications and Software**

Peter Filzmoser**Small Area Estimation under Time Models**

Domingo Morales**The Role of Similarity Relations in Fuzzy Systems**

Frank Klawonn**A Consistent Hybrid Model for Handling Partial Knowledge**

Guilianella Coletti**Fuzzy Inference Tools for Decision Makers**

Raimundas Jasinevicius**Optimizing Risk and Reward of Financial Portfolios**

Manfred Gilli**The Convergence of Estimators based on Heuristics**

Peter Winker**Classification of Gene Expression Data**

Mario Guarracino

Normative systems are "systems in the behavior of which norms play a role and which need normative concepts in order to be described or specified". A normative multi-agent system combines models for normative systems (dealing for example with obligations, permissions and prohibitions) with models for multi-agent systems. There is an increasing interest in normative systems in the computer science community, due to the observation five years ago in the so-called AgentLink Roadmap, a consensus document on the future of multiagent systems research, that norms must be introduced in agent technology in the medium term (i.e., now!) for infrastructure for open communities, reasoning in open environments and trust and reputation. Norms have been proposed in multi-agent systems and computer science to deal with coordination issues, to deal with security issues of multi-agent systems, to model legal issues in electronic institutions and electronic commerce, to model multi-agent organizations, etc. In this tutorial we consider normative decision making and reasoning under uncertainty and imprecision.

Argumentation is a reasoning model based on the construction and the evaluation of interacting arguments. Those arguments are intended to support/explain/attack statements that can be decisions, opinions, ... Argumentation has been used for different purposes, such as non-monotonic reasoning. Argumentation has also been extensively used for modeling different kinds of dialogues, in particular persuasion, and negotiation.

In this presentation, we deal with an argumentative view of decision making, thus focusing on the issue of justifying the best decision to make in a given situation by arguments. We present a framework in which decisions are articulated on the basis of arguments. This framework is relevant for different decision problems or approaches such as decision making under uncertainty, multiple criteria decisions, or rule-based decisions.

The theory of belief function (also referred to as Dempster-Shafer theory) is a generalization of probability theory allowing for the representation of uncertain and imprecise knowledge. Introduced in the context of statistical inference by A. P. Dempster in the 1960?s, it was developed by G. Shafer in the 1970?s as a general framework for combining evidence and reasoning under uncertainty. After a general introduction to this theory, we will focus on its application to data classification and clustering. As will be shown, Dempster-Shafer theory makes it possible to handle uncertain and imprecise observations, such as partially supervised data in classification tasks. The language of belief functions also allows us to generate rich descriptions of the data (using, e.g., the new concept of credal partition in clustering problems), and to combine efficiently information coming from several sources (such as statistical data and expert knowledge).

In the last 20 years probabilistic graphical models - in particular Bayes networks and Markov networks - became very popular as tools for structuring uncertain knowledge about a domain of interest and for building knowledge-based systems that allow sound and efficient inferences about this domain.

The core idea of graphical models is that usually certain independence relations hold between the attributes that are used to describe a domain of interest. In most uncertainty calculi - and in particular in probability theory - the structure of these independence relations is very similar to properties concerning the connectivity of nodes in a graph. As a consequence, it is tried to capture the independence relations by a graph, in which each node represents an attribute and each edge a direct dependence between attributes. In addition, provided that the graph captures only valid independences, it prescribes how a probability distribution on the (usually high-dimensional) space that is spanned by the attributes can be decomposed into a set of smaller (marginal or conditional) distributions. This decomposition can be exploited to derive evidence propagation methods and thus enables sound and efficient reasoning under uncertainty.

This lecture talk provides a brief introduction into the core ideas underlying graphical models, starting from their relational counterparts and highlighting the relation between independence and decomposition. Furthermore, the basics of model construction and evidence propagation are discussed, with an emphasis on join/junction tree propagation. A substantial part of the talk is then devoted to learning graphical models from data, in which quantitative learning (parameter estimation) as well as the more complex qualitative or structural learning (model selection) are studied. The lecture closes with a brief discussion of example applications.

Randomness and fuzziness involve different approaches and models to deal with uncertainty that often coexist in real-life. We will first discuss main divergencies and meeting points between both approaches.

As part of the meeting points, some concepts have been established in the literature in which the two sources of uncertainty arise combined. Among them, fuzzy random variables (FRVs) play a key role. FRVs have been introduced to formalize either fuzzy perceptions/observations of existing real-valued random mechanisms or mainly existing fuzzy-valued random mechanisms. Some discussions will be carried out on the interest of modelling and analyzing these mechanisms from a probabilistic/statistical viewpoint.

Puri and Ralescu's model will be presented and some of their main probabilistic aspects/implications will be commented. For many years, statistical aspects and applications of FRV's had not received as much attention as the probabilistic features. Recently, both descriptive and inferential studies about fuzzy- and real-valued parameters/characteristics associated with FRVs have been developed. A statistical methodology for estimating and testing hypothesis about the means of fuzzy random variables will be presented and illustrated. Other statistical problems involving FRVs will be also discussed.

Certain random experiments traditionally modeled through ordinal variables can be intuitively modeled by means of fuzzy random variables. This is the case of the forest fire index, which is usually fixed to range in a discrete scale from 1 to 5 (1 being absence of risk and 5 being maximum risk). The different nature of the extreme values together with the lack of precision entailed by the discretization suggest the possibility of representing them by means of specific fuzzy sets capturing these features.

Some comparative trials concerning statistical hypothesis testing showed that considering the experiment modeled through fuzzy random variables leads frequently to better results (in power) than considering simply the ordinal scale. This fact led to the consideration of different kinds of fuzzifications which were proved to be connected with the characterization of the distribution of real-valued random variables.

In this talk, different fuzzifications are presented and a discussion of their uses is carried out. Specifically, empirical analysis supporting the suitability of some of the fuzzifications in connection with goodness-of-fit and equality of distribution tests will be shown. In addition, the usefulness of the graphical display in some cases will be examined to illustrate that some of the most relevant parameters (namely, the mean value, variance, and asymmetry) can be immediately identified.

The simple linear regression problem for fuzzy random variables is considered. Some of the more relevant regression analysis in this context will be recalled; although the aim will be centred on a specific model based on the fuzzy-valued arithmetic. One of the advantages of this model is associated with its ease of manageability, in contrast to other models which become quite complex in practice.

Since the space of fuzzy values is not linear, some pathological cases arise, namely, the possibility of a double theoretical linear model in some special situations. The conditions under which the double model exists are identified and characterized.

Whereas usual linear models between fuzzy random variables usually involve the conditional expectation (regression function), the one which is considered here is assumed to be held in a wider setting by the variables themselves. This fact entails a remarkable simplification in the associated estimation problems concerning the regression/correlation analyses.

The least squares estimation problem in terms of a versatile metric is addressed. The solutions are established in terms of the moments of the involved random elements by employing the concept of support function of a fuzzy set. Some considerations concerning the applicability of the model are made.

We will outline the basic concepts of robust statistics, starting from outlier detection and robust regression to robust location and covariance estimation. The latter estimates are used for robustifying various multivariate methods. Applications to real data examples will demonstrate the advantages of robust methods over classical counterparts. The R packages "robustbase" and "rrcov" include the most important methods, and we will briefly introduce their essential features.

Small area parameters usually take the form *h(y)*, where
*y* is the vector containing the values of all units in the
domain and *h* is a linear or nonlinear function. If *h*
is not linear or the target variable is not normally distributed,
then the unit-level approach has no standard procedure and each case
should be treated with a specific methodology. Area-level linear mixed
models can be generally applied to produce EBLUP estimates of linear
and non-linear parameters because direct estimates are weighted sums,
so that the assumption of normality may be acceptable. In this
tutorial we treat the problem of estimating small area non-linear
parameters, with special emphasis on the estimation of poverty
indicators. For this sake, we borrow strength from time by using
area-level linear time models. We consider two time-dependent
area-level models, empirically investigate their behavior and apply
them to estimate poverty indicators in the Spanish Living Conditions
Survey.

This tutorial provides an overview on fuzzy systems from the viewpoint of similarity relations. Similarity relations turn out to be an appealing framework in which typical concepts and techniques applied in fuzzy systems and fuzzy control can be better understood and interpreted. They can also be used to describe the indistinguishability inherent in any fuzzy system that cannot be avoided. Similarity relations provide also a stricter framework for reasoning under uncertainty than arbitrary fuzzy systems.

Coherent conditional probabilities can be regarded as a general framework in which different measures of uncertainty can be reread as particular coherent assessments. So this model is able to consistently manage also partial imprecise and vague information and the relevant inferential process. Moreover coherence principle provides algorithms both for checking consistency of the assessments and for ruling the enlargement of an assessment to new events in a coherent way.

Introduction: decision making process. Verbal vs. numerical, fuzzy vs. crisp. SWOT analysis and FCMs basics. SWOT + FCM = new tool for decision dynamics. FCMs extensions. Fuzzy expert maps (FEM) and decision suggesting tools. Examples and applications (environmental project; NATO enlargement; port security system; medical diagnostics; USA war in Iraq; fund portfolio management). Concluding remarks.

Manfred Gilli (Enrico Schumann)

Portfolios of stocks are often characterised by a desired property, the "reward", and something undesirable, the "risk". The aim of portfolio selection (or optimisation) is then to find (Pareto-)efficient combinations of stocks with respect to these properties. That is, we look for portfolios that cannot be improved any more with respect to one property without worsening the other.

Reward and risk are usually identified with mean portfolio return and variance of returns, respectively. While this is computationally attractive, it is often argued that, with financial time series so demonstrably non-Gaussian, other specifications like partial and conditional moments, quantiles, or drawdowns are more appropriate. In practice, however, such alternative measures are mainly used for ex post performance evaluation, only rarely for explicit portfolio optimisation.

One reason is that, other than in the mean--variance case, the optimisation with these alternative risk and reward measures is more difficult since the resulting problems are often not convex and cannot be solved with standard methods.

We detail a simple but effective optimisation heuristic, Threshold Accepting, that allows us to replace mean and variance by alternative functions, while not requiring any parametric assumptions for the data, ie, the technique works directly on the empirical distribution function of portfolio returns. More specifically, we describe how to select portfolios by optimising risk--reward ratios constructed from alternative risk and reward specifications.

Econometric theory describes estimators and their properties, e.g., the convergence of maximum likelihood estimators. However, it is ignore that often the estimators cannot be computed using standard tools, e.g., due to multiple local optima. Then, optimization heuristics might be helpful. The additional random component of heuristics might be analyzed together with the econometric model. A formal framework is proposed for the analysis of the joint convergence of estimator and stochastic optimization algorithm. In an application to a GARCH model, actual rates of convergence are estimated by simulation. The overall quality of the estimates improves compared to conventional approaches.

Microarray technology permits to contemporary know the level of activation of thousands of genes. It is possible, for example, to classify types of cancers with respect to the patterns of gene activity in the tumor cells.

Additionally, by examining the differences in gene activity between untreated and treated tumor cells, it is possible to understand how different therapies affect tumors. However, data produced by high throughput medical equipments are noisy and incomplete, resulting in uncertainty in gene expression datasets. In this talk, the problem of dealing with imprecise classification methods and noisy datasets will be addressed, providing solutions based on the statistical learning theory and a priori knowledge.