Evaluation and Validation Criteria for Agent-oriented Software Engineering Methodologies

As outlined in Comparing and Evaluating Agent-oriented Software Engineering Approaches, no validation method for software engineering processes is flawless. Personal preferences and biases, familiarity with the evaluated processes, learning effects, prior experiences, etc. always prevent a true objective evaluation. Nonetheless, it is crucial to provide evidence that a process is applicable and give guidance as to its specialities. We follow Tran and Low (2005) and Tran et al. (2005) by evaluating PosoMAS in four areas: process-related criteria, including aspects that are related to the process life-cycle, the steps that are used, and the agent architectures that are supported; technique-related criteria, evaluating the techniques employed in the individual model steps; model-related criteria, looking at the notational aspects and the concepts included in the models used; and supportive-feature criteria, focusing on tools and support for various AOSE specialities. We will not use the full catalogue used in the original paper and instead incorporate some of the aspects from Al-Hashel et al. (2007) to bolster the structural evaluation.

The criteria put forward in Tran and Low (2005) and Tran et al. (2005) are, at times, very specific. They contain, e.g., criteria such as "Communication ability: Can the models support and represent a `knowledge-level' communication ability (i.e., the ability to communicate with other agents with language resembling human-like speech acts)?" (emphasis in the original text). From this follows that the evaluation is mostly tailored to methodologies that provide specialised models instead of re-using standard software engineering modelling techniques the way PosoMAS does. Other criteria explicitly evaluate the use of "standard AOSE concepts", e.g., roles for agents or the use of ontologies. For reasons outlined in Steghöfer et al. (2014), such elements are explicitly ignored in PosoMAS, mainly to avoid limiting the applicability of the process to specialised architectures or agent meta-models. Therefore, some of the model-related criteria, e.g., the one mentioned above or "Human Computer Interaction: Do the models represent human users and the user interface?" are ignored or altered in the following, since they definitely can be included but PosoMAS does not prescribe how or if this has to be done. The technique-related criteria and criteria that have to do with the steps of the process are regarded separately as well.

On the other hand, the criteria used by Tran and Low (2005) do not always provide clear guidelines on how they have to be evaluated. The criterion "complexity", e.g., is defined as "is there a manageable number of concepts expressed in each model/diagram?" Clearly, the definition of "manageable" differs quite a bit between different evaluators. In addition, the authors use some terms in the description of the criteria that are commonly employed differently in the SE community. Analysis, design, and implementation are usually not designated as the "phases" of a process but as its disciplines. These oversights have been fixed as far as possible in the overview below. If not explicitly denoted otherwise, all criteria and their description are from Tran et al. (2005).

Process-related criteria

These criteria are used to determine general properties of the process such as the supported life-cycle as well as its specific suitability for multi-agent system development, especially for the kind of MAS it can be applied to. The criteria are listed in Table 1. They can be assessed based on the process description, especially the life-cycle and the activities that are defined. A criterion not present in Tran et al. (2005) is "meta-model based". A meta-model based process prescribes a meta-model for the description of the agent and/or system architecture. As argued in Steghöfer et al. (2014), meta-models can be helpful but they also limit the designers freedom in design choices.

Table 1: Process-related criteria for the evaluation of agent-oriented software engineering processes according to Tran et al. (2005)
Criterion Description Note
Development life-cycle Is the life-cycle formally defined? Which life-cycle is adopted (e.g., Waterfall)?
Coverage of the life-cycle What disciplines are covered by the methodology (e.g., analysis, design, implementation). Description was originally: "What phases of the life-cycle..."
Development perspectives Does the methodology support both bottom-up and top-down development or does it follow a hybrid approach?
Application Domain Can the methodology be applied to arbitrary domains or is it suited for specific domains?
Size of MAS Which system size does the methodology support?
Agent paradigm Is the methodology suited for the development of agents of a specific paradigm (e.g., BDI agents) or is it independent of the reasoning mechanism and meta-model used? Was originally: "Agent nature".
Support for model validation Are there provisions for the validation and verification of the models and specifications developed in the methodology?
Refinability Does the process define how a model is gradually refined and augmented in the course of the process?
Approach towards MAS development Does the methodology apply a specific MAS development approach (e.g., based on methods form knowledge engineering, object-oriented, role-based, or "non-role-based", i.e., relying on other means such as use cases, workflow models, or interactions)?
Meta-model Based Does the process prescribe a meta-model as the basis for architectural models of the system? New

Model-related criteria

The models created in the course of a process are the core vehicle to document design decisions, communicate to the stakeholders, guide the developers, and determine the scope of the project. Therefore, a number of criteria evaluate their expressiveness, the concepts they capture, and the way they are used in the process. As noted above, some of the criteria originally defined by Tran et al. (2005) are omitted or changed, since they were either formulated with very specific expectations about a MAS in mind or are based on the discretion of the designer. The "complexity" criterion, e.g., can be easily mitigated by creating different diagrams for a model, thus providing different views on the same model. While the model can be highly complex, the views—usually different diagrams—can shed light on specific aspects and thus simplify its depiction tremendously. The criteria are listed in Table 2. Tran et al. (2005) also provide a list of MAS concepts that are used to evaluate the coverage of more specific notions. This list is used in Qualitative Comparison of AOSE processes as well.

Table 2: Model-related criteria for the evaluation of agent-oriented software engineering processes according to Tran et al. (2005)
Criterion Description Note
Syntax and Semantics Are the syntactical elements of the models and their semantics clearly defined? Was originally: "Formalization/Preciseness of models"
Model transformations Is the transformation of models as part of a model-driven engineering approach supported by providing guidelines for the transformations? Was originally: "Model derivation"
Consistency Do provisions exist that guarantee internal consistency within a model and between models? Is it possible to check consistency between levels of abstractions and between different representations of the same aspect in different models?
Modularity Can agents be structured in a modular way?
Abstraction Is it possible to model the system and the agents at different levels of detail and abstraction?
Autonomy Does the methodology support modelling the autonomous features of an agent?
Adaptability Are features of adaptability such as learning supported by the models?
Cooperative behaviour Does the methodology support modelling the cooperative behaviour of the agents?
Inferential capability Is it possible to model automatic agent inference, i.e., the capability of an agent to derive concrete actions from abstract commands?
Communication ability Does the methodology support modelling communication and knowledge exchange between agents?
Personality Does the methodology support modelling an agent's "ability to manifest attributes of a `believable' human character"?
Reactivity Is there support for the modelling of reactive agents that act as a response to sensed stimuli?
Proactivity Does the methodology support modelling proactive agent behaviour, i.e., the ability of the agents to self-initiate a deliberate act to achieve a certain goal? Was originally: "Deliberative behaviour"
Temporal continuity Can the models support and represent temporal continuity of agents (i.e., persistence of identity and state over long periods of time)?
Concurrency Do the models prescribed by the methodology allow capturing concurrency of and synchronisation between processes?
Model Reuse Does the methodology provide, or make it possible to use, a library of reusable models?

As PosoMAS uses standard UML diagrams with a specialised UML profile, most of the criteria about the expressiveness of the models can be answered in the positive. The question becomes whether the methodology gives guidance how these features can be modelled with standard UML. While some of the criteria are very general and apply to all kinds of multi-agent systems, some aim at very specific kinds of MAS or agents. The "personality" criterion, e.g., is applicable when an agent should act as a placeholder for a human. If a game is developed, the non-player characters could have a "personality" of their own. The role of emotion has been investigated in the multi-agent systems community as well (see, e.g., Brave et al., 2005) and personality and emotion can be linked (Allbeck and Badler, 2002). However, this aspect of multi-agent systems is usually ignored for AOSE processes since they are mostly aimed at technical solutions without strong requirements for such concepts.

Supportive feature criteria

An AOSE methodology can support the development team in a number of ways that are captured in this set of criteria detailed in Table 3. This includes special features such as mobile agents or ontologies but also support for a number of other properties, such as self-organisation and self-interested agents. These criteria thus include a number of important principles for open self-organising MAS. As the evaluation will show, these criteria also capture the most important differences of the processes. This can be ascribed to the fact that Tran et al. (2005) added a number of central concepts to these "supportive features". The criterion Dynamic Structure, e.g., deals with self-organisation, an important part of processes such as O-MaSE and PosoMAS.

Table 3: Supportive feature criteria for the evaluation of agent-oriented software engineering processes according to Tran et al. (2005)
Criterion Description Note
Software and methodological support Are there tools and libraries of reusable components that support the methodology?
Open systems and scalability Does the methodology provide support for open systems that allow agents and resources to dynamically enter and leave the system?
Dynamic structure Is there support for self-organisation processes, i.e., the dynamic reconfiguration of the system structure?
Performance and robustness Are techniques for dealing with exceptions, capturing error states and recovering from failures as well as performance monitoring provided? Was originally: "Agility and robustness"
Support for conventional objects Can "ordinary objects" be integrated into the design and are interfaces to such objects captured?
Support for mobile agents Are there possibilities to integrate mobile agents in the developed MAS?
Support for self-interested agents Is there support for agents that do not adhere to the benevolence assumption and thus do not necessarily contribute to the overall system goal?
Support for ontologies Can ontologies be integrated in the design of the MAS?

Process Steps and Related Criteria

In addition, Tran and Low (2005) propose 19 crucial process steps shown in Table 4 by analysing existing methodologies. The importance of these steps has been verified by "experts" according to the authors and can thus be seen as an agreed-upon minimal set of steps for the design of a multi-agent system. The presence of these steps, the "notational components" that are created in these steps, the weaknesses and strengths of the concrete realisations, the ease of understanding, and the definition of inputs and outputs of the steps are further process-related criteria that will be regarded in the context of these steps.

Table 4: 19 required steps for an AOSE process according to Tran et al. (2005)
  1. Identify system goals
  2. Identify system tasks/behaviour
  3. Specify use case scenarios
  4. Identify roles
  5. Identify agent classes
  6. Model domain conceptualisation
  7. Specify acquaintances between agent classes
  8. Define interaction protocols
  9. Define content of exchanged messages
  10. Specify agent architecture
  11. Define agent mental attitudes (e.g., goals, beliefs, plans, commitments,...)
  1. Define agent behavioural interface (e.g., capabilities, services, contracts, ...)
  2. Specify system architecture (i.e., overview of all components and their connections)
  3. Specify organisational structure/control regime/inter-agent social relationships
  4. Model MAS environment (e.g., resources, facilities, characteristics)
  5. Specify agent-environment interaction mechanism
  6. Specify agent inheritance and aggregation
  7. Instantiate agent classes
  8. Specify agent instances deployment

Furthermore, the technique-related criteria listed in Table 5 are evaluated for each of the steps. While they, again, introduce subjective measures, it is important to determine whether single steps are merely mentioned or described extensively and supported by examples. The criteria will be used in a metric to evaluate how well the steps are covered by a process.

Table 5: Technique-related criteria for the evaluation of singular steps within
agent-oriented software engineering processes according to Tran et al. (2005)
Criterion Description Note
Availability of techniques and heuristics Which techniques are used in each step and how are the models produced through them?
Ease of understanding and usability of the techniques Are the techniques described in a way that is easy to understand and to apply for a developer?
Provision of examples Does the process description detail the techniques with examples?

Criteria evaluation and justification

In evaluating these criteria, it is necessary to justify the answers. The criterion "Size of MAS" can be answered by a simple number, but the answer has to be based on characteristics of the process which should be made explicit. If a process provides guidance for a certain system size, this guidance and how it supports the developer should be mentioned. The mere resistance of such a guidance, however, does not ensure that it is usable: it has to be effective and helpful to the development team as well. Unfortunately, the original comparison papers do not contain such justifications. However, we will provide the rationales for our assessment of PosoMAS in Qualitative Comparison of AOSE processes.



This material is made available under the Creative Commons—Attribution-ShareAlike License v3.0.

© Copyright 2013, 2014 by Institute for Software & Systems Engineering, University of Augsburg
Contact: Jan-Philipp Steghöfer