Publications

[1] F. Scharffe and J. Euzenat, “Méthodes et outils pour lier le web des données,” in Actes de RFIA 2010, 2010. [ bib | .pdf ]
Français:

Le web des données consiste à publier des données sur le web de telle sorte qu'elles puissent être interprétées et connectées entre elles. Il est donc vital d'établir les liens entre ces données à la fois pour le web des données et pour le web sémantique qu'il contribue à nourrir. Nous proposons un cadre général dans lequel s'inscrivent les différentes techniques utilisées pour établir ces liens et nous montrons comment elles s'y insèrent. Nous proposons ensuite une architecture permettant d'associer les différents systèmes de liage de données et de les faire collaborer avec les systèmes développés pour la mise en correspondance d'ontologies qui présente de nombreux points communs avec la découverte de liens.

English:

The Web of data consists of publishing data on the Web in such a way that they can be interpreted and connected together. It is thus critical to be able to establish links between these data, both for the Web of data and for the Semantic Web that this one contributes to feed. We propose a general framework and we show how the diverse techniques developed for establishing these links fit in the framework. We then propose an architecture allowing to associate various interlinking systems and to make them collaborate with systems developed for ontology matching that present many commonalities with link discovery techniques.

[2] O. Sváb-Zamazal, V. Svátech, and F. Scharffe, “Pattern-based ontology transformation service,” in Proc. 1st IK3C international conference on knowledge engineering and ontology development (KEOD), Funchal (PT), 2009. [ bib | .pdf ]
Many use cases for semantic technologies (eg. reasoning, modularisation, matching) could benefit from an ontology transformation service. This service is supported with ontology transformation patterns consisting of corresponding ontology patterns capturing alternative modelling choices, and an alignment between them. In this paper we present the transformation process together with its two constituents: a pattern detection and an ontology transformation process. The pattern detection process is based on SPARQL and the transformation process is based on an ontology alignment representation with specific extensions regarding detailed information about the transformation.

[3] O. Sváb-Zamazal, V. Svátek, J. David, and F. Scharffe, “Towards metamorphic semantic models,” in Proc. 6th european conference on semantic web (ESWC), Heraklion (GR), 2009. [ bib ]
On-demand ontology transformation inside a formalism such as OWL can be useful for semantic applications. The same conceptualisation can be modelled in diverse ways; (parts of) an ontology thus can be transformed from one modelling choice to another, taking advantage of patterns for e.g. n-ary relations, specified values or naming variations. Although the strictly formal semantics may change, the intended meaning should be preserved. Different use cases can share an ontology transformation service that will apply transformation patterns from a library; the development of the library (reusing existing resources) is in progress. We describe three use cases, focusing on the ontology matching use case.

[4] F. Scharffe, Y. Liu, and C. Zhou, “Rdf-ai: an architecture for rdf datasets matching, fusion and interlink,” in Proc. IJCAI 2009 workshop on Identity, reference, and knowledge representation (IR-KR), Pasadena (CA US), 2009. [ bib | .pdf ]
With the recent publication of large quantities of RDF data, the Semantic Web now allows concrete applications to be developed. Multiple datasets are effectively published according to the linked-data principles. Integrating these datasets through interlink or fusion is needed in order to assure interoperability between the resources composing them. There is thus a growing need for tools providing datasets management. We present in this paper RDF-AI, a framework and a tool for managing the integration of RDF datasets. The framework includes five modules for pre-processing, matching, fusing, interlinking and post-processing datasets. The framework inplementation results in a tool providing RDF datasets integration functionalities in a linked-data context. Evaluation of RDF-AI on existing datasets shows promising results towards a Semantic Web aware datasets integration tool.

[5] F. Scharffe, Correspondence Patterns Representation. PhD thesis, University of Innsbruck, 2009. [ bib | .pdf ]
We introduce in this dissertation correspondence patterns as a novel way to model ontology alignments. Correspondence patterns are meant to provide reference templates helping to model ontology alignments, like design patterns in software engineering help to model software designs. We develop an ontology mediation framework positioning patterns at the top level of abstraction of an ontology alignment representation, and introduce an expressive ontology alignment language allowing to represent correspondence patterns. Based on this framework, we propose a correspondence pattern library containing a number of patterns solving classical ontology mismatches, and modeled using the expressive alignment language.

[6] F. Scharffe and D. Fensel, “Correspondence patterns for ontology mediation,” in Proceedings of the 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW2008), (Acitrezza, Italy), Springer, September 2008. [ bib | .pdf ]
We introduce in this paper correspondence patterns as templates to model ontology alignments. Correspondence patterns capture regularities recurring when aligning ontologies. They come in complement of ontology matching algorithms and graphical mapping tools, and facilitate the task of the engineer building the alignment between a pair of ontologies. We develop an ontology mediation framework based on three ontology correspondence abstraction levels. We particularly detail the most abstract level: correspondence patterns.

[7] O. Shafiq, F. Scharffe, D. Wutke, and G. T. del Valle, “Resolving data heterogeneity issues in open distributed communication middleware,” in Proceedings of the Third International Conference on Internet and Web Applications and Services ICIW 2008, (Athens, Greece), June 2008. [ bib | http ]
Triple Space Computing is a communication and coordination paradigm that allows communication of semantic technologies in general, by publish and read of semantic data. It has also been provided as an underline communication middleware for Semantic Web Services. While focusing on scalability and openness of Triple Space Computing to bring it into its full potential in a global view, the possibility of heterogeneity among different users communicating over Triple Space is very likely to arise. This paper focuses on providing the Triple Space Computing with data mediation to enable easy integration of data, information, and knowledge. Mediation is a technique to overcome heterogeneity issues in a system, i.e. to remove differences in the syntactic representation and the intended semantics of data that is exchanged. The paper introduces an Abstract Mapping Language and shows how mapping rules can be created using this mapping language. It further proposes mediation APIs for users and internal system. It also explains the grounding of mediation mapping rules to Triple Space. It further provides a refined version of architecture of the mediation engine along with its bindings with other components of Triple Space Computing paradigm.

[8] F. Scharffe, J. Euzenat, and D. Fensel, “Towards correspondence patterns for ontology mediation,” in Proceedings of the ACM Symposium on Applied Computing, Semantic Web and Applications trac (ACM SAC 2008), (Fortaleza, Brazil), March 2008. [ bib | .pdf ]
Aligning ontologies is a crucial and tedious task. Matching algorithms and tools provide support to facilitate the task of the user in defining correspondences between ontologies entities. However, automatic matching is actually limited to the detection of simple one to one correspondences to be further refined by the user. We propose in this paper the use of correspondence patterns as a tool to assist the design of ontology alignments. Based on existing research on patterns in the fields of software and ontology engineering, we propose a pattern template as an helper to develop a correspondence patterns library. We give ways towards the representation of patterns using an appropriate correspondence representation formalism: the Alignment Ontology.

[9] J. Euzenat, A. Polleres, and F. Scharffe, “Processing ontology alignments with sparql,” in Proceedings of the International Workshop on Ontology Alignment and Visualization, CISIS 2008, (Barcelona, Spain), March 2008. [ bib | .pdf ]
Solving problems raised by heterogeneous ontologies can be achieved by matching the ontologies and processing the resulting alignments. This is typical of data mediation in which the data must be translated from one knowledge source to another. We propose to perform data translation using the SPARQL query language. Indeed, such a language is particularly adequate for extracting data from one ontology and, through its CONSTRUCT statement, for generating new data. We present examples of such transformations, as well as a set of example correspondences illustrating the needs for particular representation constructs, such as aggregates, value-generating built-in functions and paths, which are missing from SPARQL. Hence, we advocate the use of SPARQL++ and and PSPARQL, two SPARQL extensions providing these missing features.

[10] Y. Liu, F. Scharffe, and C. Zhou, “Towards practical rdf datasets fusion,” in Proceedings of the Workshop on data integration through semantic technology (DIST2008), (Bangkok, Thailand), 2008. [ bib | .pdf ]
In this paper, we describe our ongoing work on RDF datasets fusion. We first define what RDF datasets fusion is and list some key challenges around this issue. Based on the problems analysis, we present ways towards solutions for each problem. We detail an experimental fusion algorithm, which takes into account both the similarity of literal contents and the graph structure of the dataset, and test this algorithm on small-scale RDF datasets. Our approach gives the user an important role for defining the input of the algorithm. At last, this paper lists the remaining issues and future works to achieve a practical RDF datasets fusion system.

[11] J. Euzenat, A. Polleres, and F. Scharffe, “Sparql extensions for processing alignments,” IEEE Intelligent Systems T&C, Making Ontologies Talk: Knowledge Interoperability in the Semantic Web, pp. 80-84, 2008. [ bib | .pdf ]
Heterogeneity between ontologies is often handled by establishing correspondences between ontologies’ entities and transforming data according to these correspondences, whether for integrating heterogeneous data sources or exchanging messages between services. Relations between aligned entities can be very complex, so we’ve developed an alignment language for expressing such complexities. The language is independent of knowledge-representation languages and processing languages, but transforming data requires processing the correspondences expressed in the alignment language. In particular, it requires translating a source ontology’s data instances to instances of the target ontology. A complete ontology mediation scenario thus requires to first design an alignment beween the ontologies. This alignment represented in the alignment language is then grounded to the formalism preforming the transformation task. We expect this scenario to become common as more ontologies are developed and used to describe Resource Description Framework (RDF) data. A query language is a natural choice for translating data because it would allow both data extraction and data transformation. Hence, when RDF Schema and Web Ontology Language (OWL) are the standards for describing ontologies and data, Simple Protocol and RDF Query Language (SPARQL) seems a natural candidate for expressing and processing the correspondences. However, SPARQL isn’t powerful enough to cover the full expressivity of the alignment language we’ve developed. We therefore propose combining two recent SPARQL extensions to handle complex alignments: SPARQL++ provides aggregates, value-generating built-ins, and (possibly recursive) processing of mappings expressed in SPARQL, and PSPARQL provides queries on path expressions (made from regular expression patterns) which are sufficient for expressing those of the expressive language. Here, we illustrate our proposal with a data-translation problem between two ontologies: Friend of a Friend (FOAF, http://xmlns.com/foaf/0.1) and vCard (www.w3.org/2006/vcard/ns). Both vocabularies describe information about persons and organizations, both are used extensively, and they cover complementary as well as overlapping aspects of this information.

[12] F. Scharffe, “Ontology mapping specification language,” in Proceedings of the Knowledge Web PhD Symposium at ESWC 2007, (Innsbruck, Austria), June 2007. [ bib | .pdf ]
Ontology mediation is one of the key research topics for the acomplishment of the semantic web. Different tasks can be distinguished under this generic term: instance transformation, query rewriting, instance unification, ontology merging or mapping creation. All first four tasks require a mapping specification between the ontologies to be mediated. Mapping creation using tools and algorithms is outputting such a specification. We argue in this thesis proposal that a specific language to express mapping specifications is needed. This proposal presents arguments why such a language is needed, introducing particularly the concept of mapping patterns, based on a study of the frequent mismatches arising when trying to mediate between ontologies. Such a language is then proposed and its applicability is demonstrated for three scenarios: a graphical tool for ontology mapping, an output format for ontology matching algorithms and a merging algorithm. We also give first results on the language design, mainly represented by an alignment ontology.

[13] F. Scharffe, M. Luger, and Y. Raimond, “Semantics in the easaier framework,” 2007. [ bib | .pdf ]
[14] A. Polleres, F. Scharffe, and R. Schinduler, “Sparql++ for mapping between rdf vocabularies,” in Proceedings of ODBASE, On the Move Conferences (OTM) 2007, (Villamoura, Portugal), Springer, 2007. [ bib | .pdf ]
Lightweight ontologies in the form of RDF vocabularies such as SIOC, FOAF, vCard, etc. are increasingly being used and exported by “serious” applications recently. Such vocabularies, together with query languages like SPARQL also allow to syndicate resulting RDF data from arbitrary Web sources and open the path to finally bringing the Semantic Web to operation mode. Considering, however, that many of the promoted lightweight ontologies overlap, the lack of suitable standards to describe these overlaps in a declarative fashion becomes evident. In this paper we argue that one does not necessarily need to delve into the huge body of research on ontology mapping for a solution, but SPARQL itself might - with extensions such as external functions and aggregates - serve as a basis for declaratively describing ontology mappings. We provide the semantic foundations and a path towards implementation for such a mapping language by means of a translation to Datalog with external predicates.

[15] F. Scharffe, J. Euzenat, Y. Ding, and D. Fensel, “Towards correspondence patterns for ontology mediation,” in Proceedings of the Ontology Matching Workshop, ISWC 2007, 2007. [ bib | .pdf ]
We introduce in this paper correspondence patterns as a tool to design ontology alignments. Based on existing research on patterns in the fields of software and ontology engineering, we define a pattern template and use it to develop a correspondence patterns library. This library is published in RDF following a structured vocabulary. It is meant to be used in ontology alignment systems, in order to support the user or improve matching algorithms to refine ontology alignments.

[16] F. Scharffe, Y. Raimond, L. Barthelemy, Y. Ding, and M. Luger, “Publishing and accessing digital archives using the easaier framework,” in Proceedings Workshop on Cultural Heritage and the Semantic Web, ISWC 2007, 2007. [ bib | .pdf ]
Cultural archives have been massively digitalized in the past twenty years. Many multimedia databases are now available, many locally, more and more on-line. The emergence of the web, and its evolution towards the semantic web opens a new phase for the publication of digital archives. The data and assets they contain can be made available in a structured way, providing more precise, as well as wider querying possibilities. In this paper, we present a lightweight architecture for easily publishing and managing digital archives, based on semantic web technologies. This architecture is successfully being used within the EASAIER (Enabling Access to Sound Archives through Integration, Enrichment and Retrieval) European project. We also detail how the Royal Scottish Academy of Music and Drama HOTBED archive was successfully published using such a system.

Access to cultural archives was significantly simplified thanks to the web. Online archives however remain mostly isolated on the network, such as islands in an ocean. Search in these archives is tedious since each archive has to be searched separately.

Recent advances on web research have seen a number of technologies emerging towards more structured information. Through the use of common vocabularies, unique identification of resources, and reasoning techniques, the semantic web is of great benefit for cultural archives. It allows one to query many archives at the same time, defocusing from the archive itself to concentrate on the information needed.

The EASAIER european project aims at providing next generation tools for access and processing of cultural archives information, with a focus on audio. We present in this demo the architecture we developed. It has two major goals: first, to provide simple publication of digital archives on the semantic web. Second, to give the possibility to enrich archive metadata by either linking resources from various archives or

[17] Z. Yan, F. Scharffe, and Y. Ding, “Semantic search on cross-media cultural archives,” in Proceedings of the Atlantic Web Intelligence Conference (AWIC), (Fontainebleau, France), 2007. [ bib | .pdf ]
With the emergence of semantic web, traditional archive service meets a new challenge to provide more intelligent and interactive services for web users. To take good advantage of semantic web technologies, in particular ontology and semantic inference, this paper proposes a semantic search portal for cross­media cultural archives involving documents, images, audios and videos. With such semantically­enhanced search portal, implicit multimedia archives can be retrieved under the support of ontology modeling and semantic reasoning. Furthermore, the retrieved cross­media archives can be semantically navigated and repacked in a more meaningful and integrated way.

[18] J. Euzenat, A. Mocan, and F. Scharffe, Ontology Management: Semantic Web, Semantic Web Services, and Business Applications, ch. Ontology Alignments:an ontology management perspective. Springer, 2007. [ bib | http ]
Relating ontologies is very important for many ontology-based applications and more important in open environments like the semantic web. The relations between ontology entities can be obtained by ontology matching and represented as alignments. Hence, alignments must be taken into account in ontology management. This chapter establishes the requirements for alignment management. After a brief introduction to matching and alignments, we justify the consideration of alignments as independent entities and provide the life cycle of alignments. We describe the important functions of editing, managing and exploiting alignments and illustrate them with existing components.

[19] F. Scharffe, “Dynamerge: A merging algorithm for structured data integration on the web,” in International Workshop on Scalable Web Information Integration and Service at DASFAA 2007, 2006. [ bib | .pdf ]
Integrating various data sources is a major problem in knowledge management. Integration systems proposed in the last two decades as a solution start showing some limitations for various reasons. First the environment scaled from a few data sources in the same location to an unknown number of possible data sources on the Web. Second the sources to be integrated may change quickly over time. We propose in this article a method allowing for the dynamic integration of data sources on the web. Based on a network of schemas (database schemas, ontologies) related via mappings our algorithm generates a global view over a set of resources. Our approach presents the advantage to minimize the necessary human intervention in order to integrate sources.

[20] F. Scharffe, “Schema mappings for the web,” in Proceedings of the International Semantic Web Conference 2006 (ISWC 2006, PhD Symposium), Springer, 2006. [ bib | .pdf ]
Current solutions to data integration present many inconvenients. The bottleneck seems to be the impossible automation of the whole process. Human intervention will always be needed at some point, and the problem is to find where and how this intervention can be performed the most efficiently. In traditional mediator approaches the global schema and mappings between the global and local schemas are designed by hand. This is not the way to go if we want to see emerging a “semantic we”. The collaborative development of one-to-one mappings driven by application needs has much more chance to rapidly create a network of schemas. We propose to build on top on this view, shifting the human intervention from the global schema elaboration to the one-to-one mapping between local schemas. This repartition of efforts associated with publication of the local mappings is the only solution if we want to see the deep web rising up and the semantic web vision becoming true. I propose to contribute to this paradigm at two levels. First, mappings between heterogeneous schemas must be universally understandable, as schema descriptions may be of various natures (XML, relational, Ontologies, Semi structured,...). An independant language able to model correspondences between two schemas is then needed. This language also serves as an exchange format for matching algorithms as well as graphical mapping tools. A global schema is still necessary in order to provide a unified view over resources. We propose in the following to study how from a network of related schemas can we extract a global schema together with the associated mapping rules.

[21] O. Shafiq, F. Scharffe, R. Krummenacher, Y. Ding, and D. Fensel, “Data mediation support for triple space computing,” in Proceedings of the IEEE International Conference on Collaborative Computing (CollaborateCom 2006), 2006. [ bib | .pdf ]
[22] F. Scharffe, “Instance transformation for semantic data mediation,” in Proc. of the Int. Semantic Web and Web Services Conference SWWS'06, 2006. [ bib | .pdf ]
Ontologies formaly specify the terminology used to describe semantic web services functionalities and behavior. Mediators are used to allow interoperability between heterogenously described web services, a partic- ular type of mediator being the data mediator. A data mediator is used to solve terminological mismatches that arise when two different ontologies are used to describe two services. When a service interacts with another the queries and the resulting instance data must be translated from terms of one service into terms of the other. These processes are known as query rewriting and instance transformation. In this paper we try to address the problem of specifying transformations of instances as part of the mapping specification between two ontologies. We study the different necessary transformations and give a classification of those.

[23] Y. Ding, F. Scharffe, A. Harth, and A. Hogan, “Authorrank: Ranking improvement for the web,” in Proc. of the Int. Semantic Web and Web Services Conference SWWS'06, 2006. [ bib | .pdf ]
As the wealth of data on the World Wide Web grows, and as the structuring of that data improves, more sophisticated applications can be developed to derive meaningful characteristics relating to the content and structure of that data. In particular, ranking the various elements of sets of structured information is of great utility with respect to semantic network analysis. In this paper we report on preliminary results of ranking experiments carried out on the DBLP dataset that contains metadata descriptions of more than 600.000 publications.

[24] J. de Bruijn, M. Ehrig, C. Feier, F. J.Martin-Recuerda, F. Scharffe, and M. Weiten, Semantic Web Technologies, ch. Ontology mediation, merging and aligning. John Wiley & Sons, 2006. [ bib | .html ]
Ontology mediation is a broad field of research which is concerned with determining and overcoming differences between ontologies in order to allow the reuse of such ontologies, and the data annotated using these ontologies, throughout different heterogeneous applications.

Ontology mediation can be subdivided into three areas: ontology mapping, which is mostly concerned with the representation of correspondences between ontologies; ontology alignment, which is concerned with the (semi-)automatic discovery of correspondences between ontologies; and ontology merging, which is concerned with creating a single new ontology, based on a number of source ontologies.

This chapter reviews the work which has been done in the three mentioned areas and proposes an integrated approach to ontology mediation in the area of knowledge management. A language is developed for the representation of correspondences between ontologies. An algorithm, which generalizes current state-of-the-art alignment algorithms, is developed for the (semi-)automated discovery of such mappings. A tool is presented for browsing and editing ontology mappings. An ontology mapping can be used for a variety of different tasks, such as transforming data between different representations and querying different heterogeneous knowledge bases.

[25] F. Scharffe and J. de Bruijn, “A language to specify mappings between ontologies,” in Proc. of the Internet Based Systems IEEE Conference (SITIS05), 2005. [ bib | .pdf ]
The ontology mediation field aims at finding techniques and frameworks to allow interoperability between heterogeneous ontologies having an overlapping part. One of the solution is designing mappings that link the corresponding entities. Many systems and algorithms have been built by different parties but each system represents the mappings in its own format. To allow reusability of the mapping tools and of the results of matching algorithms, we have designed a language to express these mappings. We are in this paper presenting this language.

[26] M. Autonès, A. Beck, P. Camacho, N. Lassabe, H. Luga, and F. Scharffe, “Evaluation of chess position by modular neural network generated by genetic algorithm,” in Prooceedings of the european Conference on Genetic Programming, EuroGP, pp. 1-10, Springer, 2004. [ bib | .pdf ]
[27] F. Scharffe, “Croisements semantiques dans les graphes petits mondes,” Master's thesis, Institut de recherche en Informatique de Toulouse, 2004. [ bib ]
Ce mémoire présente la contribution apportée au projet de formalisation du sens entrepris par le groupe de recherche DiLan. Après un rappel des notions essentielles et un bref état de l'art, nous y présentons les outils pour l'extraction de graphes et les méthodes pour réaliser des croisements sémantiques entre un verbe et un nom à l'intérieur du graphe du dictionnaire, développés au cours du stage. Les résultats de ces méthodes sont ensuite brièvement comparés et des perspectives d'application entrevues.