eagle-i: data integration in a scientific resource discovery networkD. Bourges-Waldegg PhD(1), S. K. Cheng, MS(1), T. Bashor(2), H. R. Frost, MS(1), M.A. Haendel, PhD(3), C. Torniai, PhD(3), J.A. McMurry, MPH(1), D. MacFadden, MS(1)
(1)Harvard Medical School, Boston, MA; (2)Wonder Lake Software; (3)Oregon Health & Science University, Portland, OR
eagle-i is an open-source web-based application suite enabling scientists to discover resources across a distributed network. eagle-i’s ontology-driven software supports powerful search methods while maintaining flexibility and interoperability through linked open data (LOD). We discuss a web-service that encapsulates interactions with an eagle-i repository in order to produce compositions of eagle-i resources and serve them as XML destined for consumption by other applications.
Introduction. Researchers in the medical field produce and consume a vast variety of resources, such as cell lines, specialized services and animal models. Most of these existing research resources cannot be readily found using conventional methods. The eagle-i approach has been developed to address complexities of this longstanding issue. By removing barriers to resource discovery, eagle-i is helping scientists find existing resources more easily, thus reducing time-consuming and expensive duplication.
Technology. eagle-i is built around Semantic Web technologies (RDF, OWL, SPARQL) following LOD principles. The eagle-i architecture comprises a set of ontology-driven software components deployed at each institution and a central search application that communicates with these federated components. In the fall of 2011, eagle-i was released under an open source (BSD-3) license. Currently, adopters may build eagle-i from scratch (source code) or download binary packages to install. We anticipate offering eagle-i via a pre-configured virtual machine or a hosted model. Because eagle-i is designed as a federated system, all of these deployment options provide institutions with technical and administrative autonomy; they can choose to make their resources locally discoverable, or globally discoverable at www.eagle-i.net.
Data integration in action. The native RDF in eagle-i provides an excellent framework for data integration. The eagle-i applications use SPARQL to retrieve data from eagle-i repositories; they then restructure and serve this RDF to users as HTML. However, external applications can also access the data in a variety of ways. For example, the eagle-i software stack now includes an optional light-weight web service that feeds transformed XML data to a Core Facilities Portal application. While any external application can freely use the public SPARQL endpoint directly, in practice many such applications are not well equipped to do the necessary queries and data transformations. The web-service approach allows for a clean simple solution that 1) handles the SPARQL endpoint on behalf of the applications 2) while still enabling advanced re-structuring of the data for different audiences. This same approach could be adapted in other cases where there is a need for special visualization, context or aggregation.