A walk through xWCPS 2.0

xWCPS (XPath Enabled WCPS) is a novel Query Language that combines OGC Coverage data and metadata processing features. It merges two standards, the XPath 2.0 for its capabilities on semistructured data handling and WCPS for its array data processing capabilities. By combining those two, it is delivering a rich set of features that revolutionizes the way scientific data can be located and processed: it enables combined search, filtering and processing on both metadata and payload of OGC coverages.

Its 2.0 version builds on the first specification of the language, as defined in EarthServer Project and refines its characteristics so it facilitates implementation, improves expected query performance and eases user adoption.

In the rest of this article, we describe the application of the FLOWR syntax paradigm in the xWCPS syntax, the technology that supports the reference implementation delivered by EarthServer-2 and finally a set of examples that showcase the syntax and results of xWCPS queries.

Syntax: FLWOR paradigm

Queries are the most fundamental part of the language. In essence, a xWCPS query bases its structure on the "for-where-return" structure of WCPS, while it introduces a set of new features such as the "let-order by" structure (FLWOR), the "mixed" results functionality (i.e. returning results comprizing of both array data and metadata) and special characters that identify coverages (*, @) and fetch metadata (::), and it adds support for XPath 2.0 expressions. In the following, we present each part of the syntax in more detail:

The "for" statement of a query: it can contain the let clause allowing variable definition that can be used later in "where".

The "let" statement: It can initialize variables following an assignment expression that finishes with a semicolon. The use of the let clause can greatly reduce repetitiveness, making xWCPS extremely less verbose than WCPS.

The "where" statement: It is used to specify one or more metadata or coverage criteria for filtering down the returned result. Every XPath or WCPS expression evaluating to a boolean result is a valid xWCPS comparison expression that can be used for filtering. To declare an XPath expression the "::" notation should follow the variable refering to a coverage. That notation fetches the metadata of the coverage where the xPath is evaluated.

The "order by" statement: It is used to order the returned coverages based on a xPath clause that operates on their metadata. Ordering direction can be ascending (asc) or descending (desc). If direction is not defined explicitly, ascending is used by the system.

The "return" statement: Defines the form of the return, which may contain textual results, structured XML results, WCPS encoded (i.e. png, tiff, csv) results or combinations of binary and textual data aka mixed results in xWCPS terminology. xWCPS acts as a wrapper construct on top of XPath 2.0 and WCPS, thus it doesn't offer any language specific operations.

Syntax Features

Expressiveness and coherence are key features of the language, now in its 2nd revision, allowing experts dealing with multidimensional array data to easily adopt and take advantage of its offerings. In general, xWCPS new features are:

  • Extended Search Abilities:
    • Ability to identify all coverages through special character "*".
    • Ability to identify all coverages at a specific data source through "*@endpoint" syntax.
    • Ability to identify specific coverage at a specific data source through "id@endpoint" syntax.
  • Exploitation of Descriptive Metadata: Coverage filtering based on the available metadata using XPath 2.0. For and where clauses can contain XPath 2.0 expressions in order to restrict results to specific metadata. Metadata can be accessed through "::" special character.
  • Repetitiveness Reduction: xWCPS supports variable manipulation via 'let' clause, which allows assigning complex expressions to variables and using them for subsequent references, avoiding repetitiveness.
  • Extended Set of Results Support: An important feature of xWCPS is the ability to return the data accompanied with their metadata. This can be achieved using the "mixed" clause of xWCPS.

Technology

The reference implemetation that supports xWCPS query execution in EarthServer, is implemented in Java 8, while ANTLR4 used as the parser generator that translates xWCPS to source code. REST web services (implemented using the Jersey framework) are used to allow communication of the the client side application (implemented in JavaScript and AngularJS) with the server side query processor.

The solution utilizes rasdaman as the backend for processing (geospatial) array queries (WCPS) and the FeMME as the engine for hosting and processing the metadata. The latter is supported by two very quite different NoSQL datastores:

  • MongoDB for the data storage and
  • ElasticSearch for indexing, in order to improve the performance of particular queries and offer semantic and full-text queries

The overall architecture of the system is shown in the next diagram.

xWCPS Examples

In the following we present some examples on how to utilize the power of xWCPS .

  • Retrieve all coverages of ECMWF filtered by RectifiedGrid dimension equals 2 and present only the RectifiedGrid element.

  • Retrieve a mixed result containing an image of a specific coverage  and its metadata.

  • Retrieve coverage names of all coverages in ECMWF in descending order.