Metadata

Catalog and schema

All the methods on connection level dealing with metadata and information about the available content provide information in terms that are defined in the SQL standard, including catalogs and schemas.

In most relational databases, a catalog is equivalent to a specific database on a server or cluster, and the schema refers to the collection of tables, views and procedures in that catalog.

From the SQL 1992 standard (find an archived copy here):

(4.12) Catalogs are named collections of schemas in an SQL-environment. An SQL-environment contains zero or more catalogs. A catalog contains one or more schemas, but always contains a schema named INFORMATION_SCHEMA that contains the views and domains of the Information Schema.

This driver does not support catalogs (see discussion 55), so any metadata result set will return literal null when asked for the catalog of a database object. No metadata method supports filtering on a non-null catalog parameter and no catalog specifier can be used in a query. Future developments might use catalogs to describe composite databases, in essence listing the constituents of the composite database defined in the connection.

The same standard defines schemas as follows:

(4.11) An SQL-schema is a persistent descriptor that includes:

[…] the descriptor of every component of the SQL-schema.

In this International Standard, the term "schema" is used only in the sense of SQL-schema. Each component descriptor is either a domain descriptor, a base table descriptor, a view descriptor, an assertion descriptor, a privilege descriptor, a character set descriptor, a collation descriptor, or a translation descriptor. The persistent objects described by the descriptors are said to be owned by or to have been created by the <authorization identifier> of the schema.

We report the literal name public as schema for any result component of a metadata result set. We support querying for objects in any schema, however only literal null or public will potentially produce a result.

Labels will be reported as table objects with the TABLE_TYPE being literal TABLE.

Summary

  • Catalog: Always null; filtering on anything non-null yields no results.

  • Schema: Always public; filtering on public and literal will yield result, anything else won’t.

  • Table descriptors: Reported as TABLE in the TABLE_TYPE column.

Labels to tables

The CLG and Langstar groups speak about "Node type combinations" and gravitate towards "open node type semantics" in the GQL standard:

Node type combinations is a less permissive form of open node type semantics. The idea of node type combinations is that nodes also conform to a graph type if they are not of one of the node types in the node type set of the graph type, but of a node type that is an intersection of (a subset of) the node type in a node type set of the graph type.

An example for their proposal can be found here.

This driver therefore compute node types in a similar way:

  • Single label nodes will map naturally to a table name, the single label will become the table name

    • The label name will be taken as is and will be case-sensitive. The labels Movie, movie, MOVIE will result in three tables in the metadata

    • This is in line with the default SQL-to-Cypher translation

  • Node type combinations will map to table names composed as label1_label2, sorting the labels alphabetically to make them independent of the order Neo4j returns them

  • Property sets for these node type combinations will then be computed