Modifying a Graph Schema

After a graph schema has been created , it can be modified. Data already stored in the graph and which is not logically part of the change will be retained. For example, if you had 100 Book vertices and then added an attribute to the Book schema, you would still have 100 Books, with default values for the new attribute. If you dropped a Book attribute, you still would have all your books, but one attribute would be gone.

To safely update the graph schema, the user should follow this procedure:

  • Create a SCHEMA_CHANGE JOB, which defines a sequence of ADD, ALTER and/or DROP statements.

  • Run the SCHEMA_CHANGE JOB (i.e. RUN JOB job_name), which will do the following:

    • Attempt the schema change.

    • If the change is successful, invalidate any loading job or query definitions which are incompatible with the new schema.

    • if the change is unsuccessful, report the failure and return to the state before the attempt.

A schema change will invalidate any loading jobs or query jobs which relate to an altered part of the schema. Specifically:

  • A loading job becomes invalid if it refers to a vertex or and an edge which has been dropped (deleted) or altered .

  • A query becomes invalid if it refers to a vertex, and edge, or an attribute which has been dropped .

Invalid loading jobs are dropped, and invalid queries are uninstalled. After the schema update, the user will need to create and install new load and query jobs based on the new schema.

Jobs and queries for unaltered parts of the schema will still be available and do not need to be reinstalled. However, even though these jobs are valid (e.g., they can be run), the user may wish to examine whether they still perform the preferred operations (e.g., do you want to run them?)

Load or query operations which begin before the schema change will be completed based on the pre-change schema. Load or query operations which begin after the schema change, and which have not been invalidated, will be completed based on the post-change schema.

Global vs. Local Schema Changes

Only admin, designer, and superuser users can create and run schema changes. Each user role can create and run a different type of schema change.

Only a superuser can add, alter, or drop global vertex types or global edge types, which are those that are created using CREATE VERTEX or CREATE ... EDGE. This rule applies even if the vertex or edge type is used in only one graph. To make these changes, the superuser uses a GLOBAL SCHEMA_CHANGE JOB.

An admin or designer user can add, alter, or drop local vertex types or local edge types which are created in the context of that graph. Local vertex and edge types are created using an ADD statement inside a SCHEMA_CHANGE JOB. To alter or drop any of these local types, the admin user uses a regular SCHEMA_CHANGE JOB.

The two types of schema change jobs are described below.

CREATE SCHEMA_CHANGE JOB

The CREATE SCHEMA_CHANGE JOB block defines a sequence of ADD, ALTER, and DROP statements for changing a particular graph. It does not perform the schema change.

CREATE SCHEMA_CHANGE JOB syntax
CREATE SCHEMA_CHANGE JOB job_name FOR GRAPH graph_name {
    [sequence of DROP, ALTER, and ADD statements, each line ending with a semicolon]
}

One use of CREATE SCHEMA_CHANGE JOB is to define an additional vertex type and edge type to be the structure for a secondary index. For example, if you wanted to index the postalCode attribute of the User vertex, you could create a postalCode_idx (PRIMARY_ID id string, code string) vertex type and hasPostalCode (FROM User, TO postalCode_idx) edge type. Then create an index structure having one edge from each User to a postalCode_idx vertex.

By its nature, a SCHEMA_CHANGE JOB may contain multiple statements. If the job block is used in the interactive GSQL shell, then the BEGIN and END commands should be used to permit the SCHEMA_CHANGE JOB to be entered on several lines. if the job is stored in a command file to be read in batch mode, then BEGIN and END are not needed.

Remember to include a semicolon at the end of each DROP, ALTER, or ADD statement within the JOB block.

If a SCHEMA_CHANGE JOB defines a new edge type which connects to a new vertex type, the ADD VERTEX statement should precede the related ADD EDGE statement. However, the ADD EDGE and ADD VERTEX statements can be in the same SCHEMA_CHANGE JOB.

ADD VERTEX | EDGE (local)

The ADD statement defines a new type of vertex or edge and automatically adds it to a graph schema. The syntax for the ADD VERTEX | EDGE statement is analogous to that of the CREATE VERTEX | EDGE | GRAPH statements. It may only be used within a SCHEMA_CHANGE JOB.

ADD VERTEX / UNDIRECTED EDGE / DIRECTED EDGE
ADD VERTEX v_type_name (PRIMARY_ID id type [, attribute_list]) [WITH STATS="none"|"outdegree_by_edgetype"]; 
ADD UNDIRECTED EDGE e_type_name (FROM v_type_name, TO v_type_name [, edge_attribute_list]);
ADD DIRECTED EDGE e_type_name (FROM v_type_name, TO v_type_name [, edge_attribute_list])
    [WITH REVERSE_EDGE="rev_name"];

In the current version, v_type_name and e_type_name identifiers must be GLOBALLY unique, even though they are only locally visible to local graph users. As a consequence, when a user runs a SCHEMA_CHANGE JOB with ADD VERTEX/EDGE statements, it is possible that the system will reject the proposed names, because they have already been used by another graph.

ALTER VERTEX | EDGE

The ALTER statement is used to add attributes to or remove attributes from an existing vertex type or edge type. It may only be used within a SCHEMA_CHANGE JOB. The basic format is as follows:

ALTER VERTEX / EDGE
ALTER VERTEX|EDGE object_type_name ADD|DROP (attribute_list);

ALTER ... ADD

Added attributes are appended to the end of the schema. The new attributes may include DEFAULT fields:

ALTER ... ADD
ALTER VERTEX|EDGE object_type_name ADD ATTRIBUTE (
       attribute_name type [DEFAULT default_value]
    [, attribute_name type [DEFAULT default_value]]* );

ALTER ... DROP

ALTER ... DROP
ALTER VERTEX|EDGE object_type_name DROP ATTRIBUTE (
      attribute_name [, attribute_name]* );

DROP VERTEX | EDGE (local)

The DROP statement removes the specified vertex type or edge type from the database dictionary. The DROP statement should only be used when graph operations are not in progress.

drop vertex / edge
DROP VERTEX v_type_name [, v_type_name]*
DROP EDGE e_type_name [, e_type_name]*

RUN SCHEMA_CHANGE JOB

RUN JOB job_name performs the schema change job. After the schema has been changed, the GSQL system checks all existing GSQL queries (described in "GSQL Language Reference, Part 2: Querying"). If an existing GSQL query uses a dropped vertex, edge, or attribute, the query becomes invalid, and GSQL will show the message "Query query_name becomes invalid after schema update, please update it.".

Below is an example. The schema change job add_reviews adds a Review vertex type and two edge types to connect reviews to users and books, respectively.

SCHEMA_CHANGE JOB example
USE GRAPH Book_rating
CREATE SCHEMA_CHANGE JOB add_reviews FOR GRAPH Book_rating {
    ADD VERTEX Review (PRIMARY_ID id UINT, review_date DATETIME, url STRING);
    ADD UNDIRECTED EDGE wrote_review (FROM User, TO Review);
    ADD UNDIRECTED EDGE review_of_book (FROM Review, TO Book);
}
RUN JOB add_reviews

USE GLOBAL

The USE GLOBAL command changes a superuser's mode to Global mode. In global mode, a superuser can define or modify global vertex and edge types, as well as specifying which graphs use those global types. For example, the user should run USE GLOBAL before creating or running a GLOBAL SCHEMA_CHANGE JOB.

CREATE GLOBAL SCHEMA_CHANGE JOB

The CREATE GLOBAL SCHEMA_CHANGE JOB block defines a sequence of ADD, ALTER, and DROP statements which modify either the attributes or the graph membership of global vertex or edge types. Unlike the non-global schema_change job, the header does not include a graph name. However, the ADD/ALTER/DROP statements in the body do mention graphs.

CREATE GLOBAL SCHEMA_CHANGE JOB syntax
CREATE GLOBAL SCHEMA_CHANGE JOB job_name {
    [sequence of global DROP, ALTER, and ADD statements, each line ending with a semicolon]
}

Those both global and local schema change jobs have ADD and DROP statements, they have different meanings. The table below outlines the differences.

local SCHEMA_CHANGE

GLOBAL SCHEMA_CHANGE

ADD

Defines a new local vertex/edge type; adds it to the graph's domain

Adds one or more existing global vertex/edge types to a graph's domain.

DROP

Deletes a local vertex/edge type and its vertex/edge instances

Removes one or more existing global vertex/edge types from a graph's domain.

ALTER

Adds or drops attributes from a local vertex/edge type.

Adds or drops attributes from a global vertex/edge type, which may affect several graphs.

Remember to include a semicolon at the end of each DROP, ALTER, or ADD statement within the JOB block.

ADD VERTEX | EDGE (global)

The ADD statement adds existing global vertex or edge types to one of the graphs.

ADD VERTEX / UNDIRECTED EDGE / DIRECTED EDGE (Global)
ADD VERTEX v_type_name [,v_type_name...] TO GRAPH gname;
ADD EDGE e_type_name [,e_type_name...] TO GRAPH gname;

ALTER VERTEX | EDGE

The ALTER statement is used to add attributes to or remove attributes from an existing global vertex type or edge type. The ALTER VERTEX / EDGE syntax for global schema changes is the same as that for local schema change jobs.

ALTER VERTEX / EDGE
ALTER VERTEX|EDGE object_type_name ADD|DROP (attribute_list);

ALTER ... ADD

Added attributes are appended to the end of the schema. The new attributes may include DEFAULT fields:

ALTER ... ADD
ALTER VERTEX|EDGE object_type_name ADD ATTRIBUTE (
       attribute_name type [DEFAULT default_value]
    [, attribute_name type [DEFAULT default_value]]* );

ALTER ... DROP

ALTER ... DROP
ALTER VERTEX|EDGE object_type_name DROP ATTRIBUTE (
      attribute_name [, attribute_name]* );

DROP VERTEX | EDGE (global)

The DROP statement removes specified global vertex or edge types from one of the graphs. The command does not delete any data.

drop vertex / edge
DROP VERTEX v_type_name [,v_type_name...] FROM GRAPH gname;
DROP EDGE e_type_name   [,e_type_name...] FROM GRAPH gname;

RUN GLOBAL SCHEMA_CHANGE JOB

RUN JOB job_name performs the global schema change job. After the schema has been changed, the GSQL system checks all existing GSQL queries (described in "GSQL Language Reference, Part 2: Querying"). If an existing GSQL query uses a dropped vertex, edge, or attribute, the query becomes invalid, and GSQL will show the message "Query query_name becomes invalid after schema update, please update it.".

Below is an example. The schema change alter_friendship_make_library drops the on_date attribute from the friend_of edge and adds Book type to the library graph.

GLOBAL SCHEMA_CHANGE JOB example
USE GLOBAL
CREATE GRAPH library()
CREATE GLOBAL SCHEMA_CHANGE JOB alter_friendship_make_library {
    ALTER EDGE friend_of DROP ATTRIBUTE (on_date);
    ADD VERTEX Book TO GRAPH library;
}
RUN JOB alter_friendship_make_library

Last updated