Data Loading

If you have defined a graph schema, you can load data into the graph store. Loading data is a multistep process:

  1. Create a connector.

  2. Define a data source.

  3. Create a loading job.

  4. Run the loading job.

The first two steps are only applicable if you are loading data using a connector. If you are loading from files that exist locally on a TigerGraph server, you do not need to create a connector, create a loading job to load from the files directly.

To learn about the GSQL statements used in defining a loading job, see GSQL language reference: Creating a loading job.

Connectors

Connectors are interfaces built in to the TigerGraph system that enable users to use the same high-level GSQL protocol for high-speed parallel data loading, whether the data reside directly on the network file system, or come from one of several supported data sources.

With TigerGraph’s connector offerings, you can perform data loading tasks such as the following:

The data connectors will stage temporary data files on the database server’s disk. You should have free disk space of at least 2 times the size of your total (uncompressed) input data.

Data source

A data source is an object in GSQL that contains metadata about the source of the data where a loading job loads data from. You must define a data source whenever you use a connector to load data.

Data ingestion from Kafka

Connectors often stream data to a Kafka server first - often TigerGraph’s internal Kafka server. After the data has already been streamed to a Kafka server, users then create a Kafka loading job to load the data into a graph.

To create a Kafka loading job, see Kafka Loader Overview.