Using The Connector
The connector is included with Pivotal GemFire. The connector’s JAR file will automatically be included on the classpath.
To use the connector,
specify configuration details in gfsh
commands
or within a cache.xml
file.
Do not mix the use of gfsh
for configuration with the use
of a cache.xml
file.
To do an explicit mapping of fields, or to map only a subset
of the fields,
specify all configuration in a cache.xml
file.
Specification with gfsh
gfsh
may be used to configure all aspects of transfer and the the mapping,
as follows:
If domain objects are not on the classpath, configure PDX serialization with the GemFire
configure pdx
command after starting locators, but before starting servers. For example:gfsh>configure pdx --read-serialized=true \ --auto-serializable-classes=io.pivotal.gemfire.demo.entity.*
After starting servers, use the GemFire
create jndi-binding
command to specify all aspects of the data source. For example,gfsh>create jndi-binding --name=datasource --type=SIMPLE \ --jdbc-driver-class="org.postgresql.Driver" \ --username="g2c_user" --password="changeme" \ --connection-url="jdbc:postgresql://localhost:5432/gemfire_db"
After creating regions, set up the gpfdist protocol by using
configure gpfdist-protocol
. For example,gfsh>configure gpfdist-protocol --port=8000
Specify the mapping of the GPDB table to the GemFire region with the
create gpdb-mapping
command. For example,gfsh>create gpdb-mapping --region=/Child --data-source=datasource \ --pdx-name="io.pivotal.gemfire.demo.entity.Child" --table=child --id=id,parent_id
Specification with a cache.xml File
To provide configuration details within a cache.xml
file,
specify the correct xsi:schemaLocation
attribute within the
cache.xml
file.
For the 3.3.0 connector, use
http://schema.pivotal.io/gemfire/gpdb/gpdb-3.3.xsd
Connector Requirements and Caveats
Export is supported from partitioned GemFire regions only. Data cannot be exported from replicated regions. Data can be imported to replicated regions.
The number of Pivotal Greenplum® Database (GPDB) segments must be greater than or equal to the number of Pivotal GemFire servers. If there is a high ratio of GPDB segments to GemFire servers, the GPDB configuration parameter
gp_external_max_segs
may be used to limit GPDB concurrency. See gp_external_max_segs for details on this parameter. An approach to finding the best setting begins with identifying a representative import operation.- Measure the performance of the representative import operation with the default setting.
- Measure again with
gp_external_max_segs
set to half the total number of GPDB segments. If there is no gain in performance, then the parameter does not need to be adjusted. - Iterate with values of
gp_external_max_segs
that are half as much at each iteration, until there is no performance improvement or the value ofgp_external_max_segs
is the same as the number of GemFire servers.
Upgrading Java Applications from Version 2.4 to Version 3.x
API changes implemented for version 3.0.0 that are also in this connector version require code revisions in all applications that use import or export functionality.
For this sample version 2.4 export operation, an upsert type of operation was implied:
// Version 2.4 API
long numberExported = GpdbService.createOperation(region).exportRegion();
Here is the equivalent version 3.x code to implement the upsert type of operation:
// Version 3.x API
ExportConfiguration exportConfig = ExportConfiguration.builder(region)
.setType(ExportType.UPSERT)
.build();
ExportResult result = GpdbService.exportRegion(exportConfig);
int numberExported = result.getExportedCount();
For this sample version 2.4 import operation,
// Version 2.4 API
long numberImported = GpdbService.createOperation(region).importRegion();
here is the version 3.x code to implement the import operation:
// Version 3.x API
ImportConfiguration importConfig = ImportConfiguration.builder(region)
.build();
ImportResult result = GpdbService.importRegion(importConfig);
int numberImported = result.getImportedCount();
Please note that the new result
objects’ counts are of type int
instead of type long
.
This is for consistency,
as the connector internally uses JDBC’s executeQuery()
,
which supports int
.