Datatype Mapping
How the types correspond between a Pivotal GemFire® region and a Pivotal
Greenplum® (GPDB) table
are described in a cache.xml
file.
GemFire XML Description of a GPDB Table
In order to use the connector,
a mapping between a GemFire region and a GPDB table has to be described.
This mapping may go in the
cache.xml
file, which has an extended syntax for the mapping information.
The connector makes assumptions about the intended mapping if some parts of the mapping are incomplete. We advise you to explicitly define all keys and fields, such that the connector has no assumptions to make.
This cache.xml
file example provides the
XML description of the mappings for the region.
The XML with the gpdb
tag is the mapping between GemFire region fields
and the GPDB table columns.
It describes how to connect to GPDB,
as well as gpfdist protocol to be used to transfer the data between
the two systems.
<?xml version="1.0" encoding="UTF-8"?>
<cache xmlns="http://geode.apache.org/schema/cache"
xmlns:gpdb="http://schema.pivotal.io/gemfire/gpdb"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://geode.apache.org/schema/cache
http://geode.apache.org/schema/cache/cache-1.0.xsd
http://schema.pivotal.io/gemfire/gpdb
http://schema.pivotal.io/gemfire/gpdb/gpdb-3.4.xsd"
version="1.0">
<pdx read-serialized="true" persistent="false">
<pdx-serializer>
<class-name>org.apache.geode.pdx.ReflectionBasedAutoSerializer</class-name>
<parameter name="classes">
<string>io.pivotal.gemfire.demo.entity.*</string>
</parameter>
</pdx-serializer>
</pdx>
<jndi-bindings>
<jndi-binding jndi-name="DemoDatasource" type="SimpleDataSource"
jdbc-driver-class="org.postgresql.Driver" user-name="gpadmin"
password="changeme" connection-url="jdbc:postgresql://localhost:5432/gemfire_db">
</jndi-binding>
</jndi-bindings>
<region name="Parent">
<region-attributes refid="PARTITION">
<partition-attributes redundant-copies="1" />
</region-attributes>
<gpdb:store datasource="DemoDatasource">
<gpdb:types>
<gpdb:pdx name="io.pivotal.gemfire.demo.entity.Parent"
schema="public"
table="parent">
<gpdb:id field="id" />
<gpdb:fields>
<gpdb:field name="id" column="id" />
<gpdb:field name="name" />
<gpdb:field name="income" class="java.math.BigDecimal" />
</gpdb:fields>
</gpdb:pdx>
</gpdb:types>
</gpdb:store>
</region>
<gpdb:gpfdist port="8000" />
</cache>
jndi-binding
In the example, there is a single JNDI binding
to specify the details of the
connection to the example’s GPDB database.
This element defines a datasource
that is referenced later in gpdb:store
element using datasource
attribute.
The connection to the GPDB instance will be to localhost:5432
,
with a default database of gemfire_db
.
The GPDB user name is gpadmin
, and its password is changeme
.
For more details on JNDI bindings, see JNDI.
gpdb:store
The gpdb:store
element specifies the mapping between GPDB table contents
and the GemFire region.
There should be one store element for each GPDB data source.
In this example, the gpdb:store
element identifies the JNDI binding
to specify the GPDB instance at the other end of the connector.
It is the data source.
gpdb:types
The gpdb:types
element contains a list of gpdb:pdx
instances.
There is only one gpdb:pdx
instance in this example.
gpdb:pdx
Each gpdb:pdx
instance identifies the mapping of a GemFire region’s entries
to its corresponding GPDB table.
This is needed for the data serialization and transfer implemented
by the connector.
The optional schema
attribute specifies the GPDB schema for the table.
See
Creating and Managing Schemas within the GPDB manual for a description of GPDB schemas.
The optional schema
attribute is set to the default value of public
when not specified.
Make the schema
attribute value the same across all GemFire servers
that host the region.
gpdb:id
The gpdb:id
element specifies the fields to be used as the key in
the identification of a GemFire object.
The example uses a singleton key named id
, also mapped to table column id
:
<gpdb:id field="id" />
Here is an example of a composite key with the two fields
parentId
and id
mapped to table columns parentId
and id
.
<gpdb:id>
<gpdb:field ref="parentId" />
<gpdb:field ref="id" />
</gpdb:id>
gpdb:fields
The gpdb:fields
element lists the fields mapped to values of
the GemFire PdxInstance
.
A subset of the available GemFire fields or GPDB columns may be mapped.
Omission of this element causes the connector to automap
the GemFire region fields to GPDB table columns,
using the assumption that the exactly matching field and column names
are to be mapped.
gpdb:field
Each gpdb:field
instance specifies the mapping from a GemFire
PdxIntance
field to a GPDB column.
If no GPDB column
attribute is specified,
the GPDB column is assumed to have exactly the same name (case sensitive)
as the GemFire field.
Here is an example in which a GemFire parentId
field is mapped
to a GPDB parent_id
column,
and the GemFire age
field is of the Java class (data type)
java.lang.Integer
.
<gpdb:field name="parentId" column="parent_id" />
<gpdb:field name="age" class="java.lang.Integer" />
Special cases exist when the gpdb:field
attributes are
missing from the cache.xml
file.
With no gpdb:field
attributes defined,
the connector assumes that the GemFire region fields have
the same names as the GPDB table columns.
This list describes the current behavior for this type of special case:
If there exists a GPDB table column for which there is no GemFire region field, then an export operation does not update that GPDB table column. and an insert of a new row places either null or the defined default into the columns with no associated GemFire region field. An import operates on only those GemFire region fields for which the column with the matching name exists.
If there is a GemFire region field for which there is no matching GPDB table column, then an export operation only exports the GemFire region fields for which there is a GPDB table column of the same name. An import has side effects on the GemFire region. The import correctly acquires those GemFire region fields for which the column with the matching name exists. But, the import sets to null those GemFire region fields for which there is no GPDB table column of the same name; this occurs for both an update and an insert of region entries.
If one or more gpdb:field
attributes are specified,
there can exist a mismatch of fields to columns.
This list describes the current behavior for this type of special case:
If there is exactly one mapping of a field to a column, and the naming of that GemFire region field or a GPDB table column within a
gpdb:field
attribute is incorrect such that it does not match a configured and defined name, an exception is thrown upon export, as this case appears as if there are no mappings.With more than one
gpdb:field
attributes specified, if the naming of one GemFire region field or a GPDB table column is incorrect such that it does not match a configured and defined name, upon export an update to a GPDB table row will not change the value of the incorrectly named column. An insert of a new GPDB table row under the same assumptions acquires the correctly mapped GemFire region fields. The incorrectly mapped GPDB column will be set to null or its defined default value. Upon import, either an update of an existing entry or an insert of a new entry acquires the GPDB table column values for any correctly mapped fields, and uses null for any incorrectly named GemFire field.
gpdb:gpfdist
The gpdb:gpfdist
element is not part of the region specification,
but is part of the cache specification.
It provides details of the gpfdist
protocol used in the connector.
More details may be found within the section on
Working with File-Based External Tables
in the GPDB manual.
To increase the security of the data during transfer between GemFire and GPDB, TLS/SSL transfers can be enabled, as in this example:
<gpdb:gpfdist hostname="gpfdist.gemfire.test">
<gpdb:ssl
key-store="target/test-classes/io/pivotal/gemfire/gpdb/gpfdist/keystore.jks"
key-store-password="123456"
trust-store="target/test-classes/io/pivotal/gemfire/gpdb/gpfdist/truststore.jks"
trust-store-password="123456" />
</gpdb:gpfdist>
All four gpdb:ssl
attributes must be present to configure the transfers
to use SSL.
The two password fields remain in plain text within the cache.xml
file,
so file system protections should be considered.
The specification of optional attributes overrides default values. These are the optional attributes:
A
port
specification overrides the listening port ofgpfdist
to be other than its default of the first available port within the inclusive range of 8080-8280. The value can be a single port number or a range. If specified with a single port number and that port number is already in use, an exception will be thrown. If specified with a range, the syntax uses a colon to separate the two endpoints of the range, as in<lowport>:<highport>
. A value of 0 causes automatic allocation of a port number from an ephemeral port range. An example with a specific port number:<gpdb:gpfdist port="8000" />
An example that specifies a port range:
<gpdb:gpfdist port="8000:8100" />
A
hostname
specifies the host name used in URI generation. If SSL is enabled, this name must match the common name (CN) of installed key store certificate. The default value is an IP address of the local host.<gpdb:gpfdist hostname="host12.example.com" />
An
address
specification identifies the IP address for listening to be other than the default value of 0.0.0.0. For example:<gpdb:gpfdist address="192.0.2.0" />
idle-timeout
specifies a timeout in milliseconds. If no data is transmitted within this timeout period, the connection may be closed. The default value is 60000 milliseconds. There will be no timeout if set to 0. For example<gpdb:gpfdist idle-timeout="120000" />
Requirements and Caveats in the Mapping
- There must be only one GPDB table specified
per GemFire region.
If more than one GPDB table is specified for a single region
by defining distinct GPDB tables
across multiple
cache.xml
files, then an exception is thrown. If more than one GPDB table is specified within a singlecache.xml
file, no error will be thrown or issued, and the import or export operation will proceed, resulting in incorrect results.
Supported Datatypes
The following datatypes are supported. The left side shows the GPDB/PostgreSQL datatype, and the right side shows the associated GemFire/Java datatype which it is mapped to.
GPDB | GemFire |
---|---|
Bigint | Long (Object wrapper) |
Bigint not null | long (primitive) |
Bigserial | long (primitive) |
Boolean not null | boolean (primitive) |
Boolean | Boolean (wrapper) |
Bit | BitSet |
Varbit | BitSet |
Char | String |
Varchar | String |
Decimal | BigDecimal |
Float | Float (Object wrapper) |
Float not null | float (primitive) |
Int | Integer (Object wrapper) |
Int not null | int (primitive) |
Real | Float (Object wrapper) |
Real not null | float (primitive) |
Serial | int (primitive) |
Smallint | Short (Object wrapper) |
Smallint not null | short (primitive) |
Text | String |
Date | java.sql.Date |
Time | Time |
Timetz | Time |
Timestamp | Timestamp |
Timestamptz | Timestamp |
Unsupported GPDB Datatypes
- inet
- cidr
- bytea
- box
- circle
- interval
- seg
- macaddr
- path
- point
- polygon
- xml
- money
- array
Unsupported GemFire Datatypes
Classes with data members that are objects. Not supported, as this would require a representation of nested objects in GPDB. Here is an example of an unsupported type:
class HasAnObjectMember { private UserObject theObject; }
Any data type other than
BitSet
,Boolean
,Date
,Number
, andString
. Array types are an example of an unsupported data type.