Accessing External SQL Databases with JDBC (Beta)

    This section describes how to use PXF with JDBC, including an example of creating and querying an external table that accesses data in a MySQL database table.

    Before accessing external SQL databases using HAWQ and PXF, ensure that:

    • The JDBC plug-in is installed on all cluster nodes. See Installing PXF Plug-ins for PXF plug-in installation information.
    • The JDBC driver JAR files for the external SQL database are installed on all cluster nodes.
    • The file locations of external SQL database JDBC JAR files are added to . If you manage your HAWQ cluster with Ambari, add the JARS via the Ambari UI. If you managed your cluster from the command line, edit the /etc/pxf/conf/pxf-public.classpath file directly.

    Querying External SQL Data

    The PXF JDBC plug-in supports the single profile named Jdbc.

    Use the following syntax to create a HAWQ external table representing external SQL database tables you access via JDBC:

    JDBC-plug-in-specific keywords and values used in the CREATE EXTERNAL TABLE call are described in the table below.

    Note: When creating PXF external tables, you cannot use the HEADER option in your FORMAT specification.

    You include custom options in the LOCATION URI. Preface each option with an ampersand &.

    The JDBC plug-in Jdbc profile supports the following s. The JDBC connection set up options are required.

    Option NameDescription
    JDBC_DRIVERThe JDBC driver class name. (Required)
    DB_URLThe URL to the database; includes the hostname, port, and database name. (Required)
    USERThe database user name. (Required)
    PASSThe database password for USER. (Required)
    PARTITION_BYThe partition column, <column-name>:<column-type>. The JDBC plug-in supports date, int, and enum <column-type>s. Use the yyyy-MM-dd format for the date <column-type>. A null PARTITION_BY defaults to a single fragment.
    RANGE(Used only when PARTITION_BY is specified.) The query range, <start-value>[:<end-value>]. <end-value> may be empty for an <column-type>. The RANGE is left closed, right open. That is, the range includes the <start-value> but does not include the <end-value>.
    INTERVAL(Used only when PARTITION_BY is specified.) The interval, <interval-num>[:<interval-unit>], of one fragment. INTERVAL may be empty for an enum <column-type>. <interval-unit> may be empty for an int <column-type>.

    Example JDBC connection string:

    1. &JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://<dbhost>:<dbport>/testdb&USER=user1&PASS=changeme

    Example JDBC substrings identifying partitioning parameters:

    1. &PARTITION_BY=year:int&RANGE=2011:2013&INTERVAL=1
    2. &PARTITION_BY=createdate:date&RANGE=2013-01-01:2016-01-01&INTERVAL=1:month

    Example: Using the Jdbc Profile to Access a MySQL Database Table

    Create a MySQL Table

    Perform the following steps to create a MySQL table named mysql_table1 in the default schema of a database named mtestdb:

    1. Connect to the default MySQL database as the root user:

      1. $ mysql --user=root -p
    2. Create a database named mtestdb:

    3. Assign privileges to mtestdb to a user named muser1 and then exit:

      1. mysql> GRANT ALL on mtestdb.* TO 'muser1'@'localhost' IDENTIFIED BY 'muser1';
      2. mysql> \q
    4. Connect to mtestdb as user :

      1. $ mysql --user=muser1 mtestdb -pmuser1
    5. Create a table named mysql_table1 and insert some data into this table:

      1. mysql> CREATE TABLE mysql_table1(id int);
      2. mysql> INSERT INTO mysql_table1 VALUES (1);
      3. mysql> INSERT INTO mysql_table1 VALUES (2);
      4. mysql> INSERT INTO mysql_table1 VALUES (3);
      1. &JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://mydb.server.com:3306/mtestdb&USER=muser1&PASS=muser1

      Save this string for use later.

    Configure PXF

    If not already present on your system, download and copy the MySQL connector JAR file to your system and update pxf-public.classpath with the location of this file.

    For example, if you manage your HAWQ cluster with Ambari, and the MySQL connector jar file is located in the /usr/share/java directory, add the following line:

    1. /usr/share/java/mysql-connector*.jar

    in the Ambari PXF service Configs > Advanced pxf-public.classpath pane and then use the Restart button to restart PXF.

    If you manage your HAWQ cluster from the command line, directly edit the /etc/pxf/conf/pxf-public.classpath file, adding the MySQL connector JAR file path. Then restart PXF on each HAWQ node with the following command:

    1. root@hawq-node$ sudo service pxf-service restart

    Query Using the Jdbc Profile

    Perform the following steps to create and query an external PXF table to access the mysql_table1 table you created in the previous section:

    1. Use the Jdbc profile to create an external table to access the MySQL mysql_table1 table. For example:

      Substitute your PXF , as well as the DB_URL string you constructed in the previous exercise.

    2. Display all rows of the pxf_jdbc_mysql_table1 table:

      1. gpadmin=# SELECT * FROM pxf_jdbc_mysql_table1;
      2. id
      3. ----
      4. 1
      5. 2
      6. 3