J2EE framework

JDBC : Connection to databases

What is JDBC?

JDBC is an API that encapsulates the low?level calls needed for database access and interaction into one common interface. Both the Java Development Kit (JDK) and Java Runtime Environment (JRE) contain the API as part of the standard distribution. The API’s interfaces and classes reside in the java.sql and javax.sql packages. The standard components are packaged in java.sql while the enterprise elements are in javax.sql.

Steps for building a simple JDBC application

1. Import Packages
2. Register JDBC Driver
3. Open a Connection
4. Execute a Query
5. Extract data from Result Set
6. Clean-Up Environment

A Simple Program to demonstrate the JDBC application working:

This is a simple program that retrieves  and displays the values of particular columns from the Employees table stored in the database.

//STEP 1. Import packages
import java.sql.DriverManager;
import java.sql.Connection;
import java.sql.Statement;
import java.sql.ResultSet;
import java.sql.Date;
import java.sql.SQLException;

public class FirstQuery {
public static void main(String[] args) {
//Define Connection variable
Connection conn = null;
//Begin standard error handling
try{
//STEP 2: Register JDBC driver
String driver = “oracle.jdbc.driver.OracleDriver”;  //for oracle database
Class.forName(driver);
//STEP 3: Open a connection
System.out.println(“Connecting to database…”);
String jdbcUrl = “jdbc:oracle:thin:@localhost:1521:XE”;
 /* this url will be different for the different databases */
String user = “username”; // username and password for the database authentication
String password = “password”;
conn = DriverManager.getConnection(jdbcUrl,user,password);
//STEP 4: Execute a query
Statement stmt = conn.createStatement();
String sql;
sql = “SELECT SSN, Name, Salary, Hiredate FROM Employees”;
ResultSet rs = stmt.executeQuery(sql);
//STEP 5: Extract data from result set
while(rs.next()){
//Retrieve by column name
int ssn= rs.getInt(“ssn”);
String name = rs.getString(“name”);
//Retrieve by column index as an example
double salary = rs.getDouble(3);
Date date = rs.getDate(4);
//Display values
System.out.print(“SSN: ” + ssn);
System.out.print(“, Name: ” + name);
System.out.print(“, Salary: $” + salary);
System.out.println(“, HireDate: ” + date);
}
//STEP 6: Clean?up environment
rs.close();
stmt.close();
conn.close();
}catch(SQLException se){
//Handle errors for JDBC
se.printStackTrace();
}catch(Exception e){
//Handle errors for Class.forName
e.printStackTrace();
}finally{
//finally block used to close resources
try{
      if(conn!=null)
         conn.close();
    }catch(SQLException se){
            se.printStackTrace();
      }
}
System.out.println(“Finished!”);
}//end main
}

The Expected Output after you have created the databse with some values of these columns:

Connecting to database…
SSN: 111111111, Name: Harish, Salary: $5000.55, HireDate: 2010?09?16
SSN: 419876541, Name: Vikas, Salary: $1500.75, HireDate: 2011?03?05
SSN: 312654987, Name: Aman, Salary: $2000.95, HireDate: 2012?01?11
SSN: 123456789, Name: Sumit, Salary: $3080.05, HireDate: 2009?09?07
SSN: 987654321, Name: John, Salary: $4351.27, HireDate: 2011?12?31
Finished!
The values that you will store in the databse will be shown.

This was just a simple demonstration program to just give some idea about the JDBC connectivity.
In the later posts, the more detailed and programs for entering the data in the databse, updating and creating the database using either Statement or preparedStatement objects.

For additional affordable web design toronto specifics, look at intelligentwebsitetechnologies.com.
For a good Attracta SEO review, consult this Attracta review post on Yahoo Voices.

JDBC Storage Handler for Hive

I was able to complete the implementation of Hive JDBC Storage Handler with basic functionality. Therefore I thought to write a blog post describing the usage with some sample queries. Currently It supports writing into any database and reading from major databases (MySql, MsSql, Oracle, H2, PostgreSQL). This feature comes with upcoming WSO2 BAM 2.0.0 Milestone 5 release. I’ll update the blog with the link to BAM2 M5 distribution when it is done.

Setting up the BAM to use Hive jdbc-handler. 


You need to add following property to hive-site.xml in $BAM_HOME/repository/conf/advanced/ directory, in this property you will point to the jdbc-driver of your database.
<property>
<name>hive.aux.jars.path</name>
<value>file:///opt/lib/drivers/mysql-connector-java-5.1.10-bin.jar</value>
</property>

Web UI for executing Hive queries.

BAM2 comes with a web ui for executing the Hive queries. Also there is a option to schedule the script

 
User interface for writing Hive Queries



User interface for scheduling hive script

Sample on writing analyzed data into JDBC 


Here I am going to demonstrate the functionality of writing the analyzed data into JDBC storage. In this simple example, We’ll fetch records from a file then analyze it using hive and finally store those analyzed data into MySQL database. 

Records – These are the records that we are going to analyze.

bread   12      12/01/2012
sugar   20      12/01/2012
milk    5       12/01/2012
tea     33      12/01/2012
soap    10      12/01/2012
tea     9       13/01/2012
bread   21      13/01/2012
sugar   9       13/01/2012
milk    14      13/01/2012
soap    8       13/01/2012
biscuit 10      14/01/2012

Hive Queries

//drop tables if already exist
 
drop table productTable;
 
drop table summarizedTable;
 
CREATE TABLE productTable (product STRING, noOfItems INT, dateOfSold STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
 
//Load the file with above records
 
load data local inpath '/opt/sample/data/productInfo.txt' into table productTable;
 
CREATE EXTERNAL TABLE IF NOT EXISTS
summarizedTable( product STRING, itemsSold INT) 
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'
    TBLPROPERTIES (
                'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver',
                'mapred.jdbc.url' = 'jdbc:mysql://localhost/test',
                'mapred.jdbc.username' = 'username',
                'mapred.jdbc.password' = 'password',
                'hive.jdbc.update.on.duplicate'= 'true',
                'hive.jdbc.primary.key.fields'='product',
                'hive.jdbc.table.create.query''CREATE TABLE productSummary (product VARCHAR(50) NOT NULL PRIMARY KEY, itemsSold INT NOT NULL)');
 
insert overwrite table summarizedTable SELECT product, sum(noOfItems) FROM productTable GROUP BY product;

View the result in mysql.

mysql> select * from productSummary;
+---------+-----------+
| product | itemsSold |
+---------+-----------+
| biscuit | 10 |
| bread | 33 |
| milk | 19 |
| soap | 18 |
| sugar | 29 |
| tea | 42 |
+---------+-----------+
6 rows in set (0.00 sec)

Detail description on TBLPROPERTIES in storage handler.

Property name Required Detail
mapred.jdbc.driver.class Yes
The classname for the JDBC Driver to use. This should be available on Hive’s classpath.
mapred.jdbc.url  Yes The connection url for the database.
mapred.jdbc.username No The database username, if it’s required.
mapred.jdbc.password  No The database Password, if it’s required.
hive.jdbc.table.create.query No
If table already exist in the database, then you don’t need this. Otherwise you should provide the sql query for creating the table in the database.

mapred.jdbc.output.table.name  No
The name of the table in the database. It does not have to be the same as the name of the table in Hive. If you have specified the sql query for creating the table, handler will pick the table name from query. Otherwise you need to specify this if your meta table name is different from the table in database.
hive.jdbc.primary.key.fields Yes If you have any primary keys in the database table
hive.jdbc.update.on.duplicate No
Expected values are either “true” or “false”. If “true” then the storage handler will update the records with duplicate keys. Otherwise it will insert all data. 

hive.jdbc.output.upsert.query No
This can be use to optimize the update operation. The default implementation is  to use insert or update statement after the select statement. So there will be two database round trips. But we can reduce it to one by using db specific upsert statement. Example query for mysql database is ‘INSERT INTO productSummary (product, itemsSold) values (?,?) ON DUPLICATE KEY UPDATE itemsSold=?’

hive.jdbc.upsert.query.values.order No
If you are using an upsert query then this is mandatory. sample values for above query will be ‘product,itemsSold,itemsSold’ //values order for each question mark 

hive.jdbc.input.columns.mapping No
This is mandatory if your field names in meta table and database tables are different. Provide the field names in database table in the same order as the field names in meta table with ‘,’ separated values. example: productNames,noOfItemsSold. These will map to your meta table with product,itemsSold field names.

mapred.jdbc.input.table.name No
Used when reading from a database table. This is needed if the meta table name and database table name are different.

Sample on reading from JDBC.

Now I am going to read the previously saved records from mysql using hive jdbc-handler.

Hive queries

//drop table if already exists
drop table savedRecords;
 
CREATE EXTERNAL TABLE IF NOT EXISTS savedRecords( product STRING, itemsSold INT) 
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'        
             TBLPROPERTIES (                
                    'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver',
                    'mapred.jdbc.url' = 'jdbc:mysql://localhost/test', 
                    'mapred.jdbc.username' = 'username',     
                    'mapred.jdbc.password' = 'password',
                    'mapred.jdbc.input.table.name' = 'productSummary');
SELECT product,itemsSold FROM savedRecords ORDER BY itemsSold;

This will give all the records in the productSummary table.

How to remote debug Apache Cassandra standalone server

In order to debug the cassandra server from your favorite IDE. You need to add the following into cassandra-env.sh located in apache-cassandra-1.1.0/conf directory.

JVM_OPTS="$JVM_OPTS -Xdebug"
JVM_OPTS="$JVM_OPTS -Xnoagent"
JVM_OPTS="$JVM_OPTS -Djava.compiler=NONE"
JVM_OPTS="$JVM_OPTS -Xrunjdwp:transport=dt_socket,server=y,address=5005,suspend=n"
cassandra-env.sh


After adding this, once you start the server you can see the following line printed in cassandra console

“Listening for transport dt_socket at address: 5005″ 

This the port that you specified in JAVA_OPTS. You can change it to some other value as you want.

Now configure your IDE to run on debug mode.

Now you can debug the apache cassandra server from your favorite IDE :)