Creating a Java 8 Stream from unbounded data using Spliterator

Problem Statement

I have a large XML file. I would like to read it, and group-by and aggregate the rows in it using Java 8. DOM parser with JAXB will not be able to handle this, as its a really large file. I would like to create a Stream from the unbounded data contained in the XML file.

Solution

I read the XML by streaming with Stax. Since I do not load the entire file in memory, I am good. I go a step further, and use JAXB to un-marshall small portions of this file, which I will call a row. I use a Spliterator backed by a BlockingQueue to create a Stream out of it. After I have the stream, I apply the famous grouping-by function and aggregate the rows.

The XML

The sample XML looks like this:

There would be thousands of elements “T”. I have modeled my POJO on the element “T”. I use Stax to read the xml. When I read one element “T”, I use Jaxb to un-marshall it to a Java object and then add it to the Stream.

The POJO

I have modeled the POJO as below:

The Stax Parser

The heart of this is the Stax parser:

 

I use the CountDownLatch only because I need my JUnit to be alive till the document is read fully. It would not be needed in an actual server environment. Note the usage of the BlockingQueue.

Spliterator implementation

 

The grouping logic

This part is very simple. We actually stream a GZip file by using a GZIPInputStream:

 

Sources

https://github.com/paawak/blog/tree/master/code/reactive-streams/spliterator-demo

I found some large xmls from the below location:

http://www.cs.washington.edu/research/xmldatasets/www/repository.html

 

Building a REST Server with Spring MVC

We would like to build a REST server with Spring MVC. It should be very simple to support various formats like JSON and XML for the same request, just by changing the Content header. Example, I have the below url:

http://localhost:8090/simple-spring-rest/bank-detail

It should return me either json or xml or some other format depending on the Accept header to application/json or application/xml respectively.

Lets see how to achieve that.

Configuration of Spring MVC

We will use pure Java configuration:

Note the use of WebMvcConfigurerAdapter. It comes in handy when you want to work with Spring MVC. Especially note worthy is the configureMessageConverters() method. You would use that to configure a REST service. It would define how Spring handles the @ResponseBody or @RestController annotation, to translate a POJO to the response type: json, xml, etc. In the above example, we are using MappingJackson2HttpMessageConverter to convert our POJOs to JSON and Jaxb2RootElementHttpMessageConverter to convert them to XML.

Model

Note the use of @XmlRootElement. This is absolutely necessary as we are using Jaxb2RootElementHttpMessageConverter to convert our POJOs to xml. If you omit this, you will get a “Error 406 Not Acceptable” error, the underlying cause being:

org.springframework.web.HttpMediaTypeNotAcceptableException: Could not find acceptable representation

Controller

The controller is very simple, and returns the POJO. It is upto the HttpMessageConverter to make sense of it and convert that to either json or xml.

This makes the perfect sense, as the controller can just return the model, and the conversion can be a configuration detail.

Alternate ways of specifying the desired response type

Spring gives us the flexibility of doing away with the Accept header to specify the type of response. If we want json output, we can simply say:

http://localhost:8090/simple-spring-rest/bank-detail.json

For xml, we can similarly say:

http://localhost:8090/simple-spring-rest/bank-detail.xml

Sources

The sources can be found here:

https://github.com/paawak/blog/tree/master/code/simple-spring-rest

Spring Java Config

After Spring came out with annotations based Java configuration, I found them very handy. Get rid of the xml Spring configs, as the Java configs are safe with refactoring, more readable and less verbose. I will give some of the examples that I used:

Configuration of Jdbc Connection Pool

Configuration of Spring MVC

Note the use of WebMvcConfigurerAdapter. It comes in handy when you want to work with Spring MVC. Especially note worthy is the configureMessageConverters() method. You would use that to configure a REST service. It would define how Spring handles the @ResponseBody or @RestController annotation, to translate a POJO to the response type: json, xml, etc. In the above example, we are using MappingJackson2HttpMessageConverter to convert our POJOs to JSON and Jaxb2RootElementHttpMessageConverter to convert them to XML.

Excluding a specific class from the config

Sometimes it so happens that we would like to selectively disable a couple of classes from the annotation config. This is how it is done:

 

Sources

The sources can be found here:

https://github.com/paawak/blog/tree/master/code/simple-spring-rest

Replacing web.xml with Java config

Post Servlet 3.0, web.xml has become redundant. It can now be replaced with pure Java configuration.

We can do that even without Spring. But since we are using Spring MVC, I am taking the Spring example. Define a class implementing the WebApplicationInitializer, as below:

Note that we can define servlets, filters, etc., whatever we do in the web.xml. Its super simple and very readable.

Please be aware that you need to include the below servlet 3.x dependency in your pom.xml:

Moreover, you have to include the below Maven plugin and set the failOnMissingWebXml property to false in pom.xml:

 

The sources will be found here:

https://github.com/paawak/blog/tree/master/code/simple-spring-rest

Running Jetty 9 with Maven

Running Jetty through Maven is super simple. The only problem is lot of configurations were changed in Jetty 9. It is significantly different from, say, Jetty 6. I will keep it short. The below lines in pom.xml would get the job done:

Note that it defines a custom port and a context path as well.

Use the below command to run Jetty:

mvn jetty:run

The sources are available here:

https://github.com/paawak/blog/tree/master/code/simple-spring-rest

Running Jetty 9 with Ant

We have a simple web application written in Spring MVC. Today, we will demo how to run the war through Ant, using Jetty 9. The needed libraries can either be manually downloaded and provided as a part of the project. However, sometimes, it becomes unwieldy, especially if you want to upgrade the Jetty version. So, we will define them as Maven dependencies in the build.xml.

The beauty of defining the required jars as Maven dependencies is, that these jars will be downloaded in local Maven repository if not present.

Note that we define the location of the war file. Also, we define a custom port and a context path.

The source code can be found here:

https://github.com/paawak/blog/tree/master/code/simple-spring-rest

[jackson] Handle LocalDate for json

Post Java 8, LocalDate has become mainstream. Of course, Joda Time, from which LocalDate derives, was quite popular as well. I was trying to convert LocalDate to and from json.

Consider the below simple Person POJO:

 

The below code converts a Person to Json:

The generated json is as below:

Notice that the dateOfBirth field has lots of attributes. The next logical thing to do would be to take this string and try to un-marshall it to Java Object:

However, this results in the below exception:

 

Solution

Its very easy to solve it. We would need to add the LocalDate type module in pom.xml:

To enable it:

Now we can just go our usual way of using the ObectMapper. Note that now the json string now becomes more concise:

Sources

The entire project is here:

https://github.com/paawak/blog/tree/master/code/json-inheritance-demo

The full test case is here:

https://github.com/paawak/blog/blob/master/code/json-inheritance-demo/src/test/java/com/swayam/demo/json/simple/LocalDateJsonTest.java

 

Using SwingWorker to update Swing Components asynchronously

I have a JList, the contents of which are updated asynchronously by a different thread. The way I update the JList is by updating the model:

The only problem is, if this is not done in the EventDispatchThread, the UI becomes unresponsive, and starts behaving weird.

The way to solve this to use the SwingWorker.

I have a Jini server, which streams data to a Jini client, in this case, a Swing JFrame. I have a JList to which I publish the data as and when it becomes available.

Jini Service API

Its a simple streaming service:

At its heart is the interface RemoteDataListener, on to which the server publishes data as it becomes available.

Client Implementation

The key is to implement the interface RemoteDataListener along with the SwingWorker as shown below as an inner class in the JFrame:

We invoke the long running service in the doInBackground() method and add itself as a listener. When new data is received in the newData() method, we immediately call the publish() method, which delegates it to the process() method. The process() method is invoked within the eventDispatcherThread. This ensures that any updates made to the model of the JList is reflected on the UI. So, the code to update Swing components reside here. Also note that the done() method is called by the SwingWorker after the processing thread is finished.

Running the example

Run the SpringNonSecureRmiServer to start the Reggie and the Jini Server. After the Jini Server starts up, run the RmiStreamingDemoFrame.

rmi-data-streaming-1

rmi-data-streaming-2

Sources

The sources can be found here: https://github.com/paawak/blog/tree/master/code/jini/unsecure/streaming-with-jini

There are 3 Maven projects under that:

  1. rm-api
  2. rmi-client
  3. rmi-server

Data streaming with Jini

Lets take a simple example of how we can stream data from a Jini Server to a Jini Client. Data Streaming means that we can send huge, unbounded data to a consumer. In this example, we will read from a DataStore and directly send it to the client.

We have the following simple api:

The RemoteDataListener is a remote call-back where the data would be published from the Jini server as it becomes available.

Note that this is a Remote interface as well.

Server Implementation

The server implementation is very simple: its a straight delegation to the Dao.

The Dao just pushes data to the RemoteDataListener as shown below:

Client Implementation

The trick really is in the client implementation. In this example, since the RemoteDataListener  is a Remote listener, it is an exported Jini service which lives on the client. This is an example where the Jini Client and Jini Server swaps roles and the client becomes the server. Just to illustrate our point, we have created a very simple client which just writes the data received on to the console.

Then, we have a simple class with the main method to invoke the streaming service:

Note that just before invoking the streaming service and passing the RemoteDataListener, it is being duly exported as a Jini service and the exported RemoteDataListener is then passed to the streaming service. That is the trick, really.

The SpringContextHelper is a simple class to load up Spring Context and help look up remote services:

Running the example

We have embedded the Reggie in, so simply run the SpringNonSecureRmiServer to start the Reggie and the Jini Server. After the Jini Server starts up, run the SimpleStreamingClient. You would see the following on the console:

Sources

The sources can be found here: https://github.com/paawak/blog/tree/master/code/jini/unsecure/streaming-with-jini

There are 3 Maven projects under that:

  1. rm-api
  2. rmi-client
  3. rmi-server

Maven: Creating deployable distribution: Part 2: creating fat jar

In the Part 1 of this, we had seen how to use the appassembler plugin to create a distribution tar. In this increment, we will see how to use the Maven assembly plugin to create a uber jar or fat jar. A uber jar or fat jar is a single jar having the contents of all the dependencies needed for the project. Again, we will be using the maven-distribution-example as reference. This project would need the project rmi-service-api to compile. We have created a profile called uber-jar:

The assembly plugin uses a pre-fabricated descriptor called jar-with-dependencies. This would condense all dependencies into the single fat jar.

Note there are 2 executions. The first one creates the uber jar and the second one takes that uber jar and the readme.txt and creates a zip file. It does this by using the below assembly descriptor (uber-jar-assembly-descriptor.xml):

The uber jar can be generated by:

mvn clean package -P uber-jar

The sources can be found here: https://github.com/paawak/blog/tree/master/code/maven-ant-assembly-example