Software Distilled

Wednesday, July 8, 2015

How to create a custom Maven packaging

Distributing software made in Java is pretty standard, either a jar, a war or an ear format is used. However, it is not uncommon the need for other formats. Cases like applications being distributed in a tar or zip format, with some scripts, are often found.

Maven provides an easy mechanism for packaging applications in the standard formats and when a custom packaging is required, the Assembly plugin is the most common option. Though it is a good solution, it tends to overload the build specification with its configuration and the descriptor file. Also, it is not easy to share or reuse among similar projects.

A custom packaging is an appropriate alternative for this cases, as all the magic can be encapsulated inside a Maven plugin, and the pom would look like:

<packaging> my-custom-packaging </packaging>

Lets create a custom packaging, called jzip for packaging the application in a zip file with a start script.

Decoupling filtering, sorting and paging with HATEOAS

Filtering a list of anything, as well as sorting and paging, requires the client to know which parameters must use. Furthermore, if some parameters are not compatible (like filters that are not supposed to be used together), the knowledge must be even deeper. This generates a coupling to the server, as any change on these parameters will have an impact in the client.

HATEOAS can be used to reduce the coupling between a client and a server, specifically in the way the client obtains resources, by following links. This can be applied in some scenarios for filtering, sorting and paging lists. As lists are resources, links for this operations can be provided.

Let's take as an example an API for selling cars. The client has the following URL /shop-api/cars, obtained through a link while navigating the API or previously bookmarked. This URL lists the first ten available cars for sale.

In order to sort the cars by price, the client must know which sorting parameters has to use, but with HATEOAS, the list resource can include a link for this:

{
"rel" : "sortedByPriceAsc",
"href" : "http://shop-api/cars?sortBy=price&sortDirection=ASC",
"title" : "Price: Low to High"
}

By locating the link with rel “sortedByPriceAsc” and following it, the client sorts the list by price ascending and never has to deal with sortBy and sortDirection parameters. If these parameters change (though in real cases a bit unlikely but possible) to something like sort=price-ASC, the client is not affected as it keeps following the same link now with the new URL:

{
"rel" : "sortedByPriceAsc",
"href" : "http://shop-api/cars?sort=price-ASC",
"title" : "Price: Low to High"
}

Once sorted, if the next ten cars are desired, the list resource will provide a link to get them, that not only handles pagination parameters but also keeps the navigation consistent, as a sort has been applied previously:

{
"rel" : "next",
"href" : "http://shop-api/cars?sort=price-ASC&page=2&pageSize=10",
"title" : "Next"
}

The client is not only decoupled from sorting and paging but also is released from keeping track of all parameters, already used, for further requests.

Filtering follows the same mechanism. In order to get only convertible cars, the list resource includes this link:

{
"rel" : "convertibles",
"href" : "http://shop-api/cars?category=convertible",
"title" : "Convertibles"
}

In all cases, the navigation coherence of the API is driven by the server. Once a sort, pagination or filter is applied, the obtained resource will provide links for further navigation, including any parameter already used, unless it is not compatible (like several sorts at the same time) which in that case is excluded from the link URL.

The tree operations, then, are transformed into following links, requiring the client only to know which one to follow. Also, incorporating any new filter or sort, is just a matter of using a new link.

Conclusion

At some extent, filtering, sorting and paging are withdrawn as operations for manipulating lists in the sense that there is no filter, what there is, is a link to a resource that turns out to be a "reduced" list of resources, or there is no sort, what there is is a link to a resource that turns out to be a "sorted" list of resources. The whole API navigation is limited to one mechanism and that is following links; lists are not special cases. The further the client can keep the interaction in this way, the less coupled will be.

The first concern regarding this approach is the proliferation of links. For each possible value in a filter, a link must be provided. Seems like a lot of links, and they are, but is this the negative enough to eclipse the benefits?

It could be argued that this proliferation of links increases the size of the response, but it's just text and can be efficiently compressed if need it (HTTP compression: gzip). Also programmatically speaking, building those links, at the server, doesn't present much of a challenge.

There are two scenarios (there always are) where this approach doesn't fit like a glove.

The first one is when the filter values are not discrete or there are too many options, making it unrealistic to offer a link for each one. A possible solution, that could work in some cases, is to offer links representing ranges, like for example price ranges.

The second scenario is the ability to apply several filters and a sort at the same time, in one interaction. This is the case where the user chooses several options at once and then triggers a search, expecting all criteria to be applied. The link concept here is not flexible enough for tackling the problem, as each link models the application of one criteria. Some possible solutions are to use URI templates in the href value of the link or to use forms (like HTML forms). In both cases, though the client would need to know how to complete those structures (generating certain level of coupling), some of the advantages, like decoupling part of the URL and navigation coherence, can still be obtained.

There is a third scenario worth to mention. If the client is an intermediary backend, that enhances or aggregates the API with more functionality, the hypermedia mechanism (at least for filtering, sorting and paging) looks a bit cumbersome, as it may have to define some sort of mapping between the links it offers in its API, and those provided by the API it uses. More brain on this in further posts as it seems to be a long discussion.

Not all the scenarios are the same and there is no unique solution for all of them. HATEOAS presents an elegant and simple solution for decoupling filters, sorts and paging operations for final API clients, reducing the knowledge they must have regarding parameters and how to combine them leveraging an easy interaction between the client and the server.

HATEOAS for reducing coupling

In any Client-Server architecture, there are basically two types of coupling: the data they exchange and the means of communication between both.

Any change in either the data structure or communication will affect the client. No matter how well the client was programmed, a refactor, either big or small, must be performed. And that's the nature of coupling.

But in the pursuit of loose coupling, HATEOAS offers a simple mechanism for reducing the coupling generated by communication.

HATEOAS stands for Hypermedia As The Engine of Application State. The basic idea is to use Hypermedia as the mechanism for the client to interact with the server.

Through the use of links, a client can obtain all the resources it needs. A link has a well-known structure that includes, at least, two fields: href holds a resource URI, and rel helps to understand which resource will be obtained. The whole interaction then is reduced to searching and following links.

But, how will the use of links help the client in reducing communication-related coupling?

How to debug a Maven plugin

Maven plugins are debugged using the Java remote debugging mechanism. By using mvnDebug instead of mvn, a remote debugging session is started. It can be started by triggering a complete build or by invoking a specific goal, in both cases, over a project that uses the plugin.

Lets exemplify with the plugin built in the previous post. By executing the goal check-properties with mvnDebug, the execution will hang saying it is listening on a specific port:

mvnDebug org.softwaredistilled:properties-maven-plugin:1.0:check-properties -Dcheck="i18n_??"

Preparing to Execute Maven in Debug Mode
Listening for transport dt_socket at address: 8000

Another option would be to trigger a build like:

mvnDebug clean package

Preparing to Execute Maven in Debug Mode
Listening for transport dt_socket at address: 8000

Most of Java IDEs, if not all of them, provide some remote debugging tools. In Eclipse, a Remote Java Application debug configuration has to be created (Debug configuration section) that once started, it will stop at any break point.

The tricky part is that the plugin must be installed every time a modification is performed in order to see the changes at debugging time.

Thursday, March 5, 2015

How to create a Maven plugin and goals

As any generic tool, Maven may require some customization. If this customization involves some actions not contemplated by the tool, then two approaches can be taken.

The first approach is to include some scripting in the pom file. By embedding ANT code, a lot of tasks can be achieved. Though it is a quick solution, it may turn the pom.xml file into an illegible and unmanageable file. Plus, it is difficult to reuse or share anything.

The second approach is to create a plugin with goals that perform the desired tasks.

A plugin can be thought as a bag of tasks or commands named goals. Each goal can be used during build time, to perform a specific job. In other words, a Maven plugin is the distribution mechanism for goals.

Building the plugin

It is very common in Java projects to use .properties files for internationalizing an application or for configuration purposes. It is also very common, to forget a key in one of the .properties files. So lets build a plugin with a goal that checks if a set of .properties files have all the same keys, at build time, to avoid discovering this error later.

The first step is to use an archetype for creating the plugin project:

mvn archetype:generate
    -DarchetypeGroupId=org.apache.maven.archetypes
    -DarchetypeArtifactId=maven-archetype-plugin
    -DgroupId=org.softwaredistilled -DartifactId=properties-maven-plugin

As any archetype, this will create all folders and files, plus an example of a goal. Each goal will be handled by a Java class, named in the Maven world as Mojo (Maven Old Java Object).

The idea is to create a goal that receives a list of regular expressions indicating the .properties files to check. For instance, by setting the expression "i18n_??" it will validate if all files i18n_{language}.properties have the same keys.

Why Dependency Injection is a powerful concept

Dependency Injection (DI) is a type of Inversion of Control (IoC) focused on decoupling and inverting the way an object obtains its collaborators.

In order to achieve its purpose, almost every object interacts with other objects, so there must be some code that materializes "the wiring" among them.

The easiest way to do this would be to place that code in the object definition, that is to say, the object is responsible for obtaining its collaborators and one of the places of choice is its constructor.

Let's exemplify this with some Java code:

public class MyBusinessObject {

private OtherBusinessObject obo;

public MyBusinessObject(){

obo = new OtherBusinessObject();

}

...
}

Very simple, but it has some flaws. Let's say you programmed your objects thinking on polymorphism and its underlying concept exchangeable objects, for leveraging code scalability. In Java, this can be achieved by defining the collaborator type as a superclass (probably abstract) or as an interface.

What if a new business requirement makes your object interact with a different class of its collaborator?

Although small, some changes have to be made in the code (the collaborator instantiation in the constructor).

Furthermore, what if there are some scenarios in which the object has to interact with different definitions of the collaborator (needless to say all polymorphic among each others)?

This is very common when unit testing: the object is tested isolated so you make it interact with a "fake" version of its collaborator (aka mock).

Clearly this approach is not flexible enough. So, how can we avoid modifying the code and support the ability of interacting with different objects at different scenarios?

Here is where DI comes to action.

JPA without Spring (or any DI framework)

JPA is the ORM defined by the Java Community Process, that is to say, it is a JSR (317). Previously to JPA, there were a couple of ORM occupying the scene, like Hibernate or iBatis. So the community realized Java needed it's own ORM.

As any other JSR, the specification is defined through some documents, some interfaces and abstract classes, a couple of tests, etc., and let the vendors to implements the details. There are three main JPA implementations: Hibernate, EclipseLink and OpenJPA.

The idea, as always, is to program using the specification (classes,interfaces, annotations, etc) and make the implementation available in the classpath so the details are resolved at runtime. That gives you the freedom to choose the implementation that fits better for your project or to change it without touching your source code.

I've been working with JPA for a couple of years in different projects, and most of the time it's configured through a dependency injection (DI) framework like Spring, where the EntityManager is injected in some DAO class.

Although using a DI framework may be the most appropriate approach for plumbing JPA in an enterprise scenario, it is very interesting to dig a bit and see what happens in the underground. This helps a lot to understand what is part of the specification and what is part of the salad of framework you are using.