Technical Concepts In A Capsule

Java Serialization- Basics - Part 1

Background:
There has been good amount of documentation already about serialization concept and how Java achieves it but generally such documentation talks only about a subset of what serialization offers us. This set of two articles aims at making the concept clear in the first part. And then in the second part, we will talk about lesser known and more complex facts about serialization.

Serialization concept:
It is an obvious fact that Java allows us (programmers) to create objects and store them in memory for re-use. However the life of any object in Java is at max limited till the time the actual JVM is alive. Once the JVM shuts down, however nice and useful our object is, next time we can't use it as it is.
So what should we do if we want to keep our object alive even after JVM shuts down and we want to use it the next time JVM comes up ? Is there a way we can use the same nice object in some different JVM altogether?
Well to answer these question,serialization comes to our rescue.

Time for definition:
Serialization is a process by which we can persist an object (as it stands in JVM or object state to be more technical) as sequence of bytes (may be in a file or in db or some othere storage) and then bring back the object in JVM (same or different) so that it can be reused whenever required.

It is obvious from the definition that we can flatten an object and store it for later use now in next section we will see how Java does it.

How java does it:
There are 3 ways in which we can persist any object. We will start with the most basic method and then work our way to the more complex ways to implement serialization. Before we start with the first way, we will talk about which object we can serialize and which ones we can't or shouldn't.

#1. An object can be serialized as long as it implements "Serializable" interface.
This is the first and most important rule we should remember about serialization. Serializable interface is a 'Marker' interface meaning it doesn't have any methods defined on it. It is just to tag a particular object to let JVM know that it can be serialized.

Now what happens when an object implements serializable interface but it has another object as an instance variable and that object to our misery doesn't implement the interface. Can we still serialize our object ?

Think about it, if we want to store the object and be able to reuse it later, we should be storing the entire object otherwise if we just store partial state of the object, that can lead to pretty frustating and potentially disastrous scenarios. So next rule (or ammendment to the first rule) is

#2. In order to serialize an object, all the referred objects within the object to be serialized MUST implement the "Serializable" interface.

Now think about another important fact, if java creators wanted serialization to be available for all objects across the board, they would have made the "Object" class imlpement the "Serializable" interface. But it isn't so. What might be the reason?

It is easy. There are objects, we don't want anybody to be able to serialize e.g. live 'Connection' object; imagine somebody serialize the active db connection and then wanting to use it later. It doesn't make any sense. So it is important to remember that not all objects are to be serialized.

One more thing before we dive into the different serialization mechanisms. Imagine we have a "BlogPost" object which has a member variable as "noOfViews". Now this variable represents the number of times the post is viewed. This variable is updated asynchronously by different threads at the same time. Now it doesn't make any sense to store the value of this variable when we serialize the BlogPost object. (This may not be the best example around but you should get the general idea) So what to do in such scenarios ?
Easy, java is always two steps ahead of us so it has a provision for this. Just mark the variable as "transient" and java's serialization mechanism will ignore the variable while storing the object. Easy... isn't it ?

#3 Mark the variables 'transient' if we don't want the values of these variables to be persisted while serializing.
Now we can talk about the first and default serialization mechanism

Default serialization mechanism:
If our object implements the above mentioned 'Serializable' interface, we are done :) java itself takes care of all the other things for us while serializing the object. Following example will explain how this works

public class Employee implements Serializable
{
private String name;
private int age;
//Getters and setters for the fields
}

In all examples, we will use the above object for our reference.
Now we will see how it can be serialized

Employee employee = new Employee();
FileOutputStream f= null; ObjectOutputStream outStream = null; try { f = new FileOutputStream(filename);outStream= new ObjectOutputStream(f);outStream.writeObject(employee);outStream.close(); } catch(IOException ex) { ex.printStackTrace(); }

As simple as that. No need to do anything else. Now how java internally traverses the object and object within object and object within object within object etc is out of scope for this discussion. We will talk about it in our next part. Till then, just assume that java has a magic algorithm to traverse the object graph.

We can stop here for now. In the next part we will talk about the other two ways of serializing an object.

See you then.. :) Have a nice time learning Serialization.

URL , URI , URN - What is what and who's who

In internet world, we find these terms regularly. What exactly are these terms, when should we use them if we want to make any sense, how can we be sure that we are making proper use of these terms. We all have these and many related questions. This article aims to make our life easier in this context and hopes to clear all doubts in using these terms.

First and foremost it needs to be clear that world of internet or any network for that matter is made up of different 'resources' which we need to 'access'. To make it clearer following are few examples of these two concepts.

RESOURCES : Any meaningful entity that is available in a network e.g. webpages of websites, images hosted on flicker, this blog post etc

ACCESS: Any meaningful action taken on the resources e.g viewing the webpage, adding a new image on the flicker , deleting the post. These are nothing but different ways to access the resource.

Enough background ; getting down to actual topic (Q and A style)

Q : What do we need to access the resource ?
A : Obviously, we first need to identify the resource in the network. then we need to be able obtain the resource somehow

Now next natural question is that how do we identify the resource ? We all know that in the network each resource has a name which is unique in that network context (also called as namespace) and a distinct location in the network (aka address). So what we mean here is that a resource can be identified by its name or location (or in some cases by both at the same time as we will see below)

So there we have it. All the things we need to know are in above 4 lines.

Enough sidewalking...Getting down to the thick of it.

URI : URI is as it says Uniform Resource Identifier. It is a string that identifies any resource in the network. As we talked about it above, we can do this either by the name of the resource or the location of the resource. It doesn't matter how we do it, it is still a URI since either ways it identifies the resource.

URN : Uniform Resource Names : This is a TYPE of URI that identifies a resource by its name in the network

URL: Uniform Resource Locator : This is a TYPE of URI that identifies a resource by its location in the network. Now one special note here ! Sometimes the location of the resource is its name so URL can be both ; name as well as location. It stiill doesn't matter. It is still a URI for sure.

So for those of us familiar with UML, we can say that URI is an abstract class which can be realized by either a URN or a URL. Till this date, these are the only two ways to realize a URI. Following diagram represents this relation.

Time for some examples:

So as we discussed above, we can specify URI by specifying either a URL or a URN so following examples represent both these schemes.

URN examples

urn:isbn:143424254 -- This identifies a resource (a book) by its ISBN number.
www.yahoo.com -- This identifies a yahoo website
ldap://some.resource.here -- Identifies an ldap resource (person may be)

Note here that in all these examples we just identified the resources but didn't talk about how to get them (the address part or which protocol to use to get them) these just identify the resources but give no further information

URL examples

http://www.yahoo.com -- This identifies the yahoo website based on its address
ftp://ftp.is.co.za/rfc/rfc1808.txt -- Identifies a particular text file by its location on the website

Note here that we give out the protocol (in technical terms we can say 'primary access mechanism') we give out the address and the information about how to get the resource

Difference between URL and URN :

URL gives the road to visit the resource (access mechanism) and possibly the name of the resource (in some cases)
URN gives out just the name but keeps mum about the access mechanism

Difference between URL and URI

(Not exactly correct to point out differences between parent and child but still this topic seems to be quite a hot one in general):

URL is a TYPE of URI : meaning every URL is a URI but converse is not true
URL is expected to point out the access mechanism of the resource but for URI it is not mandatory as long as it uniquely identifies the resource.

Related mis-conceptions

It is wrong to assume that URL is tied to http protocol. URL can be of any protocol that sticks with the specified syntax. Examples are ftp urls
It is wrong to assume that if we have the URN, we can access the resource. It is perfectly valid to have the name of the resource but it is not accessible. URNs do not talk anything about availability of the resource.

Important note:

The term 'URL' is now deprecated. So if we are writing any technical documentation, best thing would be to drop the word URL and use URI instead.

JNDI !! What the heck is it ? Part 2 'Directory services'

If you understood the first part of the series , this should be cake walk. We will talk about Directory services in this part. So lets start.

To understand directory services, think of any travel website where you can book a flight based on various parameters like price, timing, etc. E.g. London to Tokyo flight in 1000$ at 11pm on wednesday.
Directory services are similar to naming services where you can associate resources with names and you can get the objects by their names. However directory services are very similar to how these travel websites work as well. We can search an entity or resource based on their different attributes. Most commonly directory services allow us to store objects hierarchically.

So if we think directory services are nothing but a database like systems where we can query required objects based on our own criteria by specifying a criteria filter. These filters allow us to configure different parameters like 'depth of tree nodes' etc.

Following diagram represents the concept of directory services that we just talked about.

The most common protocol for accessing the directory service is LDAP (lightweight directory access protocol) LDAP defines how client applications can access or manipulate the data on the directory servers. As mentioned above, LDAP provides APIs to search the directory servers based on filters and parameters.

This is pretty much it as far as concept of directory services goes. In next section we will talk about the real thing; the JNDI

See you then :)

JNDI !! What the heck is it ? Part 1 'Naming services'

Short answer : It is Java Naming and Directory Interface :)

Well that's not enough obviously... So
Long answer : To understand the Java naming and directory services we first need to understand what naming services and directory services are and what good are they (what purpose do they serve) .. So this is a 3 part explanation of JNDI and we will talk about the first part here

Naming services : To understand naming service, ask yourself a question 'why do I need a contact book in my mobile phone ? ' Answer will most probably be : 'How the hell am I supposed to remember everybody's phone number. I need to look up phone numbers of my contacts. Also if somebody's number changes, I can pretty easily update my contact using just my left thumb'
Right.. When we store the numbers in our contact book there are following purposes we serve
1. We don't want to remember 10 or more alpha-numerics per person. It is easier to remember a 'John Nash' than 555-223-2422 (fictious name and number ofcourse)
2. If John Nash changes his number, we don't lose anything. We anyway remember his number by his contact so we just update the number. In technical terms we call this 'Loose coupling'
So this is exactly why we need naming services in our lives.

Couple of examples of naming services :
1. DNS (Domain naming service) : If it weren't for DNS we would have gone bananas remembering IP addresses of our favourite websites. Thank god we have DNS servers which hide the ugly ip address and gift wrap it in a nice name which we can remember (or bookmark :) )
2. Our file systems : Well file systems hide the actual storage location of our contents and give us a facility of file names, folders etc.

Enough blabbering !! Getting down to brass technical terms

Naming service is a software application that associates a 'name' with the required 'service' or 'information'
The definition implies that we in our software can utilize an entity or an object without the knowledge of their exact location. They can reside in the naming service itself or can be hosted on a completely different machine (it has to be accessible though)

Now in our analogy, we can find any contact only as long as we look in the correct contactbook. Similarly, when we use naming services, we need to know where we are supposed to find our required entity. This is called as 'context' in the naming service world.
So basic steps in using naming services are :

Obtain the context.
Lookup the entity
Get the resource by the name it is bound with in the context.

Following diagram represents the architecture that we just explained.

So why exactly use a naming service ??

To decouple the provider from the user or consumer. As long as we know the name which the service is registered under we are good to go. Why bother about how it is implemented!
It is good to have a central registry sort of systems where you keep all your services registered. Good to have a uniform way of accessing our services or objects or whatever may be. It frees us from having to implement each and every service on our own like message queues, EJBs, data sources etc.
We can pretty easily change the hosting of any service if we are accessing it through naming service since we just need to update the entry and we are good to go.

Most popular examples of naming services are :

CORBA Common object services : This service provides a hierarchical way of storing the entities
Network information services (NIS) : This is java based naming service developed by sun.

So this concludes the first part 'Naming services'. We will talk about directory services in the next part.

Technical Concepts In A Capsule

Pages

Java Serialization- Basics - Part 1

URL , URI , URN - What is what and who's who

JNDI !! What the heck is it ? Part 2 'Directory services'

JNDI !! What the heck is it ? Part 1 'Naming services'

About Me

Blog Archive

Labels

Followers