|
I assume that you're familiar with HTTP and CGI or a proprietary server API
like NSAPI or ISAPI. I also assume that you are somewhat familiar with Java
programming or some other object-oriented language, such as C++. Even if you're
not a Java programmer you should be able to appreciate the benefits of servlets
reading this article, but before you develop your own servlets I recommend that
you first learn the Java basics.
The Dark AgesEarly in the World Wide Web's history, the Common Gateway
Interface (CGI) was defined to allow Web servers to process user input and serve
dynamic content. CGI programs can be developed in any script or programming
language, but Perl is by far the most common language. CGI is supported by
virtually all Web servers and many Perl modules are available as freeware or
shareware to handle most tasks.
But CGI is not without drawbacks. Performance and scalability are big problems
since a new process is created for each request, quickly draining a busy server
of resources. Sharing resources such as database connections between scripts or
multiple calls to the same script is far from trivial, leading to repeated
execution of expensive operations.
Security is another big concern. Most Perl scripts use the command shell to
execute OS commands with user-supplied data, for instance to send mail, search
for information in a file, or just leverage OS commands in general. This use of
a shell opens up many opportunities for a creative hacker to make the script
remove all files on the server, mail the server's password file to a secret
account, or do other bad things that the script writer didn't anticipate.
The Web server vendors defined APIs to solve some of these problems, notably
Microsoft's ISAPI and Netscape's NSAPI. But an application written to these
proprietary APIs is married to one particular server vendor. If you need to
move the application to a server from another vendor, you have to start from
scratch. Another problem with this approach is reliability. The APIs typically
support C/C++ code executing in the Web server process. If the application
crashes, e.g. due to a bad pointer or division by zero, it brings the Web server
down with it.
Servlets to the rescue!The Servlet API was developed to leverage the
advantages of the Java platform to solve the issues of CGI and proprietary APIs.
It's a simple API supported by virtually all Web servers and even load-balancing, fault-tolerant Application Servers. It solves the performance problem
by executing all requests as threads in one process, or in a load-balanced
system, in one process per server in the cluster. Servlets can easily share
resources as you will see in this article.
Security is improved in many ways. First of all, you rarely need to let a
shell execute commands with user-supplied data since the Java APIs provide
access to all commonly used functions. You can use JavaMail to read and send
email, Java Database Connect (JDBC) to access databases, the File class and
related classes to access the file system, RMI, CORBA and Enterprise Java Beans
(EJB) to access legacy systems. The Java security model makes it possible to
implement fine-grained access controls, for instance only allowing access to a
well-defined part of the file system. Java's exception handling also makes a
servlet more reliable than proprietary C/C++ APIs - a divide by zero is reported
as an error instead of crashing the Web server.
The Servlet Run-time EnvironmentA servlet is a Java class and therefore
needs to be executed in a Java VM by a service we call a servlet engine.
The servlet engine loads the servlet class the first time the servlet is
requested, or optionally already when the servlet engine is started. The servlet
then stays loaded to handle multiple requests until it is explicitly unloaded or
the servlet engine is shut down.
Some Web servers, such as Sun's Java Web Server (JWS), W3C's Jigsaw and
Gefion Software's LiteWebServer (LWS) are implemented in Java and have a
built-in servlet engine. Other Web servers, such as Netscape's Enterprise
Server, Microsoft's Internet Information Server (IIS) and the Apache Group's
Apache, require a servlet engine add-on module. The add-on intercepts all
requests for servlets, executes them and returns the response through the Web
server to the client. Examples of servlet engine add-ons are Gefion Software's
WAICoolRunner, IBM's WebSphere, Live Software's JRun and New Atlanta's
ServletExec.
All Servlet API classes and a simple servlet-enabled Web server are combined
into the Java Servlet Development Kit (JSDK), available for download at Sun's
official Servlet site (see Resources below). To get
started with servlets I recommend that you download the JSDK and play around
with the sample servlets.
As this article is written (early March 1999), the released version of the
JSDK is for the Servlet 2.0 API, with an Early Access version of the JSDK 2.1
available at Java Developer's Connection. All servlet engines mentioned above
support the Servlet 2.0 API, and a few also support the 2.1 API. The examples of
2.1 API features in this article are clearly marked so you don't have to be
surprised when they don't work with your 2.0 servlet engine.
Servlet Interface and Life CycleLet's implement our first servlet. A
servlet is a Java class that implements the Servlet interface. This interface
has three methods that define the servlet's life cycle:
public void init(ServletConfig config) throws
ServletException This method is called once when the servlet is loaded
into the servlet engine, before the servlet is asked to process its first
request.
public void service(ServletRequest request, ServletResponse response)
throws ServletException, IOException This method is called to process
a request. It can be called zero, one or many times until the servlet is
unloaded. Multiple threads (one per request) can execute this method in parallel
so it must be thread safe.
public void destroy() This method is called once just before
the servlet is unloaded and taken out of service.
The init method has a ServletConfig attribute. The servlet can
read its initialization arguments through the ServletConfig object. How the
initialization arguments are set is servlet engine dependent but they are
usually defined in a configuration file.
A typical example of an initialization argument is a database identifier. A
servlet can read this argument from the ServletConfig at initialization and then
use it later to open a connection to the database during processing of a
request: ...
private String databaseURL;
public void init(ServletConfig config) throws ServletException {
super.init(config);
databaseURL = config.getInitParameter("database");
}
The Servlet API is structured to make servlets that use a different protocol
than HTTP possible. The javax.servlet package contains interfaces
and classes intended to be protocol independent and the
javax.servlet.http package contains HTTP specific interfaces and
classes. Since this is just an introduction to servlets I will ignore this
distinction here and focus on HTTP servlets. Our first servlet, named
ReqInfoServlet, will therefore extend a class named HttpServlet. HttpServlet is
part of the JSDK and implements the Servlet interface plus a number of
convenience methods. We define our class like this: import javax.servlet.*;
import javax.servlet.http.*;
public class ReqInfoServlet extends HttpServlet {
...
}
An important set of methods in HttpServlet are the ones that specialize the
service method in the Servlet interface. The implementation of
service in HttpServlet looks at the type of request it's asked to
handle (GET, POST, HEAD, etc.) and calls a specific method for each type. This
way the servlet developer is relieved from handling the details about obscure
requests like HEAD, TRACE and OPTIONS and can focus on taking care of the more
common request types, i.e. GET and POST. In this first example we will only
implement the doGet method. protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
...
}
Request and Response ObjectsThe doGet method has two
interesting parameters: HttpServletRequest and HttpServletResponse. These two
objects give you full access to all information about the request and let you
control the output sent to the client as the response to the request.
With CGI you read environment variables and stdin to get information about
the request, but the names of the environment variables may vary between
implementations and some are not provided by all Web servers. The
HttpServletRequest object provides the same information as the CGI environment
variables, plus more, in a standardized way. It also provides methods for
extracting HTTP parameters from the query string or the request body depending
on the type of request (GET or POST). As a servlet developer you access
parameters the same way for both types of requests. Other methods give you
access to all request headers and help you parse date and cookie headers.
Instead of writing the response to stdout as you do with CGI, you get an
OutputStream or a PrintWriter from the HttpServletResponse. The OuputStream is
intended for binary data, such as a GIF or JPEG image, and the PrintWriter for
text output. You can also set all response headers and the status code, without
having to rely on special Web server CGI configurations such as Non Parsed
Headers (NPH). This makes your servlet easier to install.
Let's implement the body of our doGet method and see how we can
use these methods. We will read most of the information we can get from the
HttpServletRequest (saving some methods for the next example) and send the
values as the response to the request. protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/html");
PrintWriter out = response.getWriter();
// Print the HTML header
out.println("<HTML><HEAD><TITLE>");
out.println("Request info");
out.println("</TITLE></HEAD>");
// Print the HTML body
out.println("<BODY><H1>Request info</H1><PRE>");
out.println("getCharacterEncoding: " + request.getCharacterEncoding());
out.println("getContentLength: " + request.getContentLength());
out.println("getContentType: " + request.getContentType());
out.println("getProtocol: " + request.getProtocol());
out.println("getRemoteAddr: " + request.getRemoteAddr());
out.println("getRemoteHost: " + request.getRemoteHost());
out.println("getScheme: " + request.getScheme());
out.println("getServerName: " + request.getServerName());
out.println("getServerPort: " + request.getServerPort());
out.println("getAuthType: " + request.getAuthType());
out.println("getMethod: " + request.getMethod());
out.println("getPathInfo: " + request.getPathInfo());
out.println("getPathTranslated: " + request.getPathTranslated());
out.println("getQueryString: " + request.getQueryString());
out.println("getRemoteUser: " + request.getRemoteUser());
out.println("getRequestURI: " + request.getRequestURI());
out.println("getServletPath: " + request.getServletPath());
out.println();
out.println("Parameters:");
Enumeration paramNames = request.getParameterNames();
while (paramNames.hasMoreElements()) {
String name = (String) paramNames.nextElement();
String[] values = request.getParameterValues(name);
out.println(" " + name + ":");
for (int i = 0; i < values.length; i++) {
out.println(" " + values[i]);
}
}
out.println();
out.println("Request headers:");
Enumeration headerNames = request.getHeaderNames();
while (headerNames.hasMoreElements()) {
String name = (String) headerNames.nextElement();
String value = request.getHeader(name);
out.println(" " + name + " : " + value);
}
out.println();
out.println("Cookies:");
Cookie[] cookies = request.getCookies();
for (int i = 0; i < cookies.length; i++) {
String name = cookies[i].getName();
String value = cookies[i].getValue();
out.println(" " + name + " : " + value);
}
// Print the HTML footer
out.println("</PRE></BODY></HTML>");
out.close();
}
The doGet method above uses most of the methods in
HttpServletRequest that provide information about the request. You can read all
about them in the Servlet API documentation so here we'll just look at the most
interesting ones.
getParameterNames and getParameterValues help you
access HTTP parameters no matter if the servlet was requested with the GET or
the POST method. getParameterValues returns a String array because
an HTTP parameter may have multiple values. For instance, if you request the
servlet with a URL like
http://company.com/servlet/ReqInfoServlet?foo=bar&foo=baz
you'll see that the foo parameter has two values: bar
and baz. The same is true if you use the same name for more than
one HTML FORM element and use the POST method in the ACTION tag.
If you're sure that an HTTP parameter only can have one value you can use the
getParameter method instead of getParameterValues. It
returns a single String and if there are multiple values it returns the first
value received with the request.
You have access to all HTTP request headers with the
getHeaderNames and getHeader methods.
getHeader returns the String value of the header. If you know that
the header has a date value or an integer value you can get help converting the
header to an appropriate format. getDateHeader returns a date as
the number of milliseconds since January 1, 1970, 00:00:00 GMT. This is the
standard numeric representation of a timestamp in Java and you can use it to
construct a Date object for further manipulation. getIntHeader
returns the header value as an int.
getCookies parses the Cookie header and returns all cookies as
an array of Cookie objects. To add a cookie to a response the
HttpServletResponse class provides an addCookie method that takes a
Cookie object as its argument. This saves you from dealing with the format for
different versions of cookie header strings.
If you compile the ReqInfoServlet and install it in your servlet engine you
can now invoke it through a browser with a URL like
http://company.com/servlet/ReqInfoServlet/foo/bar?fee=baz. If
everything goes as planned you will see something like this in your browser:
Request infogetCharacterEncoding:
getContentLength: -1
getContentType: null
getProtocol: HTTP/1.0
getRemoteAddr: 127.0.0.1
getRemoteHost: localhost
getScheme: http
getServerName: company.com
getServerPort: 80
getAuthType: null
getMethod: GET
getPathInfo: /foo/bar
getPathTranslated: D:\PROGRA~1\jsdk2.1\httproot\servlet\ReqInfoServlet\foo\bar
getQueryString: fee=baz
getRemoteUser: null
getRequestURI: /servlet/ReqInfoServlet/foo/bar
getServletPath: /servlet/ReqInfoServlet
Parameters:
fee:
baz
Request headers:
Connection : Keep-Alive
User-Agent : Mozilla/4.5 [en] (WinNT; I)
Host : company.com
Accept : image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding : gzip
Accept-Language : en
Accept-Charset : iso-8859-1,*,utf-8
Cookie : TOMCATID=TO04695278486734222MC1010AT
Cookies:
TOMCATID : TO04695278486734222MC1010AT
What if you want this servlet to handle both GET and POST requests? The
default implementations of doGet and doPost return a
message saying the method is not implemented. So far we have only provided a new
implementation of doGet. To handle a POST request the same way we
can simply call doGet from doPost: protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
doGet(request, response);
}
Persistent and Shared DataOne of the more interesting features of the
Servlet API is the support for persistent data. Since a servlet stays loaded
between requests, and all servlets are loaded in the same process, it's easy to
remember information from one request to another and to let different servlets
share data.
The Servlet API contains a number of mechanisms to support this directly.
We'll look at some of them in detail below. Another powerful mechanism is to use
a singleton object to handle shared resources. You can read more about this
technique in Improved
Performance with a Connection Pool.
Session TrackingAn HttpSession class was introduced in the 2.0 version
of the Servlet API. Instances of this class can hold information for one user
session between requests. You start a new session by requesting an HttpSession
object from the HttpServletRequest in your doGet or
doPost method: HttpSession session = request.getSession(true);
This method takes a boolean argument. true means a new
session shall be started if none exist, while false only returns an
existing session. The HttpSession object is unique for one user session. The
Servlet API supports two ways to associate multiple requests with a session:
cookies and URL rewriting.
If cookies are used a cookie with a unique session ID is sent to the client
when the session is established. The client then includes the cookie in all
subsequent requests so the servlet engine can figure out which session the
request is associated with. URL rewriting is intended for clients that don't
support cookies or when the user has disabled cookies. With URL rewriting the
session ID is encoded in the URLs your servlet sends to the client. When the
user clicks on an encoded URL, the session ID is sent to the server where it can
be extracted and the request associated with the correct session as above. To
use URL rewriting you must make sure all URLs that you send to the client are
encoded with the encodeURL or encodeRedirectURL
methods in HttpServletResponse.
An HttpSession can store any type of object. A typical example is a database
connection allowing multiple requests to be part of the same database
transaction, or information about purchased products in a shopping cart
application so the user can add items to the cart while browsing through the
site. To save an object in an HttpSession you use the putValue
method: ...
Connection con = driver.getConnection(databaseURL, user, password);
session.putValue("myappl.connection", con);
...
In another servlet, or the same servlet processing another request, you
can get the object with the getValue method: ...
HttpSession session = request.getSession(true);
Connection con = (Connection) session.getValue("myappl.connection");
if (con != null) {
// Continue the database transaction
...
You can explicitly terminate (invalidate) a session with the
invalidate method or let it be timed-out by the servlet engine. The
session times out if no request associated with the session is received within a
specified interval. Most servlet engines allow you to specify the length of the
interval through a configuration option. In the 2.1 version of the Servlet API
there's also a setMaxInactiveInterval so you can adjust the
interval to meet the needs of each individual application.
ServletContext AttributesAll servlets belong to one servlet context. In
implementations of the 1.0 and 2.0 versions of the Servlet API all servlets on
one host belongs to the same context, but with the 2.1 version of the API the
context becomes more powerful and can be seen as the humble beginnings of an
Application concept. Future versions of the API will make this even more
pronounced.
Many servlet engines implementing the Servlet 2.1 API let you group a set of
servlets into one context and support more than one context on the same host.
The ServletContext in the 2.1 API is responsible for the state of its servlets
and knows about resources and attributes available to the servlets in the
context. Here we will only look at how ServletContext attributes can be used to
share information among a group of servlets.
There are three ServletContext methods dealing with context attributes:
getAttribute, setAttribute and
removeAttribute. In addition the servlet engine may provide ways to
configure a servlet context with initial attribute values. This serves as a
welcome addition to the servlet initialization arguments for configuration
information used by a group of servlets, for instance the database identifier we
talked about above, a style sheet URL for an application, the name of a mail
server, etc.
A servlet gets a reference to its ServletContext object through the
ServletConfig object. The HttpServlet actually provides a convenience method
(through its superclass GenericServlet) named getServletContext to
make it really easy: ...
ServletContext context = getServletContext();
String styleSheet = request.getParameter("stylesheet");
if (styleSheet != null) {
// Specify a new style sheet for the application
context.setAttribute("stylesheet", styleSheet);
}
...
The code above could be part of an application configuration servlet,
processing the request from an HTML FORM where a new style sheet can be
specified for the application. All servlets in the application that generate
HTML can then use the style sheet attribute like this: ...
ServletContext context = getServletContext();
String styleSheet = context.getAttribute("stylesheet");
out.println("<HTML><HEAD>");
out.println("<LINK HREF=" + styleSheet + " TYPE=text/css REL=STYLESHEET>");
...
Request Attributes and ResourcesThe 2.1 version of the API adds two
more mechanisms for sharing data between servlets: request attributes and
resources.
The getAttribute, getAttributeNames and
setAttribute methods where added to the HttpServletRequest class
(or to be picky, to the ServletRequest superclass). They are primarily intended
to be used in concert with the RequestDispatcher, an object that can be used to
forward a request from one servlet to another and to include the output from one
servlet in the output from the main servlet.
The getResource and getResourceAsStream in the
ServletContext class gives you access to external resources, such as an
application configuration file. You may be familiar with the methods with same
names in the ClassLoader. The ServletContext methods, however, can provide
access to resources that are not necessarily files. A resource can be stored in
a database, available through an LDAP server, anything the servlet engine vendor
decides to support. The servlet engine provides a context configuration option
where you specify the root for the resource base, be it a directory path, an
HTTP URL, a JDBC URL, etc.
Examples of how to use these methods may be the subject of a future article.
Until then you can read about them in the Servlet 2.1 specification.
MultithreadingAs you have seen above, concurrent requests for a servlet
are handled by separate threads executing the corresponding request processing
method (e.g. doGet or doPost). It's therefore
important that these methods are thread safe.
The easiest way to guarantee that the code is thread safe is to avoid
instance variables altogether and instead pass all information needed by a
method as arguments. For instance: private String someParam;
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
someParam = request.getParameter("someParam");
processParam();
}
private void processParam() {
// Do something with someParam
}
is not safe. If the doGet method is executed by two threads
it's likely that the value of the someParam instance variable is
replaced by the second thread while the first thread is still using it.
A thread safe alternative is: protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
someParam = request.getParameter("someParam");
processParam(someParam);
}
private void processParam(String someParam) {
// Do something with someParam
}
Here the processParam gets all data it needs as arguments
instead of relying on instance variables.
Another reason to avoid instance variables is that in a multi-server system,
there may be one instance of the servlet for each server and requests for the
same servlet may be distributed between the servers. Keeping track of
information in instance variables in this scenario doesn't work at all. In this
type of system you can instead use the HttpSession object, the ServletContext
attributes, or an external data store such as a database or an RMI/CORBA service
to maintain the application state. Even if you start out with a small,
single-server system it's a good idea to write your servlets so that they can
scale to a large, multi-server system the day you strike oil.
ResourcesThis article barely scratches the surface on the Servlet
API and all the things you can do with servlets. You can learn more by visiting
some of the Web sites below:
http://java.sun.com/products/servlet/
Sun Microsystem's official Servlet API site
http://java.sun.com/products/servlet/runners.html
Servlet enabled Web servers and add-on servlet engines
http://java.sun.com/docs/books/tutorial/servlets/index.html
The servlet chapter in Sun's Java tutorial
http://www.novocode.com/doc/servlet-essentials/
Novocode's Servlet Essentials, a Servlet programming tutorial
http://www.servletcentral.com
Servlet Central, articles about servlet technology, success stories,
resources and more
http://www.javashareware.com/
A database with many servlets, both freeware with source code and commercial
products
http://www.servlets.com/
Information about the O'Reilly Java Servlet Programming book by Jason
Hunter and William Crawford
Hans Bergsten has worked in the computer industry for 18 years, with everything from IBM mainframes to PCs.
Last year he founded Gefion Software, a software development company focused on platform-independent network-based applications.
The current product line includes a free Servlet Engine named WAICoolRunner and a Servlet-based product for database access named InstantOnline Basic.
|