<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
  <link type="text/css" rel="stylesheet" charset="UTF-8" href="xhtml1-transitional.css"/><title>Web Services and Internationalization</title>
</head>
<body><h1>Web Services and Internationalization</h1><h3>by Addison P. Phillips<br/>Globalization Architect<br/>Quest Software</h3>
<h2>What are Web Services?</h2><p>For the past few years, a lot of attention has been paid to Web services, a new XML-based technology that allows computer systems to interact with each other over the Web. Here's what the <a href="http://www.w3.org/2002/ws/Activity">W3C Web Services Activity Statement</a> says about Web services:</p><div class="exampleOuter"><p>The advent of <a href="http://www.w3.org/XML/"><acronym title="Extensible Markup Language">XML</acronym></a> makes it easier for
systems in different environments to exchange information. The universality
of XML makes it a very attractive way to communicate information between
programs. Programmers can use different operating systems and programming
languages and have their software communicate with each other in an
interoperable manner. Moreover, <a href="http://www.w3.org/XML/">XML</a>, <a href="http://www.w3.org/TR/REC-xml-names/">XML namespaces</a> and <a href="http://www.w3.org/XML/Schema">XML schemas</a> provide useful mechanisms to deal
with structured extensibility in a distributed environment.</p>

<p>Similar to programmatic interfaces available since the early days of the
World Wide Web via HTML forms, programs are now accessible by exchanging XML
data through a Web interface by using <a href="http://www.w3.org/TR/soap12-part1/">SOAP
Version 1.2</a>, the XML-based messaging framework produced by the <a href="http://www.w3.org/2002/ws/Activity#xmlp-wg">XML Protocol Working Group</a>.</p>

<p>Web services provide a standard means of interoperating between different
software applications, running on a variety of platforms and/or
frameworks.</p>

<p>The power of Web services, in addition to their great interoperability and
extensibility thanks to the use of XML, is that they can then be combined in
a loosely coupled way in order to achieve complex operations. Programs
providing simple services can interact with each other in order to deliver
sophisticated added-value services.</p></div><p>As with the early days of the traditional HTML-flavored Web, companies have rushed to stake out "a Web services strategy" and announce new products to take advantage of them. So what are Web services? Why are they crucial? And, of interest to readers of this magazine, are Web services internationalized or can they be?</p><p>Originally the Web provided a way for humans to interact with "resources" on the Internet—content or systems of interest to the end user with a browser. By using technologies such as HTML to provide rich content in a transparent, platform independent manner, standards could drive the adoption of the Internet for a wide range of human interaction, from blogging  to instant messaging to file sharing. </p><p>Some resources are static (like an image file or an HTML page), while others are dynamic, generated by software on the fly to serve a particular need or request. But humans are not the only source of requests, not even for the traditional Web. Anything too boring or repetitive for a person to do is a candidate for automation and all Web services are is a refinement of this automation: allowing machines to talk to one another to exchange information.  .</p><p>A Web service is a piece of software that does one specific task: it takes a specific list of information as input, does its thing, and then, optionally, returns a result. In software terms, a Web service is a "function" or a "method call". Anyone who knows the list of inputs can use the service (that's what makes it a "service").What makes it a "Web" service is that it lives at the end of a URI—a Web address—and you communicate (or "invoke") it by exchanging XML documents with it. You send the inputs in an XML document and, if there is a response, you get an XML document in return. Like a Web page, you are not concerned with the programming language, architecture, operating system, physical location, or other implementation details of the service itself. All you really need to know is how to invoke it and what reponse(s) to expect. You don't even have to know what it does or if the service is actually located right there!</p><p>The system or process you communicate with is called a "<dfn id="provider_dfn">provider</dfn>". You can think of a Web service provider as the equivalent to what a Web server or application server is for HTML. It manages the transport layer (such as HTTP) that you use to send and receive the XML documents. The provider decodes the XML documents you send it and calls the service to do its work for you. Any response or error is then packaged up into an XML document and returned to you.</p><p> Notice that the service itself may not be running inside the provider's process: it may not even be on the same machine and it could even be another Web service. The service may be arbitrarily complex or very simple. Your machine (your machine, the client invoking the service, is called the "<dfn id="requester_dfn">requester</dfn>") doesn't have to know any of the gruesome details. The provider handles calling the service for you. One of the side effects of this is that Web services are "<dfn>composable</dfn>", that is, you can put sets of them together to do some larger task and the result can be treated as a single Web service.</p><p>The XML documents that you use to communicate with the provider use a format called <acronym title="Simple Object Access Protocol">SOAP</acronym>, or "Simple Object Access Protocol". This is an XML dialect that allows you to invoke services and send or receive simple or complex data structures in a standardized format. The basic data structres used are defined using XML Schema, which provides XML representations for common data types such as integers, dates, strings, booleans and so forth. These can be built up in SOAP to form any arbitrary data object that you might need to interact with a service. Because SOAP and XML Schema are standardized, the requester and provider can easily map their own programming language constructs or data objects to the standardized format.</p><p>One thing you still need is a way of knowing where a service lives (its address); what SOAP document (data structures) you need to invoke it; and what, if anything, to expect in return. This information about the service is stored in the Web Service Description, which is usually in the form of an XML document that uses the <acronym title="Web Services Description Language">WSDL</acronym> or the Web Service Description Language. This document describes the "message exchange pattern" (whether to expect a response, for example), as well as URI where the service can be found, and other information such as optional "headers" in the SOAP "envelope" that control things such as security and transactionality: in fact, all the meta data about the service (like what it actually does or purports to do).</p><p>Another thing the WSDL document will describe is what happens when things to wrong. In Web services these are called "faults" and they are the equivalent to exceptions or errors in your regular programming language. Faults can happen as a result of the service failing or as a result of a problem with the request itself (such as sending the wrong SOAP document to the service provider).</p><p>In order to use the Web service, you generally have to have access to the WSDL file because it contains the information about how to configure your "requester" to call the right URI with the right SOAP document. The process of finding this information is called "discovery" and it may be automated or a manual operation.</p><img src="webservice_arch.jpg" alt="Web Services Architecture Diagram" class="noResize"/><p>So, to summarize, in a Web service, different software processes can access one-another. Because they use open standards such as XML, XML Schema, SOAP, WSDL, and so forth, it's possible to connect very different kinds of systems or operating environments without introducing a layer of specialized "translation" middleware, writing custom code, or purchasing a proprietary solution. This means that companies can develop services and then  powerfully integrate their systems without regard for where the data is stored or the particular nuances of its storage format.
And the information about these services, stored in the Web Service Descriptions, can be used to "compose" new services or applications that take advantage of an organization's various software resources, no matter where or how they might be hosted.</p><p>In 2002, the W3C chartered a Task Force within the Internationalization Working Group to look into globalization of Web services. As the chair of this Task Force, I had an opportunity to review the state of Web services and consider how internationalization and localization might be affected. The complete set of conclusions and requirements were recently published as Working Group Notes by the W3C. Some of these conclusions were incorporated into the Internationalization Core Working Group's new charter, approved by the W3C in late 2004.
</p><p>Web services provide a way for systems to “componentize” functionality and to provide transparent access to diverse systems.
Anything you can program a computer to do, from a simple function (add two integers together) to a complex business process (process purchase order), can be a Web service.</p><p>For all their apparent novelty, Web services are really just a new package for old ideas about building distributed systems. Technologies such as CORBA have long promised to deliver similar benefits, using components to assemble virtual applications from “building blocks”. These older technologies have had limited success and have generally been restricted to large enterprise implementations because of the scale of the commitment required to pull distributed systems together and because of interoperability woes. Groups with laser-like focus could sometimes recover the necessary investment and exploit the benefits that resulted from these technologies, but average organizations are not well set up to achieve these kinds of results.
Often there was a "lock-in" effect that made it difficult to exploit the full range of services available within an organization.</p><p>Web services are interesting and successful because they are standardized, open, and based on technologies that are accessible. Interoperability has been the singular key focus of Web services early development: vendors have formed consortia and had massive "code bake-offs" to ensure interoperability above all. This gives us the promise of distributed systems that can be secure, transactional, managed, scaled, and, yes, internationalized on an enterprise scale, while integrating software and solutions from nearly any vendor, no matter how small. And it promises to deliver these benefits to anyone who can plug in to the network, not just corporate behemoths.
</p><h2>Too Many Standards</h2><p>Web services can do all sorts of things, from the trivial to the complex. They can be chained together into transactions or used to provide integration points between different pieces of software, by  calling other Web services or by providing a wrapper around some existing system's proprietary API. In fact, there is a whole realm of Web service technologies, such as WS-Choreography, which deal with building business processes and transactions out of collections of Web services.</p><p>At this point Web services become a bit less elegant. Effective Web services, especially in business processes or transactions, require all of the hallmarks of enterprise systems: security, quality-of-service, reliability, speed, transactionality, and so forth. These items have each been addressed as individual concerns, with specific standards designed to address each—so there is WS-Security and WS-ReliableMessaging and WS-Addressing and so forth. </p><p>When creating a specific service with specific capabilities you then rely on <q>service composition</q>. That is, multiple features can be applied to the same service without modifying the service itself. For example, a service can reference both the WS-Secuirty and WS-ReliableMessaging standards without the two standards having to know about one another. You use various standards together to make a collection of the features you need. </p><p>Unfortunately there are a whole lot of standards. So many in the vast, amorphous blob called <q>WS*</q> that it becomes difficult to deploy and manage Web services because of the need to support many, sometimes conflicting, options.
</p><p>Web services vendors have turned their attention to this problem and there are proposals at the W3C and elsewhere to create more a modular, extensible architecture for Web services. Some of the proposals use existing Web services technologies and forms (WS-Policy, a proposal from IBM and Microsoft, for example). Others propose to use Semantic Web technologies such as RDF and OWL to augment or replace WSDL. Managing constraints and capabilities in Web services is a key issue, one that currently ignites passions throughout the industry.</p><p>Notably absent in this WS-Goulash? "WS-Internationalization"</p><h2>Web Services and Internationalization: Do We Need It?</h2><p>At first glance, Web services seem internationalized to start with: they  don't have a direct user interface—humans aren’t really reading the SOAP documents directly. Web services originally were supposed to represent a kind of low-level API call. Since Web services use a locale-neutral data representation in the form of XML Schema, it is tempting to think that internationalization doesn't play a major role in a Web service interaction; Web services merely pass objects in XML documents between functions in software. and internationalization is the problem of the service's author: the data still has to be formatted for display to the end user. This model for Web services neither inhibits nor especially encourages any particular internationalized behavior. </p><p>On the other hand,  Jonathan Schwartz, COO of Sun Microsystems, recently opined in his <a href="http://blogs.sun.com/roller/page/jonathan/20050528#random_cool_things">blog</a>: <q>On the one hand, I really enjoy seeing the world. It's becoming more true by the day that the globalization of network standards is allowing the <em>localization</em> of the internet itself. A web server in the US is the same as a web server in Brazil. But a web <em>service</em> based in the US is unlikely to succeed against its local Brazilian counterparts without comprehending local culture. There's nothing like being there to understand the market.</q></p><p>Making your software comprehend local culture is at the very heart of internationalization. But most programming languages and operating environments use proprietary internationalization models that assumes that the user's preferences (generally embodied as an something called a <q>locale</q>) are created and maintained by the operating environment. This paradigm worked well in the era of personal computing, but that era effectively ended with the rise of the Internet and, in particular, the advent of the Web in 1995. Steve Jobs's famous "one person, one machine" has become "one person, many machines" and our various "NLS subsystems" and creaky locale models have not developed into a single, coherent  internationalization model for distributed computing.</p><p>For example, traditional Web technologies such as .Net or J2EE provide proprietary ways to exploit international support in Web applications. In an ASP.NET page, the <code>Culture</code> (which you can think of as the locale) can be set from the HTTP Accept-Language header values chosen by the user in the browser. In a J2EE web application (for example, a JSP page or a servlet), the <code>java.util.Locale</code> object of the user is set from the same header and can be retrieved from the <code>HttpServletRequest.getLocale()</code> method. These capabilities exist in Web servers because the underlying code that makes up an application usually wasn't written with a Web page in mind, as we'll see a bit later in this article.</p><p>The reason, of course, that a Web page needs the locale is so that it can load content in the right language and format dates, numbers, and so on correctly. Since Web services don't have a "user interface", why would they need to know the locale? Because ultimately Web services are software functions that serve end users. A particular Web service may not do something that requires a locale and, in fact, this might even describe most well-internationalized Web services, but there are some cases where they do need the locale information. </p><p>Recall that the actual Web service can be any part of an existing application, say a  function call. If the application were internationalized originally, it would use the internationalization model for the programming language and platform where it was developed and is running (a Culture if you're a C# program and a Locale if you're a Java program, and so on). The actual function relies on the locale or other international preferences being available in the application's environment or in some runtime variable, just as in the Web examples above. Wrapping this function in a Web service breaks internationalization because the provider doesn't know that the code inside the function it is calling needs the locale information. </p><p>Besides, the provider doesn't currently have a way to obtain that information independently from the requester--even though it needs the same information to do things like lookup the human-readable text for errors ("faults" in Web services) that it encounters, even if it never gets around to invoking a service on behalf of the requester. The provider is "supposed" to provide the requester's preferences (which it does not know) in the operating environment (setting the DefaultThreadCulture for a C# program, for example).</p><p>So internationalizing the Web service provider (as well as the Web service itself) requires that  we have a way that vendors can use to "activate" their proprietary locale model and exchange international preferences. Let's call this "WS-Internationalization" for a moment. WS-Internationalization should provide  locale and other preference identifiers that are as clear and platform neutral as possible. And there should be the possibility of vendor specific extensions so that vendors can leverage their platform's specific capabilities as well.</p><p>The W3C Internationalization Working Group studied this problem and concluded that there are four basic patterns that a Web service might use that affects its need for locale information:</p><ol><li><strong>Locale Neutral.</strong> Most services are not particularly locale affected. These services can be considered "locale neutral". For example, a service that adds two integers together is locale neutral.</li><li><strong>Data Driven.</strong> The service implementation may be locale affected, but the locale is not under programmatic control. For example, a service that queries a database will return records in the collation order of the database, not in an order controlled by the service itself.</li><li><strong>Service Determined.</strong> The service will have a particular setting built into it. As in: this service always runs in the French for France locale. Or, quite commonly, the service will run in the host's default locale. It may even be a deployment decision that controls which locale or preferences are applied to the service's operation.</li><li><strong>Client Influenced.</strong> The service's operation can use a locale preference provided by the end-user to affect its processing. This is called "influenced" because not every request may be honored by the service (the service may only implement behavior for certain locales or international preference combinations).</li></ol><p>Each of these patterns may apply to a service or an aspect of the service. By describing the "international policy" separately for a service or sevice aspect, different end-points or bindings of the same service can provide different locale-affected behavior or different localizations. Or the logic might differ in a culturally sensitive manner. </p><p>Once one posits services that can be locale-affected and which obey policies in their Web service descriptions, it is a short journey to composing a service's international constraints and capabilities with other features. For example, a service might be provisioned so that specific "binding" (combination of URI and various options) returns messages in a specific language. Or the processing rules might differ based on the language requested—requests for French might be result in the request going to a server in France, for example.</p><h2 id="appendix_a">Internationalizing a  Web Service</h2><p>If Web services leverage proprietary internationalization models they risk exposing the underlying implementation in a way that might break interoperability. But if Web services also require the same international capabilities as traditional applications in order to provide language and culturally affected processing,  then how can Web services have both internationalized behavior and platform neutral interoperability?</p><p>Creating a non-proprietary internationalization model actually requires very little standardization. The Internationalization Working Groups believes that this work includes:</p><ul><li>Standardized identifiers for locale and other international preferences, such as collation. This is already a problem for other technologies</li><li>Standardized SOAP and WSDL descriptions (described as features) for invoking internationalized services and passing 
international preferences to 
them.</li><li>A way of describing the deployment, implementation, or invocation  policy of a service.</li></ul><p>Let's consider an example of a simple Java method converted to a Web service. For simplicity we'll write a little "internationalization demo" that converts a date to a human-readable string. Here's how such a method might be implemented:</p><div class="example"><pre>public static String getDateString(Date date) {
   if (date == null) date = new Date();
   DateFormat df = DateFormat.getDateTimeInstance(DateFormat.LONG, 
      DateFormat.MEDIUM);
   return df.format(date);
}</pre></div><p>In this example, a Date object is formatted as a String using the default locale of the JVM where the program is executing. The locale setting of this method is not part of the service's signature and wouldn't appear in parameter list necessary for invoking this method.</p><p>This method does require some modifications if we wish to use it in a distributed setting, say a J2EE program:</p><div class="example"><pre>public static String getDateString(Date date) {
   if (date == null) date = new Date();
   Locale userLocale = request.getLocale();
   DateFormat df = DateFormat.getDateTimeInstance(DateFormat.LONG, 
      DateFormat.MEDIUM, userLocale);
   return df.format(date);
}</pre></div><p>The change to the method's implementation is minimal: the developer must know to get the locale from the HTTPServletRequest object, in this case. Note that a comparison with C# is instructive: there is no need to change code in .NET, since the service provider sets the thread "culture" (the Microsoft equivalent of a locale) from the request.</p><p>Now there is something mystical about how the J2EE system got the locale into the method. The Java proprietary Locale object was set in the Request object by processing the user's HTTP Accept-Language header and doing some (proprietary) mapping to Java's representation (the Locale object). The locale can then be accessed by a Java program. Although language tags and Locale identifiers are quite similar in may respects, they are distinct: a mapping does take place and it differs from the one a C# or C program might do.</p><p>Both of the examples above produce localized behavior using information maintained by the application's environment. The developer may need to do something in the body of the code to access this capability (as in the second example), but the method signature remains the same. One might be tempted to write the following code, for example:</p><div class="example"><pre>public static String getDateString(Date date, Locale userLocale) {
   if (locale == null) locale = "";
   if (date == null) date = new Date();
   DateFormat df = DateFormat.getDateTimeInstance(DateFormat.LONG, 
      DateFormat.MEDIUM, userLocale);
   return df.format(date);
 }</pre></div><p>The problem here is that the Locale object takes three strings in its constructor and, since you can't stick a Locale object into an XML Schema document directly, will request those underlying strings in the WSDL. Another Java program knows what "language", "country", and "variant" are and what the valid values are, but to any other program these are three strings with no way to fill them in accurately. It is important to recognize that changing the method signature to get localized behavior is a poor implementation choice. A little more work might improve that, but  it requires the service author to create gallons of unusual code:</p><div class="example"><pre>public static String getDateString(Date date, String locale) {
   if (locale == null) locale = "";
   if (date == null) date = new Date();
   Locale userLocale;
   if (locale.length()==0) {
      userLocale = Locale.getDefault();
   } else {
      // parse string called "locale" and create a Locale object, about 20 lines of code.
   }
   DateFormat df = DateFormat.getDateTimeInstance(DateFormat.LONG, 
      DateFormat.MEDIUM, userLocale);
   return df.format(date);
 }</pre></div><p>Because thinking about Web service internationalization incorporates the idea of policies and capabilities directly, internationalization provides many specific use cases that illustrate the problems and potential for work in this area.</p><h2>Conclusions</h2><p>The W3C has recognized the need for work on this problem. The recently rechartered Internationalization Core Working Group has two work items that it is developing to address the need for Web service internationalization. </p><p>The first is to describe Language and Locale Identifiers for the Web. Work at the IETF on revising and updating RFC 3066 (the basis for language identifiers in many applications) may result in an update to that standard that maps more closely to language and locale models in modern operating systems and programming languages. It also dovetails with work at the Unicode Consortium on a Common Locale Data Repository and an XML dialect called <acronym title="Locale Data Modeling Language">LDML</acronym> that can be used to describe specific locale settings. Using these works as a base, the Internationalization Core Working Group can describe a standard for using language tags in W3C technologies and how to extend or augment language identification to describe specific locale information, as needed for Web services.</p><p>The second deliverable is to create a standard for Web services Internationalization. This work necessarily depends on the first deliverable and probably won't be seriously underway until late in 2005, but it should result in new standards that Web services providers and vendors can leverage to provide a richly internationalized Web services environment.</p><p>Creation of a WS-International, complete with standardized "international preference" and policy identifiers will allow Web services to provide interoperability and localized behavior. Combined with standards for describing complex service behavior and configurations (constraints and capabilities), the creation of rich composite applications can be assured.</p><div class="footer">  <p><a href="http://jigsaw.w3.org/css-validator/">
  <img style="border:0;width:88px;height:31px;float:right" src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"/></a>
   <a href="http://validator.w3.org/check/referer">
   <img style="border:0;width:88px;height:31px;float:right" src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0!"/></a>
   <a href="http://www.unicode.org"><img style="width:88px;height:31px;background:inherit;border:0;float:right" src="./UniEncWhiteBord.gif" alt="Unicode Encoded"/></a></p>
</div></body>
</html>
