You may have noticed how some websites have small "powered by Google" search boxes on them. How is this done? Are they making a standard HTTP request to Google, munging the results and displaying them? Have they got their own mirror of Google running on the webserver? Chances are that these pages are produced by using the Google Web APIs.
What are the Google Web APIs?
Although most people will be familiar with the ubiquitous Google websites, a lesser known facility is available to developers - the Google Web APIs. This service allows programmers to use the searching power of Google from an application they are developing. The APIs are presented as a web service, which is a sort of remote proceedure call system developed and maintained by the W3C. The search requests and responses can be conveniently packaged as objects and transported over a HTTP connection via SOAP (Simple Object Access Protocol), which makes using the service very intuitive and convenient. This may all sound very foreign and complicated, but the object management and communication logic is nicely contained "behind the scenes", so that a developer barely realises the complexity of the operations they're controlling.
How does it all work?
Very simply, in order to do a search, a programmer creates a "search request" type object and passes it to Google via SOAP. Google will process this request, perform a search and passes back a "search results" type object, again via SOAP. The programmer can then use the results as they like: process and store them, display them, ignore them, anything!
Obviously, this is a very convenient abstraction for what is a highly complex process. SOAP is absolutely key to the system; without it, passing objects between languages, between computers, would be a mammoth task in itself.
SOAP is based on XML, and is supported by many programming languages. It provides a convenient way for transmitting structured and typed data between hosts, over HTTP. Using HTTP means that minimal firewall re-configuration is required for SOAP traffic to pass between networks freely.
As these SOAP constructs can be arbitrarily complicated, all the functionality and flexibility of a normal Google search can be specified in the search request. For example, to only search for pages written in Estonian, we can assign the value lang_et to the <lr> (language restrict) parameter.
How do you use it?
Before using the APIs, you have to register with Google (for free), providing your email address. When this is complete, you receive a key that entitles you to 1000 API-based searches daily. The developer is responsible for keeping track of how many searches have been performed - there is no way of extracting this information from Google, apart from when you use up all your quota, and start getting SOAP faults.
You will also need to download the developer kit, which includes documentation, a jar file to enable Google API interaction inside a Java programme, quite a lot of example code and a WSDL (Web Service Description Language) file which formally describes the format of requests and responses. This WSDL file ensures that specialised code doesn't need to be written for every language to access the APIs; it can be used to automatically configure SOAP in any language.
As an example of how seamlessly SOAP enables Google's functionality into a application, I present some sample code:
Java - performs a search for the word "moose".
GoogleSearch s = new GoogleSearch();
s.setKey("");
s.setQueryString("moose");
try {
GoogleSearchResult res = s.doSearch();
System.out.println("Results: \n"+res.toString());
} catch (GoogleSearchFault e) {
System.err.print("Error: ");
System.out.println(e.toString());
}
PERL - still looking for that moose.
#!/usr/bin/perl
my $google_key='';
# Configuration data from WSDL file
my $google_wdsl_file = "./GoogleSearch.wsdl";
# We need the extra functionality of this SOAP module
use SOAP::Lite;
my $query_string = "moose";
# Instantiate SOAP::Lite
my $google_search = SOAP::Lite->service("file:$google_wdsl_file");
# Perform the actual Google search
my $results = $google_search ->
doGoogleSearch(
$google_key, $query_string, 0, 10, "false", "", "false",
"", "latin1", "latin1"
);
In this example, the extra parameters being passed to doGoogleSearch are defining properties such as whether SafeSearch should be on, what encoding you want, etc.
Conclusion
There are any number of situations in which using the Google APIs could be helpful, or even essential. Google itself suggests using the APIs to track the proliferation of certain subjects on the web, to perform market research or in an "innovative game", although I'm not sure what form that would take! Also, convenient scripts can be created which use Google from the command line. For example, I have a google_everything script which searches for instances of a certain string (passed as an argument) on everything2. Because of the simplicity of SOAP, it only took a couple of minutes to write, and is very convenient.
For more information, go to http://www.google.com/apis/