The Google Search Appliance
is basically a (or several) google servers
for your intranet
. Information is taken from http://www.google.com/appliance/. I blame them
for any inaccuracy
The basic product is the GB-1001, as single yellow, rackmount linux server running their google search engine, and hosting the necessary webserver for user access. The point of the product is pretty much that it should be plug-and-play, so they don't tell much about what you would need to configure, beyond reassuring the customer that they don't need to know about linux servers. It does boast the following features (cut-and paste from Google):
- Web-based Admin Console
- The admin console supports multiple logins and administrative roles for crawling, serving, and monitoring with an intuitive, easy-to-use interface.
- Categorize searches according to URL patterns.
- Define synonyms for company-specific acronyms or terminology and have those terms be displayed as suggested alternative queries.
- Define matches between URLs and keywords so that the targeted URL displays above the main set of search results.
- Look and Feel
- Search results customizable using XSLT stylesheets.
- Web-based reports show daily and hourly result sets, top queries, special feature usage, and more. Easily export the reports for use in other reporting tools.
- URL Tracking
- Analyzes all crawled content and hosts, making it easy for administrators to identify problematic servers, errors, and sources of content.
- Remote Diagnostics
- Comes equipped with a modem connection for remote maintenance by Google support if necessary.
As to the kind of content that it can search, it can apparently search stuff that is "NTLM
" or "basic authentication
" protected; it can search behind proxies, and it can search NFS
s, etc....if you expose
them using a webserver!
As to the actual products, they come in three flavours:
- The basic GB-1001 can search at a rate of 60 queries per minute, and can be licensed to search either 150,000 or 300,000 documents. The obvious question is: can I hack it to remove the limit? The answer is probably "yes", but I certainly don't know how. I do, however, think that any admin would want this cool yellow box in his rack.
- Next is the GB-5005, which "is the result of five servers featuring automatic internal clustering and failover. Searches up to three million documents at a rate of 150 queries per minute (more than one million queries per week), and can run two independent collections." No word as to relative price, of course, so no idea if it would just be cheaper to buy five single units.
- Lastly, you have the GB-8008, eight clustered servers, "up to 15 million documents, more than 1.4 million queries per day, running five independent collections."
The point of noding
this is not just server-porn
, but to talk about their ideas for supplying the appliance. They claim that by selling an appliance, instead of just software, they reduce the total cost of ownership
, and that it allows them to develop features more quickly. This turns the accepted attitude (although not wisdom) among business-types that "software is cheap".
Google, of course, get around the fact they have sold you a valuable and durable piece of kit by licensing its operation for only two years. Whether or not this sort of thing is actually enforceable is another question, and one I would like to research, but frankly, I don;t have time. In fact, this could well explain why these little babies are only available in the USA. I'd tell you more, but I'd have to pretend I wanted to buy one of these things.