Google Mini Review

As previously promised, I hereby present a mini review of the Google Mini.
Installation
The Mini came in a box conspicuously plastered with “Google” on every conceivable surface, which caused several engineers’ heads to turn as I lugged the box through our lobby. We’d given no advance warning to anyone of our intention to install the GM and so I heard several “Google? What the…?” comments on my way to the server room.
Once opened, the box revealed the Mini, cords, documentation, mounting brackets and a complimentary Google T-shirt(!). The other sysadmins were jealous, I can assure you, as it’s SOP ’round the office that the freebies that come with the equipment you order are rightfully yours.
On to the physical installation. I think Dell rackmount kits have ruined me, as I now find just about every other vendor’s kits to be extremely difficult to use and install. The Google Mini’s setup is no exception. It shipped in the box with “telco”-style mid-chassis mounting brackets installed, which is all well and good if you’re using double post racks (as opposed to the more common quad post setup) but which is a pain for our datacenter. The GM suffers from the same lack of decent mounting hardware that seems to plague most “appliance” servers these days – after stripping off the telco brackets, it took myself and a coworker well over a half an hour to get the rear mounting brackets installed, correctly spaced and an additional five minutes or so to install the finished assembly in the rack.
With the box in place, I plugged in the Ethernet cable and plugged the Mini into an open power socket. Total physical prep time: around an hour and fifteen minutes.
Software Setup
Once plugged in, the Mini ran through a series of self-diagnostics which were completely obscured from the admin’s (read: my) view. When the diagnostics completed, the Mini played a tune using the motherboard’s PC speaker. Unfortunately, I’m not sure what the tune was, as the air conditioning units in our server room run at a sufficiently high dB level so as to obscure the actual tune. It definitely made noise, though.
Google provides a 40 page (plus a single two page addendum) instruction manual on how to get the Mini up and running as well as a support site for their “enterprise” customers. The manual guides users through a step-by-step process, demonstrating how to set up networking on the box, how to tell the Mini what sites to crawl, creating new “collections”, etc.
The first step was to run the initial network configuration, which is accomplished by hooking a laptop directly to the Mini via a crossover cable (included). You iterate through a series of steps, setting such properties as IP address, default gateway, default DNS servers, etc. During this process, you also set the default administrator password. While this portion of the setup was easy enough, it was made overly complicated by Google’s refusal to reveal any details about the box’s hardware configuration, particularly the Ethernet address of the on-board network card, meaning that the Mini adapts poorly (out of the box) when deployed on networks dependent upon DHCP (such as ours). More on this later.
Setting Up A Site Crawl
Once I finished with the network configuration, I unplugged my laptop and returned to my desk, as the Mini was now active on our network and the remaining steps could be completed via a standard web browser. Once again, the instruction manual provided a step-by-step process for setting up an initial “collection” and crawling the content defined by that collection. As suggested by the manual, I pointed the Mini to our external web site (which carries far less content than our internal server – makes for a quicker initial crawl). The first crawl lasted for roughly an hour or so, after which I issued a few test queries and found that the Mini had indeed done a good job of indexing our publicly available content. I then set the Mini’s sights on our internal site and went home for the evening.
Performance/Usage Impressions
Returning to work in the morning I found that the initial crawl and index of our content had finished overnight with an approximate total crawl time + index time equal to around 8 hours (subsequent runs have taken around 2 hours, start to finish). After executing a few test queries and checking the crawl logs, I found that the Mini had, by all appearances, indeed indexed almost every bit of content available on both our internal and external sites, including Power Point, Word, Excel, PDF, PostScript and Open Office documents. It doesn’t seem to support indexing of images (see: images.google.com), but I doubt this will come up much in our “corporate” use of the Mini. It was rather entertaining to note all of the “404 – not found” messages found by the crawl, as there are apparently ghostly remnants of pages floating around on our intranet referencing ~/[username] pages of people who haven’t worked here in years, if not at least a decade.
Google also provides an XML interface to the Mini, meaning that well-formatted GET queries sent to the box will be met with an XML document containing the search results. Our engineers and developers are rather excited, as this could proved many of their applications with the ability to search our entire network for information. We’ll see how that pans out over time.
If you are used to the snappy response times of google.com, you may be in for a bit of a surprise, as the Mini is relatively slow to serve up search results (at least when compared to the Google mothership). Some queries can take up to a few seconds to return any results, although on average the response times are decent.
Shortcomings
The Mini virtually defines “security through obscurity”. I have yet to find any references to the actual hardware contained within the appliance itself and so I don’t know what sort of processor it is running on, nor how much memory the box has, nor hard drive size. Also missing from the provided documentation is any record of the Ethernet address of the box, meaning that, in order for me to get the Mini working properly in our DHCP-managed environment, I had to tail the logfiles on our dhcpd server and watch for new requests and then determine which one was coming from the Mini itself, then add the address to the relevant NIS tables. While this is a minor nit to pick, it represents a fairly inconvenient oversight on Google’s part.
It also appears to be running at a reduced level of functionality, at least when compared with its more expensive Google Search Appliance cousins. As previously noted, it doesn’t index images. It also doesn’t allow for document-level security, meaning that if you only want to expose certain portions of your search results to a specific group of users, you must place a proxy server in front of the Mini and handle access to the different sub-collections via the proxy server. Lastly, while 100,000 documents may seem like a lot, I think that any organization larger than 150 or so would quickly run into that search cap. Of course, when you start getting larger than that, the $15,000 price tag for a larger Google Search Appliance probably doesn’t present as much of a hurdle as it does to smaller organizations like ours.
Summary
While we have only had the Mini on-site for a week or so, I have already had dozens of users offer up glowing comments on its utility. It truly is a powerful tool crammed into a small package and, though it suffers from some notable shortcomings, is well worth the $3,000 for a small-to-moderately-sized organization looking to get a handle on its web content. Overall, I would rate it an 8.5/10.

3 Comments