As previously promised, I hereby present a mini review of the Google Mini.
Installation
The Mini came in a box conspicuously plastered with “Google” on every conceivable surface, which caused several engineers’ heads to turn as I lugged the box through our lobby. We’d given no advance warning to anyone of our intention to install the GM and so I heard several “Google? What the…?” comments on my way to the server room.
Once opened, the box revealed the Mini, cords, documentation, mounting brackets and a complimentary Google T-shirt(!). The other sysadmins were jealous, I can assure you, as it’s SOP ’round the office that the freebies that come with the equipment you order are rightfully yours.
On to the physical installation. I think Dell rackmount kits have ruined me, as I now find just about every other vendor’s kits to be extremely difficult to use and install. The Google Mini’s setup is no exception. It shipped in the box with “telco”-style mid-chassis mounting brackets installed, which is all well and good if you’re using double post racks (as opposed to the more common quad post setup) but which is a pain for our datacenter. The GM suffers from the same lack of decent mounting hardware that seems to plague most “appliance” servers these days – after stripping off the telco brackets, it took myself and a coworker well over a half an hour to get the rear mounting brackets installed, correctly spaced and an additional five minutes or so to install the finished assembly in the rack.
With the box in place, I plugged in the Ethernet cable and plugged the Mini into an open power socket. Total physical prep time: around an hour and fifteen minutes.
Software Setup
Once plugged in, the Mini ran through a series of self-diagnostics which were completely obscured from the admin’s (read: my) view. When the diagnostics completed, the Mini played a tune using the motherboard’s PC speaker. Unfortunately, I’m not sure what the tune was, as the air conditioning units in our server room run at a sufficiently high dB level so as to obscure the actual tune. It definitely made noise, though.
Google provides a 40 page (plus a single two page addendum) instruction manual on how to get the Mini up and running as well as a support site for their “enterprise” customers. The manual guides users through a step-by-step process, demonstrating how to set up networking on the box, how to tell the Mini what sites to crawl, creating new “collections”, etc.
The first step was to run the initial network configuration, which is accomplished by hooking a laptop directly to the Mini via a crossover cable (included). You iterate through a series of steps, setting such properties as IP address, default gateway, default DNS servers, etc. During this process, you also set the default administrator password. While this portion of the setup was easy enough, it was made overly complicated by Google’s refusal to reveal any details about the box’s hardware configuration, particularly the Ethernet address of the on-board network card, meaning that the Mini adapts poorly (out of the box) when deployed on networks dependent upon DHCP (such as ours). More on this later.
Setting Up A Site Crawl
Once I finished with the network configuration, I unplugged my laptop and returned to my desk, as the Mini was now active on our network and the remaining steps could be completed via a standard web browser. Once again, the instruction manual provided a step-by-step process for setting up an initial “collection” and crawling the content defined by that collection. As suggested by the manual, I pointed the Mini to our external web site (which carries far less content than our internal server – makes for a quicker initial crawl). The first crawl lasted for roughly an hour or so, after which I issued a few test queries and found that the Mini had indeed done a good job of indexing our publicly available content. I then set the Mini’s sights on our internal site and went home for the evening.
Performance/Usage Impressions
Returning to work in the morning I found that the initial crawl and index of our content had finished overnight with an approximate total crawl time + index time equal to around 8 hours (subsequent runs have taken around 2 hours, start to finish). After executing a few test queries and checking the crawl logs, I found that the Mini had, by all appearances, indeed indexed almost every bit of content available on both our internal and external sites, including Power Point, Word, Excel, PDF, PostScript and Open Office documents. It doesn’t seem to support indexing of images (see: images.google.com), but I doubt this will come up much in our “corporate” use of the Mini. It was rather entertaining to note all of the “404 – not found” messages found by the crawl, as there are apparently ghostly remnants of pages floating around on our intranet referencing ~/[username] pages of people who haven’t worked here in years, if not at least a decade.
Google also provides an XML interface to the Mini, meaning that well-formatted GET queries sent to the box will be met with an XML document containing the search results. Our engineers and developers are rather excited, as this could proved many of their applications with the ability to search our entire network for information. We’ll see how that pans out over time.
If you are used to the snappy response times of google.com, you may be in for a bit of a surprise, as the Mini is relatively slow to serve up search results (at least when compared to the Google mothership). Some queries can take up to a few seconds to return any results, although on average the response times are decent.
Shortcomings
The Mini virtually defines “security through obscurity”. I have yet to find any references to the actual hardware contained within the appliance itself and so I don’t know what sort of processor it is running on, nor how much memory the box has, nor hard drive size. Also missing from the provided documentation is any record of the Ethernet address of the box, meaning that, in order for me to get the Mini working properly in our DHCP-managed environment, I had to tail the logfiles on our dhcpd server and watch for new requests and then determine which one was coming from the Mini itself, then add the address to the relevant NIS tables. While this is a minor nit to pick, it represents a fairly inconvenient oversight on Google’s part.
It also appears to be running at a reduced level of functionality, at least when compared with its more expensive Google Search Appliance cousins. As previously noted, it doesn’t index images. It also doesn’t allow for document-level security, meaning that if you only want to expose certain portions of your search results to a specific group of users, you must place a proxy server in front of the Mini and handle access to the different sub-collections via the proxy server. Lastly, while 100,000 documents may seem like a lot, I think that any organization larger than 150 or so would quickly run into that search cap. Of course, when you start getting larger than that, the $15,000 price tag for a larger Google Search Appliance probably doesn’t present as much of a hurdle as it does to smaller organizations like ours.
Summary
While we have only had the Mini on-site for a week or so, I have already had dozens of users offer up glowing comments on its utility. It truly is a powerful tool crammed into a small package and, though it suffers from some notable shortcomings, is well worth the $3,000 for a small-to-moderately-sized organization looking to get a handle on its web content. Overall, I would rate it an 8.5/10.
WordPress Theming Help
If you’re a WordPress user in need of a new look for your blog and haven’t checked out Urban Giraffe’s WordPress Theme Dissection (parts two and three are also available), then you’re missing out on some valuable advice.
If you’re plum out of ideas, hit those tutorials and then head over to OSWD and pull down some inspiration.
Grrrrr.
This is ridiculous. I’ve been looking for a multi-user, multi-platform calendaring system for a long, long time and have yet to find a decent solution. How stinking hard can it be?
I don’t care if it’s Open Source or commercial – I want something that:
- Runs and syncs on Solaris, Mac OSX, Linux and Windows
- Rich client (read: not web-based) on each of the above-mentioned operating systems
- Allows for server-side configurable reminders
- Allows for reservation of resources, such as conference rooms, overhead projectors, loaner laptops, etc.
- Allows for groups of users in order to easily schedule large-scale meetings
- Allow for proxies, so administrative assistants can set up meetings for their bosses
And really, I don’t need any more than that. I don’t need a “groupware” solution. We already have task tracking, code repositories, file sharing and webmail. Oracle Calendaring is blatant overkill, Meetingmaker stinks to high heaven and won’t do server-side reminders, Open Exchange simply won’t work for me, OpenGroupware (egah!) is, once again, overkill, Hula Server looks nice but isn’t much more than an email server at the moment.
*sigh*
It can’t be difficult, can it? Why doesn’t someone write a SOAP-based server that would let anything hit it? Why must we always involve the ever-so-vile WebDAV?
Gah. I’m beyond frustrated.
Self-Hosting Web Developers Take Note: XAMPP Is Heaven-Sent
Over the past 5 years or so, a term has arisen in the web-savvy world: LAMP-powered. In the generic case, LAMP stands for “Linux – Apache – MySQL – PHP (the “P” can also stand for Perl). This combination of operating system, webserver, database and programming language[s] allows for an incredible range of dynamic web content and incredible flexibility for users and developers alike.
Most, if not all of the major Linux distributions make installing and setting up Apache, MySQL and PHP/Perl extremely easy, either at OS install time or at a later date. Those on Macintosh, Solaris or Windows-based systems, however, often have a rougher go of it. It’s often a relatively aggravating task to track down the relevant packages and get everything installed, especially for Mac and Win users (Solaris users can always fall back on SunFreeWare or Blastwave for prefab packages). If you’re looking to host dynamic content on a *AMP/P platform, you’re in for a rough road.
This is where XAMPP comes in. We’ve been over the “AMP/P” portion before, and, as any computer programmer will tell you, “X” tends to make a good stand-in variable in just about any situation. In this case, the X replaces “Mac OSX”, “Windows”, “Solaris” or “Linux” – XAMPP is an Apache/MySQL/PHP/Perl distribution in a single package for each of those OSes. It’s incredibly slick and, in my case, saved me several hours of labor in the past week. I was enlisted to install an AMP setup on one of the Windows machines at my wife’s office in order to allow users in her office to generate reports, catalog information, track inventory, etc. and instead of having to hunt down and configure Windows distributions of each tool, I simply installed one instance of XAMPP. I was up and running within about five minutes.
If you have any interest in serving up dynamic content easily and quickly, with a minimum of administrative and installation hassle, give XAMPP a look. Heck, if you’re tied to Windows but hosting your website on a Linux server, you could even use your ‘Doze box as a testbed for your designs and applications before deploying them to your hosted environment. It almost makes the whole process too easy.
Side note: For those of you looking for a good Windows-based PHP IDE, give Maguma Open Studio a look. They recently open-sourced their project and turned it into a SourceForge project. Hopefully that means that a Linux version is forthcoming, as Open Studio is an excellent IDE for PHP-based projects.
Snifty Free Font Blog
If you’re a graphic/web designer, Fontleech may interest you. They’re trying to blog/maintain a list of freely available fonts and typesets out there.
Looks like they’re off to a good start.
Web Designers, Your Book Has Arrived
Any web developer worth his or her salt these days has familiarized themselves with CSS, which adds great depth and complexity to the site designs and layouts possible in modern browsers. Those same developers may well be familiar with the CSS Zen Garden, which serves as an incredible example of the flexibility and power offered by CSS.
What they might not know is that Dave Shea, creator of the Garden, has written a book that details the many ways of using CSS to create beautiful yet standards-compliant websites.
I think I’ll be picking this one up as a reference for work…