When I arrived in the San Francisco airport last week, I saw a large banner for GoToMeeting over the terminal entrance. Citrix is a partner/competitor (yes, both) to Lou’s company, so I pointed the banner out with amusement. As we walked towards the AirTrain, Oracle and McAfee logos greeted us as well.

It was a swift, albeit strange, reminder than SF is a hub for technology. Oracle’s appearance is what put us in ’strange’ territory: I often think the bay area is more for young startups preferring free MySQL and Postgres. In my mind, New York is the Oracle market, skewing towards mature, established companies with the need and money for enterprise-level infrastructure, administrators, and licenses. I realize that is probably why O is advertising in SF and not in New York — SF needs the convincing — but I was still surprised to see that red logo so quickly. I would have expected the Blinkx.com video search billboard on Rt 80 in its place.

My observations there aside, there was a particular SFO banner stating the following:

“10 of the Top Ten Hotels use Oracle.”

I’m sure this is true, but I wonder why Oracle didn’t say ‘exclusively’ or ‘for critical systems’. It was probably a marketing decision, based on length, space, and psychology, but statements of this type are definitely a little grey. What if the 10th hotel uses Sybase*, with only a single application that they bought externally lying atop an Oracle engine?

I was once in a position to choose my department’s survey software. Our current vendor was small, Linux-based, and clunky, with questionable amounts of revenue, but a solid client list and the ability to script. We could upgrade to get what we needed, or we could go elsewhere to a better, shinier application. I spoke to Vovici, an industry leader, who had an asp.net-based web interface and millions of dollars of contracts. Their software was slick, easy, and fulfilled our requirements, but would not let us do any customization. It was also cheaper.

At one point, in one of our calls, my Vovici representative asked what we currently used. When I told him, he said he had never heard of it and went to their site to check them out, then noted his surprise at seeing large company X on their client list. Company X already did millions of dollars of work with Vovici. He noted that X was so big that perhaps some random department gave them a survey once or twice.

It seems likely. I don’t think our vendor was lying. I don’t think Vovici was lying. It’s possible that our vendor had no idea X did business with other survey companies all the time. If they asked, that department in X might not know or be willing to say anyway.

But it is misleading. Any time some company lists only a client list, or a number of clients, with no qualifications like exclusivity, contract size, relationship length, etc., it’s like a list of Facebook friends. People just collect them to have them. You have no idea if they were bar acquaintances or Top Ten.

This is obvious, of course. You always take such things with a grain of salt. My real point is that I don’t see malice or even a mistake in this case. What could our scrappy Linux vendor do? A qualified list that was arbitrarily shorter than their competitors?

We ended up sticking with them. I was pleased because I felt guilty for supporting an MS-based, non-customizable, corporate software over a Linux-based, small-biz one. Vovici was a better product, but there were other factors (administration, maintenance, and infrastructure) outweighing it. As for the client list, it was only important in that we were already on it.

*It is telling I had to look up “enterprise database” to get Sybase. Oracle is the only one I could immediately think of.

Posted by sitarah, filed under Uncategorized. Date: June 26, 2008, 7:09 pm | No Comments »

03  Jun
MySQL passwords

I set up a friend’s account on my server last night. It was the first step towards providing my acquaintances with free, unrestricted web hosting. Slashdot tells me this is a bad idea.

It turned out that, despite having dabbled in MUD development and his own server, he didn’t know Unix commands (just err, “DOS”). I’ll refrain from comment because said friend knows I write here, but I was surprised to find myself explaining the basics. Luckily, we were just installing wordpress, so his interaction with the command line was minimal and will probably be nill in the future.

I went through ls, ls -l, rm, rmdir (no rm -rf — too complicated to explain), permissions (but not directory permissions) mkdir, cd, pico, and relative paths. He got putty and WinSCP. (I like ttssh better but I thought the installation of the ssh piece was one extra step to an already overfull process.) I also showed him the wonder of wget and the despairs of case sensitivity.

Despite all that, mySQL actually provided the most troublesome step. I set him up with a strong mySQL password but still wanted him to change it himself. To my amazement, there is no way to do this in phpmyadmin. Apparently, you can do it in cpanel, but we don’t use that due to its poor updating system. The web would lead me to believe my only alternatives are the following:

  • Change it as root (then I still know the pw, and I don’t want to.)
  • Use sql to update the user table (only the user knows pw but he has access to user table and everyone else’s.)
  • Use the mysql command to do it, but that requires the command line and isn’t appropriate for my friend described above.
  • Delete the user and remake him with new pw. (I still know the password. As an aside, wth? That isn’t a valid suggestion. I’m looking at you)

It seems like a simple enough addition to phpmyadmin. Write a script that accepts the user’s current password, validate against the user table, then ask for the new pw, and run the sql to update it. I can, and probably will, add it myself, but why hasn’t it been added already? It makes me wonder if I am missing something, like the link clearly shown in this picture.

Granted, people don’t change their database passwords very often, if ever, but the option to give the user a gibberish pw generated by phpmyadmin seems to agree with the concept of ‘user changes it to something meaninful afterwards.’

I have a vague memory from cpanel that there was some manager like mysqllite that handled this sort of thing? but I can’t find it.

Posted by sitarah, filed under Uncategorized. Date: June 3, 2008, 8:54 am | No Comments »

31  May
Google favicon

The Google favicon (the little picture next to the browser url) changed from the uppercase G to the lower case g. I’m noticing it is a little similar to a figure 8/infinity sign, but if that was intentional, I think they’re out of luck. My first thought is Apple’s Infinite Loop street.

I’m not sure I like it. I usually equate cursive writing with archaic, fussy, and classy, whereas the G in a square was simple, much like the Google service itself. I think it is emphasized by the blue font — the icon-view of fonts in Windows is a single, fancy letter, usually in blue.

Posted by sitarah, filed under Uncategorized. Date: May 31, 2008, 12:27 am | No Comments »

I discovered 2 months ago that a large site I work on had not been indexed by Google for A Long Time. When I say not indexed, I mean that, when I search for this site, pages on a certain subdomain had a Cached date from many months ago.
I’ve learned a lot while trying to fix this problem, so I thought I’d share it here.

If your cached date is significantly in the past, then one of two things is happening:

  1. Google is not crawling
  2. Google is crawling but not indexing

Google is not crawling

First, check your robots.txt. If you have it, it could have an error in it. Even if you do not have it, it could still be your problem.

Sign up for Google Webmaster Tools, verify your site, and see if Google reports any errors in Tools > Analyze robots.txt.

If it reports errors, then your path is obvious. If it does not, then look at the Status.

Network unreachable: robots.txt unreachable
Your ISP may be blocking the Googlebot. It can make so many crawling requests than an ISP may mistake them for ‘abuse’. Talk to your ISP or system administrator to see if they are blocking IPs in the 66.249. range. (You should research other IP blocks for the Googlebot, as they could change any time.)

If your ISP swears this isn’t the case, but you don’t see the Googlebot in your server logs, keep asking them until they’re sure and extremely sick of you. These logs can be found in /var/log/httpd on an Apache system.

Please check back later.
If Webmaster Tool->Diagnostics->Web Crawl reports 0 errors as well, but you have URLs timing out, look to see if it is the robots.txt timing out. If you see this and/or your httpd logs report the Googlebot is hitting your server, then the Googlebot is not blocked, but it can’t access robots.txt.

When Google requests robot.txt, it can only understand two things:

  • found and read
  • not found

You may think you are covered if you do not have a robots.txt. That counts as not found, right? Only if it returns a 404 header.

If you have access to your *nix server’s command line, use wget to fetch the website in question. wget will access the website and let you see the information that the browser normally gets. It will then save the page’s content into a file.

wget www.yoursite.com/robots.txt
--13:59:51-- http://www.yoursite.com/robots.txt
Resolving www.yoursite.com...
Connecting to www.yoursite.com| |:80... connected.
HTTP request sent, awaiting response... 404 Not Found

A 404 Not Found means the file is not there. This page will either say 404 on it (you’ve seen them before, trust me) or it will be a custom error page with your branding and a friendly message.

A 404 is an appropriate response for a Googlebot looking for robots.txt. Strangely, my site did NOT return a 404 in this case. See the below.

wget www.yoursite.com/robots.txt
--13:54:13-- www.yoursite.com
Resolving www.yoursite.com...
Connecting to www.yoursite.com| |:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: www.yoursite.com/error.html?errorpage=robots.txt
[following]
–13:54:14– www.yoursite.com/error.html
Reusing existing connection to www.yoursite.com:80
HTTP request sent, awaiting response… 200 OK

A 302 is a temporary redirect, as denoted by the “Moved Temporarily” response. For instance, let’s say you have a store which is closed for a 2-week inventory process. You might use a temporary redirect to point www.yoursite.com/store to a message about the 2-week unavailability. After 2 weeks, you would remove it.

If your site has moved permanently, then you should use a 301 Permanent Redirect. The best example is when you move your site from www.domain.com/yoursite to www.yoursite.com. You want anyone going to www.domain.com/yoursite to be automatically transferred to its new home at www.yoursite.com for the foreseeable future.

Google treats 302s and 301s very differently, as you’ll see here:

302 (Moved temporarily) The server is currently responding to the request with a page from a different location, but the requestor should continue to use the original location for future requests. This code is similar to a 301 in that for a GET or HEAD request, it automatically forwards the requestor to a different location, but you shouldn’t use it to tell the Googlebot that a page or site has moved because Googlebot will continue to crawl and index the original location.

In my example above, the use of a 302 redirect to an error page that doesn’t even return a 404 is telling Google:
“Go here to find robots.txt, except it’s not here, but I say it is a success (200). By the way, don’t index it, either — use the old page (302).” This is an entirely unacceptable response. It doesn’t meet the criteria stated above:

  • found and read
  • not found

The robots.txt file was not found, but there was no 404 returned. A 200 was returned, but this wasn’t the robots.txt, and the 302 means that Google should ignore that page anyway. The end result: every request for robots.txt ‘times out’ because the Googlebot can’t understand whether it has permission to index the site. It crawls, but it doesn’t save, and the site’s cache date sits for months. It turns out that our switch to this 302 redirect system was exactly timed with when Google stopped indexing.

Your options are:

  1. Put up a blank robots.txt. This will count as found and read.
  2. Put up a non-blank robots.txt with whatever criteria you want. This will count as found and read.
  3. Change your error system to return a 404. You can still have a custom error page that returns a 404 rather than a 200.
  4. Change the redirect to be a 301. Google does understand a 301 pointing to a robots.txt, or a 301 pointing to a 404.

As an aside, I would recommend, for pagerank purposes, that you be very discriminate in your use of 301 redirects vs 302 redirects. They are not interchangeable, particularly when Google has specifically stated that they will only index the original content, not the redirected content, for a 302.

I didn’t mention Google Sitemaps above on purpose. If you have the described robots.txt problem, creating a Sitemap with urls to crawl will not help you. You will see an OK status, but no 0 urls crawled and 0 indexed.

Posted by sitarah, filed under Uncategorized. Date: May 31, 2008, 12:21 am | No Comments »

Google is hosting popular ajax frameworks/libraries now to speed up the user experience. It makes perfect sense, but I wonder how many popular applications will use it. The package wouldn’t function if the computer is offline.

Let’s say you are working in an offline development environment — you can’t effectively test. I can also see the small possibility that a developer works in a corporate environment where most webpages are blocked. Maybe they blocked everything on google but www.google.com to prevent employees from using gmail, google docs, and google images to share proprietary information. This means the developer has to download the correct version of the framework and change the include line to refer to it locally. The solution is easy, but it could be a real pain to find that bug. Google gears wouldn’t help since the file was never accessible in the first place.

While this seems like a really unlikely case, if you have software that caters to a traditionally strict corporate audience, you’ll either have to offer two packages (local include, Google include) or just go with the local include. The local include is always the safer choice.

There’s also some fine points made on slashdot that this is an excellent way for Google to track ajax usage.

Posted by sitarah, filed under Uncategorized. Date: May 31, 2008, 12:05 am | No Comments »

Yesterday, I overheard a coworker say to my boss “Hey, did you get Firefox 3 yet??”. This coworker is not techy at all and still uses a mix of IE and Firefox at work. I think it is kind of cool he was aware of FF, but I am fairly sure he just saw the Firefox Downloads Try to Break Guinness World Record headline that morning, so it was top of mind.

I don’t care if that’s why, really.  Yes, this is a publicity stunt, but it is mostly harmless, and it seems to be working. I love my little heat squirrel, and if more people can find solace in his furry and fiery embrace, excellent. I do hope, though, that the FF3 has fixes some of the memory leaks. Yes, yes, I have tuned the settings — all those crazy things with cacheing and whatnot — and it just never helped. I’m not complaining for myself . I’m fine with Firefox 2. It rarely crashes, and I will never go back to IE, but I fear that less loyal people will get a Firefox 3 that is ablaze for all the wrong reasons and turn to the dark side forever.

If FF crashes and hogs RAM while IE doesn’t, then it some ways, it isn’t the dark side, but the right side, now that IE7 has tabs. The casual user wants it to just work. Tabbed browsing was an awesome advantage over IE before, but now that they are equalized there, the only differences are rendering, performance, and addons. The casual user doesn’t care that IE ignores various web standards. They assume the page is broken. They will not be so cavalier about crashes.

To rebound from that depressing thought, I am intrigued by the notion of Download/BYOBrowser parties. I’ve suggested to Louis that we hold one, bringing the liquor while our friends bring their laptops. He only nervously laughed when I suggest we provide beer, then offer to open the hard stuff once the assembled group has solved our centos SMTP problem. I’m pretty sure it’d solve the issue, and I think paying them with vodka is, in fact, more ethical than asking friends to help for free.

Friends don’t let friends give free advice, right?

Posted by sitarah, filed under Uncategorized. Date: May 30, 2008, 11:56 pm | No Comments »

I currently use twitter for personal messages — where I am, what I am eating, etc. I admit, it is half for me and half for my friends.

I know of a person who uses it exclusively for technical updates instead: “can now update templates in X without validation errors”, “figured out a way to use CURL to do Y”, “switched to RC2 for Z”. When I think about it, this is extremely useful. Here is a real life example: Louis can’t figure out why he can’t properly configure the SMTP daemon on centos*. Imagine if he could search the tweets/feeds of his friends for keywords like email, SMTP, or centos and see if anyone he knows or his friends’ know has dealt with this issue or a related field like SMTP on *nix. While a little Googling should come before harassment, when you’re really stuck, it’d be nice to have an option between Google and a post to a random forum.

I know some tech people despise trendy things like twitter, but maybe the jaded could be convinced to use a tech twitter clone when it is repackaged as something to:

  1. inform your resume. You have 300 140-char msgs about all the little stuff you wrestled with that you forgot about.
  2. help you find friendly, trusted advice when you are stuck
  3. show everyone how l33t you are because you work with so many different tools/apps/frameworks/languages
  4. help employers recruit full-time or ‘holy crap, this is broken’ consultants

I’m sure it’s already out there somewhere.

*If you’ve done this successfully, please ping me!

Posted by sitarah, filed under Uncategorized. Date: May 30, 2008, 11:45 pm | No Comments »

Until I decided to redesign this site, I spent a lot of time uploading and tagging my ~2,000 photos on Flickr. It wasn’t always this way. I used to have Menalto Gallery locally installed on my server instead, because I tend to err on the side of controlling my own data. I was a little derisive of Flickr, to be honest. There was a lot of hype, and it was just a photo site. Whatever.

This all changed when I went to my job interview for my current position. I met a few people at the time, and one of them I instantly liked. She looked a little too old to be on board the web culture clue train, but she asked me great questions about my work with Google maps, forums, and then, to my surprise, Flickr. She suggested, essentially, that it was a great way to virtually visit a place. I don’t know how she meant it, but that’s how I heard it, and I realized that it was true.

I put this to the test, searching for Stokes, a favorite State Park of mine. To my wonder, some of the photos were familiar — not because I recognized the general scenery, but because I suspected I, too, had taken pictures of that same exact mossy tree. It was strange to think that a handful of people, who I would never know, had looked upon a particular spot and reacted in the same way I did.

I saw, too, that my interviewer was right — I could indeed get a sense of a place with the right search. It wasn’t fair, thorough, or timely, but it was there. More importantly, my thousands of pictures could improve it. Flickr could be more than just a vanity site or a dumping ground a la photobucket, but an archivist and repository. It was, in some way, actually altruistic. At that moment, I was sold, and my account reflects this realization, going beyond ‘pretty’ and ‘professional’ to include the far less glamorous ‘historic’. They answer questions: this is what you get at the local diner, this is the stage size for the King of Prussia Irishfest, and this is the costume you should expect to see at a Ren Faire*.

I once had a Googler friend who tried to sell me on Picasa. I didn’t like Google’s uploader at the time, but the main reason I gave him had to do with the above. Not enough people search Picasa like they do Flickr, and it’s not about the features *I* like best (Smugmug anyone?), but where my information can get the most exposure. I realize this is a catch-22 — refusing to use a less popular site because it is less popular — but he seemed to understand my point. (I see Google has caught onto this by introducing user content and pictures into Google Maps, and I’d be completely onboard panaramio by now if someone hadn’t already taken my username.)

This may seem like a loveletter to my Lady of One Vowel, but it is not. As much as I enjoy and believe in Flickr, I can’t make it work corporately. On the surface, it is very attractive to companies — a wealth of categorized information on your brand or activities relating to your brand. Do a search, throw the results in a badge or collage, and voila! Web 2.0 shiny.

Until…

  • Your Washington Monument badge get the randomized photo of a stack of beer cans, impressively shaped like a narrow tower.
  • Your Monmouth Food Court collage returns a volley ball court barbecue in Monmouth, New Jersey. (For example, Monmouth Food Court turned up a bizarre picture of a diseased leaf against an American flag as its 8th picture, along with a Barbados monkey, a Veterans memorial, and a tugboat.)
  • Mischievous teenagers find your Favorite Flickr Dog Contest and purposely vote a picture of a cat to the top.

Granted, you can avoid the last possibility by avoiding contests altogether, but don’t think this isn’t a real danger: Chevrolet discovered this when their user-made Tahoe commercials started including lines like “Yesterday’s Technology, Today.”
As for the other two, you can’t avoid them. You just can’t. “Monmouth food court” in quotes? Nope. No results. Search tags only? That returns pictures of a kid eating a hotdog. Monmouth foodcourt, tags or text? Only 1 picture.

This isn’t wrong behavior. It’s not malicious of someone to tag a photo of a child at Monmouth Food Court as such, even if you can’t see anything distinguishing that food court from Piscataway Food Court. It’s just a by-product of tagging. It introduces both order and chaos. Perhaps one day we will have a system of primary and secondary tags to indicate what is pictured, vs the related informative tags. Today is not that day. Until then, you cannot rely on Flickr search to return a dependable result set, in which dependable is defined as having an obvious picture of what you specified, consistently.

You could create your own group and moderate submissions, but that’s a lot of involvement. If you allow people to vote on pictures — thumbs up, thumbs down — the system could still be gamed with a concentrated effort. At best, you could restrict voting only to ‘bury’, which will make sure no one can purposely make a bad picture float to the top for long. However, even one viewing of the wrong picture can be one too many when you’re dealing with minors, religious groups, or a government agency. A consent form won’t solve this — even if Johnny’s parents can’t legally complain that he saw a bong in his “My Fave Sk8terz” widget, they can make a fuss about it publically. Perhaps Johnny can screenshot it, too, handily posed with your precious brand name, and send it all around cyberspace with some funny captions.

Unless that is the PR you want to risk, Flickr won’t work for a company. I wish it could.

*As an exercise for the reader, imagine the picture that answers all these questions at once.

Posted by sitarah, filed under Uncategorized. Date: May 30, 2008, 11:38 pm | No Comments »

Most web analytics tools rely on javascript:document.referrer to get the ‘referrer’ for a page, or the page you were at just before you came to this one. Analysts will use this to see if a particular site is driving a lot of traffic via a link or an ad. The explosion of webmail clients means you can also tell when people are coming from an email. You’ll have a referrer from webmail.aol.com or mail.yahoo.com , etc. While best practices dictate that you don’t rely on referrers as your sole determinate of email success, you might try to classify your traffic into email, websites, search engines, and bookmarks, based on your referrals. Unfortunately, gmail causes some issues.

If I use Yahoo mail and click on a link, then paste javascript:document.referrer into the browser window, I see the yahoo address. If I use gmail and do the same, because the link opens in a new window, the referrer is null. If I use gmail but specifically open the link in another tab, the referrer is gmail.

If the users are more likely to click on the link and not open it in a new tab, then your Direct/bookmarked referrals will be overinflated and your email traffic too low because of the gmail loss. You should compare your referrals with your actual email metrics (you should be coding your email links with some parameter to track the clicks) to see what percent you might be losing. You’ll get a more accurate picture of your traffic sources.

Posted by sitarah, filed under Uncategorized. Date: May 30, 2008, 11:22 pm | No Comments »

Wordpress has a new version, 2.5.1. I have been ignoring it because I expect it to break. Let’s play a game. What are the first words that come to mind when I say:

  • Upgrade
  • New Release
  • Launch

Bugs? Breakage? Patch? Feelings of fear, anxiety, and loathing? I know I am not alone in my automatic negative reaction.  I have been well-trained that the above processes never perform smoothly or completely for projects involving more than 2 people.

While it is easy to linger on how sad that is and what it says about the state of software, I’m sure that subject has been beaten to death in many arenas already. Perhaps at this point we just need to accept that the above is true. Yes, we will be upgrading. You’ll be unhappy — perhaps intensely frustrated — but it’ll be worth it in the future.

Just like dental exams, the driving test, the SATs, and puberty, let’s all agree there will be pain, but when the pain is over, you’ll be taller,  your teeth will be straighter, and you can drive your damn self to the mall.

(I’ll still put off upgrading. 2.3.3 works just fine, though I would like to be taller.)

Posted by sitarah, filed under Uncategorized. Date: May 30, 2008, 11:17 pm | No Comments »