Main

January 2005 Archives

January 1, 2005

Acronym markup and Bobsleds - Jan 1, 2005

A few weeks ago, I had a discussion with Guy Leo of Galileo IMS, a friend of mine who does some work with the USBSF. We were discussing how to identify an acronym in a press release I was writing. Guy wanted me to do the traditional thing which is to identify the acronym by following it with the expanded phrase immediately following the first use in a document.

As a bit of an HTML prude, I think that's pretty FUBAR, or Fouled Up Beyond All Recognition.

The code looks like this, if you're interested:

<acronym title="HyperText Markup Language">HTML</acronym>

If you're really interested, this tag falls in the category of phrase elements. Phrase elements add structural information to text fragments, according to the documentation.

The other phrase elements are:

  • em
  • strong
  • cite
  • dfn
  • code
  • samp
  • kbd
  • var
  • abbr
The USBSFsite is at http://www.usbsf.com/.

2005 Year of the blog - Jan 1, 2005

Dialog: a conversation between two or more persons; an exchange of ideas and opinions -from Webster's Seventh New Collegiate Dictionary, ca 1965
The Internet has always been about access to ideas and information.


From its earliest beginnings, the Internet has been about sharing information. The earliest chunks of "Internet" allowed the mainframes at four different universities in the US to communicate with each other. If you had access to those machines, you could share resources of the other machines.

Years later the World Wide Web has made this ability to exchange information more accessible to a wider range of people. The increase in availability of bandwidth and inexpensive web hosting in the late 90's and early years of this century allowed businesses of all sizes to take advantage of the Internet to communicate with their customers and market their products and services.

Personal web sites have long been a part of the Internet. If you have ever seen an Apache configuration file, you know "personal" home pages have long been a part of the culture. If you're the kind of person like me who draws generalizations about culture from web server configurations files. Until recently they tended to be set up once and infrequently maintained. Some people who work on the Internet a lot would update their personal pages frequently, other personal pages would languish.

Blogs are the natural next-step in the evolution of the Internet culture

Blogs empower any individual who wants to say something. And not just those who have decided to create a blog. This blog and most others encourage comments from the readers. In this way everyone can have a voice in the dialog.

In the past year we have seen "blog" as the most looked up word at Merriam Webster's OnLine, http://www.m-w.com/info/04words.htm, bloggers were the ABC News People of the Week, People of the Year (whatever that means) http://abcnews.go.com/WNT/PersonOfWeek/, and it is rumored they were in the running for Time’s person (sic) of the year. Bloggers were accredited press at both the major US political parties' conventions this year if there's any remaining doubt as to how much a part of our greater culture they have become.

I'm excited that this thing I have been fooling around with for a couple years is getting more exposure, that there's a buzz around technology that I’m involved with. I hope over the next year, we'll see bloggers all over the world engaged in constructive open and free dialog. I wonder what something like that will bring.

January 17, 2005

Good reason to be camera shy - Jan 17, 2005

Monkey methods: Bill Gates Strikes a Pose for Teen Beat Photospread, 1983

Ordinarily this wouldn't merit a post in AdvisorBits becuase its mostly humor. Be sure to read the comments, some of them had me in tears.

Things like this are the reason I don't let people take pictures of me very often. I don't really want to remember what I looked like in the early 80's that much.

January 18, 2005

And another thing... about SPAM - Jan 18, 2005

The nice folks at Movable Type have put together an excelent reference for fighting SPAM in MT weblog comments. The document is both informational, and instructive. They explain in plain terms how SPAMMERs operate and get around some things we try to do. It's understandable these days when systems adminstrators get frustrated and just want to take the most drastic actions possible; it's just not always a good thing. MT acknowledges this and fights against the urge by giving pros and cons of various methods in a balanced way.

And most importantly they provide specific advice wiuth regard to MT Blogs and comments. I had not followed all the advice, but now I have. The number of SPAM went from about 50 yesterday morning to only 3 this morning.

Movable Type Publishing Platform: Guide for Fighting Comment Spam

In the words of Gomer Pyle: "Thank you thank you thank you!"

I would like to add one related note to the discussion. Something I noticed the other day that I needed to fix in a hurry before I got Google Dorked. (More on Google hacking and Google Dorks in a future post.)

Many web statistics programs track referers. Referers are explained in the MT article. As I read this section I remembered that I had recently noticed some SPAM urls in my referers reports. I wondered about this at the time, but moved on.

Now as I was reading the article, it occurs to me, that this reference in my statistics may be another side effect they are hopeing to gain. If I understand it correctly, there is a class of SPAMMER that merely wants the links to appear on a web page, so they can gain ranking in search engines. They never expect anyone to click the link off my blog or yours. They just want their site to rank well.

So back to the statistics page. I wondered about this, and sure enough as I watched connections come into the server ( tail -f /var/log/httpd/access-log ) each time before a comment spam from this one source came in, it would hit the page it was commenting on. (And so getting double refer hits I might add.)

If Google got a hold of my statistics page, and saw that link, the SPAMMER would have scored their goal, and my stats page would be helping them. I put a stop to that.

Recommended additional step to fighting SPAM

Remove public access to your web stats, or at least block Google from indexing them. Most web hosts will allow you to password protect a directory. This is probably the most direct way to resolve the issue. Don't let anyone who doesn't have a reason to see those stats.

If you need to allow public access, at least hide the directory from Google. Use a robots.txt file to keep respectful Spiders out of your web site, and dont make links to your stats pages.

January 24, 2005

Enigmatic, or Egomaniac? - Jan 24, 2005

PBS | Robert X Cringely Column - "Mini Me, The New Mini Mac is All About Movies"

33% of the people I know who are Mac users (musers?) have heralded the Mini Mac. That would be one person if you're counting. Like most of the rest of the world, the 99 % who use Intel PCs, I have kind of shrugged and said, "Yea?"

Aside from Cringely's characterization of Jobs, I found the whole Sony-HDTV-Mini Me ... Mac Conspiracy he lays out vaguely plausible.

January 25, 2005

Your Referrers Are Showing - Jan 25, 2005

A couple days after I noticed the SPAM in my web statistics, the topic came up over at the Comment Spam Clearinghouse, a blog run by Jay Allen.

Jay was hired this fall by Six Apart, the publishers of MT, presumably because of his blackbelt in AntiSPAM, as demonstrated by the mission critical MT plug-in he wrote, MT-Blacklist.

Jay's post is about a script that scrapes the spammers out of your web server's access logs. I'm not sure that I agree with this, it strikes me as incorrect from a standpoint of wanting accurate statistics about how my web server is used. As I mentioned, I think its more appropriate to restrict public access to those statistics pages. Besides, I can't stand to discard data- I'm a hopeless data packrat.

I have noticed a marked increase in comment SPAM since I posted the other entry. I think it must be coincidental, because to believe otherwise would be to think that one of my very few precious readers is a SPAMMER.

January 31, 2005

"THE" question? - Jan 31, 2005

InformationWeek > Weblogs > The Weblog Question > January 31, 2005

This article looks at who owns the content on the weblogs we all love to read. It seems reasonable to me that a blog which is published on the web server's of the company is the property of the company. In fact it is this kind of thinking that made me an independant consultant. The author goes on to describe a number of cases where the ownership of content is ambiguous, and even a few where big corporations do not assert copyright over the blogs of their employees.

The article presents a well rounded overview of some of the legal issues surrounding ownership, but it was this quote that I thought both my blogging and document imaging friends would appreciate.

"Forrester envisions a day when new employees on their first day will be handed a sheet of paper with their phone number, E-mail address--and a URL for their blog," analyst Charlene Li observed in the report. That day is closer than you think."

Sheet of paper?