Category Archives: Web

Moin to the rescue

WebGUI looked good, but I couldn’t seem to get it working correctly. I tried a couple of other things, but none of those panned out, either. So I’ve decided to convert my main site back to Moin.

I used Moin on my main site back in the early 1.5-ish days and it worked rather well. I can’t remember why I changed, though it could well have been one of the many service provider moves I made. At any rate, Moin gives me the flexibility that I want, and I’m not terribly scared of Python if I want to get into some of the code that runs it.

I’ll eventually get my blog moved back there. While WordPress does blogs well I would rather keep everything together on one site. For some reason it gives me warm fuzzies.

WebGUI looks interesting

On a whim I decided to have a peek about for a content management system that suits my needs. After a little poking about Freshmeat I came across WebGUI, which looks pretty interesting.

A couple of the obvious features I like are that it’s written in Perl and has a good security policy. Of course, TWiki is written in Perl and has a security alerts page, so that doesn’t guarantee much. WebGUI is shipped ready-to-go in binary form, and is ostensibly pre-configured, which I figure should save me from making too many foolish mistakes. Time will tell.

I’m going to give it a go on my main site soon. Photos don’t put themselves on-line!

Trying out OmniWeb

Tonight I remembered a piece of software from a while back that I hadn’t looked at in a long time. OmniWeb has definitely come into its own since I’ve last looked. It’s a lot faster (though that could be the difference between a 600MHz G3 and a 2.0GHz Core 2 Duo), and it seems to render pages looking a bit more like Safari or Firefox than before.

Another thing that’s changed is the price — just USD15 now. I remember it being a lot more expensive in the past. The 30-day trial will tell me if it’ll do the things I need it to do.

Intermediate storage for webserver logs

Last night I started trying to figure out how to store cumulative web data. It’s a harder problem than it looks, as most things are. The biggest problem with it seems to be storing it compactly.

Compressed raw logs take a lot of time to process. It’s not so bad if your site isn’t terribly busy or if you’ve got a lot of time on your hands. Granted, my site isn’t busy at all and I could wait a while for results, but I’m hoping someone else will find the program useful.

Intermediate storage is of course the key, but that’s where things get really tricky. It’s easy enough to store sums and averages of various numbers, but without storing stats for all 404s, how can you report on the top x of them? I’m half-way to an answer, but it’s probably not going to be pretty. It’s certainly not going to be elegant. But it’s also not going to be Webalizer, which doesn’t make sense to try to modify to suit my needs.

Another idea for a nifty tool is a web log playback tool — given an access_log, it replays the log, making the same requests at the same time offsets in the file. It may not be coming from all over the place (instead originating from one machine), but it may help debug some performance corner cases, for example.

Well said, Ricardo

Ricardo Signes says it better than I ever could. Especially about the not playing well with others, the lack of RSS (or any other syndication format) feeds, the annoying apps that assume people are complete idiots, the assumption that a message on FaceBook is somehow equal to or better than e-mail, and what I personally view as the AOL-ification of Facebook.

I’m sure the folks in the social networking side of the office know I’m the loyal opposition.

Snow and server maintenance

Thanks to the snow that’s slowing down traffic, Robynne is running behind and I’m entertaining myself with server maintenance.

There have only been eight or nine thousand invalid login attempts in the past couple of weeks. This is a big drop from what it was — over 110’000 in only three weeks! I only black-holed a couple of IPs for that today.

The new annoyance is people/programs that set their HTTP UserAgent header to random strings. Combine that with the tendency of these specific people/programs to access only data that has 404’d for a long while and you’ll find one more IP and a class C black-holed. I can understand changing one’s UserAgent to something different from what it actually is for privacy reasons, but cluttering my web stats output is just rude. Pick one and stick to it. I recommend “NotTelling/1.0”.

RSS or Atom?

In my usual manner of putting the cart before the horse, I’m trying to figure out a syndication format before I even have the blog navigation complete.

Atom is the new kid on the block and looks like a good bet, but RSS seems to be better-supported by Perl. I haven’t looked at the situation on Linux yet (only OS X so far), but the Atom module doesn’t seem to want to build (at least not using CPAN).