Skip navigation.
Home

WebLog Analysis

Articles about analysing weblogs and what can be extracted therein.

AWFFull

The latest Production Release is v3.10.2, released on the 13th of December 2008: awffull-3.10.2.tar.gz

AWFFull - A Webalizer Fork, Full o' features!

AWFFull is a webserver log analysis tool. Mainly used to produce simple reports, it can also be used as the starting point for more detailed Web Analytics.

The output produced is a series of simple HTML pages and images,making it trivial to access the reports from practically any web browser. No additional configuration or set-up is necessary.

AWFFull does produce "at a glance" metrics to help identify problem areas within a site.

The released version includes simple segmentation to further drive from only Reporting into true Web Analytics.

AWFFull is a fork of the venerable Webalizer log analysis program.

So You Want to Learn Regular Expressions? Part 7: Examples: IP Addresses

The full list of Regular Expression Articles I've done:

In this article I'm going to take you a little through the method and madness of creating regular expressions for filtering or identifying IP Addresses and Ranges.

Why IP Addresses?

It will demonstrate and combine the concepts explored in previous articles. Hopefully cast some illumination on the method of solving Regular Expression problems, and highly coincidentally, show how to filter your corporate network from your Google Analytics stats.

So You Want to Learn Regular Expressions? Part 6: Errr... Or...

The full list of Regular Expression Articles I've done:

In the previous article with our box of chocolates, we used a method for choosing between one or more of several, more or less random, characters. [abc]+ for example.
But a common task in any web analytics is to be able to choose between several different items and treat them identically.
eg Images: gif, jpg, png
or, Pages: htm, html, cfm, php, asp and so on.

Or to put the first case pretty bluntly in English, we want "gif" or "jpg" or "png" at the end of a file name request.

So You Want to Learn Regular Expressions? Part 5: Just Like a Box of Chocolates

The full list of Regular Expression Articles I've done:

"Just Like a Box of Chocolates"?? Yeah. Pretty cool analogy isn't it! Just wait! Smile

I hope we're all familiar with the principle of being offered to pick a choccy from a box of Chocolates. Pick one and one only. But any one of the myriad of choices arrayed before you.

Well those clever Regular Expressions supply a tasty Box of Chocolates as well.

So You Want to Learn Regular Expressions? Part 4: More Wildcards

The full list of Regular Expression Articles I've done:

In this instalment of this series on Regular Expressions, I'll expose a wee lie from part 2, and show how wildcards can be less wild. More controlled. And hence more useful.

A LIE????

Urm. Yes. Not to put too fine a point on it. A school/teaching progression type of lie. You see a ".*" construct isn't actually a wildcard. The asterix is the wild card. All on its very own. Similarly the plus "+" in ".+".

So You Want to Learn Regular Expressions? Part 3: Positioning

The full list of Regular Expression Articles I've done:

If you've been tracking the public discussion on Robbin Steif's blog regarding this series, you'll no doubt be aware that she was prompting me (in a really unsubtle fashion ;-) ) to explain the use of the "beginning" and "end" characters. ^ and $ respectively.

So that's what this episode in the series will be focusing on.

So You Want to Learn Regular Expressions? Part 2: Wild Cards

The full list of Regular Expression Articles I've done:

In the previous article (So You Want to Learn Regular Expressions?) I hopefully managed to explain the underlying concept of using regular expressions via the Analogy of a jail.

In this article we'll start to explore the use of wild cards - what they are when to use them, and more importantly, when NOT to.

When NOT to use Wild Cards

When not to?

Yes. You see, a regular expression will usually have two implied wild cards.

So You Want to Learn Regular Expressions?

The full list of Regular Expression Articles I've done:

Perhaps you've been forcibly inducted into the Joy Of Regular Expressions through the use of tools like Google Analytics. Unfortunately while perfectly correct, the Google Analytics help for Regular Expressions is brief and does not explain the why of when to use X vs Y.
Hopefully the following article will get you through the why. I’m going to assume you’ve had at least some exposure to using Regular Expressions already.

Web Analytics Association: Standard Metrics Definitions

(UPDATE: Having woes with the Captcha. Argh. Have disabled comments till I can get it fixed. Sorry.... Thanks Judah for the heads up!)
(UPDATE2: Captcha Woes Fixed. Now using a new improved "Math" Captcha. Apologies for the mess!)
(UPDATE3: Pulled item 20. Stephen Turner gently corrected me that timestamps are daylight savings independent. I should know that! Smile)

The Web Analytics Association (WAA) has recently released a document of 26 Standard Definitions (PDF) to "... Promote Consistency across the ... Web Analytics Community"

They are by no means a complete list, but you can read various reactions around the Web Analytics industry via Avinash, Judah and/or Robbin. All have a slightly different view on things. And to the best of my heresay knowledge, are all members of the WAA itself. I'm personally not a member, more due to slack/lazy inertia than a deliberate conscious decision to not join.

I finally had some time today to read the document in detail. As is natural with any new document, there are issues, minor or not. So being a good little bug reporter type, I thought I'd write 'em down. And then email the list through. But the list seemed to grow just a tad, and I figured that perhaps these concerns and issues could benefit from a public airing. Or at least, that I could be shot down publicly... Foot in mouth Foot-in-mouth, just deserts et al. Pick one. Or three. Smile

Notwithstanding the below (hopefully constructive) criticisms, this is a pretty good document! Really! It's way past time we did have a standard on what these terms mean in this industry.

Now I have had all too much experience with writing policy documents, was even Defence's representative on a Standards Australia committee. (Which sounds way more glamorous than it is. What's that? It doesn't sound glamorous? That's what I just said! Tongue out). So please forgive the anal retentiveness of what follows. I do mean this for the best! Laughing No particular order, though I have tended to start from the beginning of the document and gradually worked through to the end.

Using AWFFull: Pages - Hits Percentage

You've possibly seen it (Hits%) on the Daily Report and wondered what it was good for and why did Steve bother putting it in?

Syndicate content