Monitoring requests
You can record all the requests to a website in a file or in the database to spot an invading robot or display statistics like the total number of visitors or the 10 most consulted pages.
The configuration parameters are defined in the file config.inc.
$log_dir = ROOT_DIR . DIRECTORY_SEPARATOR . 'log';
global $track_db, $track_log;
global $track_visitor, $track_visitor_agent;
global $track_agent_blacklist;
$track_db=false;
$track_log=false; // true, file name or false
$track_visitor=false;
$track_visitor_agent=false;
$track_agent_blacklist=false; // false or array of agent signatures
$track_visitor
set to true
triggers logging requests.
$track_visitor_agent
adds the content of the header User-Agent
to the registered data.
$track_db
gives the name of the DB table which contains the log, track
by default if $track_db
is just true
.
$track_log
gives the name of the file which contains the the log, track.log in the folder defined by $log_dir
by default if $track_log
is just true
.
If $track_db
and $track_log
are false
, no logging is performed.
To filter out requests sent by known services, such as Google, Facebook or Nagios, define the parameter $track_agent_blacklist
as an array of the signatures in lowercase they write in the field User-Agent
of a request.
In this configuration, the function track
will ignore the requests sent by the Facebook and Google robots and Nagios probes.
Logging requests is managed by the function dispatch
in engine.php:
global $base_path;
...
global $track_visitor, $track_visitor_agent;
$req = $base_path ? substr(request_uri(), strlen($base_path)) : request_uri();
if ($track_visitor) {
track($req, $track_visitor_agent);
}
...
}
The database records the information about a request in the table track
.
`track_id` INT(10) UNSIGNED NOT NULL,
`time_stamp` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`ip_address` INT(10) UNSIGNED NOT NULL,
`request_uri` VARCHAR(255) NOT NULL,
`user_agent` VARCHAR(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `track` ADD PRIMARY KEY (`track_id`);
ALTER TABLE `track` MODIFY `track_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT;
USAGE
To display the content of the connection log:
$ tail track.log
To obtain the total number of visitors:
$ cut -f 1 track.log | cut -d' ' -f 3 | sort | uniq | wc -l
To list the 10 most consulted pages:
$ cut -f 2 track.log | sort | uniq -c | sort -rn | head -10
To check the DB:
mysql> SELECT * FROM track;
NOTE: If necessary, add the prefix $db_prefix
defined in db.inc to the name of the table track
.
To obtain the total number of visitors:
mysql> SELECT COUNT(DISTINCT ip_address) from track;
To list the 10 most consulted pages:
mysql> SELECT request_uri, COUNT(request_uri) AS count from track GROUP BY request_uri ORDER BY count DESC LIMIT 10;
IMPORTANT: The amount of data generated can rapidly fill up the DB and the log file. Choose only one mode by setting $track_db
or $track_log
to false
. Once a campaign for analyzing the types of the clients (navigators, mobiles, robots, etc.) is over, leave the parameter $track_agent
to false
.
Comments