Build a Visitor Tracking System for your Website with PHP January 15, 2012
Posted by Tournas Dimitrios in PHP.trackback
If you have a website , you are probably interested in knowing details of your visitors like : — physical location , which webpages they visit , date/time , how did they get to your site , etc — . There are some online tracking options available , like google analytics or your hosting provider may offer you some statistics . But what if you want a customized tracker that shows you only what you need ? Once again , PHP has build-in variables that deliver tracking and browser information . Superglobal built-in variable $_SERVER takes certain parameters and will “dump” or deliver tracking information . Some of these variables rely on the server and or browser settings (user-Agent) and may be disabled or customized , so take it in account when analyzing the data .
The superglobal $_SERVER array has 32 values , you can explore these values by running :
echo “<pre>”;
print_r($_SERVER);
echo “</pre>” ;
My previous article “Logging Visitor’s IP with PHP ” was a simple example of logging visitor’s location into a “txt” file . This article will present a practical example for a tracker php script that saves the following information about your visitors in a database : their ip address , location (country and city) based on their ip , the date and time of their visit , some information about the browser they used and their operating system , the referer (if they clicked on a link on another site to get to yours , you will know which site referred them) and the query string they searched for in case they were referred by a search engine . Most of these details are “hosted” in the super-global array “$_SERVER” , except country / city information . We will query a web-service that will translate the IP to country / city . And last but not least the script will check if the visitor was a bot (these are software applications that run automated tasks over the Internet like Google’s index bot ) . We will download a file which has recorded the ID’s of 300 known bot-engines , this list will be used by our script as reference to detect if the user- agent is a bot .
All this can be achieved in seven simple steps , follow along or download the source code :
Step 1 : Create a table into the database where the details will be stored . The source code for this article contains a visitor_tracker.sql file , just import this file via phpMyadmin or via the command-line into your database .
Step 2 : The translation IP / country-city will be made by a web-service available from ipinfodb.com . This web-service demands an API-key which is free , though you have to make an registration (only a few minutes ) .
Step 3 : Download the list of known bots . This plain text file contains details about each bot . We only need the bot- id from this file , so we have to parse the file and extract only this information . A function , getBotList() , will load the file and return an array with all bot-id’s .
Step 4 : The function is_bot() will compare the $_SERVER[‘HTTP_USER_AGENT’] variable against the bot-list and return a Boolean value (true/ false) .
Step 5 : Query the web-service with visitor’s IP , it will return a string with information (the response can also be in xml , json ) . We have to parse the string and assign the values to variables which will be recorded into the database-table . The site of this web-service provide script examples which can be used to query it’s service . I decided to build my own script , it’s only five lines of code , it use PHP’s curl() extension . Most hosts have this extension enabled by default (run a phpinfo() to verify that it’s enabled) , so you can use my code . If your host doesn’t support the curl extension , use the class provided by ipinfo.com . My previous article ” Accessing Remote Url’s using Curl ” made a good introduction to this topic , read that article if you like to learn more about the curl extension .
Step 6 : Parsing the string from the web-service is simple as : converting the string into an array (explode() ) and assigning the values of this array to variables .
Step 7 : Querying and Inserting all values into the database . Finally closing the DB-connection .
Step 7_2 : Paste the code at the end of each web-page . A better option is to store the code into a file outside your public server directory (root-dir) and include the code with an “include” statement . If you choose the last option , don’t forget to configure PHP’s include path .
Here follows the complete code :
<?php $ip = $_SERVER['REMOTE_ADDR'] ; /* Set your API key this is a fake example 🙂 */ $api= "1ade0eec6de005cfeedd12678aac3cbf4f47c120bbf83b3cc" ; $apiurl = "http://api.ipinfodb.com/v3/ip-city/?key=$api&ip=$ip" ; /* Prepare connection to ipinfodb.com and parse results into variables */ $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "$apiurl"); /** * Ask cURL to return the contents in a variable * instead of simply echoing them to the browser. */ curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); /** * Execute the cURL session */ $contents = curl_exec ($ch); /** * Close cURL session */ curl_close ($ch); /*Parse the returned string into an array */ $pieces = explode(";", $contents) ; /*Retrieve values from the array and assign them into variables */ $country = $pieces['4'] ; $city = $pieces['6'] ; $city2 = $pieces['5'] ; $date = date("Y-m-d") ; $time = date("H:i:s") ; $ip = $_SERVER['REMOTE_ADDR'] ; $query_string = $_SERVER['QUERY_STRING'] ; $http_referer = isset( $_SERVER['HTTP_REFERER']) ? $_SERVER['HTTP_REFERER'] : "no referer" ; $http_user_agent = isset($_SERVER['HTTP_USER_AGENT']) ? $_SERVER['HTTP_USER_AGENT'] : "no User-agent" ; $web_page = $_SERVER['SCRIPT_NAME'] ; $isbot = is_Bot() ? '1' : '0' ; /* Connect to the database --- set your credentials --- */ $connection = new mysqli("localhost", "root", "", "test"); /* check connection */ if (mysqli_connect_errno()) { printf("Connection failed: %s", mysqli_connect_error()); exit(); } /* Insert data into mysql - table */ mysqli_query($connection, "INSERT INTO `visitor_tracker` (`country` ,`city`,`date` ,`time`,`ip` ,`web_page` ,`query_string` ,`http_referer` ,`http_user_agent` ,`is_bot`) VALUES ('$country','$city', '$date', '$time', '$ip', '$web_page', '$query_string', '$http_referer', '$http_user_agent','$isbot' )") ; /* close DB-connection */ mysqli_close($connection) ; /* Remove this line on production pages */ echo "Your IP is :" . $ip . "and database is updated " ; /* Detect if visitor is a "bot" */ function is_bot() { $botlist = getBotList() ; foreach($botlist as $bot) { if(strpos($_SERVER['HTTP_USER_AGENT'] , $bot) !== false) return true ; } return false ; }//end function is_bot /* Parse the robotId.txt file into an array */ function getBotList(){ if (($handle = fopen("robotid.txt", "r")) !== FALSE) { //$count= 1 ; // for debuging $bots = array() ; while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) { if (strchr($data[0] , "robot-id:")) { //echo $count ." $data[0]".""; // for debuging $botId = substr("$data[0]", 9) . "" ; array_push($bots, "$botId") ; //$count++ ; // for debuging } } fclose($handle); return $bots ; } } // end function getBotList ?>
Wamp server has the curl() extension not enabled by default , go to wamp-tray -> PHP ->PHP-extensions -> Enable curl .
Building an interface to get access to these recorded information , is simple and out of the scope of this article .
The downloaded code is a “7z” archive , it implements the LZMA compression algorithm . Compression ratio of this algorithm is 30-50% better than the ZIP format . For *nix like operating systems use “7z” utility to decompress the downloaded source code of this article . Most likely this utility isn’t installed by default on any Linux distribution . For instance , on my CentOs 6.xx machine , I installed the package with :
yum install p7zip-plugins p7zip
7z x Build-visitor-tracking-system-with-PHP.7z .
I really appreciate your post. Thanks for sharing your ideas with us. I also recently discussed the topic visitors tracking system.
[…] Build a Visitor Tracking System for your Website with PHP Share this:ShareTwitterEmailFacebookDiggStumbleUpon […]
http://www.robotstxt.org/dbexport.html this link is not working… could you provide alternative link to get list?
@vijaysinhparmar
try: http://www.robotstxt.org/db.html
But it is not plain txt file..