Robot access log
This page show you how to analyze the access log with the source code.
The function of this page
This page imports and analyze access log in order to detect robot access. By learning this, you can customize the way to detect robot and analytics.
About the detail, take a look at Checking the crawl and Import access log.
Location of Source code
Source code of this page is in the "[ALINOUS_HOME]/admin/robotlog/" folder.
Import access log
You can import access log of the http server from this page.
Choose file with the "Choose File" button and push "Import" button. Then progress bar apears and operation starts. About implementation of th progress bar, please take a look at jQuery Progress Bar.
Start importing
If you push the "import" button, the "$IN.cmd" parameter set as "uploadlog", and source code below executed.
In this code, it calls "uploadlog()", and the code is below.
Start back ground job with the progress bar
The "ProgressJob.startJob()" function starts back ground job with the progress bar. The source code is at"[ALINOUS_HOME]/include/progress.alns"
The source code is below.
This code does operation below.
- Insert record into the table which manage the progress bar status
- Call Thread.execute() in order to start back ground job
Framework of analyzing log
After new thread to analyze access log launched by the main thread, next code is executed as back ground job.
The main routine of log analyzing is in the doUploadlog
() function. After executed the function, call "ProgressJob.jobFinished()
" and finish the progress bar.
The source code of the main logic is below.
The "AccessLog.readConfig()" function get access log format, and read some lines. After reading lines defined by "$bulknum
" variable, analyze and insert the result simultaneously. In order to do this, the "bulkExecute()" function uses "Parallel execution" of Alinous-Core programming language.
Log analyze policy
The log analysing is done when you import access log. The function "AccessLog.analyze
()" includes the policy of analyzing the access log.
When you customize the way to analyze the access log, please customize here.
Source code of analyze
The source code of "AccessLog.analyze
()" is below.
This function does operations below.
- Check the access log is robot access or not
- If the access is of robot, insert access log record
Detect the log is robot access or not
This function actually checks the log is robot access or not.
The policy to check it is below.
- If access log format contains user agent, check the useragent, at first
- Check the remote host IP's name