The following warnings occurred:
Warning [2] Undefined variable $unreadreports - Line: 26 - File: global.php(961) : eval()'d code PHP 8.1.2-1ubuntu2.14 (Linux)
File Line Function
/global.php(961) : eval()'d code 26 errorHandler->error
/global.php 961 eval
/printthread.php 16 require_once



UserSpice
Tomfoolery Logging Bot hits - Printable Version

+- UserSpice (https://userspice.com/forums)
+-- Forum: Miscellaneous (https://userspice.com/forums/forumdisplay.php?fid=28)
+--- Forum: Documentation (https://userspice.com/forums/forumdisplay.php?fid=30)
+--- Thread: Tomfoolery Logging Bot hits (/showthread.php?tid=597)



Tomfoolery Logging Bot hits - jdmfarms - 06-16-2017

Hello.

As I'm new, I'm still learning what I can about userspice. I haven't found anything that addresses this subject.

Issue: I've noticed in my log a constant logging of certain IPs and did some research and noticed they are the search engine web crawlers. I'd like the log to omit entries of known bots (google, amazon, etc.)

Thoughts: I was thinking of exploring the tomfool php file and support files to see if I could put an exception in but before hand, was interested in seeing if any others have done this already. Another thought is to explore code in the htaccess files in select directories to block bots from scanning those directories.

Your thoughts? Thank you for any insight you can provide on this. I just dont want my log bloated with bot hits.

JDM


Tomfoolery Logging Bot hits - mudmin - 06-20-2017

We can definitely make a whitelist. I actually don't have that problem on my projects, so it could be some sort of server configuration option.

My thought is that we can add a table of ip addresses that don't show up there. Just make sure not to put your own!

It's going to be 5 weeks until I can get back to active development (I run a camp and we're right in the middle of camp season), but this is doable.


Tomfoolery Logging Bot hits - Brandin - 06-20-2017

Very odd that you are getting forked with these crawlers! I haven't had any issues on either of my US projects...However, if it is happening, adding an omitting option would be great.


Tomfoolery Logging Bot hits - jdmfarms - 06-20-2017

I'm an amateur programmer...still have lots to learn. Therefore, it could be something I could turn off on my server. Or, it could be that I'm on a European service provider's server.


Tomfoolery Logging Bot hits - Brandin - 06-20-2017

I don't have any special configuration on my cPanel VPS other than ConfigServe firewall...I am wondering if this is why I don't get crawled...


Tomfoolery Logging Bot hits - firestorm - 06-20-2017

you generaly do this in a robot.txt file however if that fails to stop them then you can use the following in your htaccess for the directory you dont want crawled

<pre>
Code:
RewriteEngine On

RewriteCond {3bc1fe685386cc4c3ab89a3f76566d8931e181ad17f08aed9ad73b30bf28114d}{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider) [NC]
RewriteRule .* - [R=403,L]
</pre>


change the bots to the ones you want stopped and they arent case sensitive, you can change the error code to what you wish aswell, currently 403 access forbidden


Tomfoolery Logging Bot hits - jdmfarms - 06-25-2017

Firestorm, unfortunately neither seems to work. I tried the htaccess first and it crashed my site. now I have the robot.txt file in a few folders and they seem to ignore it. however, i only have the basic entry:

User-agent: *
Disallow: /