User Tools

Site Tools


computing:managingbots

This is an old revision of the document!



  • managingbots
  • Jonathan Haack
  • Haack's Networking
  • webmaster@haacksnetworking.org

managingbots


This tutorial is designed for Debian OS and LAMP stack users. In my case, I have a multi-site WordPress that includes my tech blog, poetry, and teaching blog. Additionally, I have a separate vhost on the same instance that is this very dokuwiki. The first thing I did was create a script that would scrap and tally all the bots and how much they have done during the last day:

The first report is over here:

After considering my physical host of 48 threads, 384GB of RAM, and the virtual appliance that runs on it for this script was built, which is 16 vCPU cores and 16GB of RAM, I decided that the first thing to do was to tweak apache to ensure it could handle the flood long enough to take action. So, by increasing the maximum workers to 800, with each worked being able to handle approximately 1.5K - 2.5K requests per second, this means that the server could handle 95K - 145K requests. This is well-above what even the most aggressive bot did to my server, and also well-above most reports I've see most impacted servers. Here are some bird's eye averages of some of the reports we've all read about:

Okay, so the multi processing module numbers above are tweaked to be roughly 5X of the worst attack. The next question - before I set those values - is whether my hardware can handle that. In my case, the virtual appliance has 16 vCPUs/cores, so assuming 50 threads per core, that would be maybe at most 5GB total of RAM usage, or 2-5MB per thread. Okay, so I configured the mpm_event (no one should be using prefork anymore) module by opening /etc/apache2/mods-available/mpm_event.conf and changing the defaults as follows:

#Defaults
#StartServers            2
#MinSpareThreads         25
#MaxSpareThreads         75
#ThreadLimit             64
#ThreadsPerChild         25
#MaxRequestWorkers       150
#MaxConnectionsPerChild  0
#Adjustments
StartServers            4
MinSpareThreads         25
MaxSpareThreads         75
ThreadLimit             64
ThreadsPerChild         25
MaxRequestWorkers       800
MaxConnectionsPerChild  0
#32 is the exact ServerLimit, setting to 50 to have some wiggle room
ServerLimit             50

The stock configuration seems to be tailored to allow a popular hobbyist to have a functioning website with minimal configuration changes. Once those were changed, and since I use WordPress and Nextcloud which rely heavily on php, I also took a look at /etc/php/8.2/fpm/pool.d/www.conf and adjusted the servers, or child processes as follows:

#default
#pm.max_children: 5
#pm.start_servers: 2
#pm.min_spare_servers: 1
#pm.max_spare_servers: 3
#adjusted
pm.max_children = 400
pm.start_servers = 40
pm.min_spare_servers = 20
pm.max_spare_servers = 40
pm.max_requests = 1000

Assuming that

oemb1905 2025/04/06 08:57

computing/managingbots.1743933494.txt.gz · Last modified: 2025/04/06 09:58 by oemb1905