Fixing 504 Bad Gateway nginx error

I have a problem with my website. Since a few days, I only see a 504 Bad Gateway nginx error showing if I visit my website. I followed a lot of instructions on the internet but nothing was helpful. The website is still offline.

codenovae's picture

nginx waits in vain for an answer

Nginx 504 Gateway Error

504 Bad Gateway Time-out nginx/1.15.5

When calling a web page on the Internet, the server sends an HTTP status code for each request. If you see only a white web page with an HTTP status code instead of the web page, it can have very different causes. An overview of all possible HTTP status codes can be found here on Wikipedia.

 

To find out why exactly a 504 Bad Gateway nginx HTTP status code is displayed, one first needs to understand how a server answers to a server request from a browser starting. An inquiry usually begins by entering an Internet address in the browser or by clicking on an element on the website. The resulting HTTP request is then forwarded to large DNS servers. There, the domain is then assigned an IP address so that the input is sent to the correct server. On the destination server, several interconnected systems are usually responsible for answering a sent HTTP request. One speaks in this context of a stack.

 

Highly simplified pictorial representation of a LAMP software stack:

LAMP stack

Often today special Linux / Varnish / Memcached / Apache / Nginx / PHP / Mysql software stack configurations are used to process server requests internally. Of course, there are also countless other software stack configurations. In the example above, Nginx is often configured as a network gateway to forward HTTP requests to the Apache web server. A network gateway is in principle a mediator between often incompatible systems. The Apache web server can not easily understand HTTPS requests. NGINX can read the HTTPS protocol and forward it so that the Apache server can handle it. In addition, other systems, such as Varnish or Memcached are in use to shift repeated requests into the faster memory. Also, Varnish cannot handle HTTPS requests directly. The mediator's task is handled by NGINX, which in our example is configured as a network gateway. Each individual communication station works like a Swiss precision clockwork and fulfills its own special tasks in order to process the server requests as efficiently and quickly as possible. But it is not just the software stack that is responsible for processing and forwarding the data. The hardware must work properly as well. If one of these communication stations fails, the internally defined time-out threshold will be exceeded. In this case, the user then sees only a white browser window with the nginx Bad Gateway 504 message. To fix the error, one must, therefore, identify which of the communication stations has failed. Here is an overview of the most common error messages in this context:

 

  • "Bad Gateway Timeout 504 NGINX"
  • "NGINX Bad Gateway Timeout 504"
  • "HTTP 504 Gateway Timeout NGINX"
  • "504 Gateway Timeout NGINX"
  • "Gateway Timeout 504 NGINX"

 

A gateway 504 nginx error means that nginx in the communication chain has to wait too long for a response from another web server or network interface. Therefore, there is a gateway timeout. The amount of time required to trigger the "Bad Gateway Timeout 504" message is specified in the individual configuration files of the software stack components. As an example, PHP, Apache, NGINX, Plesk, and FastCGI have configuration files where changes to the timeout settings can be made. More on that later.

 

First of all, you should be aware that a nginx Gateway 504 Error does not occur out of the blue. In most cases, there is a change in the system configuration or hardware so that a script or a hardware component does not perform the assigned tasks in the given time. Many instructions on the internet show how to increase the time to stop displaying the Bad Gateway nginx 504 error message. Unfortunately, this does not solve the problem in most cases. A better approach would be to identify what exactly is responsible for the timeout. If the problem occurs within the php-fpm, then you can use slow-log to identify slow scripts. To do this, you have to adjust the configuration file of the pool:

sudo vi /etc/php-fpm.d/www.conf
slowlog = /var/log/php-fpm/www-slow.log
request_slowlog_timeout = 3s

In the example shown, all PHP scripts which need for execution more than 3 seconds are written to the log files.

 

List of possible causes for a nginx Bad Gateway 504 error:

  1. The DNS server is down or too slow. Contact your domain provider to initiate a DNS server check. Alternatively, you can also change the name servers in the domain configuration. Google provides e.g. fast and reliable name servers available for free.

  2. The network in the data center is overloaded or down. In this case, all hosted web pages on this server would be affected by the outage. A router, switch or proxy server in the data center could be down as a result of a technical defect or overload. Here you should contact the provider to rule out a hardware defect. A hardware reset may also be appropriate for a hanged up hardware component to solving the problem.

  3. The proxy settings may be incorrect on both the client and the server. So you should consider whether you have changed the settings of the proxy server before the 504 nginx Bad Gateway error message occurred.

  4. The local router could be misconfigured or hanged up. If other websites on the Internet work without problems, this cause can be disregarded. At https://downforeveryoneorjustme.com/ you can check if the website is out of reach for anyone on the internet.

  5. If you are using a proxy server to relieve the web server and observe that there is often an nginx Bad Gateway 504 timeout error message when the traffic is particularly high, then this is a sign that the hardware is over challenged. In this case, you should buy or rent additional hardware resources and integrate them into the system. With a cloud solution, the handling is much more flexible. With a cloud solution, hardware resources can be flexibly adapted to the needs at any time. To check how busy the respective server is, you can run the following command:

    free -h

    The command shows the memory usage and CPU utilization. To see which processes are responsible for high server load, we can do the following:

    top

     

  6. In a server network, image or cache servers are often used. These servers serve the purpose of relieving the file and database server. If one of these additional servers becomes unreachable, it will result in a 504 Bad Gateway Timeout error message because the requested data is not available. All servers in the server network should work properly.

  7. It may also be that a newly installed component or code is incompatible with a particular software stack component version. If you have before the error message updated the PHP version, it may be that previously functioning code can no longer be executed correctly. Here it is recommended under
     

    /var/log/nginx/error.log

    /var/www/vhosts/system
    /example.com/logs/proxy_error_log

    to look at the log files carefully. Even if one does not understand the information right at first sight. Mostly there are detailed indications of the cause of the problem.

  8. The website has been infected by malware or a virus and changed in this way so that assigned tasks can no longer be performed as desired. It could be that the server was converted to send spam emails. This, in turn, could overload the hardware resources. Furthermore, DDoS attacks or bots can overload the infrastructure of the servers.

  9. An upstream Content Delivery Network (CDN) may be down or malfunctioning due to a bad configuration.

  10. The databases could be corrupted. If, for example, a backup has recently been restored, it could be that the databases have been damaged. In such a case, one must first repair the databases. For CMS systems such as WordPress or Drupal there are special modules for repairing and tidying up the databases. Depending on the system, very different databases are used, such as MySQL, MariaDB or ApacheSolr.

  11. When using a Content Management System (CMS) such as WordPress or Drupal, a previously installed and broken plugin could be responsible for the internal data processing timeout.

  12. The htaccess file could be configured incorrectly. This could be the case due to a faulty change or especially with newly set up systems.

  13. Last but not least, the website may perform some very complex and time-consuming tasks that simply exceed the internally defined timeout periods. In such a case one should adapt the configuration files.

The most common configuration files for the timeout settings of the individual stack components are listed here. The specifications for the timeout period can be adjusted in the respective configuration files:

Apache (httpd.conf)

Timeout 1000

#Restart Apache

service apache2 restart

PHP (php.ini)


max_execution_time 1000

#Restart Apache:

service apache2 restart

NGINX (/etc/nginx/nginx.conf or creating new file under /etc/nginx/conf.d/timeout.conf)

client_header_timeout 1000;
client_body_timeout 1000;
fastcgi_read_timeout 1000;
client_max_body_size 128m;
fastcgi_buffers 8 128k;
fastcgi_buffer_size 128k;
proxy_send_timeout 1000;
proxy_read_timeout 1000;
send_timeout 1000;
fastcgi_send_timeout 1000;

#Restart nginx

service nginx restart

 

Plesk

If you use Plesk you can increase the timeout limits on a domain level. Therefore go in Plesk to Domains->yourdomain.com->Apache &nginx Setting and find the Additional nginx directives field and add the following:

proxy_connect_timeout 1000s;
proxy_send_timeout 1000s;
proxy_read_timeout 1000s;
fastcgi_send_timeout 1000s;
fastcgi_read_timeout 1000s;

 

Picture Source: 

By Karsten Adam (eigene Arbeit; Netzwerkkarte von OpenClipArt.org) [GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons

 

Vote the answer: 
0
No votes yet