Understanding Internals of Chaincode getState/putState calls

If we look at the ChaincodeStub class in $/hyperledger/fabric-chaincode-node/libraries/fabric-shim/lib/stub.js we can see following code

async getState(key) {
        logger.debug('getState called with key:%s', key);
        // Access public data by setting the collection to empty string
        const collection = '';
        return await this.handler.handleGetState(collection, key, this.channel_id, this.txId);

    async putState(key, value) {
        // Access public data by setting the collection to empty string
        const collection = '';
        if (typeof value === 'string') {
            value = Buffer.from(value);
        return await this.handler.handlePutState(collection, key, value, this.channel_id, this.txId);

The code for the handler (which is of type ChaincodeSupportClient) can be found in $/hyperledger/fabric-chaincode-node/libraries/fabric-shim/lib/handler.js

async handleGetState(collection, key, channel_id, txId) {
        const msg = {
            type: fabprotos.protos.ChaincodeMessage.Type.GET_STATE,
            payload: fabprotos.protos.GetState.encode({key, collection}).finish(),
            txid: txId,
            channel_id: channel_id
        logger.debug('handleGetState - with key:', key);
        return await this._askPeerAndListen(msg, 'GetState');

    async handlePutState(collection, key, value, channel_id, txId) {
        const msg = {
            type: fabprotos.protos.ChaincodeMessage.Type.PUT_STATE,
            payload: fabprotos.protos.PutState.encode({key, value, collection}).finish(),
            txid: txId,
            channel_id: channel_id
        return await this._askPeerAndListen(msg, 'PutState');

and it looks like this handler is instantiated in $/hyperledger/fabric-chaincode-node/libraries/fabric-shim/lib/chaincode.js

static start(chaincode) {
const client = new Handler(chaincode, url, optsCpy);

which in turn gets called when the chaincode is registered in $/hyperledger/fabric-chaincode-node/libraries/fabric-shim/lib/contract-spi/bootstrap.js

static register(contracts, serializers, fileMetadata, title, version) {
        // load up the meta data that the user may have specified
        // this will need to passed in and rationalized with the
        // code as implemented
        const chaincode = new ChaincodeFromContract(contracts, serializers, fileMetadata, title, version);

        // say hello to the peer
Posted in Software | Leave a comment

PHP Interactive Shell

I keep forgetting this so many times that I need to make a note of this. This is how you use it:

$ php -a
Interactive shell

php > $a='hello world';
php > echo $a;
hello world

Don’t forget the semicolons and remember PHP variables start with a $

Posted in Software | Leave a comment

Useful SO Queries

CDF of User Reputation


with cte as (select reputation, cume_dist() over (order by reputation) as cdf
from users)
select reputation, max(cdf) from cte
group by reputation
order by reputation;

According to this,

  • 71% of users have reputation <= 1. These are users who just created an account on the site.
  • 74% of users have reputation <= 10
  • 92% of users have reputation <= 100
  • 98% of users have reputation <= 1,000
  • 99.8% of users have reputation <= 10,000

also: http://rjbaxley.com/posts/2016/11/08/Stack_Exchange_Reputation_Power_Law.html

Question Count over time


DECLARE @Tag1 varchar(255)  = ##Tag1:string##;

    SELECT      DATEADD (month, DATEDIFF (month, 0, q.CreationDate), 0)  AS [Month],
    COUNT (q.Id)                AS NumQuests

    FROM        Posts           q
    INNER JOIN  PostTags        pt
    ON          q.Id            = pt.PostId
    INNER JOIN  Tags            t
    ON          t.Id            = pt.TagId

    WHERE       q.PostTypeId    = 1
    AND         q.CreationDate >= '2016-01-01 00:00:00'
    AND         t.TagName       IN (@Tag1)

    GROUP BY    DATEDIFF (month, 0, q.CreationDate), 
    order by    DATEDIFF (month, 0, q.CreationDate)

below graph shows number of questions tagged hyperledger-fabric over time

number of questions tagged hyperledger-fabric over time.

CDF of Question Count vs. time

total number of questions over time
Posted in Software | Leave a comment

failed to create runc console socket: mkdir /tmp/pty064803763: no space left on device: unknown

Got this error today when trying to log into a docker container

$ docker exec -it mysql /bin/bash
failed to create runc console socket: mkdir /tmp/pty064803763: no space left on device: unknown

Then I tried

$ df -h
Filesystem                 Size  Used Avail Use% Mounted on
devtmpfs                   3.9G     0  3.9G   0% /dev
tmpfs                      3.9G  4.0K  3.9G   1% /dev/shm
tmpfs                      3.9G  378M  3.6G  10% /run
tmpfs                      3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/mapper/rootvg-rootlv  7.8G  128M  7.3G   2% /
/dev/mapper/rootvg-usrlv   9.8G  2.3G  7.0G  25% /usr
/dev/sda1                  976M  113M  797M  13% /boot
/dev/mapper/rootvg-optlv   2.0G  1.5G  303M  84% /opt
/dev/mapper/rootvg-homelv  976M  3.9M  905M   1% /home
/dev/mapper/rootvg-tmplv   2.0G  432M  1.4G  24% /tmp
/dev/mapper/rootvg-varlv   7.8G  4.2G  3.3G  56% /var
/dev/sdc                   246G  1.1G  233G   1% /app
/dev/sdb1                   16G  2.1G   13G  14% /mnt/resource
tmpfs                      797M     0  797M   0% /run/user/48081

which did not indicate any problem. /tmp is only 24% used.

$ df -i
Filesystem                  Inodes  IUsed    IFree IUse% Mounted on
devtmpfs                   1016516    431  1016085    1% /dev
tmpfs                      1019390      2  1019388    1% /dev/shm
tmpfs                      1019390    856  1018534    1% /run
tmpfs                      1019390     16  1019374    1% /sys/fs/cgroup
/dev/mapper/rootvg-rootlv   524288   6703   517585    2% /
/dev/mapper/rootvg-usrlv    655360  86575   568785   14% /usr
/dev/sda1                    65536    343    65193    1% /boot
/dev/mapper/rootvg-optlv    131072  53934    77138   42% /opt
/dev/mapper/rootvg-homelv    65536    166    65370    1% /home
/dev/mapper/rootvg-tmplv    131072 131072        0  100% /tmp
/dev/mapper/rootvg-varlv    524288  26116   498172    5% /var
/dev/sdc                  16384000   3649 16380351    1% /app
/dev/sdb1                  1048576     13  1048563    1% /mnt/resource
tmpfs                      1019390      1  1019389    1% /run/user/48081

Above is showing all the inodes have been exhausted under /tmp. On listing /tmp I saw lot of files like this

#-> cat /tmp/tmp.01amFL2hH4
read_kt /etc/krb5.keytab
write_kt /tmp/tmp.kt

For now I manually deleted the files. And after that it works

$ docker exec -it mysql /bin/bash
Posted in Software | Leave a comment

Understanding the internals of wordpress php-fpm-alpine image

This post describes the internals of wordpress:php7.4-fpm-alpine Docker image.

The wordpress container does nothing but run the following command and wait for it to end which it never does as it spins up a server that listens for incoming connections forever:

$ docker-entrypoint.sh php-fpm

The file docker-entrypoint.sh can be found in /usr/local/bin/docker-entrypoint.sh inside the container. It is instructive to read and understand this file. It will do some setup etc. such as installing wordpress files in the WORKDIR which defaults to /var/www/html and ends with

exec "$@"

what above line does is to run php-fpm as if it were a continuation of the docker-entrypoint.sh script [1].

php-fpm is PHP’s Fast CGI processor which spins up a server listening at port 9000.

By inspecting the docker image (docker image inspect wordpress:php7.4-fpm-alpine) one can see from where the script installs PHP. This is given by the PHP_URL in below:

"Env": [
    "PHPIZE_DEPS=autoconf \t\tdpkg-dev dpkg \t\tfile \t\tg++ \t\tgcc \t\tlibc-dev \t\tmake \t\tpkgconf \t\tre2c",
    "PHP_EXTRA_CONFIGURE_ARGS=--enable-fpm --with-fpm-user=www-data --with-fpm-group=www-data --disable-cgi",
    "PHP_CFLAGS=-fstack-protector-strong -fpic -fpie -O2 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64",
    "PHP_CPPFLAGS=-fstack-protector-strong -fpic -fpie -O2 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64",
    "PHP_LDFLAGS=-Wl,-O1 -pie",
    "GPG_KEYS=42670A7FE4D0441C8E4632349E4FDC074A4EF02D 5A52880781F755608BF815FC910DEB46F53EA312",

Also we can see from PHP_EXTRA_CONFIGURE_ARGS that the php-fpm process will run as www-data. There are also references to www-data in docker-entrypoint.sh.

TIP: If you want to change the directory in which wordpress gets installed, simply change the –workdir to the location in which you want wordpress to be installed.

WordPress Boot Sequence

There are two great articles explaining how wordpress loads up when a request is received:

The bad: it looks like this boot sequence happens on EVERY request. That is why you don’t need to refresh or restart anything if you make changes to wp-config.php for example.

The boot sequence is this. Under /var/www/html there is an index.php which loads wp-blog-header.php which loads wp-load.php which loads wp-config.php which loads wp-settings.php which loads as many as 36 files.

bash-5.0# grep -n wp-blog-header.php index.php
4: * wp-blog-header.php which does and tells WordPress to load the theme.
17:require __DIR__ . '/wp-blog-header.php';
bash-5.0# grep -n wp-load.php wp-blog-header.php
13:	require_once __DIR__ . '/wp-load.php';
bash-5.0# grep -n wp-config.php wp-load.php
4: * and loading the wp-config.php file. The wp-config.php
8: * If the wp-config.php file is not found then an error
10: * wp-config.php file.
12: * Will also search for wp-config.php in WordPress' parent
27: * If wp-config.php exists in the WordPress root, or if it exists in the root and wp-settings.php
28: * doesn't, load wp-config.php. The secondary check for wp-settings.php has the added benefit
34:if ( file_exists( ABSPATH . 'wp-config.php' ) ) {
37:	require_once ABSPATH . 'wp-config.php';
39:} elseif ( @file_exists( dirname( ABSPATH ) . '/wp-config.php' ) && ! @file_exists( dirname( ABSPATH ) . '/wp-settings.php' ) ) {
42:	require_once dirname( ABSPATH ) . '/wp-config.php';
76:		/* translators: %s: wp-config.php */
78:		'<code>wp-config.php</code>'
83:		__( 'https://wordpress.org/support/article/editing-wp-config-php/' )
86:		/* translators: %s: wp-config.php */
88:		'<code>wp-config.php</code>'
bash-5.0# grep -n wp-settings.php wp-config.php
108:require_once ABSPATH . 'wp-settings.php';

Request Flow

in my nginx conf, I have

location / {
    try_files $uri $uri/ /index.php?$args;

Now when a request comes in for https://mywordpresssite.com/2020/04/22/e2e-connected-visibility-platform/ nginx will first see if there is a resource with that name. There isn’t any so nginx will perform internal redirect and attempt to serve index.php?2020/04/22/e2e-connected-visibility-platform. This will kick-in the index.php under /var/www/htmland the wordpress boot sequence. So requests for pretty much all resources are going through index.php which acts as the Main method.

Posted in Software | Tagged | Leave a comment

Debugging File Upload Issues with WordPress

First make sure you have adequately increased the maximum file size that can be uploaded to wordpress. It is possible that uploads may still not work. On php-fpm-alpine image (and even on apache) wordpress runs under the credentials of www-data user. This user needs to have write access to the wp_upload_dir directory and the subdirectory under it where it will attempt to create the file. The full path resolves to /var/www/html/wp-content/uploads/2020/06 to give an example. Use the namei utility with the -om flag to see if www-data has sufficient permissions to be able to create a file in the directory. Use chown -R to make www-data the owner of all subdirectories under wp-content.

Posted in Software | Tagged | Leave a comment

How to change the maximum file size you can upload to wordpress on nginx w/ php-fpm

Apache is a solid web server but my goto environment for wordpress has now changed to running it on Docker with NGINX + php7.4-fpm-alpine. To change the max file size you can upload to wordpress, you will need to make changes to two places:


In the server block set the client_max_body_size like so

server { 
    listen 443 ssl; 
    server_name '"$COMMON_NAME"'; 
    ssl_certificate /home/server.crt;
    ssl_certificate_key /home/server.key;
    root /var/www/html; 
    client_max_body_size 100M;

This setting controls how much the web server itself will allow before wordpress even comes into the picture.


Create following file if it does not exist /usr/local/etc/php/conf.d/uploads.ini and edit it like so:

file_uploads = On
memory_limit = 500M
upload_max_filesize = 100M
post_max_size = 100M
max_execution_time = 600
max_file_uploads = 50000
max_execution_time = 5000
max_input_time = 5000

You are all set! Above settings control how much wordpress will allow. Take a look at wp_max_upload_size

Tip: Verify /usr/local/etc/php/conf.d/uploads.ini is indeed the correct file to modify by running phpinfo() in PHP REPL

bash-5.0# php -a
Interactive shell

php > phpinfo();
PHP Version => 7.4.5


Configuration File (php.ini) Path => /usr/local/etc/php
Loaded Configuration File => (none)
Scan this dir for additional .ini files => /usr/local/etc/php/conf.d
Additional .ini files parsed => /usr/local/etc/php/conf.d/docker-php-ext-bcmath.ini,

Posted in Software | Tagged | Leave a comment

Rahul Gandhi with Narendra Modi in Restaurant

मोदी जी एक रेस्तरां में गये और राहुल के पास वाली सीट पर बैठ कर टी वी देखने लगे ।

टी वी पर रात 9 बजे वाली न्यूज चल रही थी जिसमें कोई आदमी बहुत ऊंची इमारत की छत से कूदकर आत्महत्या करने जा रहा था।

राहुल ने मोदी जी से पूछा,"आपको क्या लगता है, वो कूद जायेगा ?" मोदी जी ने कहा, "हां, वह कूद जायेगा, मैं शर्त लगा सकता हूं।"

राहुल बोला,"ठीक है, शर्त लगाई, वह नहीं कूदेगा।" ठीक है कहते हुए, मोदी जी ने दो हजार रुपए टेबल पर रख दिए।

जैसे ही राहुल ने अपने दो हजार रुपए टेबल पर रखे वह शख्स इमारत से नीचे कूद गया और मर गया। राहुल ने बड़ी निराशा और अनिच्छा से दो हजार रुपए मोदी जी को दे दिए।

मोदी जी बोले,"नहीं, मैं आपसे ये रुपए नहीं ले सकता क्योंकि मैने शाम पांच बजे की न्यूज में ही ये देख लिया था इस लिए मुझे पता था कि वह कूद जायेगा।"

राहुल बोला, "पांच बजे की न्यूज मे ये सब देखा तो मैने भी था, पर मैंने यह नहीं सोचा कि वह शख्स दोबारा ऐसा करेगा।"………..

अब मोदीजी ने पैसे ले लिए ।


Posted in Jokes | Leave a comment

Complete guide to Azure AD Authentication in Node

There is a lot of documentation available on how to authenticate users against Azure AD. There are even libraries like MSAL.js that are supposed to do the job for you. Unfortunately, for me, I found the Microsoft documentation very noisy and confusing. Which library am I supposed to use? MSAL.js, ADAL.js, there is even a Passport.js in classic Microsoft fashion of developing multiple products for same use case – think Yammer, Teams, Skype or OneDrive, Sharepoint. Should I do OAuth or OpenIDConnect? It does not stop here. Microsoft provides two endpoints for doing OAuth – v1 and v2.

The link that is most helpful to understand and get the background on doing authentication using OAuth or OpenIdConnect (OIDC) is https://developer.okta.com/blog/2019/10/21/illustrated-guide-to-oauth-and-oidc. Start by reading this link and understand the four things you need in Azure before you can proceed further:

  • Client ID: identifier of your application. Also known as Application ID. You will get this by creating a new App Registration in Azure.
  • Tenant ID: identifier of the organization’s Azure AD against which you are trying to authenticate users.
  • Client Secret: password of your application. You create a new client secret in Azure portal.
  • Redirect URL: this has to be setup in Azure as well.

The OAuth authentication is a two step process. We use v1 endpoints in this article.

There are two endpoints involved in OAuth authentication. An authorization endpoint and a token endpoint.

1. In the first step, one makes a request to the authorization endpoint. A complete request might look like:


This endpoint will take the user to Microsoft login page. The user logs in. Then user is prompted to grant permissions to the app. If user approves, an access code is returned to the user’s browser and the browser is redirected to the redirect_uri – think of this as a callback url. The point to note here is that it is the user’s browser which will invoke the redirect_uri with the access code, not Azure. Azure does not call the callback. Next thing to note is that Azure only accepts https URL for the redirect_uri and this restriction comes from the protocol itself because of security considerations [https://tools.ietf.org/html/rfc6749#section-10.3]

An example callback might look like


Whatever you pass in the state parameter in the call to authorize, you will get it back in the callback. You should set the state parameter to the URL the user was trying to access in the first place. Otherwise, that information will be lost and you won’t be able to land the user to appropriate page after logging in the user.

2. In your callback method, you use the access code to get access token of the user. To do this you have to make a POST request to the token endpoint and you need to know a client secret. Node code for doing that would look like following

var postData = querystring.stringify({
                'grant_type': 'authorization_code',
                'code': req.query.code,
                'client_id': this.clientId,
                'client_secret': this.clientSecret,
                'resource': this.clientId,
                'redirect_uri': this.callbackUrl,
                'scope': 'user.read'

var tokenRequestOptions = {
                hostname: this.tokenUrl.hostname,
                port: 443,
                path: this.tokenUrl.pathname,
                method: 'POST',
                timeout: 45,
                headers: {
                    'Content-Type': 'application/x-www-form-urlencoded',
                    'Content-Length': Buffer.byteLength(postData)
const tokenRequest = https.request(tokenRequestOptions, tokenResponse => {

A successful response has following fields in it and is returned as a string that you will have to parse using JSON.parse:

From here on, you can use either the access_token or the id_token to get user details. I found both have the same info. I used the id_token which is JWT token. To decode it, use the jwt-decode or jsonwebtoken library. The decoded token has following info in it

Decoded ID Token

The aud field is equal to your client ID and the UUID in iss field is equal to the issuer – the tenant ID. You can use this to validate the token. See this for what the other fields mean.

That’s it! You have authenticated the user! Observe that the client secret is never shared with the browser. The story does not end here because now you should set some persistent info – think cookie – in the response sent back to the browser so that when user visits the site again, they don’t have to log in again! Also, we need to redirect the browser to the page the user has been trying to visit all this time.

There are many ways you can set a cookie. I used following code where we use the jsonwebtoken library.

const userToken = {
            uid: userInfo.unique_name,
            name: userInfo.name
        const signedToken = jwt.sign(userToken, this._jwtSecret, { expiresIn: 86400 }); // 24h
        const redirectUrl = url.resolve(this._siteUrl, landingUrl);
            .cookie(this._loginCookie, signedToken, { maxAge: 86400, httpOnly: true, secure: true })

The secure=true option tells to transmit the cookie only via TLS and httpOnly=true prevents user from accessing the cookie via client-side javascript. Both are pretty important.

Now when user makes a request back again, s/he can be authenticated by examining the cookie. We use the cookie-parser library to access cookies using req.cookies in the code below.

const token = req && req.cookies && req.cookies[this._loginCookie];
        if (!token) {
            return false;
        // Test the validity of the token
        try {
            const decodedToken = jwt.verify(token, this._jwtSecret);
            // Compare the token expiry (in seconds) to the current time (in milliseconds)
            // Bail out if the token has expired
            if (decodedToken.exp <= Date.now() / 1000) {
                return clearTokenAndNext();
            return {
                'email': decodedToken.uid,
                'name': decodedToken.name
        } catch (err) {
            return clearTokenAndNext();

Finally we can wrap this up by adding a method that will logout a signed in user

logout(req, res) {
        res.status(200).send('You have successfully logged out');

    _clearLoginCookie(res) {

There you have it! With a little effort, it can all be done without using any library. You will need following dependencies:

"dependencies": {
    "cookie-parser": "~1.4.5",
    "express": "~4.17.1",
    "jwt-decode": "~2.2.0",
    "jsonwebtoken": "~8.5.1"

Good Luck! Make it happen!

Implicit Grant explained: The best practice is to obtain the access or ID token on the server in the callback. But if you are developing a SPA which has no server-side code, then you have no choice but to obtain the token on the client side. To do this, you need to check the implicit grant boxes.

Again do this only if you have no server side code i.e., there is no choice but to get the token on client itself. https://tools.ietf.org/html/rfc6749#section-10.3 clearly states:

When using the implicit grant type, the access token is transmitted in the URI fragment, which can expose it to unauthorized parties.

If you enable implicit grant, you can acquire the access or ID token in the call to authorize itself as shown in https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-implicit-grant-flow

For this to work you need to use response_type=id_token instead of response_type=code when making request to the authorize endpoint. The implicit grant allows you to get the token directly from the authorize endpoint without having to know the client secret. It short-circuits the step of using the access code to get the access or ID token in the callback on the server. TL;DR: don’t do it.

V1 vs V2: Azure AD supports two endpoints for OAuth. What’s the difference?
scopes are only supported by the v2 endpoint. In my case the v1 endpoint was perfect as it returns all the information about the user. See https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-auth-code-flow and https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-permissions-and-consent for information on scopes.

  • The email scope allows your app access to the user’s primary email address through the email claim in the id_token, assuming the user has an addressable email address.
  • The profile scope affords your app access to all other basic information about the user, such as their name, preferred username, object ID, and so on, in the id_token.
Posted in Software | Tagged | Leave a comment

CloudFlare 522 Error. Server does not respond.

This is probably the longest time I have spent on debugging something (4 days) so its worth the effort to write about it.

The problem: We built a wordpress site but got CloudFlare 522 error when trying to connect to it. Our logs were completely empty.

The Setup: Our setup consisted of a VM in a private network with no public IP. This VM ran 3 docker containers connected together in a user-defined bridge network. The first container was an nginx server which would forward requests for PHP resources to the wordpress container. The third container was running MySQL 5.7. Nginx server configuration was borrowed from here. If the VM has no public IP how does one access it? The VM was connected to a layer-4 load balancer in a DMZ with a public IP. Further the load balancer would only accept traffic from CloudFlare CDN. The purpose of CloudFlare was to filter out malicious traffic and protect the servers.

CloudFlare -> Load Balancer -> NGINX -> WordPress -> MySQL

How we debugged: We tried all the usual things. We tested that the host is forwarding http (port 80) and https (port 443) traffic to the nginx container.

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                      NAMES
ec8600d9c4b4        nginx:1.17          "nginx -g 'daemon of…"   24 hours ago        Up 24 hours>80/tcp,>443/tcp   nginx
feb535607eb2        wordpress:php7.4-fpm-alpine       "docker-entrypoint.s…"   25 hours ago        Up 25 hours         9000/tcp                                   wordpress
48b2dea3706b        mysql:5.7           "docker-entrypoint.s…"   25 hours ago        Up 25 hours>3306/tcp, 33060/tcp          mysql

We tested that wordpress container is listening on port 9000

$ docker exec wordpress netstat -tpln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0*               LISTEN      -
tcp        0      0 :::9000                 :::*                    LISTEN      1/php-fpm.conf)

We tested that nginx container can connect to wordpress container

$ docker exec -it nginx /bin/bash
root@ec8600d9c4b4:/# apt-get install netcat
root@ec8600d9c4b4:/# nc -zv wordpress 9000
DNS fwd/rev mismatch: wordpress != wordpress.wordpress_net
wordpress [] 9000 (?) open

We edit our wp-config.php to enable debug logs in wordpress

// Enable WP_DEBUG mode
define( 'WP_DEBUG', true );

// Enable Debug logging to the /wp-content/debug.log file
define( 'WP_DEBUG_LOG', true );

// Disable display of errors and warnings
define( 'WP_DEBUG_DISPLAY', false );
@ini_set( 'display_errors', 0 );

// Use dev versions of core JS and CSS files (only needed if you are modifying these core files)
define( 'SCRIPT_DEBUG', true );

We then tailed the following logs:

$ docker logs -f nginx
$ docker logs -f wordpress
$ docker exec wordpress tail -f /var/www/html/wp-content/debug.log

Logs were empty.

Next thing we tried was to run a bare-bones nginx server without the wordpress setup. So just run

$ docker run -d -p 80:80 nginx:alpine

Now the error vanished! This led us to believe that CloudFlare was working properly. So the problem had to be with wordpress. But then why was there no error in the logs? Without anything in the logs to give a clue, we were stuck for a long time. Then it dawned on us to access the VM directly using its private IP from the VPN and lo and behold the server responded and wordpress loaded up! So it couldn’t be a problem with wordpress!

When 2+2 does not equal 4: If CloudFlare is working properly as well as WordPress, then we are led to a logical contradiction and the 522 cannot be explained.

We then contacted CloudFlare when we were out of moves and they told us that all requests to the site were timing out leading them to believe there is something blocking requests from CloudFlare’s IP

Source IP: Y.Y.Y.Y
nc: connect to X.X.X.X port 443 (tcp) timed out: Operation now in progress
[exit code 1]
Source IP: Y.Y.Y.Y
nc: connect to X.X.X.X port 443 (tcp) timed out: Operation now in progress
[exit code 1]
Source IP: Y.Y.Y.Y
nc: connect to X.X.X.X port 443 (tcp) timed out: Operation now in progress
[exit code 1]
Source IP: Y.Y.Y.Y
nc: connect to X.X.X.X port 443 (tcp) timed out: Operation now in progress
[exit code 1]
Source IP: Y.Y.Y.Y
nc: connect to X.X.X.X port 443 (tcp) timed out: Operation now in progress
[exit code 1]

But we knew there was nothing blocking CloudFlare as the requests did succeed when we ran a bare-bones NGINX server.

Finally, the light struck us on the 4th day. The load balancer in Azure uses health probes to know if a machine is healthy. It construes any 200 response as indication of an unhealthy server. This is all documented here

An HTTP / HTTPS probe fails when:

Probe endpoint returns an HTTP response code other than 200 (for example, 403, 404, or 500). This will mark down the health probe immediately.

but it was the first time I was using a load balancer up close and personal and I didn’t know this. The endpoint it was using to test was / to which nginx was responding with a redirect 302. As soon as I changed the health probe to a custom endpoint to which I configured NGINX to return 200 the error vanished!

    # this special section is for the load balancer.
    # The elastic load balancer in azure needs to know if a machine is healthy. 
    # We assume it does that by making a request to the /health-probe endpoint.
    # If the load balancer gets a non-200 response it will mark the machine as unhealthy
    # and not send requests to it. 
    location /health-probe {
        return 200 OK;

It was typical example when you need out-of-the-box thinking to fix a problem. For the longest time I kept thinking maybe the problem is in the docker network as I have struggled with it in the past. The fact that CouldFlare worked as well as WordPress (when we hit the VM using its private IP) but then why we were getting was the thing that puzzled me the most. It left us with no clue.

Posted in Software | Tagged | Leave a comment