When processes on your Hypernode require more memory than is available, there is a risk of downtime. To prevent such an ‘out of memory’ event from happening, the Linux oom-killer process tries to kill processes in order to free up memory. This oom-killer process is a last resort measure to prevent the Hypernode from going down in its entirety.

The OOM-killer explained

The OOM-killer will try to kill the process using the most memory first. As this is usually MySQL, this is not always a good idea. So we use Cgroups, a Linux kernel feature, to group certain processes together and assign them a value to have control over which processes should be killed first and which crucial processes should not be killed at all.

Processes like SSH, crons or inactive Nginx workers will typically be killed first, having been assigned a high numerical value of 1000. You’ll also encounter processes with a value of -1000, which translates to ‘don’t kill this process’ which will be assigned to for example mysqld or php-fpm. MySQL in particular tends to hog memory and is reluctant to release it, which can be worked around by performing a graceful restart using the hypernode-servicectl tool.

Configuring the OOM-killer

It is possible to change the behaviour of the OOM-killer, configuring it to be permissive of short term memory usage, allowing memory hungry processes to run without executing overly drastic measures. While this allows short lived scripts to run more stable, it does result in processes which would later become memory problems to be prevented from being culled in an early stage in shops with a certain type of memory footprint.
The default is to be more restrictive, as on average that seems to be the most effective setting.

This configurable setting can be set using the hypernode-api or using the hypernode-systemctl command-line tool from the Hypernode.
If you want to enable or disable this setting yourself, you can do so with the command below. But before you do, be sure to read this changelog for an in-depth explanation of what the setting exactly entails.

$ hypernode-systemctl settings permissive_memory_management --value True

To disable this setting use the following command:

$ hypernode-systemctl settings permissive_memory_management --value False

Note that even though this setting can give you some more leeway in regards to memory utilization on Hypernode and gives you the option to make a decision whether you value keeping the site online at the cost of killing non-essential processes early versus trading some risk in terms of stability for increasing the chance of memory hungry one-off processes to complete, if you notice structural Out Of Memory messages in your kern.log that often indicates a real problem in the shop or that it might be time to upgrade.

Prioritize importent processes
To protect a command from the OOM-killer and give it maximum priority over all other services you can simply ‘wrap’ the command with the hypernode-oom-protect command. This will start the command specified as normal, except for that the oom score adj will be set to -1000.

For Example:
hypernode-oom-protect php /data/web/memory_hungry_script.php

Because the MySQL process is also marked as unkillable by default, it is possible to execute scripts to interact directly with the database with (for example using PDO) that will be unhampered by the memory state of the system.

There is still a finite amount of memory
Please keep in mind that marking processes as unkillable does not magically add more memory and it will also not prevent memory allocation errors like PHP Fatal error: Out of memory (allocated 1234) (tried to allocate 12345 bytes). It serves a nice purpose of taming the OOM-killer, but eventually it only shifts the problem slightly. If you persistently have problems with memory (and those not caused by a structural misallocation of resources), the real solutions would still be to upgrade to a bigger plan.

For more information about hypernode-oom-protect please see the changelog

Debug OOM events

If you receive a notification that an ‘out of memory’ event has occurred, the OOM-killer process will already have done its job and you’ll see that memory has been freed up. So in order to find out what happened, we’ll have to inspect the logs. You can use the command less /var/log/kern.log|grep -v 'UFW BLOCK' (or dmesg --ctime --color=always | grep -v 'UFW BLOCK') to find out what happened. Example below:

OOM killer

So you may have found an out of memory event in your kernel log. It will start with ‘[process] invoked oom-killer’ followed by a stack trace and and a list of running processes and child processes.

OOM killer

The log ends with ‘Killed process [pid] (name of process)’. You’ll also have a timestamp (not included in these screenshots) so you’ll be able to check other logs and see if something correlates. Note that this does not always mean there is a single culprit. Logs you could check are /var/log/mysql/mysql-slow.log and /var/log/php-fpm/php-slow.log. In addition, you can check the exception.log and system.log in Magento’s /data/web/public/var/log folder.

How to deal with out of memory events

If you receive emails about OOM events regularly, there are various ways to deal with this. These options roughly fall into two categories; Reducing memory usage, and reducing memory requirements. If none of these

Reducing memory usage

By reducing the memory usage of software running on your Hypernode, more memory will be available for visitors, and for running periodic tasks. One of the main culprits of memory usage is MySQL. MySQL will allocate more memory when it needs this for running large queries, but will not free this memory afterwards. This means MySQL memory will only increase over time, and not go down again. A simple way to deal with this is to periodically restart MySQL using the hypernode-servicectl restart mysql command.
If you are using a basic staging environment on your Hypernode, some memory will always be allocated to your staging environment. As such it might be beneficial to replace this with a separate Development Hypernode.

Reducing memory requirements

Some tasks in Magento use up a lot of memory, sometimes more than is available on the server. Often this memory will not be free’d until the process is finished, even if it’s no longer needed.
One way to deal with this is to run smaller tasks, ending the process earlier and freeing up this memory again. By not stacking your cronjobs, and by running smaller imports instead of a gigantic bulk import, you can reduce a memory peak to several memory bumps instead, which will have a reduced impact on your system.
Another way it to ensure less memory is needed by reducing filesizes. As Magento often loads images into memory to resize those, optimizing images to take up less disk space means these also use up less memory when Magento works on these files.

Upgrade to a larger plan

If you cannot make enough memory available, one of the only remaining options is to upgrade to a larger Hypernode with more memory. As Magento becomes more complex over time, and databases and visitor counts grow, at one point or another your shop will become to large for your existing Hypernode and require an upgrade for extra memory, and other server capacity.

10