Main, Operating System, Redhat / CEntOS / Oracle Linux

Standard Linux Tuning

Hello Bloggers,

Majority of the applications these days are deployed on (Debian / Redhat) Linux Operating System as the Base OS.

I Would like to share some generic tuning that can be done before deploying any application on it.

Index Component Question / Test / Reason  
  These are some checks to validate the network setup.
[ Network Are the switches redundant?
Unplug one switch.
  Network Is the cabling redundant?
Pull cables.
  Network Is the network full-duplex?
Double check setup.
Network adapter (NIC) Tuning
  It is recommended to consult with the network adapter provider on recommended Linux TCP/IP settings for optimal performance and stability on Linux.

There are also quite a few TCP/IP tuning source on the Internet such as

  NIC Are the NIC fault-tolerant (aka. auto-port negotiation)?
Pull cables and/or disable network adapter.
  NIC Set the transmission queue depth to at least 1000.

txqueuelen <length>

‘cat /proc/net/softnet_stat’

Performance and stability (packet drops).

  NIC Enable TCP/IP offloading (aka. Generic Segment Offloading (GSO)) which was added in kernel 2.6.18


Check: ethtool -k eth0

Modify: ethtool -K <DevName>eth


Note: I recommend enabling all supported TCP/IP offloading capabilities on and EMS host to free CPU resources.

  NIC Enable Interrupt Coalescence (aka. Interrupt Moderation or Interrupt Blanking).


Check: ethtool -c eth0

Modify: ethtool -C <DevName>


Note: The configuration is system dependant but the goal is to reduce the number of interrupts per second at the ‘cost’ of slightly increased latency.

TCP/IP Buffer Tuning
  For a low latency or high throughput messaging system TCP/IP buffer tuning is important.
Thus instead of tuning the defaults values one should rather check if the settings (sysctl –a) provide large enough buffer The values can be changed via the command sysctrl –w <name> <value>.

The below values and comments were taken from TIBCO support FAQ1-6YOAA) and serve as a guideline towards “large enough” buffers, i.e. if your system configuration has lower values it is suggested to raise them to below values.

  TCP/IP Maximum OS receive buffer size for all connection types.

sysctl -w net.core.rmem_max=8388608

Default: 131071

  TCP/IP Default OS receive buffer size for all connection types.

sysctl -w net.core.rmem_default=65536

Default: 126976

  TCP/IP Maximum OS send buffer size for all connection types.

sysctl -w net.core.wmem_max=8388608

Default: 131071

  TCP/IP Default OS send buffer size for all types of connections.

sysctl -w net.core.wmem_default=65536

Default: 126976

  TCP/IP Enable/Disable TCP/IP window scaling enabled?

sysctl net.ipv4.tcp_window_scaling

Default: 1

Note: As Applications set buffers sizes explicitly this ‘disables’ the TCP/IP windows scaling on Linux. Thus there is no point in enabling it though there should be no harm on leaving the default (enabled). [This is my understanding / what I have been told but I never double checked and it could vary with kernel versions]

  TCP/IP TCP auto-tuning setting:

sysctl -w net.ipv4.tcp_mem=’8388608 8388608 8388608′

Default: 1966087 262144 393216

The tcp_mem variable defines how the TCP stack should behave when it comes to memory usage:

–          The first value specified in the tcp_mem variable tells the kernel the low threshold. Below this point, the TCP stack does not bother at all about putting any pressure on the memory usage by different TCP sockets.

–          The second value tells the kernel at which point to start pressuring memory usage down.

–          The final value tells the kernel how many memory pages it may use maximally. If this value is reached, TCP streams and packets start getting dropped until we reach a lower memory usage again. This value includes all TCP sockets currently in use.

  TCP/IP TCP auto-tuning (receive) setting:

sysctl -w net.ipv4.tcp_rmem=’4096 87380 8388608′

Default: 4096 87380 4194304

The tcp_rmem variable defines how the TCP stack should behave when it comes to memory usage:

–          The first value tells the kernel the minimum receive buffer for each TCP connection, and this buffer is always allocated to a TCP socket, even under high pressure on the system.

–          The second value specified tells the kernel the default receive buffer allocated for each TCP socket. This value overrides the /proc/sys/net/core/rmem_default value used by other protocols.

–          The third and last value specified in this variable specifies the maximum receive buffer that can be allocated for a TCP socket.”

  TCP/IP TCP auto-tuning (send) setting:

sysctl -w net.ipv4.tcp_wmem=’4096 65536 8388608′

Default: 4096 87380 4194304

This variable takes three different values which hold information on how much TCP send buffer memory space each TCP socket has to use. Every TCP socket has this much buffer space to use before the buffer is filled up.  Each of the three values are used under different conditions:

–          The first value in this variable tells the minimum TCP send buffer space available for a single TCP socket.

–          The second value in the variable tells us the default buffer space allowed for a single TCP socket to use.

–          The third value tells the kernel the maximum TCP send buffer space.”

  TCP/IP This will ensure that immediately subsequent connections use these values.

sysctl -w net.ipv4.route.flush=1

Default: Not present

TCP Keep Alive
  In order to detect ungracefully closed sockets either the TCP keep-alive comes into play or the EMS client-server heartbeat. Which setup or which combination of parameters works better depends on the requirements and test scenarios.

As the EMS daemon does not explicitly enables TCP keep-alive on sockets the TCP keep-alive setting (net.ipv4.tcp_keepalive_intvl, net.ipv4.tcp_keepalive_probes, net.ipv4.tcp_keepalive_time) do not play a role.

  TCP How may times to retry before killing alive TCP connection. RFC1122 says that the limit should be longer than 100 sec. It is too small number. Default value 15 corresponds to 13-30 minutes depending on retransmission timeout (RTO).

sysctl -w net.ipv4.tcp_retries2=<test> (7 preferred)

Default: 15

Fault-Tolerance (EMS failover)

The default (15) is often considered too high and a value of 3 is often felt as too ‘edgy’ thus customer testing should establish a good value in the range between 4 and 10.

Linux System Settings
  System limits (ulimit) are used to establish boundaries for resource utilization by individual processes and thus protect the system and other processes. A too high or unlimited value provides zero protection but a too low value could hinder growth or cause premature errors.
  Linux Is the number of file descriptor at least 4096

ulimit –n


Note: It is expected that the number of connected clients and thus the number of connections is going to increase over time and this setting allows for greater growth and also provides a greater safety room should some application have a connection leak. Also note that the number of open connection can decrease system performance due to the way the OS handles the select() API. Thus care should be taken if the number of connected clients increases over time that all SLA are still met.

  Linux Limit maximum file size for EMS to 2/5 of the disk space if the disk space is shared between EMS servers.

ulimit –f

Robustness: Contain the damage of a very large backlog.

  Linux Consider limiting the maximum data segment size for EMS daemons in order to avoid one EMS monopolizing all available memory.

ulimit –d

Robustness:  Contain the damage of a very large backlog.

Note: It should be tested if such a limit operates well with (triggers) the EMS reserved memory mode.

  Linux Limit number of child processes to X to contain rouge application (shell bomb)

ulimit –u

Robustness: Contain the damage a rogue application can do.

This is just an example of a Linux system setting that is unrelated to TIBCO products. It is recommended to consult with Linux experts for recommended settings.

Linux Virtual Memory Management
  There are a couple of virtual memory related setting that play a role on how likely Linux swaps out memory pages and how Linux reacts to out-of-memory conditions. Both aspects are not important under “normal” operation conditions but are very important under memory pressure and thus the system’s stability under stress.


A server running EAI software and even more a server running a messaging server like EMS should rarely have to resort to swap space for obvious performance reasons. However considerations due to malloc/sbrk high-water-mark behavior, the behavior of the different over-commit strategies and the price of storage lead to above recommendation: Even with below tuning of EMS server towards larger malloc regions[1] the reality is that the EMS daemon is still subject to the sbrk() high-water-mark and is potentially allocation a lot of memory pages that could be swapped out without impacting performance. Of course the EMS server instance must eventually be bounced but the recommendation in this section aim to provide operations with a larger window to schedule the maintenance.


As theses values operate as a bundle they must be changed together or any variation must be well understood.

  Linux Swap-Space:                1.5 to 2x the physical RAM (24-32 GB )

Logical-Partition:        One of the first ones but after the EMS disk storage and application checkpoint files.

Physical-Partition:     Use a different physical partition than the one used for storage files, logging or application checkpoints to avoid competing disk IO.


  Linux Committing virtual memory:

sysctl -w vm.overcommit_memory=2

$ cat /proc/sys/vm/overcommit_memory

Default: 0



Note: The recommended setting uses a new heuristic that only commits as much memory as available, where available is defined as swap-space plus a portion of RAM. The portion of RAM is defined in the overcommit_ratio. See also: and

  Linux Committing virtual memory II:

sysctl -w vm.overcommit_ratio=25 (or less)

$ cat /proc/sys/vm/overcommit_ratio

Default: 50


Note: This value specifies how much percent of the RAM Linux will add to the swap space in order to calculate the “available” memory. The more the swap space exceeds the physical RAM the lower values might be chosen. See also:

  Linux Swappiness

sysctl -w vm.swappiness=25 (or less)

$ cat /proc/sys/vm/swappiness
Default: 60



Note: The swappiness defines how likely memory pages will be swapped in order to make room for the file buffer cache.


Generally speaking an enterprise server should not need to swap out pages in order to make room for the file buffer cache or other processes which would favor a setting of 0. 


On the other hand it is likely that applications have at least some memory pages that almost never get referenced again and swapping them out is a good thing.

  Linux Exclude essential processes (Application) from being killed by the out-of-memory (OOM) daemon.

Echo “-17: > /proc/<pid>/oom_adj

Default: NA


See: and


Note: With any configuration but overcommit_memory=2 and overcommit_ratio=0 the Linux Virtual Memory Ma­nagement can commit more memory than available. If then the memory must be provided Linux engages the out-of-memory kill daemon to kill process based on “badness”. In order to exclude essential processes from being killed one can set their oom_adj to -17.

  Linux 32bit Low memory area –  32bit Linux only

# cat /proc/sys/vm/lower_zone_protection
# echo “250” > /proc/sys/vm/lower_zone_protection

To set this option on boot, add the following to /etc/sysctl.conf:
vm.lower_zone_protection = 250



Linux CPU Tuning (Processor Binding & Priorities)
  This level of tuning is seldom required for Any Application solution. The tuning options are mentioned in case there is a need to go an extra mile. 
  Linux IRQ-Binding

Recommendation: Leave default
Note: For real-time messaging the binding of interrupts to a certain exclusively used CPU allows reducing jitter and thus improves the system characteristics as needed by ultra-low-latency solutions.

The default on Linux is IRQ balancing across multiple CPU and Linux offers two solutions in that real (kernel and daemon) of which only one should be enabled at most.

  Linux Process Base Priority
Recommendation: Leave default


Note: The process base priority is determined by the user running the process instance and thus running processes as root (chown and set sticky bit) increases the processes base priority.

And a root user can further increase the priority of Application to real-time scheduling which can further improve performance particularly in terms of jitter. However in 2008 we observed that doing so actually decreased the performance of EMS in terms of number of messages per second. That issue was researched with Novell at that time but I am not sure of its outcome.

  Linux Foreground and Background Processes

Recommendation: TBD


Note: Linux assigns foreground processes a better base priority than background processes but if it really matters and if so then how to change start-up scripts is a to-be-determined. 

  Linux Processor Set

Recommendation: Don’t bother


Note: Linux allows defining a processor set and limiting a process to only use cores from that processor set. This can be used to increase cache hits and cap the CPU resource for a particular process instance.


If larger memory regions are allocated the malloc() in the Linux glibc library uses mmap() instead of sbrk() to provide the memory pages to the process.

The memory mapped files (mmap()) are better in the way how they release memory back to the OS and thus the high-water-mark effect is avoided for these regions.


CIDR Table – Basic Reference (From Wikipedia)

Address Format Difference to last address Mask Addresses Relative to class Typical use
Decimal 2n A, B, C
a.b.c.d / 32 + 1 20 1256 C Host route
a.b.c.d / 31 + 2 21 1128 C Point to point links (RFC 3021)
a.b.c.d / 30 + 4 22 164 C Point to point links (glue network)
a.b.c.d / 29 + 8 23 132 C Smallest multi-host network
a.b.c.d / 28 + 16 24 116 C Small LAN
a.b.c.d / 27 + 32 25 ⅛ C
a.b.c.d / 26 + 64 26 ¼ C
a.b.c.d / 25 + 128 27 ½ C Large LAN
a.b.c.0 / 24 + 256 28 1 C
a.b.c.0 / 23 + 512 29 2 C
a.b.c.0 / 22 + 1,024 210 4 C
a.b.c.0 / 21 + 2,048 211 8 C Small ISP / large business
a.b.c.0 / 20 + 4,096 212 16 C
a.b.c.0 / 19 + 8,192 213 32 C ISP / large business
a.b.c.0 / 18 + 16,384 214 64 C
a.b.c.0 / 17 + 32,768 215 128 C
a.b.0.0 / 16 + 65,536 216 256 C = B
a.b.0.0 / 15 + 1,31,072 217 2 B
a.b.0.0 / 14 + 2,62,144 218 4 B
a.b.0.0 / 13 + 5,24,288 219 8 B
a.b.0.0 / 12 + 10,48,576 220 16 B
a.b.0.0 / 11 + 20,97,152 221 32 B
a.b.0.0 / 10 + 41,94,304 222 64 B
a.b.0.0 / 9 + 83,88,608 223 128 B
a.0.0.0 / 8 + 1,67,77,216 224 256 B = A Largest IANA block allocation
a.0.0.0 / 7 + 3,35,54,432 225 2:00 AM
a.0.0.0 / 6 + 6,71,08,864 226 4:00 AM
a.0.0.0 / 5 + 13,42,17,728 227 8:00 AM
a.0.0.0 / 4 + 26,84,35,456 228 16 A
a.0.0.0 / 3 + 53,68,70,912 229 32 A
a.0.0.0 / 2 + 1,07,37,41,824 230 64 A
a.0.0.0 / 1 + 2,14,74,83,648 231 128 A / 0 + 4,29,49,67,296 232 256 A

Function-as-a-Service? Serverless Architectures

It has never been a better time to be a developer! Thanks to cloud computing, deploying our applications is much easier than it used to be. How we deploy our apps continues to evolve thanks to cloud hosting, Platform-as-a-Service (PaaS), and now Function-as-a-Service.

What is Function-as-a-Service (FaaS)?

FaaS is the concept of serverless computing via serverless architectures. Software developers can leverage this to deploy an individual “function”, action, or piece of business logic. They are expected to start within milliseconds and process individual requests and then the process ends.

Principles of FaaS:

  • Complete abstraction of servers away from the developer
  • Billing based on consumption and executions, not server instance sizes
  • Services that are event-driven and instantaneously scalable

Timeline of moving to FaaS

At the basic level, you could describe them as a way to run some code when a “thing” happens. Here is a simple example below from Azure Functions. Shows how easy it is to process an HTTP request as a “Function”.

using System.Net;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
    log.Info("C# HTTP trigger function processed a request.");

    // Get request body
    dynamic data = await req.Content.ReadAsAsync<object>();

    return req.CreateResponse(HttpStatusCode.OK, "Hello " +;

Benefits & Use Cases

Like most things, not every app is a good fit for FaaS.

We have been looking to use them at Stackify primarily for our very high volume transactions. We have some transactions that happen hundreds of times per second. We see a lot of value in isolating that logic to a function that we can scale.

  • Super high volume transactions – Isolate them and scale them
  • Dynamic or burstable workloads – If you only run something once a day or month, no need to pay for a server 24/7/365
  • Scheduled tasks – They are a perfect way to run a certain piece of code on a schedule

Function-as-a-Service Features

Types of Functions

There are a lot of potential uses for functions. Below is a simple list of some common scenarios. Support and implementation for them varies by provider.

  • Scheduled tasks or jobs
  • Process a web request
  • Process queue messages
  • Run manually

These functions could also be chained together. For example, a web request could write to a queue, which is then picked up by a different function.

FaaS Providers

AWS, Microsoft Azure, and Google Cloud all provide a solution.  A lot of innovation is still going on in this area and things are rapidly improving and changing.  Read our article on how AWS, Azure, and Google Cloud compare to determine which cloud best meets your needs.

  • AWS Lambda
  • Azure Functions
  • Cloud Functions

Monitoring Challenges

One of the big challenges is monitoring your function apps. You still need to understand how often they occur, how long they take, and potentially, why they are slow.

Since you don’t necessarily have a server or control the resources they are running on, you can’t install any monitoring software.

How we monitor these new types of apps is going to continue to evolve.

Comparing FaaS vs PaaS

Platform-as-a-Service greatly simplifies deploying applications. It allows us to deploy our app and the “cloud” worries about how to deploy the servers to run it. Most PaaS hosting options can even auto-scale the number of servers to handle workloads and save you money during times of low usage.

PaaS offerings like Azure App Services, AWS Elastic Beanstalk, and others, make it easy to deploy an entire application. They handle provisioning servers and deploying your application to the servers.

Function-as-a-Service (FaaS) provides the ability to deploy what is essentially a single function, or part of an application. FaaS is designed to potentially be a serverless architecture. Although, some providers, like Azure, also allow you to dedicate resources to a Function App.

When deployed as PaaS, an application is typically running on at least one server at all times. With FaaS, it may not be running at all until the function needs to be executed. It starts the function within a few milliseconds and then shuts it down.

Both provide the ability to easily deploy an application and scale it, without having to provision or configure servers.


My OpenFaaS Stack !!!!


TIBCO Adapter Error (AER3-910005) – Exception: “JMS error: “Not allowed to create destination tracking

If you encounter the following error in your adapter logs :-

Error AER3-910005 Exception: “JMS error: “Not allowed to create destination tracking=#B0fo–uT5-V4zkYM9A/UbWgUzas#

The following are the possibilities and pointers to be checked :-

  1. Please check the JMS connection configuration of your adapter is correct.
  2. Ensure the JMS user you used have enough permission to create receiver on destination.
  3. Check whether dynamic creation is ON or not in your EMS configuration.
  4. If your destination is a queue then check in “queues.conf” and if it is a topic then “topics.conf” file.
  5. And if you don’t want to Turn ON dynamic creation then you must create the destinations that are required by the adapter manually before starting the adapter.
  6. Finally Kill the BW process and Adapter service, then first start the adapter service and then the BW service.


  • Check the repository settings.

TIBCO Adapters – Received read Advisory Error (JMS Related)

While testing for failover we found that the adapter is not failing over properly to the secondary ems server in case if the primary is down. The adapter logs show the below error. The adapter does not pick up any messages when this error occurs.

Advisory: _SDK.ERROR.JMS.RECEIVE_FAILED : { {ADV_MSG, M_STRING, “Consumer receive failed. JMS Error: Illegal state, SessionName: TIBCOCOMJmsTerminatorSession, Destination: Rep.adcom.Rep-COMAdapter_Rep_v1.exit” } {^description^, M_STRING, “” } }.

The only way to resolve this is to restart the adapter so that it reconnects to the ems server. Then it picks up the messages.


“JMS Error: Illegal state” usually happens when a JMS call or request occurs in an inappropriate context. For example, a consumer is trying to receive message while the JMS server is down.  In your case you are saying that this is happening during EMS failover from machine1 to machine2.

One thing to keep in mind is that depending on the number of oustanding messages, connections, and other resources managed by EMS there may be a brief period before the secondary server is ready to accept connections.

Clients that disconnect will typically attempt to reconnect, however there is a limit to the number of reconnection attempts (as well as the interval between attempts).   These are specified at the connection factory level in factories.conf.  Here are some of the applicable settings:


reconnect_attempt_count – After losing its server connection, a client program configured with more than one server URL attempts to reconnect, iterating through its URL list until it re-establishes a connection with an EMS server. This property determines the maximum number of iterations. When absent, the default is 4.

reconnect_attempt_delay – When attempting to reconnect, the client sleeps for this interval (in milliseconds) between iterations through its URL list. When absent, the default is 500 milliseconds.

reconnect_attempt_timeout – When attempting to reconnect to the EMS server, you can set this connection timeout period to abort the connection attempt after a specified period of time (in milliseconds).

It may also be helpful to specify heartbeats between the adapter and the EMS server.  This way if the EMS server is brought down either gracefully or ungracefully the connection will be reset when the configured number of heartbeats is missed.  This should then trigger the reconnection attempts described above.  The heartbeat settings are defined in the tibemsd.conf.  Here are some relevant settings:

client_heartbeat_server – Specifies the interval clients are to send heartbeats to the server.

server_timeout_client_connection – Specifies the period of time server will wait for a client heartbeat before terminating the client connection.

server_heartbeat_client – Specifies the interval this server is to send heartbeats to all of its clients.

client_timeout_server_connection – Specifies the period of time a client will wait for a heartbeat from the server before terminating the connection.



Docker – Commands to Manipulate the Containers

Parent command

Command Description
docker container Manage containers
Command Description
docker container attach Attach local standard input, output, and error streams to a running container
docker container commit Create a new image from a container’s changes
docker container cp Copy files/folders between a container and the local filesystem
docker container create Create a new container
docker container diff Inspect changes to files or directories on a container’s filesystem
docker container exec Run a command in a running container
docker container export Export a container’s filesystem as a tar archive
docker container inspect Display detailed information on one or more containers
docker container kill Kill one or more running containers
docker container logs Fetch the logs of a container
docker container ls List containers
docker container pause Pause all processes within one or more containers
docker container port List port mappings or a specific mapping for the container
docker container prune Remove all stopped containers
docker container rename Rename a container
docker container restart Restart one or more containers
docker container rm Remove one or more containers
docker container run Run a command in a new container
docker container start Start one or more stopped containers
docker container stats Display a live stream of container(s) resource usage statistics
docker container stop Stop one or more running containers
docker container top Display the running processes of a container
docker container unpause Unpause all processes within one or more containers
docker container update Update configuration of one or more containers
docker container wait Block until one or more containers stop, then print their exit codes

Docker – Add Proxy to Docker Daemon

I am gonna cut the chatter and hit the platter.

Proxy Recommendation :-  To Download the image from hub, we need internet connectivity.

I’ma show you the Steps to configure the proxy for Docker daemon.

  1. Check the OS in which the docker-ce or docker-ee is installed.

ubuntu@docker:~$ cat /etc/*release*
VERSION=”16.04.3 LTS (Xenial Xerus)”
PRETTY_NAME=”Ubuntu 16.04.3 LTS”

2. Check the Docker version

ubuntu@docker:~$ sudo docker -v
Docker version 17.05.0-ce, build 89658be

3. Create a directory

sudo mkdir -p /etc/systemd/system/docker.service.d

4. Create a Proxy Conf

vim /etc/systemd/system/docker.service.d/http-proxy.conf



5. Now try to login to docker

ubuntu@docker:~$ sudo docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don’t have a Docker ID, head over to to create one.
Username: <username>
Login Succeeded