Main, Operating System, Redhat / CEntOS / Oracle Linux, Ubuntu

Process Management in Linux

Process Types

Before we start talking about Linux process management, we should review process types. There are four common types of processes:

  • Parent process
  • Child process
  • Orphan Process
  • Daemon Process
  • Zombie Process

Parent process is a process which runs the fork() system call. All processes except process 0 have one parent process.

Child process is created by a parent process.

Orphan Process it continues running while its parent process has terminated or finished.

Daemon Process is always created from a child process and then exit.

Zombie Process exists in the process table although it is terminated.

The orphan process is a process that still executing and its parent process has died while orphan processes do not become zombie processes.

Memory Management

In server administration, memory management is one of your responsibility that you should care about as a system administrator.

One of most used commands in Linux process management is the free command:

$ freem

The -m option to show values in megabytes.

Certificates – Digital Certificates (Summary)01-linux-process-managment-free-command

Our main concern in buff/cache.

The output of free command here means 536 megabytes is used while 1221 megabytes is available.

The second line is the swap. Swapping occurs when memory becomes to be crowded.

The first value is the total swap size which is 3070 megabytes.

The second value is the used swap which is 0.

The third value is the available swap for usage which is 3070.

From the above results, you can say that memory status is good since no swap is used, so while we are talking about the swap, let’s discover what proc directory provides us about the swap.

$ cat /proc/swapscat /proc/swaps

01-linux-process-managment-free-command

This command shows the swap size and how much is used:

$ cat /proc/sys/vm/swappinesscat /proc/sys/vm/swappiness

01-linux-process-managment-free-commandThis command shows a value from 0 to 100, this value means the system will use the swap if the memory becomes 70% used.

Notice: the default value for most distros for this value is between 30 and 60, you can modify it like this:

$ echo 50 >/proc/sys/vm/swappinessecho 50 >/proc/sys/vm/swappiness

Or using sysctl command like this:

$ sudo sysctl -wvm.swappiness=50sudo sysctl -wvm.swappiness=50

Changing the swappiness value using the above commands is not permanent, you have to write it on /etc/sysctl.conf file like this:

$ nano /etc/sysctl.conf

vm.swappiness=50

01-linux-process-managment-free-command

Cool!!

The swap level measures the chance to transfer a process from the memory to the swap.

Choosing the accurate swappiness value for your system requires some experimentation to choose the best value for your server.

Managing virtual memory with vmstat

Another important command used in Linux process management which is vmstat. vmstat command gives a summary reporting about memory, processes, and paging.

$ vmstat -avmstat -a

-a option is used to get all active and inactive processes.

01-linux-process-managment-free-command

And this is the important column outputs from this command:

si: How much swapped in from disk.

so: How much swapped out to disk.

bi: How much sent to block devices.

bo: How much obtained from block devices.

us: The user time.

sy: The system time.

id: The idle time.

Our main concern is the (si) and (so) columns, where (si) column shows page-ins while (so) column provides page-outs.

A better way to look at these values is by viewing the output with a delay option like this:

$ vmstat 2 5vmstat 2 5

01-linux-process-managment-free-command

Where 2 is the delay in seconds and 5 is the number of times vmstat is called. It shows five updates of the command and all data is presented in kilobytes.

Page-in (si) happens when you start an application and the information is paged-in. Page out (so) happens when the kernel is freeing up memory.

System Load & top Command

In Linux process management, the top command gives you a list of the running processes and how they are using CPU and memory ; the output is a real-time data.

If you have a dual core system may have the first core at 40 percent and the second core at 70 percent, in this case, the top command may show a combined result of 110 percent, but you will not know the individual values for each core.

$ top -c-c

01-linux-process-managment-free-command

We use -c option to show the command line or the executable path behind that process.

You can press 1 key while you watch the top command statistics to show individual CPU statuses.

01-linux-process-managment-free-command

Keep in mind that certain processes are spawned like the child processes, you will see multiple processes for the same program like httpd and PHP-fpm.

You shouldn’t rely on top command only, you should review other resources before making a final action.

Monitoring Disk I/O with iotop

The system starts to be slow as a result of high disk activities, so it is important to monitor disk activities. That means figuring out which processes or users cause this disk activity.

The iotop command in Linux process management helps us to monitor disk I/O in real-time. You can install it if you don’t have it:

$ yum install iotop

Running iotop without any options will result in a list all processes.

To view the processes that cause to disk activity, you should use -o option:

$ iotop -o-o

01-linux-process-managment-free-command

You can easily know what program is impacting the system.

ps command

We’ve talked about ps command before on a previous post and how to order the processes by memory usage and CPU usage.

Monitoring System Health with iostat and lsof

iostat command gives you CPU utilization report; it can be used with -c option to display the CPU utilization report.

$ iostat -ciostat -c

The output result is easy to understand, but if the system is busy, you will see %iowait increases. That means the server is transferring or copying a lot of files.

With this command, you can check the read and write operations, so you should have a solid knowledge of what is hanging your disk and take the right decision.

Additionally, lsof command is used to list the open files:

lsof command shows which executable is using the file, the process ID, the user, and the name of the opened file.

Calculating the system load

Calculating system load is very important in Linux process management. The system load is the amount of processing for the system which is currently working. It is not the perfect way to measure system performance, but it gives you some evidence.

The load is calculated like this:

Actual Load = Total Load (uptime) / No. of CPUs

You can calculate the uptime by reviewing uptime command or top command:

$ uptimeuptime

$ toptop

The server load is shown in 1, 5, and 15 minutes.

As you can see, the average load is 0.00 at the first minute, 0.01 at the fifth minute, and 0.05 at fifteenth minutes.

When the load increases, processors are queued, and if there are many processor cores, the load is distributed equally across the server’s cores to balance the work.

You can say that the good load average is about 1. This does not mean if the load exceeds 1 that there is a problem, but if you begin to see higher numbers for a long time, that means a high load and there is a problem.

pgrep and systemctl

You can get the process ID using pgrep command followed by the service name.

$ pgrep servicename

This command shows the process ID or PID.

Note if this command shows more than process ID like httpd or SSH, the smallest process ID is the parent process ID.

On the other hand, you can use the systemctl command to get the main PID like this:

$ systemctl status<service_name>.service

There are more ways to obtain the required process ID or parent process ID, but this one is easy and straight.

Managing Services with systemd

If we are going to talk about Linux process management, we should take a look at systemd. The systemd is responsible for controlling how services are managed on modern Linux systems like CentOS 7.

Instead of using chkconfig command to enable and disable a service during the boot, you can use the systemctl command.

Systemd also ships with its own version of the top command, and in order to show the processes that are associated with a specific service, you can use the system-cgtop command like this:

$ systemdcgtop

As you can see, all associated processes, path, the number of tasks, the % of CPU used, memory allocation, and the inputs and outputs related.

This command can be used to output a recursive list of service content like this:

$ systemdcgls

This command gives us very useful information that can be used to make your decision.

Nice and Renice Processes

The process nice value is a numeric indication that belongs to the process and how it’s fighting for the CPU.

A high nice value indicates a low priority for your process, so how nice you are going to be to other users, and from here the name came.

The nice range is from -20 to +19.

nice command sets the nice value for a process at creation time, while renice command adjusts the value later.

$ nice –n 5 ./myscriptnice –n 5 ./myscript

This command increases the nice value which means lower priority by 5.

$ sudo renice 5 22132213

This command decreases the nice value means increased priority and the number (2213) is the PID.

You can increase its nice value (lower priority) but cannot lower it (high priority) while root user can do both.

Sending the kill signal

To kill a service or application that causes a problem, you can issue a termination signal (SIGTERM). You can review the previous post about signals and jobs.

$ kill process IDkill process IDID

This method is called safe kill. However, depending on your situation, maybe you need to force a service or application to hang up like this:

$ kill -1 process -1 process ID

Sometimes the safe killing and reloading fail to do anything, you can send kill signal SIGKILL by using -9 option which is called forced kill.

$ kill -9 process IDkill -9 process ID

There are no cleanup operations or safe exit with this command and not preferred. However, you can do something more proper by using the pkill command.

$ pkill -9 serviceName-9 serviceNameserviceName

And you can use pgrep command to ensure that all associated processes are killed.

$ pgrep serviceNamepgrep serviceName

I hope you have a good idea about Linux process management and how to make a good action to make the system healthy.

Thank you

Advertisements
Main, Operating System, Redhat / CEntOS / Oracle Linux

Standard Linux Tuning

Hello Bloggers,

Majority of the applications these days are deployed on (Debian / Redhat) Linux Operating System as the Base OS.

I Would like to share some generic tuning that can be done before deploying any application on it.

Index Component Question / Test / Reason  
Network
  These are some checks to validate the network setup.
[ Network Are the switches redundant?
Unplug one switch.
Fault-tolerance.
 
  Network Is the cabling redundant?
Pull cables.
Fault-tolerance.
 
  Network Is the network full-duplex?
Double check setup.
Performance.
 
       
Network adapter (NIC) Tuning
  It is recommended to consult with the network adapter provider on recommended Linux TCP/IP settings for optimal performance and stability on Linux.

There are also quite a few TCP/IP tuning source on the Internet such as http://fasterdata.es.net/TCP-tuning/linux.html

  NIC Are the NIC fault-tolerant (aka. auto-port negotiation)?
Pull cables and/or disable network adapter.
Fault-tolerance.
 
  NIC Set the transmission queue depth to at least 1000.

txqueuelen <length>

‘cat /proc/net/softnet_stat’

Performance and stability (packet drops).

 
  NIC Enable TCP/IP offloading (aka. Generic Segment Offloading (GSO)) which was added in kernel 2.6.18

See: http://www.linuxfoundation.org/en/Net:GSO

Check: ethtool -k eth0

Modify: ethtool -K <DevName>eth

lsb
Performance.

Note: I recommend enabling all supported TCP/IP offloading capabilities on and EMS host to free CPU resources.

 
  NIC Enable Interrupt Coalescence (aka. Interrupt Moderation or Interrupt Blanking).

See: http://kb.pert.geant.net/PERTKB/InterruptCoalescence

Check: ethtool -c eth0

Modify: ethtool -C <DevName>

Performance.

Note: The configuration is system dependant but the goal is to reduce the number of interrupts per second at the ‘cost’ of slightly increased latency.

 
       
TCP/IP Buffer Tuning
  For a low latency or high throughput messaging system TCP/IP buffer tuning is important.
Thus instead of tuning the defaults values one should rather check if the settings (sysctl –a) provide large enough buffer The values can be changed via the command sysctrl –w <name> <value>.

The below values and comments were taken from TIBCO support FAQ1-6YOAA) and serve as a guideline towards “large enough” buffers, i.e. if your system configuration has lower values it is suggested to raise them to below values.

  TCP/IP Maximum OS receive buffer size for all connection types.

sysctl -w net.core.rmem_max=8388608

Default: 131071
Performance.

 
  TCP/IP Default OS receive buffer size for all connection types.

sysctl -w net.core.rmem_default=65536

Default: 126976
Performance.

 
  TCP/IP Maximum OS send buffer size for all connection types.

sysctl -w net.core.wmem_max=8388608

Default: 131071
Performance.

 
  TCP/IP Default OS send buffer size for all types of connections.

sysctl -w net.core.wmem_default=65536

Default: 126976
Performance.

 
  TCP/IP Enable/Disable TCP/IP window scaling enabled?

sysctl net.ipv4.tcp_window_scaling

Default: 1

Performance.
Note: As Applications set buffers sizes explicitly this ‘disables’ the TCP/IP windows scaling on Linux. Thus there is no point in enabling it though there should be no harm on leaving the default (enabled). [This is my understanding / what I have been told but I never double checked and it could vary with kernel versions]

 
  TCP/IP TCP auto-tuning setting:

sysctl -w net.ipv4.tcp_mem=’8388608 8388608 8388608′

Default: 1966087 262144 393216
Performance.

The tcp_mem variable defines how the TCP stack should behave when it comes to memory usage:

–          The first value specified in the tcp_mem variable tells the kernel the low threshold. Below this point, the TCP stack does not bother at all about putting any pressure on the memory usage by different TCP sockets.

–          The second value tells the kernel at which point to start pressuring memory usage down.

–          The final value tells the kernel how many memory pages it may use maximally. If this value is reached, TCP streams and packets start getting dropped until we reach a lower memory usage again. This value includes all TCP sockets currently in use.

 
  TCP/IP TCP auto-tuning (receive) setting:

sysctl -w net.ipv4.tcp_rmem=’4096 87380 8388608′

Default: 4096 87380 4194304
Performance.

The tcp_rmem variable defines how the TCP stack should behave when it comes to memory usage:

–          The first value tells the kernel the minimum receive buffer for each TCP connection, and this buffer is always allocated to a TCP socket, even under high pressure on the system.

–          The second value specified tells the kernel the default receive buffer allocated for each TCP socket. This value overrides the /proc/sys/net/core/rmem_default value used by other protocols.

–          The third and last value specified in this variable specifies the maximum receive buffer that can be allocated for a TCP socket.”

 
  TCP/IP TCP auto-tuning (send) setting:

sysctl -w net.ipv4.tcp_wmem=’4096 65536 8388608′

Default: 4096 87380 4194304
Performance.

This variable takes three different values which hold information on how much TCP send buffer memory space each TCP socket has to use. Every TCP socket has this much buffer space to use before the buffer is filled up.  Each of the three values are used under different conditions:

–          The first value in this variable tells the minimum TCP send buffer space available for a single TCP socket.

–          The second value in the variable tells us the default buffer space allowed for a single TCP socket to use.

–          The third value tells the kernel the maximum TCP send buffer space.”

 
  TCP/IP This will ensure that immediately subsequent connections use these values.

sysctl -w net.ipv4.route.flush=1

Default: Not present

 
       
TCP Keep Alive
  In order to detect ungracefully closed sockets either the TCP keep-alive comes into play or the EMS client-server heartbeat. Which setup or which combination of parameters works better depends on the requirements and test scenarios.

As the EMS daemon does not explicitly enables TCP keep-alive on sockets the TCP keep-alive setting (net.ipv4.tcp_keepalive_intvl, net.ipv4.tcp_keepalive_probes, net.ipv4.tcp_keepalive_time) do not play a role.

  TCP How may times to retry before killing alive TCP connection. RFC1122 says that the limit should be longer than 100 sec. It is too small number. Default value 15 corresponds to 13-30 minutes depending on retransmission timeout (RTO).

sysctl -w net.ipv4.tcp_retries2=<test> (7 preferred)

Default: 15

Fault-Tolerance (EMS failover)

The default (15) is often considered too high and a value of 3 is often felt as too ‘edgy’ thus customer testing should establish a good value in the range between 4 and 10.

 
       
Linux System Settings
  System limits (ulimit) are used to establish boundaries for resource utilization by individual processes and thus protect the system and other processes. A too high or unlimited value provides zero protection but a too low value could hinder growth or cause premature errors.
  Linux Is the number of file descriptor at least 4096

ulimit –n

Scalability

Note: It is expected that the number of connected clients and thus the number of connections is going to increase over time and this setting allows for greater growth and also provides a greater safety room should some application have a connection leak. Also note that the number of open connection can decrease system performance due to the way the OS handles the select() API. Thus care should be taken if the number of connected clients increases over time that all SLA are still met.

 
  Linux Limit maximum file size for EMS to 2/5 of the disk space if the disk space is shared between EMS servers.

ulimit –f

Robustness: Contain the damage of a very large backlog.

 
  Linux Consider limiting the maximum data segment size for EMS daemons in order to avoid one EMS monopolizing all available memory.

ulimit –d

Robustness:  Contain the damage of a very large backlog.

Note: It should be tested if such a limit operates well with (triggers) the EMS reserved memory mode.

 
  Linux Limit number of child processes to X to contain rouge application (shell bomb)

ulimit –u

Robustness: Contain the damage a rogue application can do.
See: http://www.cyberciti.biz/tips/linux-limiting-user-process.html

This is just an example of a Linux system setting that is unrelated to TIBCO products. It is recommended to consult with Linux experts for recommended settings.

 
       
Linux Virtual Memory Management
  There are a couple of virtual memory related setting that play a role on how likely Linux swaps out memory pages and how Linux reacts to out-of-memory conditions. Both aspects are not important under “normal” operation conditions but are very important under memory pressure and thus the system’s stability under stress.

 

A server running EAI software and even more a server running a messaging server like EMS should rarely have to resort to swap space for obvious performance reasons. However considerations due to malloc/sbrk high-water-mark behavior, the behavior of the different over-commit strategies and the price of storage lead to above recommendation: Even with below tuning of EMS server towards larger malloc regions[1] the reality is that the EMS daemon is still subject to the sbrk() high-water-mark and is potentially allocation a lot of memory pages that could be swapped out without impacting performance. Of course the EMS server instance must eventually be bounced but the recommendation in this section aim to provide operations with a larger window to schedule the maintenance.

 

As theses values operate as a bundle they must be changed together or any variation must be well understood.

  Linux Swap-Space:                1.5 to 2x the physical RAM (24-32 GB )

Logical-Partition:        One of the first ones but after the EMS disk storage and application checkpoint files.

Physical-Partition:     Use a different physical partition than the one used for storage files, logging or application checkpoints to avoid competing disk IO.

 

 
  Linux Committing virtual memory:

sysctl -w vm.overcommit_memory=2

$ cat /proc/sys/vm/overcommit_memory

Default: 0

Robustness

 

Note: The recommended setting uses a new heuristic that only commits as much memory as available, where available is defined as swap-space plus a portion of RAM. The portion of RAM is defined in the overcommit_ratio. See also: http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting and http://www.centos.org/docs/5/html/5.2/Deployment_Guide/s3-proc-sys-vm.html

 
  Linux Committing virtual memory II:

sysctl -w vm.overcommit_ratio=25 (or less)

$ cat /proc/sys/vm/overcommit_ratio

Default: 50

Robustness

Note: This value specifies how much percent of the RAM Linux will add to the swap space in order to calculate the “available” memory. The more the swap space exceeds the physical RAM the lower values might be chosen. See also: http://www.linuxinsight.com/proc_sys_vm_overcommit_ratio.html

 
  Linux Swappiness

sysctl -w vm.swappiness=25 (or less)

$ cat /proc/sys/vm/swappiness
Default: 60

Robustness

 

Note: The swappiness defines how likely memory pages will be swapped in order to make room for the file buffer cache.

 

Generally speaking an enterprise server should not need to swap out pages in order to make room for the file buffer cache or other processes which would favor a setting of 0. 

 

On the other hand it is likely that applications have at least some memory pages that almost never get referenced again and swapping them out is a good thing.

 
  Linux Exclude essential processes (Application) from being killed by the out-of-memory (OOM) daemon.

Echo “-17: > /proc/<pid>/oom_adj

Default: NA

Robustness

See: http://linux-mm.org/OOM_Killer and http://lwn.net/Articles/317814/

 

Note: With any configuration but overcommit_memory=2 and overcommit_ratio=0 the Linux Virtual Memory Ma­nagement can commit more memory than available. If then the memory must be provided Linux engages the out-of-memory kill daemon to kill process based on “badness”. In order to exclude essential processes from being killed one can set their oom_adj to -17.

 
  Linux 32bit Low memory area –  32bit Linux only

# cat /proc/sys/vm/lower_zone_protection
# echo “250” > /proc/sys/vm/lower_zone_protection

(NOT APPLICABLE)
To set this option on boot, add the following to /etc/sysctl.conf:
vm.lower_zone_protection = 250

 

See: http://linux.derkeiler.com/Mailing-Lists/RedHat/2007-08/msg00062.html

 
       
Linux CPU Tuning (Processor Binding & Priorities)
  This level of tuning is seldom required for Any Application solution. The tuning options are mentioned in case there is a need to go an extra mile. 
  Linux IRQ-Binding

Recommendation: Leave default
Note: For real-time messaging the binding of interrupts to a certain exclusively used CPU allows reducing jitter and thus improves the system characteristics as needed by ultra-low-latency solutions.

The default on Linux is IRQ balancing across multiple CPU and Linux offers two solutions in that real (kernel and daemon) of which only one should be enabled at most.

 
  Linux Process Base Priority
Recommendation: Leave default

 

Note: The process base priority is determined by the user running the process instance and thus running processes as root (chown and set sticky bit) increases the processes base priority.

And a root user can further increase the priority of Application to real-time scheduling which can further improve performance particularly in terms of jitter. However in 2008 we observed that doing so actually decreased the performance of EMS in terms of number of messages per second. That issue was researched with Novell at that time but I am not sure of its outcome.

 
  Linux Foreground and Background Processes

Recommendation: TBD

 

Note: Linux assigns foreground processes a better base priority than background processes but if it really matters and if so then how to change start-up scripts is a to-be-determined. 

 
  Linux Processor Set

Recommendation: Don’t bother

 

Note: Linux allows defining a processor set and limiting a process to only use cores from that processor set. This can be used to increase cache hits and cap the CPU resource for a particular process instance.

 

If larger memory regions are allocated the malloc() in the Linux glibc library uses mmap() instead of sbrk() to provide the memory pages to the process.

The memory mapped files (mmap()) are better in the way how they release memory back to the OS and thus the high-water-mark effect is avoided for these regions.

Redhat / CEntOS / Oracle Linux, Tips and Tricks

How to Delete all files except a Pattern in Unix

Good Morning To All My TECH Ghettos,

Today ima show ya’ll a fuckin command to delete all files except a pattern,

ya’ll can use it in a script or even commandline ……. Life gets easy as Fuck !!!!!!!!

find . -type f ! -name ‘<pattern>’ -delete

A Live Example

Before

Before

After the following Command

find . -type f ! -name ‘*.gz’ -delete

After

Operating System, Redhat / CEntOS / Oracle Linux, Ubuntu

How To Patch and Protect Linux Kernel Stack Clash Vulnerability CVE-2017-1000364 [ 19/June/2017 ]

Avery serious security problem has been found in the Linux kernel called “The Stack Clash.” It can be exploited by attackers to corrupt memory and execute arbitrary code. An attacker could leverage this with another vulnerability to execute arbitrary code and gain administrative/root account privileges. How do I fix this problem on Linux?

the-stack-clash-on-linux-openbsd-netbsd-freebsd-solaris
The Qualys Research Labs discovered various problems in the dynamic linker of the GNU C Library (CVE-2017-1000366) which allow local privilege escalation by clashing the stack including Linux kernel. This bug affects Linux, OpenBSD, NetBSD, FreeBSD and Solaris, on i386 and amd64. It can be exploited by attackers to corrupt memory and execute arbitrary code.

What is CVE-2017-1000364 bug?

From RHN:

A flaw was found in the way memory was being allocated on the stack for user space binaries. If heap (or different memory region) and stack memory regions were adjacent to each other, an attacker could use this flaw to jump over the stack guard gap, cause controlled memory corruption on process stack or the adjacent memory region, and thus increase their privileges on the system. This is a kernel-side mitigation which increases the stack guard gap size from one page to 1 MiB to make successful exploitation of this issue more difficult.

As per the original research post:

Each program running on a computer uses a special memory region called the stack. This memory region is special because it grows automatically when the program needs more stack memory. But if it grows too much and gets too close to another memory region, the program may confuse the stack with the other memory region. An attacker can exploit this confusion to overwrite the stack with the other memory region, or the other way around.

A list of affected Linux distros

  1. Red Hat Enterprise Linux Server 5.x
  2. Red Hat Enterprise Linux Server 6.x
  3. Red Hat Enterprise Linux Server 7.x
  4. CentOS Linux Server 5.x
  5. CentOS Linux Server 6.x
  6. CentOS Linux Server 7.x
  7. Oracle Enterprise Linux Server 5.x
  8. Oracle Enterprise Linux Server 6.x
  9. Oracle Enterprise Linux Server 7.x
  10. Ubuntu 17.10
  11. Ubuntu 17.04
  12. Ubuntu 16.10
  13. Ubuntu 16.04 LTS
  14. Ubuntu 12.04 ESM (Precise Pangolin)
  15. Debian 9 stretch
  16. Debian 8 jessie
  17. Debian 7 wheezy
  18. Debian unstable
  19. SUSE Linux Enterprise Desktop 12 SP2
  20. SUSE Linux Enterprise High Availability 12 SP2
  21. SUSE Linux Enterprise Live Patching 12
  22. SUSE Linux Enterprise Module for Public Cloud 12
  23. SUSE Linux Enterprise Build System Kit 12 SP2
  24. SUSE Openstack Cloud Magnum Orchestration 7
  25. SUSE Linux Enterprise Server 11 SP3-LTSS
  26. SUSE Linux Enterprise Server 11 SP4
  27. SUSE Linux Enterprise Server 12 SP1-LTSS
  28. SUSE Linux Enterprise Server 12 SP2
  29. SUSE Linux Enterprise Server for Raspberry Pi 12 SP2

Do I need to reboot my box?

Yes, as most services depends upon the dynamic linker of the GNU C Library and kernel itself needs to be reloaded in memory.

How do I fix CVE-2017-1000364 on Linux?

Type the commands as per your Linux distro. You need to reboot the box. Before you apply patch, note down your current kernel version:
$ uname -a
$ uname -mrs

Sample outputs:

Linux 4.4.0-78-generic x86_64

Debian or Ubuntu Linux

Type the following apt command/apt-get command to apply updates:
$ sudo apt-get update && sudo apt-get upgrade && sudo apt-get dist-upgrade
Sample outputs:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  libc-bin libc-dev-bin libc-l10n libc6 libc6-dev libc6-i386 linux-compiler-gcc-6-x86 linux-headers-4.9.0-3-amd64 linux-headers-4.9.0-3-common linux-image-4.9.0-3-amd64
  linux-kbuild-4.9 linux-libc-dev locales multiarch-support
14 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/62.0 MB of archives.
After this operation, 4,096 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Reading changelogs... Done
Preconfiguring packages ...
(Reading database ... 115123 files and directories currently installed.)
Preparing to unpack .../libc6-i386_2.24-11+deb9u1_amd64.deb ...
Unpacking libc6-i386 (2.24-11+deb9u1) over (2.24-11) ...
Preparing to unpack .../libc6-dev_2.24-11+deb9u1_amd64.deb ...
Unpacking libc6-dev:amd64 (2.24-11+deb9u1) over (2.24-11) ...
Preparing to unpack .../libc-dev-bin_2.24-11+deb9u1_amd64.deb ...
Unpacking libc-dev-bin (2.24-11+deb9u1) over (2.24-11) ...
Preparing to unpack .../linux-libc-dev_4.9.30-2+deb9u1_amd64.deb ...
Unpacking linux-libc-dev:amd64 (4.9.30-2+deb9u1) over (4.9.30-2) ...
Preparing to unpack .../libc6_2.24-11+deb9u1_amd64.deb ...
Unpacking libc6:amd64 (2.24-11+deb9u1) over (2.24-11) ...
Setting up libc6:amd64 (2.24-11+deb9u1) ...
(Reading database ... 115123 files and directories currently installed.)
Preparing to unpack .../libc-bin_2.24-11+deb9u1_amd64.deb ...
Unpacking libc-bin (2.24-11+deb9u1) over (2.24-11) ...
Setting up libc-bin (2.24-11+deb9u1) ...
(Reading database ... 115123 files and directories currently installed.)
Preparing to unpack .../multiarch-support_2.24-11+deb9u1_amd64.deb ...
Unpacking multiarch-support (2.24-11+deb9u1) over (2.24-11) ...
Setting up multiarch-support (2.24-11+deb9u1) ...
(Reading database ... 115123 files and directories currently installed.)
Preparing to unpack .../0-libc-l10n_2.24-11+deb9u1_all.deb ...
Unpacking libc-l10n (2.24-11+deb9u1) over (2.24-11) ...
Preparing to unpack .../1-locales_2.24-11+deb9u1_all.deb ...
Unpacking locales (2.24-11+deb9u1) over (2.24-11) ...
Preparing to unpack .../2-linux-compiler-gcc-6-x86_4.9.30-2+deb9u1_amd64.deb ...
Unpacking linux-compiler-gcc-6-x86 (4.9.30-2+deb9u1) over (4.9.30-2) ...
Preparing to unpack .../3-linux-headers-4.9.0-3-amd64_4.9.30-2+deb9u1_amd64.deb ...
Unpacking linux-headers-4.9.0-3-amd64 (4.9.30-2+deb9u1) over (4.9.30-2) ...
Preparing to unpack .../4-linux-headers-4.9.0-3-common_4.9.30-2+deb9u1_all.deb ...
Unpacking linux-headers-4.9.0-3-common (4.9.30-2+deb9u1) over (4.9.30-2) ...
Preparing to unpack .../5-linux-kbuild-4.9_4.9.30-2+deb9u1_amd64.deb ...
Unpacking linux-kbuild-4.9 (4.9.30-2+deb9u1) over (4.9.30-2) ...
Preparing to unpack .../6-linux-image-4.9.0-3-amd64_4.9.30-2+deb9u1_amd64.deb ...
Unpacking linux-image-4.9.0-3-amd64 (4.9.30-2+deb9u1) over (4.9.30-2) ...
Setting up linux-libc-dev:amd64 (4.9.30-2+deb9u1) ...
Setting up linux-headers-4.9.0-3-common (4.9.30-2+deb9u1) ...
Setting up libc6-i386 (2.24-11+deb9u1) ...
Setting up linux-compiler-gcc-6-x86 (4.9.30-2+deb9u1) ...
Setting up linux-kbuild-4.9 (4.9.30-2+deb9u1) ...
Setting up libc-l10n (2.24-11+deb9u1) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up libc-dev-bin (2.24-11+deb9u1) ...
Setting up linux-image-4.9.0-3-amd64 (4.9.30-2+deb9u1) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-4.9.0-3-amd64
cryptsetup: WARNING: failed to detect canonical device of /dev/md0
cryptsetup: WARNING: could not determine root device from /etc/fstab
W: initramfs-tools configuration sets RESUME=UUID=054b217a-306b-4c18-b0bf-0ed85af6c6e1
W: but no matching swap device is available.
I: The initramfs will attempt to resume from /dev/md1p1
I: (UUID=bf72f3d4-3be4-4f68-8aae-4edfe5431670)
I: Set the RESUME variable to override this.
/etc/kernel/postinst.d/zz-update-grub:
Searching for GRUB installation directory ... found: /boot/grub
Searching for default file ... found: /boot/grub/default
Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst
Searching for splash image ... none found, skipping ...
Found kernel: /boot/vmlinuz-4.9.0-3-amd64
Found kernel: /boot/vmlinuz-3.16.0-4-amd64
Updating /boot/grub/menu.lst ... done

Setting up libc6-dev:amd64 (2.24-11+deb9u1) ...
Setting up locales (2.24-11+deb9u1) ...
Generating locales (this might take a while)...
  en_IN.UTF-8... done
Generation complete.
Setting up linux-headers-4.9.0-3-amd64 (4.9.30-2+deb9u1) ...
Processing triggers for libc-bin (2.24-11+deb9u1) ...

Reboot your server/desktop using reboot command:
$ sudo reboot

Oracle/RHEL/CentOS/Scientific Linux

Type the following yum command:
$ sudo yum update
$ sudo reboot

Fedora Linux

Type the following dnf command:
$ sudo dnf update
$ sudo reboot

Suse Enterprise Linux or Opensuse Linux

Type the following zypper command:
$ sudo zypper patch
$ sudo reboot

SUSE OpenStack Cloud 6

$ sudo zypper in -t patch SUSE-OpenStack-Cloud-6-2017-996=1
$ sudo reboot

SUSE Linux Enterprise Server for SAP 12-SP1

$ sudo zypper in -t patch SUSE-SLE-SAP-12-SP1-2017-996=1
$ sudo reboot

SUSE Linux Enterprise Server 12-SP1-LTSS

$ sudo zypper in -t patch SUSE-SLE-SERVER-12-SP1-2017-996=1
$ sudo reboot

SUSE Linux Enterprise Module for Public Cloud 12

$ sudo zypper in -t patch SUSE-SLE-Module-Public-Cloud-12-2017-996=1
$ sudo reboot

Verification

You need to make sure your version number changed after issuing reboot command
$ uname -a
$ uname -r
$ uname -mrs

Sample outputs:

Linux 4.4.0-81-generic x86_64
Main, Operating System, Redhat / CEntOS / Oracle Linux, Ubuntu

Cpustat – Monitors CPU Utilization by Running Processes in Linux

Operating System, Redhat / CEntOS / Oracle Linux, Ubuntu

Linux security alert: Bug in sudo’s get_process_ttyname() [ CVE-2017-1000367 ]

There is a serious vulnerability in sudo command that grants root access to anyone with a shell account. It works on SELinux enabled systems such as CentOS/RHEL and others too. A local user with privileges to execute commands via sudo could use this flaw to escalate their privileges to root. Patch your system as soon as possible.

It was discovered that Sudo did not properly parse the contents of /proc/[pid]/stat when attempting to determine its controlling tty. A local attacker in some configurations could possibly use this to overwrite any file on the filesystem, bypassing intended permissions or gain root shell.
From the description

We discovered a vulnerability in Sudo’s get_process_ttyname() for Linux:
this function opens “/proc/[pid]/stat” (man proc) and reads the device number of the tty from field 7 (tty_nr). Unfortunately, these fields are space-separated and field 2 (comm, the filename of the command) can
contain spaces (CVE-2017-1000367).

For example, if we execute Sudo through the symlink “./ 1 “, get_process_ttyname() calls sudo_ttyname_dev() to search for the non-existent tty device number “1” in the built-in search_devs[].

Next, sudo_ttyname_dev() calls the function sudo_ttyname_scan() to search for this non-existent tty device number “1” in a breadth-first traversal of “/dev”.

Last, we exploit this function during its traversal of the world-writable “/dev/shm”: through this vulnerability, a local user can pretend that his tty is any character device on the filesystem, and
after two race conditions, he can pretend that his tty is any file on the filesystem.

On an SELinux-enabled system, if a user is Sudoer for a command that does not grant him full root privileges, he can overwrite any file on the filesystem (including root-owned files) with his command’s output,
because relabel_tty() (in src/selinux.c) calls open(O_RDWR|O_NONBLOCK) on his tty and dup2()s it to the command’s stdin, stdout, and stderr. This allows any Sudoer user to obtain full root privileges.

A list of affected Linux distro

  1. Red Hat Enterprise Linux 6 (sudo)
  2. Red Hat Enterprise Linux 7 (sudo)
  3. Red Hat Enterprise Linux Server (v. 5 ELS) (sudo)
  4. Oracle Enterprise Linux 6
  5. Oracle Enterprise Linux 7
  6. Oracle Enterprise Linux Server 5
  7. CentOS Linux 6 (sudo)
  8. CentOS Linux 7 (sudo)
  9. Debian wheezy
  10. Debian jessie
  11. Debian stretch
  12. Debian sid
  13. Ubuntu 17.04
  14. Ubuntu 16.10
  15. Ubuntu 16.04 LTS
  16. Ubuntu 14.04 LTS
  17. SUSE Linux Enterprise Software Development Kit 12-SP2
  18. SUSE Linux Enterprise Server for Raspberry Pi 12-SP2
  19. SUSE Linux Enterprise Server 12-SP2
  20. SUSE Linux Enterprise Desktop 12-SP2
  21. OpenSuse, Slackware, and Gentoo Linux

How do I patch sudo on Debian/Ubuntu Linux server?

To patch Ubuntu/Debian Linux apt-get command or apt command:
$ sudo apt update
$ sudo apt upgrade

How do I patch sudo on CentOS/RHEL/Scientific/Oracle Linux server?

Run yum command:
$ sudo yum update

How do I patch sudo on Fedora Linux server?

Run dnf command:
$ sudo dnf update

How do I patch sudo on Suse/OpenSUSE Linux server?

Run zypper command:
$ sudo zypper update

How do I patch sudo on Arch Linux server?

Run pacman command:
$ sudo pacman -Syu

How do I patch sudo on Alpine Linux server?

Run apk command:
# apk update && apk upgrade

How do I patch sudo on Slackware Linux server?

Run upgradepkg command:
# upgradepkg sudo-1.8.20p1-i586-1_slack14.2.txz

How do I patch sudo on Gentoo Linux server?

Run emerge command:
# emerge --sync
# emerge --ask --oneshot --verbose ">=app-admin/sudo-1.8.20_p1"

Kernel Programming, Operating System, Redhat / CEntOS / Oracle Linux, Ubuntu

Impermanence in Linux – Exclusive (By Hari Iyer)

Impermanence, also called Anicca or Anitya, is one of the essential doctrines and a part of three marks of existence in Buddhism The doctrine asserts that all of conditioned existence, without exception, is “transient, evanescent, inconstant”

On Linux, the root of all randomness is something called the kernel entropy pool. This is a large (4,096 bit) number kept privately in the kernel’s memory. There are 24096 possibilities for this number so it can contain up to 4,096 bits of entropy. There is one caveat – the kernel needs to be able to fill that memory from a source with 4,096 bits of entropy. And that’s the hard part: finding that much randomness.

The entropy pool is used in two ways: random numbers are generated from it and it is replenished with entropy by the kernel. When random numbers are generated from the pool the entropy of the pool is diminished (because the person receiving the random number has some information about the pool itself). So as the pool’s entropy diminishes as random numbers are handed out, the pool must be replenished.

Replenishing the pool is called stirring: new sources of entropy are stirred into the mix of bits in the pool.

This is the key to how random number generation works on Linux. If randomness is needed, it’s derived from the entropy pool. When available, other sources of randomness are used to stir the entropy pool and make it less predictable. The details are a little mathematical, but it’s interesting to understand how the Linux random number generator works as the principles and techniques apply to random number generation in other software and systems.

The kernel keeps a rough estimate of the number of bits of entropy in the pool. You can check the value of this estimate through the following command:

cat /proc/sys/kernel/random/entropy_avail

A healthy Linux system with a lot of entropy available will have return close to the full 4,096 bits of entropy. If the value returned is less than 200, the system is running low on entropy.

The kernel is watching you

I mentioned that the system takes other sources of randomness and uses this to stir the entropy pool. This is achieved using something called a timestamp.

Most systems have precise internal clocks. Every time that a user interacts with a system, the value of the clock at that time is recorded as a timestamp. Even though the year, month, day and hour are generally guessable, the millisecond and microsecond are not and therefore the timestamp contains some entropy. Timestamps obtained from the user’s mouse and keyboard along with timing information from the network and disk each have different amount of entropy.

How does the entropy found in a timestamp get transferred to the entropy pool? Simple, use math to mix it in. Well, simple if you like math.

Just mix it in

A fundamental property of entropy is that it mixes well. If you take two unrelated random streams and combine them, the new stream cannot have less entropy. Taking a number of low entropy sources and combining them results in a high entropy source.

All that’s needed is the right combination function: a function that can be used to combine two sources of entropy. One of the simplest such functions is the logical exclusive or (XOR). This truth table shows how bits x and y coming from different random streams are combined by the XOR function.

Even if one source of bits does not have much entropy, there is no harm in XORing it into another source. Entropy always increases. In the Linux kernel, a combination of XORs is used to mix timestamps into the main entropy pool.

Generating random numbers

Cryptographic applications require very high entropy. If a 128 bit key is generated with only 64 bits of entropy then it can be guessed in 264 attempts instead of 2128 attempts. That is the difference between needing a thousand computers running for a few years to brute force the key versus needing all the computers ever created running for longer than the history of the universe to do so.

Cryptographic applications require close to one bit of entropy per bit. If the system’s pool has fewer than 4,096 bits of entropy, how does the system return a fully random number? One way to do this is to use a cryptographic hash function.

A cryptographic hash function takes an input of any size and outputs a fixed size number. Changing one bit of the input will change the output completely. Hash functions are good at mixing things together. This mixing property spreads the entropy from the input evenly through the output. If the input has more bits of entropy than the size of the output, the output will be highly random. This is how highly entropic random numbers are derived from the entropy pool.

The hash function used by the Linux kernel is the standard SHA-1 cryptographic hash. By hashing the entire pool and and some additional arithmetic, 160 random bits are created for use by the system. When this happens, the system lowers its estimate of the entropy in the pool accordingly.

Above I said that applying a hash like SHA-1 could be dangerous if there wasn’t enough entropy in the pool. That’s why it’s critical to keep an eye on the available system entropy: if it drops too low the output of the random number generator could have less entropy that it appears to have.

Running out of entropy

One of the dangers of a system is running out of entropy. When the system’s entropy estimate drops to around the 160 bit level, the length of a SHA-1 hash, things get tricky, and how they effect programs and performance depends on which of two Linux random number generators are used.

Linux exposes two interfaces for random data that behave differently when the entropy level is low. They are /dev/random and /dev/urandom. When the entropy pool becomes predictable, both interfaces for requesting random numbers become problematic.

When the entropy level is too low, /dev/random blocks and does not return until the level of entropy in the system is high enough. This guarantees high entropy random numbers. If /dev/random is used in a time-critical service and the system runs low on entropy, the delays could be detrimental to the quality of service.

On the other hand, /dev/urandom does not block. It continues to return the hashed value of its entropy pool even though there is little to no entropy in it. This low-entropy data is not suited for cryptographic use.

The solution to the problem is to simply add more entropy into the system.

Hardware random number generation to the rescue?

Intel’s Ivy Bridge family of processors have an interesting feature called “secure key.” These processors contain a special piece of hardware inside that generates random numbers. The single assembly instruction RDRAND returns allegedly high entropy random data derived on the chip.

It has been suggested that Intel’s hardware number generator may not be fully random. Since it is baked into the silicon, that assertion is hard to audit and verify. As it turns out, even if the numbers generated have some bias, it can still help as long as this is not the only source of randomness in the system. Even if the random number generator itself had a back door, the mixing property of randomness means that it cannot lower the amount of entropy in the pool.

On Linux, if a hardware random number generator is present, the Linux kernel will use the XOR function to mix the output of RDRAND into the hash of the entropy pool. This happens here in the Linux source code (the XOR operator is ^ in C).

Third party entropy generators

Hardware number generation is not available everywhere, and the sources of randomness polled by the Linux kernel itself are somewhat limited. For this situation, a number of third party random number generation tools exist. Examples of these are haveged, which relies on processor cache timing, audio-entropyd and video-entropyd which work by sampling the noise from an external audio or video input device. By mixing these additional sources of locally collected entropy into the Linux entropy pool, the entropy can only go up.