Apache2 umask

Many times you might want to fine tune the default permissions of the files created on a linux system. This is very simple and usually if you are using bash all you have to do is to define somewhere in the bash startup files (/etc/profile is a good place for this) a new value for umask like this:
umask 002
(this will allow by default group write permissions on the newly created files)

Normally on modern linux distributions this is by default set to 022 and you can easily find out what it is on your system by running the umask command:
umask

Contrary to what you might think, this is not enough to have this working for all applications and daemons on the system. This works fine for any files created from a shell session, but the files created by other processes, like the web server for example, will still use the default, unless otherwise configured. In order to have apache use a different umask we can define this inside /etc/apache2/envvars (debian, and ubuntu systems) or /etc/sysconfig/httpd (rhel,centos systems) like this:
umask 002
and restart apache to enable it.

Other daemons will have different locations where you can define this to overwrite the default setting for umask (check their documentation if you are unsure).

Tags: , ,

PHP Sessions in Memcached

The moment a PHP application grows to run on more servers, normally people will see problems caused by PHP sessions. If the application is not persistent you are lucky and don’t care about this, but if not you will quickly see this regardless of how good the load balancer you use is handling stickiness (sending the users to the same real server), this will slowly become a major issue. There are various solutions that can be used to store PHP sessions in a shared location, but I want to present today one solution that is very simple to implement, yet very efficient and on the long term better suited than using a database backend for this: using memcache to store the sessions.

The pecl memcache php extension has supported for a long time the memcache session.save_handler, but with the release 3.0.x (still in beta at this time) this brings in a set of interesting features for us:
- UDP support
- Binary protocol support
- Non-blocking IO using select()
- Key and session redundancy (values are written to N mirrors)
- Improved error reporting and failover handling

Read the rest of this entry »

Tags: , , , ,

Official Ubuntu Amazon EC2 AMIs

Ubuntu released official images for Amazon EC2 for Intrepid (8.10) and Hardy (8.04) releases (not Jaunty image yet). These are server edition images. I’ve always used the great alestic ec2 images created by Eric Hammond for any Ubuntu or Debian release I needed in the past and was very happy with the quality of the images Eric maintained. This was also seen by the Ubuntu team and they worked with Eric to create their official images with the same quality and most of the features of what most people were used for Ubuntu images so far in EC2 world.

In my opinion here are the advantages of the newly released official Ubuntu images:

  • officially support by Canonical (Eric has done a great job in patching and updating his images, but I am sure he has better things to do and let the Ubuntu team do this).
  • custom kernels: for Intrepid 2.6.27 and Hardy 2.6.24 by having Amazon support in doing this (while alestic images were using the default Amazon Fedora kernel 2.6.21 image).
  • apt mirrors in the ec2 cloud provided by Ubuntu: us.ec2.archive.ubuntu.com and eu.ec2.archive.ubuntu.com
  • RightScale support for advanced integration with the RightScale platform for RightScale users.

Read the rest of this entry »

Tags: , , ,

iptables geoip match on debian lenny

The geoip iptables extension allows you to filter, nat or mangle packets based on the country’s source or destination. This does exactly what the geoip apache module does, or the regular geoip binary, but at the iptables level. I would not go into the details why you would want to use that, but there are many ‘positive’ ways it can be useful… For example myself I use it in a project where we want to serve customized content for different countries. Since this is a high traffic site running on many web servers behind a loadbalanced setup, we prefer to split this at the loadbalancer level and not at apache level, to simplify our setup. We serve customized content to the US based visitors, while for the other countries we serve another international site.

Now this has been working fine for a long time now, using the original geoip module and patch-o-matic-ng method of installation (similar to what is very well described here). Still, this is unmaintained, and starting with kernel 2.6.22 it is no longer working. There is a patch that will make it work with a newer kernel, but if you run iptables 1.4.x this will again fail and even if there are some manual walkarounds this is still not the best solution.

The solution is called Xtables-addons. Xtables-addons is the successor to patch-o-matic-ng. Likewise, it contains extensions that were not, or are not yet, accepted in the main kernel/iptables packages. Xtables-addons is different from patch-o-matic in that you do not have to patch or recompile the kernel, sometimes recompiling iptables is also not needed.
The latest version 1.12 supports: iptables >= 1.4.1 and kernel-source >= 2.6.17.

Read the rest of this entry »

Tags: , , ,

iotop: simple top-like i/o monitor

iotop does for I/O usage what top does for CPU usage. It watches I/O usage information output by the Linux kernel (requires 2.6.20 or later) and displays a table of current I/O usage by processes on the system. This tool is written by Guillaume Chazarain and requires Python >= 2.5 and a Linux kernel >= 2.6.20 to run. This post introduces this very useful tool and shows how we can install it and use it.

iotop can be downloaded either as source package or a rpm package. Starting with lenny, debian includes iotop in the main repository and it can be installed just as simple as running:
aptitude install iotopThis is very cool indeed and kudos to the debian team to include iotop in lenny :-)
Read the rest of this entry »

Tags: , ,

iopp: howto get i/o information per process

We all know and love vmstat, but wouldn’t it be nice to get such information on a per process basis, to be able to better understand what is causing i/o problems? This is exactly what iopp, written by Mark Wong and released as open source does:
“It’s a custom tool to go through the Linux process table to get i/o statistics per process. It is open source and can be downloaded from: http://git.postgresql.org/?p=~markwkm/iopp.git;a=summary

Now this sounds interesting, and I am sure anyone that has dealt with i/o issues in the past will probably find this very useful. Let’s see how we can install it and what kind of reporting we get. We will install this from source and here are some quick steps to do this (you will need git and cmake for this):
git clone git://git.postgresql.org/git/~markwkm/iopp.git
cd iopp
cmake CMakeLists.txt
make

Read the rest of this entry »

Tags: , ,

Mdadm Cheat Sheet

Mdadm is the modern tool most Linux distributions use these days to manage software RAID arrays; in the past raidtools was the tool we have used for this. This cheat sheet will show the most common usages of mdadm to manage software raid arrays; it assumes you have a good understanding of software RAID and Linux in general, and it will just explain the commands line usage of mdadm. The examples bellow use RAID1, but they can be adapted for any RAID level the Linux kernel driver supports.

1. Create a new RAID array

Create (mdadm –create) is used to create a new array:
mdadm --create --verbose /dev/md0 --level=1 /dev/sda1 /dev/sdb2
Read the rest of this entry »

Tags: , , ,

HowTo force remote devices (routers/switches) to refresh their arp cache entry for a machine

The Address Resolution Protocol (ARP) is the method for finding a host’s link layer (hardware) address when only its Internet Layer (IP) or some other Network Layer address is known. ARP is a Link Layer protocol (Layer 2) because it only operates on the local area network or point-to-point link that a host is connected to. When we migrate one IP from a machine to another one, we might have problems caused by ‘arp caching‘. Various devices will cache the arp information for a specified amount of time and even after we moved the IP this will not be seen by some devices that will still use the cached information. I am talking about directly connected switches or routers, that we might have control or maybe not. If we have control on all the external devices, normally we just connect to the router or switch and remove the arp entry, forcing the device to query again for the information. This post will try to help in the situation where we don’t have direct control on the external devices (we are collocated or use rented servers in a remote datacenter, etc.), to minimize the downtime associated with this type of IP migration.

It is quite frequent to use separate IPs for various services on the same machine, and move those IPs to another server if needed. These are sometimes called portable IPs that can be migrated to any server in a particular colo/lan. This is done normally to minimized downtime and keep maintenance of such operations minimal (and to not rely on dns changes). Still arp caching on various network devices can cause big problems. Let’s assume we moved the IP from one server to another one in the same LAN to move away some service from our main web server. Taking down the IP from the existing server and bringing it up on the new server will complete our direct work if we don’t have access on the switches/routers in front of us. Again if you have control on all devices just connect to them and delete the arp cache for this ip to allow it to be re-cached on the new machine.

Read the rest of this entry »

Tags: , , , ,

Bcfg2 0.9.6 debian package for etch

The Bcfg2 version available in debian etch is quite old (v0.8.6), while the one in lenny is newer v0.9.5.7, it still isn’t the latest stable version 0.9.6 that was released in November last year. Since this version fixes many bugs it is the version that is recommended to use in production at this time (unfortunately it breaks the reporting system, that will not be fixed until the release 1.0 planed for the next months). This post will show how we can rebuild a debian package for the latest stable bcfg2 release so we can easily deploy it on several machines.

Bcfg2 is a debian friendly project, meaning they provide inside the source package all what is needed to build a debian package very easy. We will use for this a debian etch system, but this should work on any debian based system. Read the rest of this entry »

Tags: , , , , ,

HowTo ignore some files/folders from awstats reports

Awstats will consider as a page hit any entry from the log it processes. By default some file extensions (for regular image types and css/js) are excluded from what awstats will consider as a page:
NotPageList="css js class gif jpg jpeg png bmp ico"(this is the default). All other file types will be counted as pages. Now, if we want to completely ignore some files, or even all the content of one folder from the awstats processing we can use the SkipFiles parameter. We might want to do this to ignore some frames, hidden pages, ajax calls, etc.

Read the rest of this entry »

Tags:


Marius on Twitter