infrastructure

actually i`m building a complete infrastructure from scratch; 3 * ibm 3650m3 => 2 * vmware esxi, 1 * debian stable as nfs storage, ghettoVCB to nfs, icinga monitoring for all devices, high security segmented network design, vlan trunken switching, ibm lto5 sas drive with baclua, postix/amavisd-new/postgrey/sa mta, exchange 2010 + 2k8 ad, svn / jenkins developing machine, juniper srx220h firewall and a mag2600 as vpn-ssl solution and some other stuff … day 7 …

 

monitoring vsphere 4.1 with icinga / pnp4nagios

the first time i build a icinga monitoring for a vsphere environment. before i configured only checks for esxi machines directly. love the check_esx3.pl module from op5.com. some pics from pnp4nagios:

got the system running on a debian squeeze, all packages from the main,nonfree,controb repositories. pnp4nagios is backported. so all packages are covered with security updates and the normal apt super cow power !

vmware vsphere and symantec endpoint / webserver port conflict

got an customers w2k8 r2 mgmt machine with symantec endpoint and vmware vsphere. the problem: after installing the endpoint product under vmware vsphere the performance graphs for the vm’s don`t apear and the client crashes immediately(detail: the vmware webservice services crash and cannot restart successfully). cause: both software product got webservers on board, that means port conflicts. the solution: the machine got two nics; one nic for mgmt net, one nic for a production net. i changed in the vmware tomcat config the localhost definitions to the mgmt ip because i saw that the endpoint used the other interface automatically. now the tomcat services for the vsphere installation run on a different interface and there are no more conflicts with the two webserver instances… the better solution is not to install two products with different webservers on one machine – especially under a windows server.

ibm ds3500 / degraded drive channel

got an customer ibm ds3500 (vmware cluster) with the following error : “drive channel set to degraded” and “individual drive degraded path”

in the diagnostics were shown errors on the counter for “controller detected errors” and “drive detected errors” on one drive channel – this meant to me that there were some problems with the fc cable or the sfps of the controller – after changeing the two sfps and the fc cables after a few hours the problem was still the same.

bugged out. did a ibm support call and sent the detailed drive data – after analysing the data:

the factor for the errors was one of the disks of a raid10 array ! wow-  the disk itself was shown in the storage manager as healthy ! one defective disk (which was shown as healthy) leads to a degraded drive channel – this makes me worrying about ibm storage … hint: don`t waste time for self debugging (ibm) sans 😉