1、 s IN (Intelligent Networks) Routine Check List Version 2.12 Author: Wang Dong Xiao Tel: 13916597332 In addition to the authors named on the cover page the following persons have collaborated on this document: 31 Error! No t
2、ext of specified style in document. Jiang Xin ping IN TAC2 Gilles VERLINDE VOMS TAC2 Tu Honglei IN TAC2 The document comprises 24 pages, all pages have issue no 01. The document is based on template Normal.dot. This issue was last saved on 25.04.2002 This docum
3、ent was edited with MS WinWord 2000. General Information Issue Control The document comprises 24 pages, all pages have issue no 01. History Version Release date By comment 1.01 2001-06-14 Mr Wang dong xiao First version 1.02 2001-10-09 Mr. Jiang xin pi
4、ng Add the voms & ivr check item 1.03 2001-12-20 Mr. Jiang xin ping Add the check items for scon commdLog and scp commDLog 1.1 2002-03-15 Mr. Wang Dongxiao Add the check items for CC server Delete the check item for IVR 2.0 2002-03-21 Mr. Jiang xin ping Reviewed by Jiang xin ping; Ad
5、d the check list for voms V6. 2.01 2002-04-25 Ms. Tu Honglei Add the check item for finding the root sessions on SMP and SCP. 2.1 2002-05-30 Mr. Wang Dongxiao Add the check items for Database Backup. Add check items for monitor File Exchange manager 2.11 2002-06-11 Mr. Wang Dongxiao Ad
6、d check for the number of login on CC server 2.12 2002-07-09 Mr. Wang Dongxiao Add check for online interface of scp and voms. Add check for confirmation tickets on scp and smp Table 1: History 1. Routine Check for SMP Action and Command Remarks · Check File System occupancy root
7、> df –k Show how many space are available on the system. If you find the capacity of available space close to 0, you can find large file as below: · Search for Large files root > find . –size +2000 | xargs ls –l · Search for new large files root > find /home* -type f –mtime –7 –size +2
8、000 · Search the core file root > find /home* -type f –name core · Check System Performance root > sar 10 5 The result like: 12:00:55 %usr %sys %wio %idle int/s intdef/s 12:01:05 2 4 0 94 6 0 12:01:15 10
9、15 2 73 24 0 12:01:25 4 8 18 69 31 0 12:01:35 10 15 35 39 67 0 12:01:45 19 32 43 6 93 0 Note: the %idle should be more than 70% ·
10、 Check all root logins Root > finger –q root · Send a broadcast to all users currently logged in Root > wall · See the number of root logins Root > finger |grep root|wc –l Too many root sessions may lead to the reduction of system performance and security. It is nec
11、essary to ask the users who are from unknown position to exit. Write message. e.g. please exit the root login Root > wall Please exit the root login (Ctrl+D to exit) · Check Informix log file Informix log file found under /home/Informix/online.log This file can be viewed wit
12、h the command root > onstat –m Check if the logical log backup successfully. The information should like: Logical Log 2507 Complete Logical Log 2507 – Backup Started Logical Log 2507 – Backup Completed · Check Informix logical logs root > onstat –l Check if the logical logs are bac
13、kup to the tape successfully. Check the status of flags: address number flags uniqid begin size used %used 21a25cb8 1 U-B--- ……. 21a25cd4 2 U-B--- ……. . . . . 21a25d60 7 U---C-L
14、…….. If the logical logs are not backup on the tape please check if the tape is mounted on the system. · Check Database root > su – informix informix > onstat – Check if the Informix database is online. · Check the Processes on the Networker root > ps –ef | grep nsr Norma
15、lly you should find some processes as below: /opt/nsr/nsrd /opt/nsr/nsrexecd /opt/nsr/nsrmmd –n 2 /opt/nsr/nsrmmdbd /opt/nsr/nsrindexd /opt/nsr/nsrmmd –n 1 · Check the tape information root > nsrmm check which tape is mounted · Check backup content mminfo –m Example:
16、 root@xosmp1:/ # mminfo -m volume written (%) read expires mounts capacity E SMP_x24smp.001 26 GB 100% 182 MB 03/22/03 21 20 GB E SMP_x24smp.002 25 GB 100% 390 MB 09/24/03 15 20 GB E SMP_x24smp.004 6402 MB 32% 0 KB 10/29/03 24 20 GB
17、 SMP_x24smp.013 0 KB 0 KB 11/07/03 0 20 GB E SMP_x24smp.035 5249 MB 26% 0 KB 11/28/03 13 20 GB E SMP_x24smp.036 5282 MB 26% 0 KB 11/29/03 15 20 GB E SMP_x24smp.037 5316 MB 27% 0 KB 11/30/03 13 20 GB E SMP_x24smp.038 30 MB 0.1%
18、 0 KB 12/01/03 14 20 GB E SMP_x24smp.039 8669 MB 43% 0 KB 12/02/03 14 20 GB E SMP_x24smp.040 7286 MB 36% 0 KB 12/03/03 13 20 GB E SMP_x24smp.041 7412 MB 37% 0 KB 12/05/03 22 20 GB E SMP_x24smp.042 10 MB 0.1% 0 KB 12/21/03 16 20 GB
19、SMP_x24smp.125 10 GB 51% 0 KB 03/13/04 9 20 GB SMP_x24smp.126 11 GB 54% 0 KB 03/14/04 9 20 GB SMP_x24smp.127 5952 MB 30% 0 KB 03/15/04 43 20 GB SMP_x24smp.128 10 GB 52% 0 KB 03/16/04 46 20 GB SMP_x24smp.129 11 GB 57% 0 KB 03/
20、17/04 35 20 GB SMP_x24smp.130 10 GB 51% 0 KB 03/18/04 33 20 GB SMP_x24smp.131 9770 MB 49% 0 KB 03/19/04 36 20 GB · Check Networker backup result Find file /nsr/logs/daemon.log Open this log file and check if the defined backup group is backed up successfull
21、y. You should find the information like : nsrd: savegroup notice: SMP_DB_GROUP completed, 1 client(s) (All Succeeded) nsrd: savegroup notice: SMP_SYS_GROUP completed, 1 client(s)
22、 (All Succeeded) · Check observe SMP root > obsinfo Check the status of observe SMP. The information like: OBSERVE System Information - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - OBSERVE: started local system: x1smp1
23、 status: orig remote system: x1smp2 status: backup connection to remote host: status: yes control line 1 operating: status: yes control line 2 operating: status: yes start of observe allowed at reboot: status: yes autorepair:
24、 status: no autotake: status: yes · Check teleservice state root > telestat –2 Check if the modem is online. · Check the HW message Log in the system root > cd /usr/bin/lar root > ./show_defacts Check if there is any
25、 information about HW_CONFIG defects table. If there is any malfunction of hardware you will get the information about HW_CONFIG defects · Check if the confirmation tickets jammed root > cd /stat112 root > ls –l root > ls –l CC Usually, there is only one or two files under /stat112 and /
26、stat112/CC If you find there are lots of files under /stat112, but there is no file under /stat112/CC, please check the “move ticket” crontab. root > crontab –l smsftac #----------------------------------------------------------------------------- # moveticket cronjob for CCserver 0,15,30,45
27、 * * * /var/root/bin/moveticket.sh /stat112/ /stat112/CC/ > /dev/null If you find there are lots of files under /stat112/CC, please check the LAN/WAN connection between smp to CC server. 2. Routine Check for SCP · Check File System occupancy root > df –k Show how many space are ava
28、ilable on the system. If you find the capacity of available space close to 0, you can find large file as below: · Search for Large files root > find . –size +2000 | xargs ls –l · Search for new large files root > find /home* -type f –mtime –7 –size +2000 · Search the core file root > f
29、ind /home* -type f –name core · Check System Performance root > sar 10 5 The result like: 12:00:55 %usr %sys %wio %idle int/s intdef/s 12:01:05 2 4 0 94 6 0 12:01:15 10 15 2 73 24
30、 0 12:01:25 4 8 18 69 31 0 12:01:35 10 15 35 39 67 0 12:01:45 19 32 43 6 93 0 · Check all root logins Root > finger –q root · Send a broadcast to all use
31、rs currently logged in Root > wall · See the number of root logins Root > finger |grep root|wc –l Too many root sessions may lead to the reduction of system performance and security. It is necessary to ask the users who are from unknown position to exit. Write message. e
32、g. please exit the root login Root > wall Please exit the root login (Ctrl+D to exit) · Check Networker backup result Find file /nsr/logs/daemon.log Open this log file and check if the defined backup group is backed up successfully. You should find the information like : nsrd:
33、savegroup notice: SCP_DB_GROUP completed, 1 client(s) (All Succeeded) nsrd: savegroup notice: SCP_SYS_GROUP completed, 1 client(s) (All Succeeded
34、) · Check the Processes on the Networker root > ps –ef | grep nsr Normally you should find some processes as below: /opt/nsr/nsrd /opt/nsr/nsrexecd /opt/nsr/nsrmmd –n 2 /opt/nsr/nsrmmdbd /opt/nsr/nsrindexd /opt/nsr/nsrmmd –n 1 · Check the tape information root > nsrmm Exa
35、mple: root@xosmp1:/ # nsrmm 8mm mammoth tape SMP_x24smp.131 mounted on /dev/ios0/rstape003chn, write enabled · Check backup content mminfo –m Example: root@x24ce01:/#mminfo -m volume written (%) read expires mounts capacity SCP_x24ce01.001 69 GB full 0 KB 09/18/
36、03 18 20 GB SCP_x24ce01.002 31 GB 100% 17 GB 09/25/03 7 20 GB SCP_x24ce01.003 3496 MB 17% 0 KB 10/29/03 19 20 GB E SCP_x24ce01.004 35 GB full 0 KB 10/08/03 9 20 GB SCP_x24ce01.006 3608 MB 18% 0 KB 10/31/03 1 20 GB SCP_x24ce01.029 2
37、5 GB 100% 0 KB 12/02/03 32 20 GB SCP_x24ce01.038 13 GB 65% 0 KB 12/16/03 0 20 GB E SCP_x24ce01.064 13 MB 0.1% 0 KB 01/30/04 0 20 GB SCP_x24ce01.081 11 GB 57% 0 KB 02/18/04 47 20 GB SCP_x24ce01.087 12 GB 58% 0 KB 02/25/04 0 20
38、GB SCP_x24ce01.102 12 GB 60% 0 KB 03/13/04 58 20 GB SCP_x24ce01.103 12 GB 60% 0 KB 03/14/04 67 20 GB E SCP_x24ce01.104 45 MB 0.2% 0 KB 03/15/04 69 20 GB SCP_x24ce01.105 12 GB 60% 0 KB 03/16/04 39 20 GB SCP_x24ce01.106 12 GB 61% 0
39、KB 03/17/04 8 20 GB SCP_x24ce01.107 12 GB 61% 0 KB 03/18/04 65 20 GB SCP_x24ce01.108 12 GB 61% 0 KB 03/19/04 68 20 GB · Check teleservice state root > telestat –2 Check if the modem is online. · Check the HW message Log in the system root > cd
40、 /usr/bin/lar root > ./show_defacts Check if there is any information about HW_CONFIG defects table. If there is any malfunction of hardware you will get the information about HW_CONFIG defects · Check the commD connection to scon on both CE root> ps –ef | grep commD Output should co
41、ntain following information:: root 3017 1 0 00:26:05 ? 0:05 /opt/SIdlm/commD/commD root 3034 3017 0 00:26:05 ? 0:01 /opt/SIdlm/commD/commD root 6093 6057 2 10:13:48 inet/141 0:00 grep commD · Check commD log file of both CE root> cd /opt/SIdlm/commD
42、root> cat commDLog The output should contain following information for the newest date, for example: Thu Dec 20 00:25:46 -> commD_ping: return o.k. with 2 Thu Dec 20 00:25:46 -> from DlmGetInfo: commD_ping_state_nids[]: 3 - 3 - 1 – 1 · Check the online interface connection between CC server
43、to SCP and SCP to voms root > netstat –a | grep 22099 On one CE, the output should like: root > netstat -a | grep 22099 tcp 0 0 x10ce01.2351 ccserver.22099 ESTABLISHED tcp 0 0 x10ce01.2350 voms30.22099 ESTABLISHED tcp 0 0 x1
44、0ce01.2349 voms10.22099 ESTABLISHED tcp 0 0 x10ce01.2348 voms50.22099 ESTABLISHED tcp 0 0 *.22099 *.* LISTEN On other CE, the output should like: root > netstat -a | grep 22099 tcp 0 0 x10c
45、e02.1070 voms50.22099 ESTABLISHED tcp 0 0 x10ce02.1495 voms10.22099 ESTABLISHED tcp 0 0 x10ce02.1494 voms30.22099 ESTABLISHED tcp 0 0 *.22099 *.* LISTEN · Check if the co
46、nfirmation tickets jammed root > cd /IN/scp/data.99/tickets/cft root > ls –l root > ls –l CC The number of tickets should less than 10. If there are lots of tickets under /IN/scp/data.99/tickets/cft, but no file under /IN/scp/data.99/tickets/cft/CC, please check the “move ticket” cronjob.
47、root > crontab –l #----------------------------------------------------------------------------- # moveticket cronjob for CCserver * * * * /IN/scp/scripts/moveticket.sh /IN/scp/data.99/tickets/cft/ /IN/scp/data.99/tickets/cft/CC/ > /dev/null If there are lots of tickets under /IN/scp/data.99/t
48、ickets/cft/CC, please check the LAN/WAN connection between scp to CC server. 3. Routine Check for scon · Check commd connection to both CE root > ps -ef |grep /opt/SIdlm/commd/commd\ -d Output should contain following information: root 2615 2590 0 Oct 19 ? 0:13
49、/opt/SIdlm/commd/commd -d root 28544 28484 0 11:41:50 inet/23 0:00 grep /opt/SIdlm/commd/commd -d root 2590 1 0 Oct 19 ? 0:26 /opt/SIdlm/commd/commd -d · Check commdLog file of scon root > cd /opt/SIdlm/commd/ root> cat commdLog The output should contain following inf
50、ormation for the newest date, for example: Thu Dec 20 00:25:46 - Received NOTIFY msg from 'x9ce02' (NID 2) Thu Dec 20 00:25:47 - Current cluster size 2 ( 1 2 ) Thu Dec 20 00:26:04 - Received NOTIFY msg from 'x9ce01' (NID 1) Thu Dec 20 00:26:04 - Registration phase finished - commd ready






