= Basic Instructuions for KNU Cluster Administrators and Operators = == Root access to the nodes == Getting root access by means of {{{su}}} or {{{sudo}}} is prohibited. Root's account password authentication works only from physical console and disabled via SSH. Right way to get root access - key-based SSH connection. List of admin's public SSH keys installed with dedicated RPM package. Manual keys addition to {{{authorized_keys}}} should be avoided. == Installing, Removing and Updating packages and configuration == All nodes configuration is Kickstart-based and packet-based only. All configuration packages had been installed from Blackjack-HKR repo. Changing configs manually is '''stricly prohibited''' unless for negotiated testing and debug purposes. Installing and removing packages manualy by means of {{{yum install}}} is '''stricly prohibited'''. To add package additional {{{Requires:}}} dependencies should be added to {{{knu-wn-deps}}} package (commited to SVN). Then package need to be rebuilt and published. To update packages immediately use {{{yum nodesync}}} command. To update configuration files you should update corresponding package in SVN or create a new one in {{{knu-config.spec}}}. If have no experience with RPM spec files or access to SVN - [[https://trac.grid.org.ua/ClusterSupport/newticket | open new task in trac]]. == Managing services == KNU Cluster is RHEL7-based and use Systemd for services management. Get ready to works without LSB scripts and [[https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/System_Administrators_Guide/chap-Managing_Services_with_systemd.html|obtain systemd basic skills]]. High Availablity services that should be maintened accross the cluster rely on Pacemaker operation. Services under Pacemaker control '''SHOULD NOT''' be stoped/started/restarted manually! This includes: - All clustered filesystems - A-nodes DRBD - A-nodes NFS - A-nodes FhGFS Management LXC container - PBS Mom - FhGFS Client In case of node/service maintenance that should be done without Pacemaker interaction, corresponding node '''SHOULD BE''' put into standby mode with {{{pcs cluster standby }}}. To unstandby use {{{pcs cluster unstandby }}}.