Snapshot Abends Server
Date published: January 6, 2012
Products: Portlock Storage Manager Client and Portlock Storage Manager for NetWare.
Environment: Portlock Storage Manager Client is imaging a NetWare 6.5 SP7 server using the Image NetWare Server wizard and the server abended during the image.
Issue #1: The NetWare server abended when performing an online image.
Issue #2: After rebooting the server, a pool snaphot existed that prevented another online imaging attempt.
#Issue #3: After completing the Image NetWare Server wizard, an error message about an invalid file handle was reported.
Resolution for Issue #1. Install NetWare 6.5 SP8 and N65NSS8C. Portlock Storage Manager uses Novell's Snapshot feature to image NSS Pools while they are active with mounted volumes. This features requires SP8 and the post SP8 NSS patch.
Novell is constantly changing download URLs. If you have a problem with the above links search for the following file names with Google. Select a download location that is from Novell and not a third party:
Note: Prior to installing a service pack, we recommend using Portlock Storage Manager to image your server. This way if a serious problem appears after installing SP8, you can restore your server back to its current configuration. In this case, do an offline image.
Note: If you do not want to install SP8, then use the offline imaging feature. Do not select a Pool for snapshots in the image wizard.
Resolution for Issue #2: Delete the snapshot. After the server abended the snapshot that Portlock Storage Manager created was not deleted.
Start Portlock Storage Manager on the server:
- Select Pool Commands
- Select Pool Snapshot Menu
- Select Delete Snapshot
- Select the snapshot to delete
To load Portlock Storage Manager on the server type the following command at the server console:
- LOAD SYS:/STORMGR/STORMGR
NetWare also provides snapshot commands at the server console:
- To get a list of commands type HELP MM
- To list the existing snapshots type MM SNAP LIST
- To delete a snapshot type MM SNAP DELETE snapname
- To delete all snapshots type MM SNAP DELETE ALL /Y
Resolution for Issue #3: Using the Novell Client, relogin to the NetWare server. When a NetWare server abends, the Novell Client sometimes does not reconnect to the NetWare server once it reboots. In some cases, a reboot of the workstation is required to clear the Novell Client connection table.
More Information
Introduction
Novell SnapShot Backup
Portlock Online Imaging
Common reasons that Online Imaging fails:
NSS Pools must be in a stable and warning free state for NetWare's Snapshot feature to work correctly
If you are seeing pool errors during an Online Image
Portlock Storage Manager Log files
NSS /PoolVerify Log files
NSS /PoolRebuild Log files
Reporting issues to Portlock
Introduction
Data backup and Recovery is probably one of the most essential tasks as a
system administrator. In the event of a disaster, a successful recovery of data
and services will determine whether you will be getting your raise or you will
be looking for a new job. Yet, many administrator do not pay much attention to
backup and recovery strategy. Many put absolute faith in the backup software
logs that declares that the backup is completely successfully for the day.
However, we forget that a successful backup is not the end goal but a successful
restore in an event of a disaster is the end goal. Herein lies the trap of false
security. In the event of disaster, without a clear recovery and tested plan,
administrator may spend the next 2 days trying to figure why data cannot be
recovered. It does not help that your users and boss are breathing down your
neck.
Novell SnapShot Backup
Novell SnapShot Backup feature alleviates problems encountered when backing
up open files. SnapShot creates a virtual image or snapshot of a volume at a
particular point in time. Once the snapshot is created, further changes are
stored as "deltas" from the snapshot so that delta data plus the active volume
can create a complete replica of data storage. Novell's innovative SnapShot
approach allows a pool snapshot to be created in 10 to 20 percent of the size of
the original pool and in a minimum amount of time. NetWare can support up to 500
active snapshots on a given NSS storage pool (500 different snapshots in time on
the disk).
Snapshots provide an instant copy of volume's data that otherwise would be
difficult to backup because of open files. Novell's "freeze/thaw" technology
provides a consistent data set that facilitates non-disruptive backups. As
contrasted to a traditional, full-data copy of the pool, the block-level copy
only takes a moment to create and occurs transparently to the end user.
The Novell freeze/thaw interface manages snapshot events so that applications
(databases, GroupWise®, etc.) are informed that a snapshot
is about to take place. Applications then ready file system data by getting it
consistent and flushing pending transactions. The application indicates its
frozen or ready state to the NSS system where buffers are flushed and open disk
files are rendered consistent. The snapshot then takes place in less than two
seconds and the system indicates that the applications are free to "thaw" and
continue. No longer do administrators have to take down a database or mail
server to get a consistent backup. Freeze/thaw interfaces are published and
being consumed by Novell and its third party partners so that snapshot solutions
provide consistent data when done in a SAN or storage array as well as at a
host.
Snapshots can be used as part of a disaster recovery plan and archived for
online or near-line access. Applications or users can proceed to work from the
snapshot in the event that the original files are inaccessible.
Portlock Online Imaging
Portlock Online Imaging provides block based imaging for NetWare 6.5 SP8 or
later NSS Pools while they are active with mounted volumes. Portlock Online
Imaging depends upon Novell SnapShot Backup. An NSS Pool, which can be empty, is
used by Novell SnapShot Backup to store the deltas during an image command.
Portlock Storage Manager automatically freezes the selected pools, creates the
snapshots, images the pools, thaws the selected pools and finally deletes the
snapshots.
Common reasons that Online Imaging fails:
- The Pool is corrupted.
- The Pool holding the snapshot is corrupted.
- The Pool has too many deleted files. Consider purging the volumes within the pool.
- The Pool does not have sufficient free space. Cleanup the volumes within the pool.
- There is too much I/O activity on the Pool. NetWare must duplicate data
that is modified on the pool during an online image. Consider moving some
application's datasets to other volumes to balance pool I/O.
- There is not enough free space on the pool holding the snapshot.
- There are hardware problems causing I/O errors on either the Pool or the
pool holding the snapshot.
NSS Pools must be in a stable and warning free state for NetWare's Snapshot feature to work correctly.
- Check your Pool:
- Run "Check Pool" from Portlock Storage Manager.
- This command is located under the "Pool Commands" menu.
- Add the command line option "-logfile=filename", without the quotes
when starting Portlock Storage Manager.
- Specify the full path to the log file (example: -logfile=C:/STORMGR.LOG).
- You can specify a floppy so that nothing on the server is modified (-logfile=A:/STORMGR.LOG).
- Do not specify a volume located on a pool that you are "checking" as
the log file would be closed before the full results of the Pool Check
could be written.
- Verify your Pool:
- Execute Novell's "nss /poolverify" command from the console. Select the
pool to be verified. See the notes below about log files.
- Rebuild your Pool:
- If there are any warnings or errors from the above commands run a "nss /poolrebuild"
and then repeat the check and the verify.
If you are seeing pool errors during an Online Image:
- Reboot the server so that everything is in a stable state.
- Consider purging the volumes in the pool. We have seen a number for issues
(NetWare bugs) when there are a lot of "unpurged" files.
- Manually create a snapshot and verify both the original pool and the
snapshot pool:
- Assuming that your problem pool is called "SYS" and you have another pool
called "TEST" to store the snapshot, execute the following commands from the
NetWare console:
- mm snap list - This will display any snapshots on the server. This
should be an empty list
- mm snap create sys test sys_snap - This creates a new snapshot called "SYS_SNAP",
stores the temporary pool data on pool "TEST". The original pool is called
"SYS". Change the names according to your setup.
- mm snap list - Verify that your snapshot was created successfully.
- mm snap activate sys_snap - This activates the snapshot pool called "SYS_SNAP".
- nss /PoolVerify=SYS - This will verify the active pool "SYS".
- nss /PoolVerify=SYS_SNAP - This will verify the snapshot of pool "SYS"
- When Portlock Storage Manager is performing an "Online Image" of a pool,
SYS_SNAP is the pool being imaged.
Make sure that you don't have pool corruption in the pool that is holding
the snapshot. In the above example, compete an "nss /PoolVerify" on the "TEST"
pool.
If you see any warnings or errors with the above (and they are not
corrected by a Pool Rebuild), report this to Novell as you have a setup that
does not support snapshots correctly or there is a bug in Novell's code
supporting snapshots.
To see the help screen for NetWare's "mm" commands type "help mm" at the
console.
To delete the above snapshot: "mm snap delete sys_snap".
Portlock Storage Manager Log files
- Portlock Storage Manager supports creating a log file for details of
various warnings and errors. This is very important for commands such a Pool
Check. Add the command line option "-logfile=filename", without the quotes.
Specify the full path to the logfile (example: -logfile=C:/STORMGR.LOG). You
can specify a floppy so that nothing on the server is modified (-logfile=A:/STORMGR.LOG)
Do not specify a volume located on a pool that you are "checking" as the log
file would be closed before the full results of the Pool Check could be
written.
NSS /PoolVerify Log files
- NetWare stores the results of a Pool Verify in a log file that starts with
the pool name and ends with the suffix VLF (Verify Log File). Each time you
run a Pool Verify the results are appended to this log file. If you are
sending this log file to Portlock, delete the log files first so that the log
file only contains information about this issue being analyzed. We will not
review log files with the results of multiple Pool Verifies stored within
them.
NSS /PoolRebuild Log files
- NetWare stores the results of a Pool Rebuild in a log file that starts
with the pool name and ends with the suffix RLF (Rebuild Log File). Each time
you run a Pool Rebuild the results are appended to this log file. If you are
sending this log file to Portlock, delete the log files first so that the log
file only contains information about this issue being analyzed. We will not
review log files with the results of multiple Pool Rebuilds stored within
them.
Reporting issues to Portlock:
- Send the NSS /PoolVerify log file. If reporting an Online Image issue,
perform a Pool Verify on both the original pool and the snapshot pool. See
above for manually setting up a snapshot.
- Send the NSS /PoolRebuild log file.
- Send the Portlock Storage Manager log file from the Pool Check command.