본문 바로가기

Engineering/[Network]

[Switch][Juniper] Crash System stuck on Boot-up

http://kb.juniper.net/InfoCenter/index?page=content&id=KB20464

[Upgrade EX switch] Stage 4 – Troubleshoot upgrade failure/crash – System stuck on Boot-up



SUMMARY:

This article documents Stage 4 – Troubleshoot upgrade failure/crash. The EX device may get stuck in the boot process and fail to load the Junos. This article addresses the recovery methods.  

To go directly to other stages of the Junos upgrade on an EX switch, consult the Resolution Guide -- Upgrading EX Series.


PROBLEM OR GOAL:

Symptoms:

  • EX device is stuck loading Junos after upgrade
  • How do I recover from a failed upgrade?
  • 'Loading Junos' is reported in the LCD panel of the EX device

EX series devices may get stuck in the boot process or fails to boot the OS. In rare cases, following a sudden power loss or ungraceful power shut down, the EX Series switches could experience a file system corruption, preventing the switch from recovering to a functional state. It is recommended that customers minimize their log configurations to prevent excessive read/writes to the file system, reducing stress on storage media and thereby reducing the potential occurrence of this issue. Moreover, if abrupt power failures are transient for a very short period of time, the availability of UPS can also prevent the EX Series switch from experiencing a sudden power loss.

We do not have to worry about damaging hardware in these situations since the hardware cannot tell the difference between a graceful shutdown and pulling the power cord. The potential for damage is with the file system structure. It is possible for data to be corrupted when the computer's power is interrupted with the operating system running. That data could be in the inodes, which could result in files being lost or file contents being corrupted.

This issue has been observed on the EX2200, EX3200, EX4200 and EX4500 lines of switches. Although rare, the issue is more likely on platforms that use UNIX/BSD-based operating system, such as Junos, to access the flash-based storage media. This issue has been noted in O'Reilly Media’s JUNOS Enterprise Switching book: “Although rare, file system damage can occur with an abrupt power off, which may cause problems on the next boot. Use the request system halt or request system reboot command to gracefully shut down or reboot the OS. Once the OS is halted, it is safe to remove power.”


CAUSE:

SOLUTION:

This article applies to EX devices running Junos 10.4R2 and below. For devices running 10.4R3 and later, refer to PSN-2011-03-201 - Feature Release “Resilient Dual-Root Partitions” for EX Series Switches - (Junos OS Release 10.4R3 and later).

Perform the following steps to recover from an upgrade failure on your EX switch:

    Notes:
  • If If your EX device is a Virtual Chassis, then isolate the affected member and proceed as a stand-alone switch
  • If your EX device is a Dual Routing Engine - 8200 Series, then isolate the affected RE and proceed as a stand-alone switch
  • If 'Loading Junos' is reported on the LCD panel of the EX switch, proceed with Step 1.

Step 1. With a console connected to a stand-alone EX switch, which prompt or error is the boot up process stuck at? Click the prompt below to jump to the instructions.



Loader Prompt (Loader >)
  1. To recover the EX switch from the loader> prompt, reinstall the Junos software using one of the following methods:
  2. If a Root Cause Analysis is required after recovering, perform the steps in KB20569 - Collecting Logs from Juniper EX series Devices.

Debug Prompt (db>)
Proceed to KB20635 - While booting up, switch stuck in db> mode

UBoot Prompt (=>)
  1. Type reset from the UBoot prompt, which will reboot the EX switch.
  2. Break the bootup sequence to get to the loader> prompt:

  3. # When you see the "loading /boot/defaults/loader.conf" display hit ENTER. 
    Then press [Enter] to boot immediately, or space bar for command prompt.
    Hit the space bar to enter the manual loader. The loader > prompt displays. 
    (NOTE: There is a 1 second delay for hitting the space bar) 
    (TIPS: you can hit space bar after you see "Loading /boot/defaults/loader.conf" message)
  4. Perform the steps under Loader Prompt above.

Can't load kernel error
Power cycle the EX device IF the console bootup process does not return to a prompt, and STOPS at the following display errors:
can't load '/kernel'
can't load '/kernel.old'
If it continues to stop at the same errors after a reboot, and it does not progress to another prompt or process, then proceed to Step 2 in order to do a Format Install.

'Loading Junos' reported on LCD Panel

If 'Loading Junos' is constantly displayed on the LCD screen of the EX switch, connect a console to the EX switch.  Most likely one of the prompts above will be displayed.  If so, follow the instructions above.  

Sometimes when 'Loading Junos' is reported, the Virtual Chassis has to be reactivated with the request virtual-chassis reactivate command as follows:
--- JUNOS 10.0S10.1 built 2010-11-08 21:23:50 UTC
root@RE0:LC:1% cli
{linecard:1}
root@RE0:LC:1> request virtual-chassis reactivate

Note: The above command applicable only if you see the switch on Linecard mode else proceed with next step 

Step 2.  If the above methods do not work, a 'Format Install' is the last recovery option to recover.  

WARNING:  Performing a Format Install is going to format the entire File system & Storage Unit. Therefore the EX switch will lose all the configuration & logs. Hence a Root Cause Analysis or recovery of any information from the EX switch will not be possible.  Do NOT perform this install if you need a Root Cause Analysis.  Instead contact your technical support representative.  

To do the Format Install, proceed to KB20643 - Rewrite the entire file system by issuing "install --format" command from "Loader" mode.

PURPOSE:
Installation
Troubleshooting

RELATED LINKS: 


'Engineering > [Network]' 카테고리의 다른 글

[Email] Reverse DNS 관계  (0) 2017.01.18
[OUI] OUI List  (0) 2016.03.03
[공유기]SKB Mercury RUSH 315N  (0) 2014.11.29
[Dell Switch]  (0) 2014.09.11
DNS Server List  (0) 2013.11.06