All we need is an easy explanation of the problem, so here it is.
In Linux, is there any difference between after-
ip link down-condition and real link absence (e.g. the switch’s port burned down, or someone tripped over a wire).
By difference I mean some signs in the system that can be used to distinguish these two conditions.
E.g. will routing table be identical in these two cases? Will
ethtool or something else show the same things? Is there some tool/utility which can distinguish these conditions?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
There are difference between an interface which is administratively up but disconnected or administratively down.
The interface gets a carrier down status. Its proper handling might depend on the driver for the interface and the kernel version. Normally it’s available with
ip link show. For example with a virtual ethernet veth interface:
# ip link add name vetha up type veth peer name vethb # ip link show type veth 2: [email protected]: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:a0:3b:9a:ad:4d brd ff:ff:ff:ff:ff:ff 3: [email protected]: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 link/ether 36:e3:62:1b:a8:1f brd ff:ff:ff:ff:ff:ff
vetha which is itself administratively UP, displays
NO-CARRIER and the equivalent operstate
LOWERLAYERDOWN flags: it’s disconnected.
/sys/ entries exist too:
# cat /sys/class/net/vetha/carrier /sys/class/net/vetha/operstate 0 lowerlayerdown
In usual settings, for an interface which is administratively up the carrier and operstate match (NO-CARRIER <=> LOWERLAYERDOWN or LOWER_UP <=> UP). One exception would be for example when using IEEE 802.1X authentication (advanced details of operstate are described in this kernel documentation: Operational States, but it’s not needed for this explanation).
ethtool queries a lower level API to retrieve this same carrier status.
Having no carrier doesn’t prevent any layer 3 settings to stay in effect. The kernel doesn’t change addresses or routes when this happens. It’s just that in the end a packet that should be emitted won’t be emitted by the interface and of course no reply will come either. So for example trying to connect to an other IPv4 address will sooner or later trigger again an ARP request which will fail, and the application will receive a "No route to host". Established TCP connections will just bid their time and stay established.
Above vethb has operstate DOWN and doesn’t display any carrier status (since it has to be up to detect this. A physical Ethernet interface of course behaves the same).
When the interface is brought down (
ip link set ... down), the carrier can’t be detected anymore since the underlying hardware device was very possibly powered off and the operstate becomes "down".
ethtool will just say there’s no link too, so can’t be used reliably for this (it will surely display a few unknown entries too but is there a reliable scheme for this?).
This time this will have an effect on layer 3 network settings. The kernel will refuse to add routes using this interface and will remove any previous routes related to it:
- the automatic (
proto kernel) LAN routes added when adding an address
- any other route added (eg: the default route) in any routing table (not only the main routing table) depending directly on the interface (
scope link) or on other previous deleted routes (probably then
scope global) . As these won’t reappear when the interface is brought back up (
ip link set ... up) they are lost until an userspace tool adds them back.
When using recent tools like NetworkManager, one can get confused and think a disconnect is similar to an interface down. That’s because NM monitors links and will do actions when such events happen. To get an idea the
ip monitor tool can be used to monitor from scripts, but it doesn’t have a stable/parsable output currently (no JSON output available), so its use gets limited.
So when a wire is disconnected, NM will very likely consider it’s not using the current configuration anymore unless a specific setting prevents it: it will then delete the addresses and routes itself. When the wire is connected back, NM will apply its configuration again: adds back addresses and routes (using DHCP if relevant). This looks like the same but isn’t. All this time the interface stayed up, or it wouldn’t even have been possible for NM to be warned when the connection was back.
It’s easy to distinguish the two cases:
ip link showwill display
LOWERLAYERDOWNfor a disconnected interface, and
DOWNfor an interface administratively brought down.
setting an interface administratively down (and up) can lose routes
losing carrier and recovering it doesn’t disrupt network settings. If the delay is short enough it should not even disrupt ongoing network connections
but applications managing network might react and change network settings, sometimes with a result similar to administratively down case
you can use commands like
ip monitor linkto receive events about interfaces set administratively down/up or carrier changes, or
ip monitorto receive all the multiple related events (including address or route changes) that would happen at this time or shortly after.
ipcommands (but not
ip monitor) have a JSON output available with
ip -json ...to help scripts (along with
Example (continuing from the first veth example):
vethb is still down:
# ip -j link show dev vethb | jq '..operstate' "DOWN" # ip -j link show dev vetha | jq '..operstate' "LOWERLAYERDOWN"
Set vethb up, which now gets a carrier on both:
# ip link set vethb up # ip -j link show dev vetha | jq '..operstate' "UP"
This tells about the 3 usual states: administratively down, lowerlayerdown (ie: up but disconnected) or up (ie: operational).
Here’s what I see in
dmesg after running
ip link set enp3s0 down:
r8169 0000:03:00.0 enp3s0: Link is Down
ethtool enp3s0 Settings for enp3s0: Cannot get device settings: No such device Supports Wake-on: pumbg Wake-on: d Link detected: no
the switch’s port burned down, or someone tripped over a wire
The interface must still be there, as well as static IP addresses and routes.
ethtool will detect your NIC and show
Link detected: no but the output will be complete, unlike when the device is removed via
ip l set interface down.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂