My home internet provider offers native IPv6 connectivity by default, which I’ve been using for many years. It was never a perfect experience, but as more software adopts IPv6, the more issues I started having. I’ve developed habits like repeatedly refreshing websites, toggling the WiFi on’n’off or aborting SSH just to retry with the -4 flag. Less patient users already disabled IPv6 on their devices. It’s been time to stabilize my IPv6 setup or disable it completely.

Protocol selection algorithms Link to heading

Before I can do anything about it, I need to understand what makes IPv6-enabled software behave erratically.

Happy Eyeballs Link to heading

To decide which IP version to use, most browsers implement the Happy Eyeballs algorithm1. It basically means that a software which supports both IPv4 and IPv6 will attempt to connect via both of them in parallel. Whichever protocol connects first will be used moving forward. There’s few milisecond delay before the IPv4 connection, which gives IPv6 a head-start. In theory, this should result in selecting IPv6 whenever available, with a very quick fallback to IPv4 when it’s not.

In practice, Happy Eyeballs can causes non-deterministic behavior. To avoid long wait times for IPv4-only users, the head-start of IPv6 is very short.2 When there’s a possibility of packet loss or jitter — typical with anything wireless — the connection-race could select between IPv4 and IPv6 semi-randomly. There are no additional checks besides which connection is established first. When there are issues with only one of the protocols, we may or may not experience them.

Tip

Firefox performs Happy Eyeballs upon the first connection to each server, then sticks with the selected protocol family until CTRL+F5 is pressed3. This makes it possible to repeatedly refresh a page until it starts working. Many other software implement similar sticky behavior.

Default Address Selection Link to heading

Many software doesn’t implement its own protocol selection, but simply ask the operating system for IP addresses to try. When there are multiple possible addresses for a domain — such as an IPv4 and an IPv6 address — the implementation of Default Address Selection will determine the order to try them.4 This ordering aims to put the most-likely working address first, considering the machines current configuration. IPv6 addresses will be put forward when the machine has a suitable IPv6 address itself. The application will try to use the very first IP from the list, effectively choosing between IPv4 or IPv6. Unlike Happy Eyeballs, the address selection happens before any connection attempt. The order is dependent on the configured addresses, irrespective of their usability.

Root causes Link to heading

When IPv6 is disabled, both Happy Eyeballs and Default Address Selection should end up selecting an IPv4 connection. When IPv6 is fully working, both algorithms should end up using IPv6. Erratic behaviors may be present when IPv6 is configured but somewhat broken. So what could constitute a somewhat broken IPv6?

The Dynamic IPv6 prefix Link to heading

My internet provider requires a PPPoE tunnel to access the internet, over which they allocate IP addresses for me dynamically. This tunnel is broken approximately every 7 days, and new IP addresses are assigned upon each reconnection.

The IPv4 address change is not much of an issue. My network uses a private address space5 - like 192.168.x.y - from which the router translates to the single public IP. Active connections will be broken when the public IP changes, but reconnection attempts will use the new IP without any changes to the devices on my network.

IPv6 is a different story. The router receives not only a single address, but an entire IPv6 prefix. RA messages are used to configure an addresses with this prefix for each of the devices on the network. The benefit is that there’s no network address translation required. The downside is that when the prefix changes, the address must change on all of the devices. This is troublesome.

IPv6 addresses are leased to an interface for a fixed (possibly infinite) length of time. […] To handle the expiration of address bindings gracefully an address goes through two distinct phases while assigned to an interface. Initially, an address is “preferred”, meaning that its use in arbitrary communication is unrestricted. Later, an address becomes “deprecated” […]. While an address is in a deprecated state, its use is discouraged, but not strictly forbidden. New communication (e.g., the opening of a new TCP connection) should use a preferred address […] RFC 4862

To summarize the relevant parts of SLAAC6 an IPv6 address is supposed to go through a lifecycle, spending some time in each state. To avoid an address conflict wreak havoc, all new addresses are tentative first. Once their uniqueness is guaranteed, they’ll become preferred and start working practically. Nearing their end of life they’ll transition to deprecated to make room for a replacement address. When an address becomes deprecated, there should be a different preferred address available. An address is considered invalid when it is not working anymore.

---
title: Healthy IPv6 Address Lifecycle
---
flowchart LR
    a>2001:db8:a::1]
    b>2001:db8:b::1]
    a ~~~ a1[/tentative/] --> a2[[preferred]] --> a3[[deprecated]] --> a4[/invalid/]
    b ~~~ b1[/tentative/] --> b2[[preferred]] --> b3[[deprecated]] --> b4[/invalid/]
    subgraph s[ ]
        a
        b
    end
    style s2 stroke:none,fill:transparent;
    subgraph s2[ ]
        a2
        b1
    end
    style s2 stroke:none,fill:transparent;
    subgraph s3[graceful change]
        a3
        b2
    end
    subgraph s4[ ]
        a4
        b3
    end
    style s4 stroke:none,fill:transparent;

These SLAAC assumptions are fundamentally violated by binding the life of the IPv6 prefix to the life of the PPPoE tunnel. The next prefix is not known until after the current prefix is already unusable. This makes all IPv6 addresses invalid, before a replacement address could even become tentative.

---
title: Lifecycle with PPPoE-bound prefix
---
flowchart LR
    a>2001:db8:a::1]
    b>2001:db8:b::1]
    a ~~~ a1[/tentative/] --> a2[[preferred]] --> a3[[deprecated]] --> a4[/invalid/]
    b ~~~ b1[/tentative/] --> b2[[preferred]] --> b3[[deprecated]] --> b4[/invalid/]
    subgraph s[ ]
        a
        b
    end
    style s stroke:none,fill:transparent;
    subgraph s4[outage during change]
    a4
    b1
    end
        r((reconnection)) -.-> a4 & b1
    style r stroke-dasharray: 5,5;

My provider also does a particularly bad job at IPv6 prefix delegation. Their DHCPv6-PD validity always states exactly 7 days, irrespective of actual remaining time. Given that the PPPoE reconnection is approximately 7 days but not timed precisely, the prefix lifetime is always incorrect. If the router happens to re-requests the prefix without a PPPoE reconnection, validity could be off by days…

An unexpected PPPoE reconnection will cause even further issues. The correlated prefix change makes all IPv6 addresses unusable immediately.7 It is theoretically possible to send an RA message which immediately deprecates IPv6 addresses, but it is not possible immediately to invalidate them8. Because invalidation is not possible, an unexpected prefix change will always result in squatting of addresses. In the unlikely event of the old prefix being reused at a network we are in contact with, this will cause additional issues.

To have a fully reliable IPv6 network with dynamic prefixes, a window of overlap would be required during which both prefixes are functional. I’ve not seen any provider implementing this, therefore I have a strong preference for static prefixes.

IPv6 is slow by default Link to heading

Stale addresses could easily confuse the Default Address Selection Algorithm, but should be detected by Happy Eyeballs. Since stale IPv6 addresses can’t finish a connection handshake, browsers should quickly fallback to IPv4. Unfortunately this is not the only issue.

Even when the network prefix is stable, I still experience glitches with IPv6 that are not noticeable for IPv4. The IPv6 protocol defaults are just not designed for a SoHo network.

With IPv4, autoconfiguration and neighbor discovery are done over broadcast messages. They follow the logic that if you don’t know your recipient’s address, or they don’t have an address yet, just send the message to everyone on the network. One example is that when a new computer joins the network, it’ll send a broadcast message to request an IP address. This is fine with a few machines, but it gets noisy when your network scales to thousands of hosts.9

The IPv6 protocol attempts to avoid this noise, by extensive use of multicast and stateless algorithms. Instead of each computer asking for their own IP address separately, the router sends periodic router advertisement messages. These contain sufficient information about the network for all hosts to configure themselves silently.

When a new machine joins the network, it can simply wait for the next router advertisement silently, and configure itself with an address when it’s received. The default period for such advertisements is 3 to 10 minutes, until the next of which the new node could not communicate. To speed things up, the machine can request a solicited router advertisement, which will cause the router to send a router advertisement sometime within the next 3 seconds.10

Even with the 3-second solicited option, getting an IPv6 address takes observably longer for a device then getting an IPv4 one. Whenever my phone or computer reconnects to the WiFi network, software will experience a short outage in IPv4 and a long outage of IPv6. This messes with protocol selection badly, especially since Happy Eyeballs cached the address selection.

Note

I’ve managed university networks briefly where the broadcast traffic resulting from the inner workings of IPv4 could reach megabytes per second. This was large enough at the time to cause measurable CPU overhead, crash embedded devices, and reduce WiFi speeds to a crawl.11 There’s no great solution to fix this. The traditional approach is to segment the network into smaller chunks, and route between them instead.

I’m sympathetic to IPv6 attempting to solve the issue of very large networks. Unfortunately, this introduces great complexity without any immediate benefits. Since dual-stack support is generally a must, the maximum practical network size will still be limited by IPv4.

IPv6 fragments are different Link to heading

When an IPv4 packet traversing the network is too large to go through a specific link, the router will fragment it into multiple smaller packets. Clients can send arbitrarily large packets, and rely on routers to split them if necessary.12

---
title: A basic IPv4 fragmentation flow
config:
    mirrorActors: false
---
sequenceDiagram
    autonumber
    actor A as 
    participant R as Router
    actor B as 
    note over A,R: MTU: 1500 bytes
    note over R,B: MTU: 1492 bytes
    A->>R: [1000 bytes]
    R->>B: [1000 bytes]
    loop
        A->>R: [1500 bytes]
        Note over R: Fragment packet
        R->>B: [1/2, 1492 bytes]
        R->>B: [2/2, 8 bytes]
    end

The fragmentation has a significant performance impact on routers. It doubles the overhead of packet headers, and takes up considerable resources of the router. Multiple techniques have been developed to mitigate the need for fragmentation, most common being TCP MSS clamping.13 The IPv6 standard has opted to avoid fragmentation on routers entirely, and rely on informing the clients of the maximum possible packet size.14

---
title: A basic IPv6 fragmentation flow
config:
    mirrorActors: false
---
sequenceDiagram
    autonumber
    actor A as 
    participant R as Router
    actor B as 
    note over A,R: MTU: 1500 bytes
    note over R,B: MTU: 1492 bytes
    A->>R: [1000 bytes]
    R->>B: [1000 bytes]
    A->>R: [1500 bytes]
    R->>A: ICMPv6: Packet too big!<br/>max 1492 bytes
    note over A: Fragment packet
    loop
        A->>R: [1/2, 1492 bytes]
        R->>B: [1/2, 1492 bytes]
        A->>R: [2/2, 8 bytes]
        R->>B: [2/2, 8 bytes]
    end

This is something that is great in theory. In practice, it takes a non-negligible amount of time until the client learns the correct packet size to use. It has to receive the packet too big notification first, after which it also needs to re-transmit the data inside the lost packet. The time and bandwidth of these steps are not insignificant for new connections.

Note
The client machine is in the best position to deal with fragmentation, because it can often avoid it entirely.

In the most common case, packets are already sequential pieces from a larger data stream. Instead of utilizing IPv6 fragmentation, clients can choose to send smaller pieces at a higher level protocol, mitigating the fragmentation overhead entirely.

---
config:
    mirrorActors: false
---
sequenceDiagram
    autonumber 4
    actor A as 
    participant R as Router
    actor B as 
    R->>A: ICMPv6: Packet too big!<br/>max 1492 bytes
    note over A: Reconsider packet sizes
    loop
        A->>R: [1492 bytes]
        R->>B: [1492 bytes]
    end
Warning
Do not filter all ICMPv6 packets in your firewall. They are vital for IPv6 to function properly.

In addition to the initial overhead, ICMPv6 packets are still far too often filtered by firewalls. Despite almost every IPv6 configuration guide having a warning like this, I’ve still encountered multiple companies from which I do not receive the ICMPv6 packet too big messages.

Filtered ICMPv6 is especially bad when combined with Happy Eyeballs. The initial setup of the connection only uses small packets, leaving IPv6 as the selected protocol. When the actual data transfer starts and packets hit maximum size, the connection is effectively dead.

---
config:
    mirrorActors: false
---
sequenceDiagram
    autonumber 3
    actor A as 
    participant R as Router
    actor B as 
    A->>R: [1500 bytes]
    R--XA: ICMPv6: Packet too big!<br/>max 1492 bytes
    A->>R: retry: [1500 bytes]
    R--XA: ICMPv6: Packet too big!<br/>max 1492 bytes
    note over B: timeout
    note over A: give up,<br/> reset connection

For applications to feel responsive, the initial performance of a new connection is critical. To match IPv4 performance, we’d prefer to never have to deal with packet too big messages. IPv6 router advertisements have a field to indicate the maximum packet size that clients should use. We can use that feature to artificially reduce our MTU, effectively avoiding the fragmentation issue surfacing.

IPv6 defines a minimum MTU of 1280 bytes.15 It is tempting to use this on our network and avoid fragmentation altogether. Unfortunately, that would prevent standard-complient tunnels to work on our network. The inside of an IPv6-compatible tunnel would still need to adhere to the minimum of 1280 bytes. Assuming a non-zero tunnel overhead, this would not be impossible.

Cloudflare has a great blogpost about their IPv6 MTU configuration, for which they have made in-depth measurements. The situation has improved a lot since the blogpost was published. At the time of this writing, Google advertises an MSS of 1440 bytes, indicating support for an 1500 byte MTU. Cloudflare MSS is 1360 bytes, corresponding to an MTU of 1420 bytes.

The practical range for IPv6 MTUs is fairly small. Consider all of the following:

  • IPSec tunnel overhead is about 76 bytes, but could be up to 136 bytes.16
  • Wireguard over IPv6 overhead is 80 bytes.17
  • PPPoE overhead is 8 bytes.

I find that IPSec is commonly used by VoWiFi and other end-user software. I also see that a large portion of the end-user internet is stuck behind PPPoE. This gives me a range of 1356 to 1492 suitable for MTU. The choice of Cloudflare-choice of 1420 being in that range is probably not a coincidence. My personal MTU is choice is 1412 bytes because I need to accommodate a Wireguard over IPv6 over PPPoE overhead for some of my locations.

Solutions Link to heading

So, what can I do about my home network stability after the in-depth analysis? I have to say, that the dual-stack configuration of most routers and ISPs are just not good enough. If I want reliable IPv6 connectivity, I need to fine-tune the configuration manually. Below is the list of things I’ve done, should you wish to perform a similar endeavor.

  1. Avoid dynamic IPv6 prefixes

    Request a static IPv6 prefix for the network to avoid renumbering events. If a static prefix is unavailable at your provider, consider tunneling a static prefix.

    If you must deal with a dynamic prefix, either use unique-local addresses with prefix-translation, or tweak the preferred and valid lifetimes to minimize the impact of renumbering. Consider going as low as 3-5s router advertisement interval with 9-15s of lifetime.

  2. Tunnel carefully

    If you decide to tunnel IPv6, prefer to route IPv4 over the same route. Make sure that the latency and jitter is similar over IPv4 and IPv6, and the IPv4 and IPv6 addresses fall within the same geolocation.

  3. Tweak Neighbor Discovery timings

    Configure the neighbor discovery timings. The standard-recommended defaults are not suitable for all but the largest networks. Consider reducing your MIN_DELAY_BETWEEN_RAS to 500ms, MinRtrAdvInterval and MaxRtrAdvInterval to a few seconds.

    Consider raising your router advertisement priority to high, to reduce the chance of conflict with random devices advertising.

  4. Calculate and advertise the MTU

    Calculate the maximum MTU for your network, and advertise it via router advertisements. Don’t forget to consider the overhead of any tunneling, including ISP-enforced PPP dial-in. If your MTU is higher than 1420 bytes, consider capping at 1420 bytes for best interoperability.

    In addition to proper MTU configuration, consider performing TCP MSS clamping similarly to IPv4. Make sure ICMPv6 is not firewalled, or that the firewall reliably forwards packet too big messages.

  5. Double-check multicast

    Make sure that multicast is working in your network. Unless a high multicast throughput is expected, consider disabling multicast handling and let it fallback to broadcast.

    If there are vendor-specific multicast optimization options in your devices, turn them off or test them for IPv6 compatibility exhaustively.


  1. RFC 8305: Happy Eyeballs Version 2: Better Connectivity Using Concurrency ↩︎

  2. The head start which is given to IPv6 connections is formally known as Connection Attempt Delay. Its value is 250 ms by default RFC 8305 Section 8 but may be as low as 10 ms RFC 8305 Section 5. ↩︎

  3. This behavior is documented in Bugzilla (Bug 725587 Comment 9↩︎

  4. RFC 6724: Default Address Selection for Internet Protocol Version 6 (IPv6) ↩︎

  5. RFC 1918: Address Allocation for Private Internets ↩︎

  6. SLAAC refers to RFC 4862: IPv6 Stateless Address Autoconfiguration ↩︎

  7. Although I’ve rediscovered these issues myself, these are by no means new. There’s RFC 8978: Reaction of IPv6 Stateless Address Autoconfiguration (SLAAC) to Flash-Renumbering Events dated March 2021 discussing this issue and potential mitigations. ↩︎

  8. If RemainingLifetime is less than or equal to 2 hours, ignore the Prefix Information option with regards to the valid lifetime […] RFC 4862 Section 5.5.3 e) ↩︎

  9. See RFC 2131: Dynamic Host Configuration Protocol and RFC 826: An Ethernet Address Resolution Protocol for details. ↩︎

  10. See RFC 4861: Neighbor Discovery in IPv6 particularly sections 6.2.4 and 6.2.6 ↩︎

  11. Since WiFi networks negotiate speed on a per-client basis, the broadcast traffic will generally be transmitted at a lowest common denominator speed, disproportionately slowing down the entire network. ↩︎

  12. Fragmentation by routers can be disabled by the Do Not Fragment bit, but this is not relevant for the current comparison. The fragmentation algorithm is detailed by RFC 791: Internet Protocol ↩︎

  13. There’s a detailed RFC on how to handle fragmentation and reduced MTU due to network tunnels. Most of it’s recommendations are also widely applicable for any packet size bottlenecks. RFC 4459: MTU and Fragmentation Issues with In-the-Network Tunneling ↩︎

  14. Contrary to popular belief, IPv6 does support fragmentation, but it must be done by the original sender. See RFC 2460, Section 4.5 for details of the IPv6 fragmentation header. ↩︎

  15. The minimum is explicitly stated in RFC 8200: Internet Protocol, Version 6 (IPv6) Specification, Section 5. ↩︎

  16. Calculated via the IPSec overhead calculator ↩︎

  17. Overhead can be calculated from the Wireguard protocol description, as summarized by this mailing list post. ↩︎