Would using iptables limit my number of possible hops?

Discussion in 'Linux Networking' started by dominic.jacobssen, Aug 30, 2007.

  1. Hi all.

    I've got an odd networking problem that has completely stumped me. I'm
    very familiar with Linux on a day-to-day bases, but I'm no networking
    guru, and I figured that a usenet post would be the best bet.

    The technical support people at my ISP are no help at all, and insist
    that it must be something to do with my setup. The thing is, I am not
    sure that they're wrong. I wouldn't mind it if they actually backed up
    their assertion with a wireshark trace, but I get the impression it's
    just laziness on their part.

    This is the problem:

    A colleague of mine and I share an ADSL connection going through a
    Linux firewall running iptables. I administer the firewall. The
    firewall machine has two ethernet ports, with pretty simple rules:

    - allow incoming SSH on a high port;
    - allow access to and from the LAN;
    - allow "associated" incoming connections;

    For the most part, Internet connectivity is fine. Web, SMTP, POP3,
    BitTorrent, SSH, you name it. My ISP block the first incoming 1024
    ports as a matter of policy, but apart from that the service is solid
    and fast.

    However, this colleague has an email account hosted by
    fasthosts.net.uk (actually, they seem to go via many names: fasthosts,
    livemail, including various permutations of .co.uk, .net, etc). For
    the last three days he cannot connect to any of the following

    smtp-in-112.livemail.co.uk (
    mail213-171-216-21.livemail.co.uk (
    mail213-171-216-230.livemail.co.uk (

    Performing a tracepath on these addresses gives a suspicious pattern
    (I've removed the first few lines):

    $ tracepath
    4: tshape-phome.lim.thunderworx.net ( 65.300ms
    5: r-psdl.lim.thunderworx.net ( asymm 3
    6: r-bbone3.lim.thunderworx.net ( 65.731ms
    7: tshape2.thunderworx.net ( 107.126ms
    8: r-bbone3.lim.thunderworx.net ( asymm 6
    9: r-bbone2.lim.thunderworx.net ( asymm 7
    10: r-bbone2.ldn.thunderworx.net ( asymm 9
    11: no reply
    12: ldn-b1-link.telia.net ( asymm 11
    13: ldn-bb2-pos0-2-0.telia.net ( 144.573ms
    14: ldn-b4-link.telia.net ( 147.085ms
    15: no reply
    16: no reply

    $ tracepath
    4: tshape-phome.lim.thunderworx.net ( 64.224ms
    5: r-psdl.lim.thunderworx.net ( asymm 3
    6: r-bbone3.lim.thunderworx.net ( 197.966ms
    7: tshape2.thunderworx.net ( 197.653ms
    8: r-bbone3.lim.thunderworx.net ( asymm 6
    9: r-bbone2.lim.thunderworx.net ( asymm 7
    10: r-bbone2.ldn.thunderworx.net ( asymm 9
    11: ldn-tch-i1-link.telia.net ( asymm 10
    12: ldn-b1-link.telia.net ( asymm 11
    13: ldn-bb1-link.telia.net ( 171.609ms
    14: ldn-b4-link.telia.net ( 163.505ms
    15: no reply
    16: no reply

    In other words, it never proceeds further than ldn-b4-link.telia.net
    on the 14th hop.

    The guys at the ISP say, "well, it works for me, must be something
    with your setup". Now, I know that this works fine on another ISP, but
    that goes via a different route.

    As fas as I know, nothing has changed on my setup. Moreover, I'm
    stumped as to how having something misconfigured on my setup could
    possibly affect connectivity between backbone switches thousands of
    miles away. The only thing I can think of is maybe my MTU setting,
    which is set to 1500.

    How can I do more diagnostics? To whom can I complain? My ISP? Telia?
    The final destination?


    dominic.jacobssen, Aug 30, 2007
    1. Advertisements

  2. all this shows is that traceroute is disabled at a host along the
    path... just because traceroute is disabled... doesn't mean that you
    can't get a tcp connection to it...

    what issue ( besides traceroute ) are you having?

    To debug some of these issues you can just try and do
    telnet host 25 ( 25 - smtp server port )
    telnet host 110 ( 110 - pop3 port )
    telnet host 143 ( 143 - imap port )
    and see if you can get a connection.

    Some hosts do a reverse name lookup ( should slow things down
    a lot ) and decide if they want to allow the connection or not...

    .... here is a example...

    telnet 25
    Connected to smtp-in-112.livemail.co.uk (
    Escape character is '^]'.
    220 smtp-in-79.livemail.co.uk ESMTP Postfix
    telnet> quit
    Connection closed.

    does that work for you?

    what doesn't work?


    D.A.M. - Mothers Against Dyslexia

    see http://www.jacksnodgrass.com for my contact info.

    jack - Grapevine/Richardson
    Jack Snodgrass, Aug 30, 2007
    1. Advertisements

  3. dominic.jacobssen

    elsiddik Guest

    I dont think the problem is with the ISP - its somewhere in your
    Some isps lately must of them block all traceroute/ping queries at
    their firewall. and it will never reach the target host. thats not the
    dig somewhere else or try your log files, try to find out whats
    blocking/wont let you connect to that mailserver.
    ur firewall may be an issue .

    zaher el siddik
    elsiddik, Aug 30, 2007
  4. Hi Jack; thanks for your reply.

    I understand what you're saying about traceroute/ping, but I think
    it's deeper than that. After all, the only reason I'm doing a
    traceroute/ping is because we can't make normal TCP connections.

    Telnetting to the various ports fails completely:

    $ telnet 110
    telnet: Unable to connect to remote host: Connection timed out

    $ telnet 25
    telnet: Unable to connect to remote host: Connection timed out

    However, everything else (meaning "seemingly any connection to
    elsewhere in the internet") works fine; for example:

    $ traceroute smtp.google.com
    traceroute to smtp1.google.com (, 30 hops max, 40 byte
    4 tshape-phome.lim.thunderworx.net ( 42.751 ms 38.499
    ms 64.280 ms
    5 r-psdl.lim.thunderworx.net ( 42.466 ms 38.741 ms
    39.788 ms
    6 r-bbone3.lim.thunderworx.net ( 82.974 ms 40.871 ms
    58.354 ms
    7 tshape2.thunderworx.net ( 62.720 ms 64.320 ms
    39.224 ms
    8 r-bbone3.lim.thunderworx.net ( 114.365 ms 129.280
    ms 124.335 ms
    9 r-bbone2.lim.thunderworx.net ( 57.941 ms 61.109
    ms 59.430 ms
    10 r-bbone2.ldn.thunderworx.net ( 184.927 ms 226.484
    ms 200.451 ms
    11 * * ldn-tch-i1-link.telia.net ( 120.972 ms
    12 ldn-b1-link.telia.net ( 151.799 ms 134.434 ms
    145.181 ms
    13 ldn-bb2-pos0-2-0.telia.net ( 145.616 ms 119.058
    ms 130.195 ms
    14 nyk-bb2-link.telia.net ( 192.176 ms 205.303 ms
    212.208 ms
    15 chi-bb1-pos7-0-0-0.telia.net ( 215.366 ms 265.793
    ms 244.027 ms
    16 google-118691-chi-bb1.c.telia.net ( 208.771 ms
    207.361 ms 209.296 ms
    17 ( 243.090 ms
    ( 233.083 ms 293.161 ms
    18 smtp1.google.com ( 208.287 ms 210.620 ms 250.974

    Note that this link is (1) more than 14 hops and (2) goes through
    exactly the same router (ldn-b1-link.telia.net) as the failing ones
    above, but not through the last "working" point (ldn-b4-

    More examples:

    $ telnet google.com 80
    Connected to google.com.
    Escape character is '^]'.

    $ ping news.bbc.co.uk
    PING newswww.bbc.net.uk ( 56(84) bytes of data.
    64 bytes from nolmedia01.thdo.bbc.co.uk ( icmp_seq=1
    ttl=50 time=144 ms
    64 bytes from nolmedia01.thdo.bbc.co.uk ( icmp_seq=2
    ttl=50 time=136 ms
    64 bytes from nolmedia01.thdo.bbc.co.uk ( icmp_seq=3
    ttl=50 time=158 ms

    Moreover, if I try and do a traceroute using Thunderworx's own web
    traceroute here:


    then (apart from being very slow) this also seemingly fails to reach
    the final destination (unless I'm mistaken and is
    somehow an alias for, although it does manage to


    dominic.jacobssen, Aug 30, 2007
  5. Hi Zaher, thanks for your reply.

    As I said in another reply, it's not just the traceroute/ping traffic
    that fails, everything fails (SMTP, HTTP, POP3). This is something
    that has been working for a couple of years and has recently stopped
    working, with no change to my firewall (the script that initialises
    the firewall is kept in SVN, so I know its modification history).

    Besides, if it were the firewall, surely that would imply that:

    - I wouldn't be able to make the *first* hop, let alone multiple
    subsequent hops;
    - The protocol would be blocked to *any* remote server, not just a
    particular set of servers. For example, my colleague is using
    livemail.co.uk, but I'm using the same protocols (pop3, smtp) exactly
    with multiple other mail servers and my email works fine.


    dominic.jacobssen, Aug 30, 2007
  6. dominic.jacobssen

    buck Guest

    There is a way to limit the number of hops with iptables, but it is
    doubtful that it is implemented. If you have doubts, turn off the
    firewall during your trace. And use tcptracert -n DEST PORT rather
    than tracepath / traceroute.

    buck, Aug 30, 2007
  7. Hi Buck; thanks for your reply.

    Oooh, I hadn't heard of tcptracert before. Nice new tools! Thanks for
    the link (for the innocent bystanders in this thread, it's in Ubuntu,
    and therefore presumably Debian: sudo apt-get install tcptraceroute).

    However, the good news ends there:

    $ tcptraceroute -n 80
    Selected device eth0, address, port 45415 for outgoing
    Tracing the path to on TCP port 80 (www), 30 hops max
    4 37.928 ms 99.411 ms 37.867 ms
    5 48.708 ms 58.069 ms 38.322 ms
    6 41.557 ms 54.884 ms 58.748 ms
    7 63.019 ms 62.079 ms 58.319 ms
    8 62.238 ms 62.076 ms 58.077 ms
    9 59.418 ms 64.806 ms 62.257 ms
    10 139.315 ms 118.668 ms 123.422 ms
    11 * 119.929 ms 137.001 ms
    12 118.331 ms 120.699 ms 116.717 ms
    13 125.581 ms 119.391 ms 135.511 ms
    14 121.862 ms 122.924 ms 130.701 ms
    15 * * *
    16 * * *
    17 * * *
    18 * * *
    19 * * *
    20 * * *

    The odd thing is that everything else seems to be working just fine.
    It's just so odd and inexplicable.


    dominic.jacobssen, Aug 31, 2007
  8. dominic.jacobssen

    Moe Trin Guest

    On Fri, 31 Aug 2007, in the Usenet newsgroup comp.os.linux.networking, in
    article <>,
    Yes, however it shares one dependency with hping2, hping3, mtr, and
    the various versions of traceroute in that the intermediate hops are
    all identified by sending a packet with incrementally increasing TTLs
    and hoping that the "remote" sites will return an ICMP Type 11 error
    message when they drop the packets after decrementing the TTL.

    One thing to look at is the actual packet that is leaving your firewall
    and see if it contains any "strange" flags, or unexpected TTLs. When the
    2.4.0 kernel was introduced back in early 2001, there was a rash of
    connection problems reported because this kernel introduced the use of
    ECN (Explicit Congestion Notification - see RFC3168). Some routers of
    that era were configured to silently discard packets with "unknown"
    flags set. There were bug fixes for these routers, but it is possible
    that you may be encountering one or more that have not been updated.
    You could try disabling the ECN in your firewall by

    echo 0 > /proc/sys/net/ipv4/tcp_ecn

    and see if that fixes the problem. It's not very likely, as the bugfix
    was widely publicized back then, and one would hope that the people who
    operate the routers have their collective heads out of their a$$. But
    one never knows. You mention Ubuntu, and I'm sure you can find a packet
    sniffer (tcpdump, ethereal [now known as wireshark], or what-ever) to see
    what those outgoing packets look like.
    In your original post, you mention mail on ports 25 and 110, rather than
    web stuff on 80. None the less, from here (North America), I can see that
    the port is open and there appears nothing untoward (no ident check to
    port 113 for example).
    10 ( 314.589 ms 298.770 ms 299.088 ms
    11 ( 304.663 ms 298.652 ms 309.089 ms
    12 ( 304.526 ms 298.788 ms 299.091 ms
    13 ( 304.654 ms 308.715 ms 289.123 ms
    14 ( 294.545 ms 298.769 ms 289.097 ms
    15 ( [open] 284.802 ms * 289.584 ms

    although I'm hitting slightly different routers obviously.
    I'd try the ECN - wouldn't be the first (or last) time that has caught
    someone by surprise.

    Old guy
    Moe Trin, Aug 31, 2007
  9. dominic.jacobssen

    buck Guest

    root@YesICan:~# tcptraceroute -n
    Selected device eth1, address 66.xxx.101.194, port 60405 for outgoing
    Tracing the path to on TCP port 80 (http), 30 hops max
    1 66.xxx.101.198 0.570 ms 0.429 ms 0.460 ms
    2 9.302 ms 3.594 ms 3.569 ms
    3 4.007 ms 5.682 ms 4.347 ms
    4 6.918 ms 7.858 ms 6.270 ms
    5 8.448 ms 6.055 ms 6.951 ms
    6 211.866 ms 59.904 ms 9.372 ms
    7 8.590 ms 7.822 ms 11.016 ms
    8 16.072 ms 8.435 ms 17.012 ms
    9 33.503 ms 31.482 ms 31.137 ms
    10 19.085 ms 33.834 ms 24.976 ms
    11 33.379 ms 31.048 ms 31.963 ms
    12 91.243 ms 92.484 ms 84.391 ms
    13 187.214 ms 160.783 ms 152.232 ms
    14 158.220 ms 170.779 ms 151.425 ms

    Note that you're not getting past here. ldn-b4-link.telia.net is your
    Bad Boy.

    15 154.198 ms 155.783 ms 158.879 ms
    16 156.391 ms 164.434 ms 157.140 ms
    17 156.699 ms 154.534 ms 154.383 ms
    18 [open] 153.877 ms * 158.319 ms

    Was your firewall down?

    Assuming that your firewall was down during this trace so that it
    cannot be at fault, have you any way to discuss this with telia.net?
    My guess would be that ldn-b4-link.telia.net is blocking you.

    One thing you might try is to boot a live CD and run another trace
    while it is running. That should prevent any Bad Thing from occurring
    during the trace and also eliminate anything in your failing setup as
    being at fault.

    ASIDE: it is interesting that is hop 14 for each of us...
    buck, Aug 31, 2007
  10. dominic.jacobssen

    buck Guest

    Old Guy,

    Aren't you running with ECN set? I am and my trace, like yours,
    works. I know of several sites that still don't work with ECN:
    so ECN seems not to be Dominic's prob.

    Note that my trace to ports 25 and 110 both fail,
    while the trace to port 80 succeeds.
    buck, Sep 1, 2007
  11. dominic.jacobssen

    Moe Trin Guest

    On Fri, 31 Aug 2007, in the Usenet newsgroup comp.os.linux.networking, in
    Yes I am, but notice we come through different doors. I came in via

    ]] 8 ( 154.631 ms 158.722 ms 169.092 ms
    ]] 9 ( 234.835 ms 228.792 ms 229.119 ms
    while in your other post your show
    Your 15-18 matches my 12-15, but my 11 is different from your 14.
    It's a bit sad as the Cisco bug fixes for this problem were released back
    in late 2000.

    Old guy
    Moe Trin, Sep 2, 2007
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.