Author Topic: FluffOS KeepAlives  (Read 6354 times)

Offline Alexi

  • Acquaintance
  • *
  • Posts: 15
    • View Profile
FluffOS KeepAlives
« on: August 11, 2008, 11:38:34 AM »
There was a chat recently on I3 about adding keepalives to FluffOS.

1) A variety of client solutions are being employed to resolve the challenge of avoiding odd TCP timeouts caused by providers or routers.
2) Client maker such as Zugg are refusing to address the issue directly and their arguments are sound.
3) Host servers are not at fault.
4) Networks are not largely at fault.
5) The problem is at the mudlib or driver layer.





Offline chaos

  • BFF
  • ***
  • Posts: 291
  • Job, school, social life, sleep. Pick 2.5.
    • View Profile
    • Lost Souls
Re: FluffOS KeepAlives
« Reply #1 on: August 11, 2008, 12:01:04 PM »
Since it's known that these TCP connection terminations are performed by 'providers or routers', it seems a bit much to identify the mudlib or driver as the source of the problem, as such.

Offline Raudhrskal

  • BFF
  • ***
  • Posts: 214
  • The MUD community needs YOUR help!
    • View Profile
Re: FluffOS KeepAlives
« Reply #2 on: August 11, 2008, 01:25:58 PM »
I tested some stuff. The TELNET NOP (no opration, \xff\xf1) would be the best choice - but some mud clients won't interpret it and print two symbols instead. Then the other possibility left is NUL, the 0 byte. It will be ignored. But, although you can put a \0 into LPC string, it will be interpreted by the write() function as the standard ASCIIZ terminator.

So, we're probably stuck at sending a null or two using the low level C send(). It probably can be added somewhere in the main event loop... or even as a new efun (send_keepalives() ?) to be called by LPC.
I think, therefore i may be wrong.
Please note that if you met a Raudhrskal in a place that's not related to muds, it wasn't me. *sigh*... back when I started there was zero hits on google for that name...

Offline chaos

  • BFF
  • ***
  • Posts: 291
  • Job, school, social life, sleep. Pick 2.5.
    • View Profile
    • Lost Souls
Re: FluffOS KeepAlives
« Reply #3 on: August 11, 2008, 01:58:49 PM »
Put me down in favor of send_null(obj) where obj is an interactive.

1) Lib control of keepalive transmission is better than driver control.
2) The lib is already capable of sending a TELNET NOP.
3) So, the only thing missing is empowering the lib to send a NUL if TELNET NOP is no good for your users.

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: FluffOS KeepAlives
« Reply #4 on: August 11, 2008, 02:10:34 PM »
I put this in /lib/body.c, in the CheckHealing() fun:

Code: [Select]
    if(interactive()){
        receive("\0");
    }

It doesn't seem to break anything.

I can't say if it's a proper keepalive tho. I
don't have one of those flaky routers to test with.

Does it screw up your clients?

-Crat

Offline Raudhrskal

  • BFF
  • ***
  • Posts: 214
  • The MUD community needs YOUR help!
    • View Profile
Re: FluffOS KeepAlives
« Reply #5 on: August 11, 2008, 02:31:00 PM »
Sorry, receive() takes it's arg as C-string and stops printing at the char before first \0 . I'd opt for send_null(interactive_ob), or a complete send_binary(inter_ob,data,length) function - it exists for mud-mode sockets, but i don't think there is a player-socket equivalent.

And to make clear why not TCP keepalives:
1) Adjustment of the timeouts is nonportable.
2) The standard MINIMAL delay between keepalives is 2 hours (!).
3) They're often blocked by firewalls and other network eq, especially if you reduce the 2h interval.
I think, therefore i may be wrong.
Please note that if you met a Raudhrskal in a place that's not related to muds, it wasn't me. *sigh*... back when I started there was zero hits on google for that name...

Offline wodan

  • BFF
  • ***
  • Posts: 434
  • Drink and code, you know you want to!
    • View Profile
Re: FluffOS KeepAlives
« Reply #6 on: August 11, 2008, 02:45:07 PM »
I have no idea what you're trying to fix! (I'm away from my keyboard for about a day at a time, and never lost my connection, so there doesn't seem to be any driver issue there)

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: FluffOS KeepAlives
« Reply #7 on: August 11, 2008, 02:50:00 PM »
Quote
5) The problem is at the mudlib or driver layer.

I think this statement is a little confusing. I think *some* people
have problems staying connected to their mud. I have never
had this problem on a mud in my lan. The "disconnection
problem" is almost always, from what I've seen, either an
ISP issue or a modem/router issue.

So I disagree that their problem is at the mudlib or driver layer.

I will agree, however, that helping folks avoid the problems
caused by their lame routers or lame ISP's is most conveniently
done at the mudlib or driver layer.

-Crat

Offline Raudhrskal

  • BFF
  • ***
  • Posts: 214
  • The MUD community needs YOUR help!
    • View Profile
Re: FluffOS KeepAlives
« Reply #8 on: August 11, 2008, 02:52:33 PM »
We're not trying to 'fix' anything. We need a way to put NUL (ascii 0) into the data stream going to the player. (ie the player's socket as opposed to mud-mode socket) Problem with that: the efuns available (write(), receive()) are doing something with the provided LPC string using C <string.h> functions (or their equivalents). The LPCstring is then trimmed before the first \0, like a typical C-string.
Example:
receive("\0") -> no data in packet
receive("before\0after\0") -> "before" in packet

Verified with wireshark.
I think, therefore i may be wrong.
Please note that if you met a Raudhrskal in a place that's not related to muds, it wasn't me. *sigh*... back when I started there was zero hits on google for that name...

Offline chaos

  • BFF
  • ***
  • Posts: 291
  • Job, school, social life, sleep. Pick 2.5.
    • View Profile
    • Lost Souls
Re: FluffOS KeepAlives
« Reply #9 on: August 11, 2008, 03:43:50 PM »
If you want to go in the direction of driver convergence you could implement binary_message().

Offline Alexi

  • Acquaintance
  • *
  • Posts: 15
    • View Profile
Re: FluffOS KeepAlives
« Reply #10 on: August 11, 2008, 03:45:41 PM »
I have no idea what you're trying to fix! (I'm away from my keyboard for about a day at a time, and never lost my connection, so there doesn't seem to be any driver issue there)

Let me see if I can road map this...

This is a multi-variable challenge Wodan; one that has been around for years.  People work and play in different clients.  Different clients maintain connections from point A to point N unpredictably.  More elite developers and players tend to gravitate toward more reliable clients.

Unfortunately, it is not a logical conclusion that the more reliable clients are also the more feature rich, user friendly, or popular among a particular user community.  So from a customer service standpoint...

The problem is and always has been disconnections due to firewall and network "are you there" simplicity.  Outside the mud world we have addressed this in telnetd and sshd...but we have been dragging our feet in the mud world.  Or rather we have been cobbling all kinds of hack and workarounds for years in lib and clients to avoid confronting the real solution.

The results are; today we have clients sending unnecessary "looks" and spamming servers with delay loops and timed triggers or cloning objects that have unnecessary heartbeats with psudo-pings on the server side.  The problem has been so thoroughly band-aided that "I don't see the problem" is almost frightening.  I can only presume that seasoned veterans use clients, styles, and behavior that are tuned so tightly that challenges of this nature are rarely observed.

But surely an openminded, thought leader, one that wasn't ossifying would occationally pick up zmud, cmud, tinyfugue, tintin, wintelnet, putty, terraterm and operate outside normal environment to explore other spaces and realize that there is such a thing as connection challenges.

The point is, it's time for a 21st century solution.  We may need to consider the need for binary messaging.

Alexi

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: FluffOS KeepAlives
« Reply #11 on: August 11, 2008, 07:37:44 PM »
Skal:
Quote
Sorry, receive() takes

:(

I installed wireshark to test this myself, and it appears you
are right that my simple hack is insufficient. I tried a
bunch of different tricks, but nothing worked invisibly.

Can't be that hard to add a simple keepalive() efun, can it?
I can try to do it, but I'm pretty sure nobody would like
the result. Anyone else up for it?

Alexi:
Quote
But surely an openminded, thought leader, one that wasn't ossifying

Yikes. I think it's possible to express your preference
without painting those who seem disagree as not openminded.

-Crat

Offline chaos

  • BFF
  • ***
  • Posts: 291
  • Job, school, social life, sleep. Pick 2.5.
    • View Profile
    • Lost Souls
Re: FluffOS KeepAlives
« Reply #12 on: August 11, 2008, 10:09:55 PM »
For any of the mechanisms discussed, I don't think calling it keepalive() or anything with keepalive in the name is appropriate.  If it's a function for sending a null, call it send_null(); if it's a function for sending an arbitrary binary message, it's probably beneficial to implement the same interface as LDmud's binary_message().  Either way, calling the function keepalive() is like if write() were named send_combat_message().

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: FluffOS KeepAlives
« Reply #13 on: August 11, 2008, 10:22:10 PM »
That is reasonable.

-Crat

Offline wodan

  • BFF
  • ***
  • Posts: 434
  • Drink and code, you know you want to!
    • View Profile
Re: FluffOS KeepAlives
« Reply #14 on: August 12, 2008, 04:59:45 AM »
Right, must be my l33t client (never seen this with any client! I used the more popular ones as well).
But from what I read it could be something as silly as routers in people's homes deciding connections are inactive if they don't send anything for a minute.
Not our fault but we could do worse than trying to work around it.

I think we may as well combine it with something useful, like asking for the terminal size, not quite all clients report changes, but most seem to reply to a request for the size.
Sending 0 may well confuse some clients, and it seems telnet go ahead confuses some clients, and telnet nop get printed, so doing something real is probably the way to go.