LPMuds.net

LPMuds.net Forums => Intermud => Topic started by: cratylus on December 20, 2006, 03:11:53 pm

Title: Intermud-3 router old status
Post by: cratylus on December 20, 2006, 03:11:53 pm
Hey folks. I'm noticing some weird conflicts happening on yatmim.

Please let me know if you're having difficulty connecting.

Thanks.

-Crat
Title: Re: Intermud-3 router status
Post by: Tricky on December 21, 2006, 01:55:29 pm
Was having trouble when my client went beserk... Just deleted the client data file and it connected first time.

Tricky
Title: Re: Intermud-3 router status
Post by: cratylus on December 23, 2006, 03:43:04 pm
There is now a router status page on LPMuds.net, at this url:

http://lpmuds.net/intermud.html

It's a one-click spot for outage warnings and connection info.

-Crat
Title: Re: Intermud-3 router status
Post by: Tricky on December 28, 2006, 03:14:14 pm
I'm getting that bad-mojo - FD collision error again.

It connects, gets some mudlists and processes them and then suddenly the connection is dropped, dunno by whom. My client re-connects after 30 seconds sends the startup and receives the bad-mojo error.

I've deleted the I3 client cache file so it can't be that.

Tricky
Title: Re: Intermud-3 router status
Post by: cratylus on December 29, 2006, 11:18:00 pm
I have to apologize a bit for the unhelpfulness of that error message.

There wasn't an error type that clearly indicated "i have no idea why this is failing",
so I went with the first thing that occurred to me.

My current opinion is that this is a symptom of a problem with the driver and the
lib disagreeing on the socket's file descriptor. What is happening is a
conflict between two muds trying to use the same FD. Because my investigation
strongly suggests the lib has no reason to get it wrong, I'm inclined to think there
is an error somewhere between the lib and driver.

This theory is supported by the fact that the fd's in contention usually
are bracketed by used ports, so that being off by one would cause this kind
of error. The fact that resetting the router clears the problem is also
evidence that the issue is local to the router, and not a connection problem.

I'm looking into it. I'll take some time. If anyone who is up to speed on
MudOS sockets reads this and wants to pitch in, please feel free.

-Crat
Title: Re: Intermud-3 router status
Post by: Tricky on December 30, 2006, 11:01:03 am
I've changed the behaviour of my client to bind the socket to a system selected socket. Before I was relying on the system to do it automatically as it should...
Quote
From MudOS LPC Sockets Tutorial (http://aragorn.uio.no/nanvaent/manpages/concepts/socket_efuns.html)

Clients

So far we have discussed a connection from the server's perspective. Now let's back up and walk through the client. Just like the server a client must call socket_create() to create a socket. Since a client does not intend that another client connect to it there is no need to bind the port to socket. Does this mean that it cannot do so? No. It is possible for either a client or server to bind to a port.

But why would a client wish to do so? Well the truth of the matter is this, every socket must be bound before a connection can be established. Every one. However, since clients don't really care what port they are bound to, a special bind is used. It was alluded to above, we're just catching up to it now. If a client calls socket_bind() with a second argument of 0, this indicates that the caller doesn't care what port is selected, just pick any one that is available. And this makes it easy for a client. If the caller did bind to a specific port, what happens if another client is already bound to it? The bind fails. So why not let the system do the work of choosing the port?

Now there is one more trick up our sleeve, however. The operating system is pretty smart. It knows whether a sosket is bound or not. It knows when you do a connect (It knows when you've been bad or good). So, seeing how common it would be for a client to wish to connect to a server the designers of the 4.2/4.3BSD networking system put in a neat feature. If you connect on a socket, and the socket is not yet bound, the system will do a socket_bind(s, 0) for you automatically! In fact if you read BSD networking applications you will notice that almost no sockets that are used to initiate connect requests on ever bother doing the bind call. Laziness is bliss.

Before I bound to a socket, netstat (in the mud) reported *.* for the local address. Now I bind to a socket it reports *.<portnum> (where portnum is a port that is available for use).

Tricky
Title: Re: Intermud-3 router status
Post by: cratylus on December 30, 2006, 11:05:36 am
Ah, again I've been imprecise. Rather than "bracketed by used ports" I should have written
"bracketed by used fd's". The port binding isn't the problem, so much as the fd assignment
on socket creation.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on January 06, 2007, 02:02:33 pm
I spent some time last night poking at the router to see if I could gain
new information as to why a small number of folks still has connection problems.

The "incorrect fd" theory I had turns out to be really a symptom of a
different problem. I can say this quite confidently because I've tested
numerous scenarios where the "wrong fd" bug would hose things up,
and it didn't. Instead, the problem *looks* like it's a wrong fd bug, under specific
conditions.

The conditions appear to be a flaky network connection.

If your connection to intermud flakes out in mid-transmission, the router
may detect this drop and remove you from the connected list. It
does so by sending a socket_close() to the router.

Under some circumstances, it appears that the socket is in a
"CLOSING" state for a brief time and not actually closed. If your mud tries
to reconnect right at that moment, it will try to reclaim that closing
socket. Sadness occurs. This is most often seen as an error from
the router claiming "bad mojo" or "your mud is already connected".
This triggers unfortunate events in the router that make it
even more difficult to connect later on.

Last night's maintenance let me find correct some stuff in the router that
I believe will dramatically reduce the incidence of these disharmonies.

However, if you still have problems, please let me know. It is important for
you to provide me with the errors you are receiving. If you are on
a Dead Souls mud, you can monitor packets by typing:

arch
go down

To go to the network monitoring room.

However, it may be that I cannot eliminate all such problems, because
the source is network communitcations I do not understand. To be
precise, these errors occur under one or more of the following conditions:

1) Your network connection occasionally sucks.
2) Your network connection is wireless.

Both of which are beyond my expertise.

I am not raising a white flag here. Simply commenting that fixing the
router to handle these situations may be beyond the scope of my expertise.

-Crat

 
Title: Re: Intermud-3 router status
Post by: cratylus on January 07, 2007, 10:38:33 am
Did further maintenance last night.

In this case, it was for an issue separate from what we've been talking about so far.

It appears that the old *gjs router was very forgiving of certain errors in
startup packets. If your header info was wrong, it seems that it would tolerate it
as long as the rest of the packet elements were ok.

By default, *yatmim did not exercise this forgiveness. If you said you're an I3
protocol 2 startup packet and you have 20 elements, you get rejected, even if
the 20 elements are correctly formatted for a protocol 3 startup.

This is now less strict. If you claim to be protocol 2 but provide protocol 3 startup
data, you will no longer be denied access. Your protocol will get automatically
kicked up a notch to 3, and you should be able to access.

If you've been trying to connect to *yatmim and never were able, this is a
good time to retry.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on January 08, 2007, 06:31:45 am
Dead-souls.net and the router are both on a machine on a university network. Sometimes,
during inclement weather, this network goes down.

This is one of those times.

Hopefully things will get straightened out later this morning. If not, I'll
start implementing a workaround.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on January 08, 2007, 08:50:01 am
There is a new page now available with connection information for the alternate router.

The ETA for the university to fix things for yatmim is currently unknown.

If you need your intermud fix without delay, please read:

http://lpmuds.net/alternate_router.html

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on January 09, 2007, 12:40:07 am
I visited the box to see what the problem was.

Apparently it took a power hit, hard, and chunks of it are fried.
The chunks were obviously marginal to begin with, but they are fried all the same.

So much for the university's ups outlets.

I'm rebuilding it. Should be done in a day or two.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on January 09, 2007, 11:36:18 am
The rebuild is complete, and yatmim is back online. As usual, please let me know
of any gliatches you run into.

The router page has been revamped a bit, to consolidate information spread out
over a few pages, which I think was confusing.

Please use http://lpmuds.net/intermud.html to stay updated on current status,
planned outages, the faqs and the rules.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on February 07, 2007, 06:55:45 pm
A couple of muds chose today to go bananas and spam the router with hyperaggressive
reconnects at the same time. This has hosed things up. If you are having trouble connecting to
the router, it's because I'm playing whack-a-mole with these guys as they change ip's
and dos the router.

I hate to firewall block networks, but it might wind up happening.

In any case, if your intermud connection doesn't get better by tomorrow, send me a PM here
and we'll work something out.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on March 13, 2007, 05:22:28 pm
I've lost connectivity to the *yatmim box. I'll be investigating this evening.

In the meantime, you can use the *i4 router with the command:

switchrouter i4 204.209.44.3 8080

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on March 13, 2007, 06:45:50 pm
Apparently the school campus suffered a power outage that hit most
buildings, including the one that *yatmim hides in. By the time I arrived,
power had been restored and the router was booting.

If you experience further problems, please let me know.

-Crat
Title: Re: Intermud-3 router status
Post by: Atomic on May 22, 2007, 02:33:36 am
On my mud (in the Archroom) the Intermud router status says to be online,
but de printout says: Intermud3 link down, stats unavailable.
down, south (server monitoring room) says I3 Server monitoring (as well as OOB) is online, rest offline.

9 out of 10 that it is caused on my side of the connection... can someone confirm that I'm able to
receive global intermud conversations?

Code: [Select]
INTERMUD_D: prev: ({ OBJ(/secure/sefun/sefun), OBJ(/secure/daemon/ping) })
INTERMUD_D reloaded.
Loading object stack:
    0:"get_stack".OBJ(/secure/sefun/sefun)."get_stack"
    1:"create".OBJ(/daemon/intermud)."create"
    2:"CATCH".OBJ(/secure/sefun/sefun)."CATCH"
    3:"update".OBJ(/secure/sefun/sefun)."update"
    4:"CheckOK".OBJ(/secure/daemon/ping)."CheckOK"
    5:"heart_beat".OBJ(/secure/daemon/ping)."heart_beat"
Loading object trail: ({ OBJ(/secure/sefun/sefun), OBJ(/secure/daemon/ping) })
INTERMUD_D: SocketStat: 1
INTERMUD_D setup: ({ "startup-req-3", 5, "Netheria", 0, "*yatmim", 0, 0, -1, -1, 3000, 3005, 3008, "Dead Souls 2.4.2", "Dead Souls 2.4.2", "MudOS
    v22.2b14-dsouls2", "LPMud", "mudlib development", "mujo@net.nl", ([ "rcp" : 2990, "who" : 1, "emoteto" : 1, "http" : 2995, "channel" : 1, "finger" :
    1, "tell" : 1, "auth" : 1, "mail" : 1, "ftp" : 2999, "locate" : 1, "oob" : 3005 ]), ([ ]) })
ERROR RECEIVED: ({ "error", 5, "*yatmim", 0, "Netheria", 0, "not-allowed", "wrong password, and from a new IP", ({ "startup-req-3", 5, "Netheria", 0,
    "*yatmim", 0, 0, -1, -1, 3000, 3005, 3008, "Dead Souls 2.4.2", "Dead Souls 2.4.2", "MudOS v22.2b14-dsouls2", "LPMud", "mudlib development",
    "mujo@net.nl", ([ "who" : 1, "rcp" : 2990, "emoteto" : 1, "http" : 2995, "channel" : 1, "finger" : 1, "tell" : 1, "auth" : 1, "ftp" : 2999, "mail" :
    1, "locate" : 1, "oob" : 3005 ]), ([ "next boot" : "EST Thu Oct 18 00:48:02 2018", "oob port" : 3005, "upsince" : "Tue May 22 09:48:02 2007", "ip" :
    "127.0.0.1", "native version" : "2.4.2", "os build" : "unix" ]) }) })
errorcode: not-allowed

Hmm, not allowed on Yatmim anymore? "Wrong password and from another IP",
well that could be right, I have a DHCP ip so that may vary from time to time. Don't know about
any Yatmim-password though  ??? (edit: checked faq's, gonna try a few other things http://lpmuds.net/intermud.html)

Switching to i4 as router, does seem to accept things, but still not receiving anything from the outside world:
Code: [Select]
INTERMUD_D: socket closing!
INTERMUD_D: prev: ({ OBJ(/secure/sefun/sefun), OBJ(/secure/daemon/ping) })
INTERMUD_D reloaded.
Loading object stack:
    0:"get_stack".OBJ(/secure/sefun/sefun)."get_stack"
    1:"create".OBJ(/daemon/intermud)."create"
    2:"CATCH".OBJ(/secure/sefun/sefun)."CATCH"
    3:"update".OBJ(/secure/sefun/sefun)."update"
    4:"CheckOK".OBJ(/secure/daemon/ping)."CheckOK"
    5:"heart_beat".OBJ(/secure/daemon/ping)."heart_beat"
Loading object trail: ({ OBJ(/secure/sefun/sefun), OBJ(/secure/daemon/ping) })
INTERMUD_D: SocketStat: 1
INTERMUD_D setup: ({ "startup-req-3", 5, "Netheria", 0, "*i4", 0, 0, -1, -1, 3000, 3005, 3008, "Dead Souls 2.4.2", "Dead Souls 2.4.2", "MudOS
    v22.2b14-dsouls2", "LPMud", "mudlib development", "mujo@net.nl", ([ "rcp" : 2990, "who" : 1, "emoteto" : 1, "http" : 2995, "channel" : 1, "finger" :
    1, "tell" : 1, "auth" : 1, "mail" : 1, "ftp" : 2999, "locate" : 1, "oob" : 3005 ]), ([ ]) })
...this above piece of code continuously stopping and starting is considered pinging right?


Code: [Select]
> mudlist -n fron
No MUDs match your query.
> mudlist -n dead
No MUDs match your query.
> mudlist -n dead*
No MUDs match your query.
>
Title: Re: Intermud-3 router status
Post by: cratylus on May 22, 2007, 08:16:12 am
A good way to test is to change your mud's name to something
you know for sure is completely unique, and reboot. A common
problem is that if your IP changed and you reinstalled a new
copy of the lib, but you use your old mud name, the router will
assume you are not who you say you are, and reject your connection.

-Crat
Title: Re: Intermud-3 router status
Post by: Atomic on May 22, 2007, 12:22:57 pm
You're right, changed to NetheriaX and switched routers back to Yatmim, did the trick.
Scared a little at first due to the ansi-stuff scrolling by, but that was probably just to
retrieve all other router-info (the other muds).

You've stated it in the FAQ as well, I should have searched better.  :-[

On a side-note: any changes that names get purged after a while to get the original name I
chose reconnected to a new ip or does that have to be done manually by the Yatmim-admin?
Title: Re: Intermud-3 router status
Post by: cratylus on May 22, 2007, 12:28:45 pm
The "reserved" status of mud names is supposed to drop after 10 days or so of
a mud not reconnecting. However, send me a tell sometime tonight and I'll reset
Netheria for you.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on May 23, 2007, 09:07:59 am
I'll be upgrading the libs on the primary and secondary routers today and tomorrow.

You may experience occasional service disruptions. Please bear with me.

If you cannot connect at all, even after switching to the alternate router,
please send me a PM to let me know.

Thanks.

-Crat

Title: Re: Intermud-3 router status
Post by: Atomic on May 23, 2007, 09:41:25 am
The "reserved" status of mud names is supposed to drop after 10 days or so of
a mud not reconnecting.

Ah, so there is a timer-something, good to know.

Quote
However, send me a tell sometime tonight and I'll reset
Netheria for you.

Not that life-threatening at the moment, but thanks. :D
Title: Re: Intermud-3 router status
Post by: cratylus on May 31, 2007, 06:17:04 pm
I've upgraded both *yatmim and *i4 and fixed various IRN problems.

I think it should work just fine, but you never know. Please post if you
have connection or communication problems.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on June 01, 2007, 06:40:20 am
Looks like the upgrade somehow clobbered non-default channels and I didn't notice.

They should be ok now...please let me know if they're still hosed.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on June 15, 2007, 10:00:36 am
I've been running into some troubles with the *i4 router.

I'm working on them right now, and hope to have them resolved soon.

*i4 is the twin node of *yatmim. Normally it doesn't matter which you connect
to, you are on the same network.

However, if you're connected to *i4, you might experience some lack
of access to intermud today. My apologies for the inconvenience. If this
is problematic for you, please switch to *yatmim instead.

On a *totally* unrelated note, Arren's router, *adsr, is also having
some trouble. If you are willing to abide by the *yatmim router rules,
you're welcome to hang out on *yatmim til Arren resolves his issues.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on June 16, 2007, 09:20:44 am
Bad week for routers! *i4 is ok now, but the *yatmim computer is choking on
a bad disk, causing crashes and performance lag. I've been meaning to upgrade
that box anyway, and it looks like this is the Saturday to do it. *yatmim will
be completely unavailable today for a few hours.

Sorry for any inconvenience.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on June 16, 2007, 05:07:03 pm
Ok, the *yatmim box has had the failing disk replaced, and was updated with the
current version of the OS, as well as generally patched and massaged and
fed some grapes and honey.

It will need another couple of reboots following further patching, which
will happen today and tomorrow, but other than that, everything should be
mostly cool. Please let me know if you're having trouble connecting.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on June 18, 2007, 08:10:21 am
The maintenance that began this weekend will be finishing today, 18 June.

*yatmim will reboot twice in order to implement some groovy new patches. Please forgive
the inconvenience.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on July 07, 2007, 06:05:51 am
Normally this thread is all about this problem or that problem.

For a change, I'm reporting that the status of the routers is good.

*yatmim and *i4 are now graced with code that allows:

- efficient sharing of IRN data
- proper handling of null or errored sockets
- implementing of new code without having to drop connections (a separate socket daemon handles connections)
- joy and happiness

Well, the joy and happiness are mine really :)

I've discovered that the router code is now at a point that is so stable that
downtime and problems occur not because of network/code bugs, but
because of hardware failures or because I manually intervened in something
while deep into my pints.

As long as I keep my grubby hands off the throttle and no more disk
or campus network failures occur, I expect the routers to chug along
happily for weeks or months at a stretch.

So, all is good. The router code is still undocumented, so if you're looking to start one,
you'll need to grit your teeth and digest the code on yer own, using the
specs at intermud.org to light your way. However, the upside is that the code
works real good.

-Crat

PS note that the most current router code is in the alpha release

Title: Re: Intermud-3 router status
Post by: cratylus on October 27, 2007, 09:51:30 am
More "no news"!

Forgive me the self-congratulation. But for the history
of yatmim, announcements have almost invariably been
about bad things and failures...and it's just so nice that
such things are now so infrequent.

The *yatmim node has been up without interruption
now for: 7w 18h 12m 24s

And the *i4 node for: 10w 3d 13h 37m 17s

At the risk of jinxing things, I just wanted to do a little
public dance here to celebrate this record.

Just as a heads-up, though, I will probably be doing
some updating of how the router handles old muds. I
don't expect this new system to screw things up, but hey,
you know me. *I* might screw it up. I'll be testing this
weekend on a lab router pair to minimize the
downtime risk.

The intent is to have the router notify muds when an old mud
is removed. This involves your mud receiving a mudlist
with a data packet of: 0

I believe most i3 implementations handle this just fine, but
I thought it was worth bringing it up, in case folks have
homebrew implementations ill-equipped to handle this behavior.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on November 02, 2007, 09:38:04 am
Quote
At the risk of jinxing things, I just wanted to do a little
public dance here to celebrate this record.

Total, total jinx!

I spent the last two days carefully testing and re-testing the
new router code on isolated muds, in a "lab environment".

Once I was sure everything was perfect, I slowly rolled it out
into live production as I had practiced, and mirabile dictu,
*yatmim and *i4 survived the upgrade without a hiccup...no
connections lost, and the new functionality in place.

GREAT SUCCESS! Oh with what joy and smugness did I retire to sleep last night.

Then this morning, the building *yatmim is in had a power failure.

Just shows to go ya.

Anyway, it's back up now.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on November 24, 2007, 07:28:35 am
It looks like *yatmim went down this morning. Not sure why.

Lately the building *yatmim is in has been going through
power outages. My wild guess is that with winter approaching,
the campus maintenance folks are testing generators and
whatnot. I say this because these outages were very rare
in the summer and spring.

Anyway, I'm going to jump in my car today and take a ride
down to the *yatmim server room (read: forgotten departmental closet)
and see wtf is going on.

In the meantime, if you can't bear being isolated from your
precious intermud community, you can switch your router connection
to the *i4 router. *i4 is the twin irn node for *yatmim, meaning that
muds connected to one seamlessly communicate with muds connected to
the other. Because *i4 is on a paid hosting network with a hugalargefat
network pipe, it's actually quite a lot faster and more stable
than the *yatmim node. At some point I'll be encouraging folks
to migrate from *yatmim to *i4 anyway, and this might be a good time
for folks to look into that.

The *i4 connection info is:

*i4 204.209.44.3 8080

-Crat

PS I should've kept my fool mouth shut about the uptime. Talk
about tempting the fates.
Title: Re: Intermud-3 router status
Post by: cratylus on November 24, 2007, 10:14:09 am
Turns out I didn't need to do anything. The *yatmim box and
the i3 server software hadn't gone down at all. Whatever prevented
network access to it wasn't apparent. My guess is some sort
of networking problem on campus that the IT people fixed before
I got to intervene.

Be that as it may, this is a good opportunity to bring to folks'
attention that in general the *i4 node is much more stable
and much faster than *yatmim, and it is a peer to *yatmim, so
you'll be talking with exactly the same muds/channels as before.

Especially if you've run into connection problems with *yatmim
(which is a weak machine on a weak network), you'll find the
robustness of *i4 pleasing.

The connection info is:

*i4 204.209.44.3 8080

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on April 16, 2008, 09:18:46 am
I hate to do it but it looks like there are some deficiencies in channel
handling that need to be addressed in the routers. I will be rolling
out changes to the routers late tonight and also tomorrow. You may
be disconnected or experience weirdness. Please forgive any inconvenience.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on April 16, 2008, 10:23:50 pm
The *major* overhaul is complete. At first it looked like both nodes were happy, but
I'm sorry to say that *i4 crashed mysteriously about a half hour after its upgrade >:|

Fortunately, *yatmim, its sister node, seems to have survived the upgrade
without crashing, and has thus far maintained its 17 week uptime (woot).

Both nodes are done with the "big stuff" upgrades. There's some tweaking that
will be going on over the next few weeks, but this should not risk disconnections.

Thank you for your kind patience.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on May 09, 2008, 07:12:55 am
Looks like someone decided to see what happens when
you DOS *yatmim, and sure enough, it denies service.

If you're having intermud trouble, you can try switching to *i4 rather
than using *yatmim, which will be down for a little while this
morning while I tidy things up.

*i4 204.209.44.3 8080

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on May 10, 2008, 08:00:59 pm
*yatmim is back up. I haven't yet had a chance to  thoroughly
investigate what happened, but it appears to involve
someone sending lots of packets doing unusual things.

I'll be poking at the logs and such to figure out what
happened. If I am able to share details, I will do so in
the next few days.

In any case, it was a good excuse to upgrade the
*yatmim computer to a more robust config, so rather
than a wheezing old 300MHz server, it's on an
old-but-still-spry dual 750MHz machine.

I still recommend people get used to the idea of moving to
*i4 permanently. Eventually my old friend on the faculty
will retire, and I'll have no explanation for the university
police as to why I need to be on campus, working on *yatmim.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on August 17, 2008, 10:51:38 am
The server that the *i4 router is on had some trouble this morning,
and rebooted. As far as I can tell, all is ok now, so your intermud
connection should be able to resume.

The neat thing is that the uptime shortly before this happened was:

Quote
Router socket daemon uptime: 17w 1d 12h 58m 36s

And this interruption wasn't because of router code, but
rather server stuff beyond my control. Hurray for stable router code!

-Crat
Title: Re: Intermud-3 router status
Post by: Tricky on August 17, 2008, 06:24:35 pm
Stable? Are you willing to bet your reputation on that?  ::)

Tricky
Title: Re: Intermud-3 router status
Post by: cratylus on September 09, 2008, 11:47:25 am
Quote
Stable? Are you willing to bet your reputation on that?

:(

Today a nasty thunderstorm swept through town and *yatmim got
knocked offline for a bit, and in a strange way. It should be up
and ok now.

This is as good a time as any to announce that *yatmim is no longer
the official primary router of the LPMuds.net intermud network. If
your Intermud-3 client is pointing at *yatmim, please take the time
to change the connection to *i4. It is faster, more reliable, and has
more bandwidth.

Please make sure your configuration points to the correct name and
ip and port
. The #1 cause of problems when switching routers is
leaving the old name in the name field. This is not good. This is bad.
Change all three:

name: *i4
ip: 204.209.44.3
port: 8080

Let me know if you have problems I can help with.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on September 09, 2008, 12:24:08 pm
Hmm...actually it looks like the network *yatmim is on might be a bit screwy.

You may not be able to connect to yatmim for a little while.

All the more reason to switch to *i4

:)

-Crat
Title: Re: Intermud-3 router status
Post by: Tricky on September 09, 2008, 06:23:41 pm
All the more reason for admins to implement alternate routers given in the startup-reply packet.

Tricky
Title: Re: Intermud-3 router status
Post by: cratylus on September 10, 2008, 11:56:18 am
Quote
All the more reason for admins to implement alternate routers given in the startup-reply packet.

This is a good idea and I've added code to the router to
make it so. It is not yet enabled, though.

The reason I hesitate is that Dead Souls is not 100% compliant
with the spec's intent of "disconnect and reconnect to the
preferred router if it differs from the current connection."

http://ebspso.dnsalias.org:6061/intermud3.lpc

I suspect that other libs, written at a time when multiple
routers were more fantasy than reality, are similar.

It's trivial for me to drag the Dead Souls people into
compliance, but I'm wary of dumping a New Way into
the laps of folks running clients from 1996.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on September 24, 2008, 12:27:00 pm
So I'm in my car, *fuming*, roaring to campus,
and I'm thinking about all the choice words I'll
have for the slackjawed maintenance guy who
unplugged yatmim without calling me like the
sign says to do, or to catch the idjit student
who thinks he's stealing a computer that will
run his favrit video game, and I notice that
traffic all of a sudden is crazy, and the
traffic lights are out. And the closer I get to
campus, the harsher and more acrid the air
smells, and in the parking lot there are
emergency vehicles and people running around...
so I turned around and drove back, ashamed.

I dunno how long yatmim/dead-souls.net will be down. Let's
just hope everyone's ok.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on September 24, 2008, 01:17:33 pm
Apparently the fire wasn't oncampus but elsewhere...the emergency
response was due to folks stuck in elevators...as far as I know
everyone is ok :)

Also yatmim is back up, as well as dead-souls.net.

Please remember folks that yatmim is no longer the primary
i3 intermud router. Switch to i4 at your earliest convenience.

name: *i4
ip: 204.209.44.3
port: 8080

(make sure you change the name too, not just ip and port. it's important!)
Title: Re: Intermud-3 router status
Post by: Aidil on October 19, 2008, 12:44:43 pm
A new router has been added to the intermud 3 network:

name: *wpr
ip: 195.242.99.94
port: 8080

It is connected to *yatmim and *i4 and such through irn, and carries the same channels
as those do (also with the same rules obviously).

The primary purpose is to increase the redundancy of the i3 network (secondary purpose,
I kept myself entertained by writing the code for this..)

This router uses a different codebase then Cratylus' routers, it was written by me from scratch
and the irn code was added in the last 2 weeks. All may be considered to be 'in testing' for now.

Because it uses a new codebase, it may behave differently then *yatmim or *i4 do, when this
causes compatibility issues for you, I'd like to hear, just like any other problems you may
encounter. You can contact me on i3 as aidil@Way of the Force, or by email as aidil@wotf.org

Oh, 'filtered channels' won't work on my router, and they do not work properly between routers
on the intermud 3 network at all right now.

Aidil.
Title: Re: Intermud-3 router status
Post by: Aidil on October 20, 2008, 04:15:46 am
Some info on the differences between my router (*wpr) when compared to *i4 or *yatmim

- mudlist updates
  When a mud goes offline unexpectedly, or after having sent a shutdown packet with a
  restart_delay of less then 5 minutes, *wpr will not send out an update inmediately, rather
  it will wait approximately 5 minutes for the mud to return. If the mud returns within this time,
  and nothing else changes (ie, the mud info is the same), you will not get an update at all.

- mudlists are sent one mud at a time.

- ucache-update packets are only sent to muds that support the ucache service

- different error messages, and at times different errors

For those interested, documentation for the inter router network protocol is at

http://wotf.org/i3/irn/v1/

Title: Re: Intermud-3 router status
Post by: Aidil on October 25, 2008, 08:58:24 am
It seems *wpr is working very well, absolutely nothing unexpected happened after
fixing a few initial (and mostly cosmetic) glitches. This turned out to be a lot less
hairy then I had expected.

This means that testing is over now, and it should be considered 'production'.

That said, if you are using a custom i3 client, it might be a good idea to try if it
works with *wpr now instead of waiting until for some reason *i4 fails. It would
be good to know how things work out with a lot more muds connected also.

Aidil.
Title: Re: Intermud-3 router status
Post by: Aidil on October 27, 2008, 04:58:36 pm
The network *wpr is on is experiencing some routing problems. Those should get fixed in the comming hours.
Title: Re: Intermud-3 router status
Post by: Tricky on October 27, 2008, 08:38:15 pm
I did a quick test with AFK 1.7 last week on Aidil's test server and production server and found no problems. The only problem I found was who-req didn't quite work properly (rwho'ing an AFK mud). Actually I didn't get a response.

Tricky
Title: Re: Intermud-3 router status
Post by: Aidil on October 28, 2008, 07:30:52 am
Curious, I don't do anything special wrt rwho-requests.

Maybe when you catch me online, we can look into this some more.

Title: Re: Intermud-3 router status
Post by: Aidil on October 29, 2008, 05:22:05 am
Investigating a little bit further, I think the rwho 'issue' is related to a limitation of the current irn implementation.

Currently, when a mud connects to more then one router (either through the imc2 bridge or through i3), the last
router that the mud connects to becomes authoritive.

When the connection between the mud and this router drops, the router reports the mud offline, despite it also
being connected to another router still.

The consequence is that the mud will still receive broadcasts and can still send out packets. It will not receive
directed packets however (at least not when sent by a mud connected to another router).

Seeing that RtH-AFK connects to the imc2 bridge and to *wpr, and seeing that it was still connected to *wpr while being offline
according to the mudlist, I strongly suspect that this explains the problem Tricky reported.

I changed *wpr such that if a mud gets reported as offline by another router while *wpr has a direct
connection to that mud, *wpr will send out an update to the other routers reporting the mud as online and connected
to *wpr.

Note that this is only implemented on *wpr, and only to test if this is a workable solution.
At some later time, Cratylus and me will have to decide on a network wide 'fix' for this.
Title: Re: Intermud-3 router status
Post by: cratylus on December 06, 2008, 08:03:29 am
*i4, *yatmim, and *dalet will receive some code upgrades this
weekend. Hopefully this will not incur service interruptions, but
it is possible that you may be disconnected a couple of times.

Note that Aidil's node ( *wpr 195.242.99.94 8080 ) has no such
maintenance scheduled, so if you're connected to it you shouldn't
experience any interruptions this weekend.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on January 21, 2009, 04:56:43 pm
*yatmim is down. I won't be able to investigate until tomorrow, so expect
a 24-hour outage. If you were connected to *yatmim, please take this
opportunity to switch your i3 network connection settings to one of
the two supported routers:

*wpr 195.242.99.94 8080

*i4 204.209.44.3 8080

-Crat

EDIT: Looks like it's available again. Probably a problem on the
school network, since the server did not go down. In any case,
this is a good time for folks to switch.
Title: Re: Intermud-3 router status
Post by: cratylus on April 06, 2009, 11:44:18 am
Investigating some anomalies on the intermudses today, I tested
a thing that made everything fall down. :( sry!

So if you've had weird problems with your intermud connection today, that's probably why.

Good news is that the LPMuds.net intermud network hubs are now all running on
the same (and latest) driver and lib versions, and the problems I saw earlier
today seem not to be recurring.

-Crat
Title: Re: Intermud-3 router status
Post by: cratylus on April 14, 2009, 10:15:48 pm
Wolfpaw recently updated the server that *i4 is on.

Unfortunately this appears to have lowered the threshold for some kind
of resource monitor, and it's causing *i4 to reboot every half hour.

:(((

I'll be working with the Wolfpaw admin to resolve this. In the meantime,
if this is causing you problems, try connecting to Aidil's server instead:

*wpr 195.242.99.94 8080
Title: Re: Intermud-3 router status
Post by: Aidil on May 13, 2009, 07:39:45 am
*wpr is currently having networking problems, and is unreachable. Network people are working on the issue, and it is expected this will be resolved in the commign hours.

Aidil
Title: Re: Intermud-3 router status
Post by: wodan on May 13, 2009, 08:15:32 am
so, reading the two previous posts...
should we switch to *yatmim?
Title: Re: Intermud-3 router status
Post by: cratylus on May 13, 2009, 11:15:48 am
so, reading the two previous posts...
should we switch to *yatmim?

Generally when I post in this thread it will be to warn folks of some problem
or other. Sometimes I do announce fabulous uptime for a router, but this seems
to do nothing but jinx things.

I guess that a person could come away from this thread as thinking that
there's nothing but problems.

However, generally speaking, *i4 and *wpr are pretty reliable, and generally speaking,
it's a good thing to have a thread people can go to in case the occasional
service interruption needs explanation.

People should absolutely not move to *yatmim, and in fact they need to get
themselves moved over to *i4 or *wpr ASAFP. My buddy oncampus is retiring
soon, and without his academic aegis, *yatmim won't be long for this world.

This is a good time for folks to heed my longtime warning:

Quote
please take this opportunity to switch your i3 network connection settings
to one of the two supported routers:

*wpr 195.242.99.94 8080

*i4 204.209.44.3 8080

-Crat
Title: Re: Intermud-3 router status
Post by: Aidil on May 13, 2009, 01:18:17 pm
Just wanted to mention that *wpr is back up.

With regards to switching to *yatmim, I think Cratylus explained quite well why you shouldn't.

Then, 2 routers on entirely different networks (not to mention, servers), running an entirely different codebase will very seldom be down at the same time, but both of them will be down at times due to all kinds of things. This is why there are multiple routers to begin with.

So you should just ensure your client supports switching between routers, and preferably, does so automatically. If you rely on a single router, you will at occations not have I3 access simply because NOTHING will be up 100% of the time, its theoretically impossible :)
Title: Re: Intermud-3 router status
Post by: Aidil on September 24, 2009, 08:07:43 am
Thursday September 24th starting at 10pm (timezone CEST) the *wpr router will be unavailable due to hardware maintenance. This maintenance is expected to last upto 2 hours.
Title: Re: Intermud-3 router status
Post by: cratylus on May 18, 2013, 02:41:41 pm
Looks like the Wolfpaw server I use for i3 is down. I've sent Dale a request to look into it. Hopefully it gets cleared up this weekend soon.
Title: Re: Intermud-3 router status
Post by: quixadhal on May 19, 2013, 02:42:26 am
SEEEE????  MUD's R dying!!!!!! *grin*