Author Topic: DS3.8 & FluffOS 2.27  (Read 4784 times)

Offline skout23

  • Acquaintance
  • *
  • Posts: 17
    • View Profile
DS3.8 & FluffOS 2.27
« on: July 24, 2013, 07:42:12 AM »
I tend to use my mud as a glorified chat client, and mostly to toy around with stuff as it peaks my interest.  I took the plunge again a couple weeks back and got DS3.7a up with FluffOS 2.27, and it was stable.  Since I am pretty much stock DS, when a upgrade comes out I tend to just drop the lib in and restart the driver.  Did so again yesterday when I saw DS3.8 was out.  Have had a couple crashes since.  Here is the stacktrace:

...
Running autoexec, please wait...
Autoexec daemon run complete. (52ms)

******** FATAL ERROR: couldn't find object PW? in obj_table
FluffOS driver attempting to exit gracefully.
(current object was /secure/sefun/sefun)
--- trace ---
Object: /secure/daemon/instances, Program: /secure/daemon/instances.c
   in heart_beat() at /secure/daemon/instances.c:677
Object: /secure/daemon/instances, Program: /secure/daemon/instances.c
   in CheckConnections() at /secure/daemon/instances.c:637
Object: /secure/sefun/sefun, Program: /secure/sefun/sefun.c
   in socket_names() at /secure/sefun/sockets.c:25
'          CATCH' in '/secure/sefun/sefun.c' ('/  secure/sefun/sefun') /secure/sefun/sockets.c:25
--- end trace ---
*** glibc detected *** /home/skout/Servers/DS/Obscurum/bin/driver: malloc(): smallbin double linked list corrupted: 0x00000000028ab570 ***

I thought to look for any changes in instances.c,  sefun.c & socket.c from 3.6-> 3.8 via a github repo which tracks DS, but no changes that I could tell.    All that was happening was me idling, mud-client (tintin++) and mud are on the same box.  Am I just hitting some new idle timeout? Should I regress back to an earlier approved version of fluffos or vault into the 3.0 tags?


Thanks,
Scott

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: DS3.8 & FluffOS 2.27
« Reply #1 on: July 24, 2013, 08:05:53 AM »
Could be that you would have hit this on 3.7 as well eventually.

My default answer is to use the driver the lib comes with, but if you really want to sleuth it out, look at all the diffs in the driver. I vaguely recall this issue being dealt with in the driver.

-Crat

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: DS3.8 & FluffOS 2.27
« Reply #2 on: July 24, 2013, 03:17:57 PM »
Update: I reproduced this on 2.23. I'm now running a debug driver and will work with Kalinash to figure out what it is, next time it happens.

-Crat

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #3 on: July 25, 2013, 10:26:56 AM »
Memory issue is best detected by running the driver under valgrind.

valgrind --show_leak=full --track_origin=yes ./drvier config

And, of course, a core dump will be useful too.

However, there is a couple of memory corruption issue that is fixed in the 3.0 but not backported to 2.27,  If valgrind manage to find the problem, maybe we could do a 2.28 release.

Offline skout23

  • Acquaintance
  • *
  • Posts: 17
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #4 on: July 25, 2013, 06:05:13 PM »
am using the following valgrind cmd:

valgrind --leak-check=full --track-origins=yes --fullpath-after= ./driver mudos.cfg

will post anything I find.

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #5 on: July 25, 2013, 06:50:24 PM »
you could also add --db-attach=yes --malloc-fill=0x77 --free-fill=0xff   etc,

which should be very indicative whether there is memory issue at play.

The other option is:

compile use llvm 3.3+ and  add -fsanitize=address which will will produce a binary that will run faster but still complain hard on memory error.

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: DS3.8 & FluffOS 2.27
« Reply #6 on: July 26, 2013, 08:02:11 AM »
Kalinash analyzed the core I dumped and found the following suspicious line in socket_efuns.c :

Code: [Select]
if (!(lpc_socks[which].flags & STATE_FLUSHING) && lpc_socks[which].owner_ob && !(lpc_socks[which].owner_ob->flags & O_DESTRUCTED)) {
Per his recommendation I've changed it to:

Code: [Select]
if (lpc_socks[which].state != STATE_FLUSHING && lpc_socks[which].owner_ob && !(lpc_socks[which].owner_ob->flags & O_DESTRUCTED)) {
And we'll see how that goes.

-Crat

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #7 on: July 26, 2013, 08:25:46 AM »
nah, that code is correct...

can you reproduced on  3.0alpha6.4?

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: DS3.8 & FluffOS 2.27
« Reply #8 on: July 26, 2013, 08:35:13 AM »
Quote
STATE_FLUSHING is a state, not a flag
trying to use it as a bit field mask was wrong


As for reproducing it on 3.0, I'm not using 3.0.

-Crat

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #9 on: July 26, 2013, 10:47:44 AM »
well state is a bitmask, use & and = has no difference.

what is wrong with 3.0?

Offline quixadhal

  • BFF
  • ***
  • Posts: 631
    • View Profile
    • WileyMUD
Re: DS3.8 & FluffOS 2.27
« Reply #10 on: July 26, 2013, 12:02:41 PM »
Not quite correct.

If you'll note, the orignial line was looking at the socket's "flags" element, and seeing if STATE_FLUSHING was set.  The new line is looking at the socket's "state" element, and seeing if it is anything other than STATE_FLUSHING.

One assumes other code is supposed to ensure the "flush" bit in the flags element and the state stay in sync.

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #11 on: July 26, 2013, 12:15:01 PM »
i will have to look the code to tell you if there has been combined state.

however the crash looks like a missed object reference somewhere . if there is custom efuns i would first look at those.

i put a lot of effort on fixing 2.23 to 2.27, so i have a bit of confidence that things are accounted for there.

Offline zmax

  • Acquaintance
  • *
  • Posts: 4
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #12 on: November 29, 2013, 07:36:01 PM »
I upgraded driver 2.23 to 2.27 recently...
and found a problem .... in verbs system
If I write SetRules("WRD"), the driver will crash
but SetRules("STR") will not.

so, if I type "drop 100 silver" in game, driver crash...
I try to fix it quickly:

/verbs/items/drop.c:
Code: [Select]
// SetRules("OBS", "WRD WRD");
SetRules("OBS", "STR STR");
...
//mixed can_drop_wrd_wrd(string num, string curr) {
mixed can_drop_str_str(string num, string curr) {

/verbs/items/include/drop.h:
Code: [Select]
// mixed do_drop_wrd_wrd(string amt, string curr);
mixed do_drop_str_str(string amt, string curr);

it's working.

so... Is the rule "WRD" recommend to use?
and .. can I just use "STR" instead of "WRD" ... ?

sorry, my english is poor :)

p.s. driver 2.6.1 has no crash issue like that. only driver 2.7
« Last Edit: November 29, 2013, 07:44:47 PM by zmax »

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #13 on: November 29, 2013, 09:07:40 PM »
you have to paste the stack trace for me to help you.

please attach the crash message , or a GDB backtrace

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: DS3.8 & FluffOS 2.27
« Reply #14 on: December 01, 2013, 01:21:34 PM »
Or if you can produce a small lpc code that reliable crash the driver.

Cheers.