Author Topic: Strange behavior in 3.2  (Read 16503 times)

Offline wodan

  • BFF
  • ***
  • Posts: 434
  • Drink and code, you know you want to!
    • View Profile
Re: Strange behavior in 3.2
« Reply #45 on: June 10, 2011, 11:48:15 am »
yeah, those settings are unlikely to be the problem, dw uses USE_32BIT_ADDRESSES and 300000 size arrays and mappings and doesn't crash all that often.

Offline Maze of Ith

  • Acquaintance
  • *
  • Posts: 33
  • Sometimes nothing can be a really cool hand.
    • View Profile
Re: Strange behavior in 3.2
« Reply #46 on: June 11, 2011, 11:57:03 pm »
I added the same code as quixadhal (adding the additional catch), and that seemed to help as I was crashing every 12-24 hours. After adding the code he mentioned I crash from 24 hours and up, but so far, not less than that. Here is my info as well:

Ubuntu 10.10
Kernel 2.6.35-22-server (SMP)
single 2.66 CPU
1GB RAM

Cheers!
Zed
zed @ looney2.com 8888
zed <at> lilypadmudlib <dot> com

Offline Nulvect

  • BFF
  • ***
  • Posts: 127
    • View Profile
Re: Strange behavior in 3.2
« Reply #47 on: June 21, 2011, 03:23:39 pm »
Reporting back in. After setting my mud's auto-reboot to over four days, not a single crash or error. We've had only light activity recently, but it sounds like more than the muds that have been crashing.

Offline Io

  • Acquaintance
  • *
  • Posts: 44
    • View Profile
Re: Strange behavior in 3.2
« Reply #48 on: June 21, 2011, 03:42:18 pm »
I've upgraded to 3.4, going to see if that reduces any of the crashing I was experiencing

Offline yeik

  • Acquaintance
  • *
  • Posts: 18
    • View Profile
Re: Strange behavior in 3.2
« Reply #49 on: June 25, 2011, 12:36:22 pm »
I got it to crash again with my breakpoint.
I think we might have to go further back somehow.

1459            ret->item[5].u.ob = lpc_socks[which].owner_ob;
1460            if(lpc_socks[which].owner_ob->ref +1 > 32000)
1461            {
1462              error("ref coutn too high!\n");
1463            }
1464            add_ref(lpc_socks[which].owner_ob, "socket_status");
1465        } else {
1466            ret->item[5] = const0u;
(gdb) print lpc_socks[which]
$1 = {fd = 10, flags = 6, mode = MUD, state = STATE_FLUSHING, l_addr = {sin_family = 2, sin_port = 34513, sin_addr = {s_addr = 0},
    sin_zero = "\000\000\000\000\000\000\000"}, r_addr = {sin_family = 2, sin_port = 21282, sin_addr = {s_addr = 1451584353},
    sin_zero = "\000\000\000\000\000\000\000"}, owner_ob = 0xbce0ba8, release_ob = 0x0, read_callback = {f = 0x0, s = 0x0},
  write_callback = {f = 0x0, s = 0x0}, close_callback = {f = 0x0, s = 0x0}, r_buf = 0x0, r_off = 0, r_len = 0, w_buf = 0x9d8a3c0 "",
  w_off = 62, w_len = 8}
(gdb) print which
$2 = 1
(gdb) print lpc_socks[which].owner_ob
$3 = (object_t *) 0xbce0ba8
(gdb) print *lpc_socks[which].owner_ob
$4 = {ref = 58568, flags = 2978, extra_ref = 199399136, obname = 0x0, next_hash = 0x0, next_ch_hash = 0x0, load_time = 1308998383,
  next_reset = -1926266329, time_of_ref = 1308998675, prog = 0x0, next_all = 0x0, prev_all = 0x0, next_inv = 0x0, contains = 0x0,
  super = 0x0, interactive = 0x0, replaced_program = 0x0, shadowing = 0x0, shadowed = 0x0, sent = 0x0, next_hashed_living = 0x0,
  living_name = 0x0, privs = 0x9c75878 "MUDLIBPRIV", stats = {domain = 0x9cf1840, author = 0x0}, pinfo = 0x0, variables = {{type = 2,
      subtype = 4, u = {string = 0x0, number = 0, real = 0, refed = 0x0, buf = 0x0, ob = 0x0, arr = 0x0, map = 0x0, fp = 0x0, lvalue = 0x0,
        ref = 0x0, lvalue_byte = 0x0, error_handler = 0}}}}
(gdb) bt
#0  socket_status (which=1) at socket_efuns.c:1462
#1  0x080d731f in f_socket_status () at sockets.c:334
#2  0x0806b071 in eval_instruction (p=0x9d1fbc2 "\003\233~c\003)c\002\023\017c\002\026\034") at interpret.c:3772
#3  0x0806b266 in do_catch (pc=0x9d1fbc2 "\003\233~c\003)c\002\023\017c\002\026\034", new_pc_offset=22236) at interpret.c:3825
#4  0x0806a855 in eval_instruction (p=0x9d1fbba "\021") at interpret.c:3659
#5  0x0806c735 in call_direct (ob=0x9c456c0, offset=73, origin=8, num_arg=0) at interpret.c:4587
#6  0x080ba744 in call_simul_efun (index=73, num_arg=0) at eoperators.c:1158
#7  0x0806a7b8 in eval_instruction (p=0xa19da8d "\020\024\b") at interpret.c:3636
#8  0x0806c735 in call_direct (ob=0xa159be0, offset=22, origin=1, num_arg=0) at interpret.c:4587
#9  0x08079249 in call_heart_beat () at backend.c:380
#10 0x0809c908 in call_out () at call_out.c:288
#11 0x08078e32 in backend () at backend.c:168
#12 0x0805ef26 in main (argc=2, argv=0xbfb61574) at main.c:440

Offline Io

  • Acquaintance
  • *
  • Posts: 44
    • View Profile
Re: Strange behavior in 3.2
« Reply #50 on: June 25, 2011, 12:46:49 pm »
I've upgraded to 3.4, going to see if that reduces any of the crashing I was experiencing

So far it's been up for 4 days without incident, but I will feel more confident after its been up for a couple of weeks

Offline wodan

  • BFF
  • ***
  • Posts: 434
  • Drink and code, you know you want to!
    • View Profile
Re: Strange behavior in 3.2
« Reply #51 on: June 27, 2011, 03:52:32 pm »
thanks to Yeik, I found the problem

Code: [Select]
if (lpc_socks[which].owner_ob && !(lpc_socks[which].owner_ob->flags & O_DESTRUCTED)) {
        ret->item[5].type = T_OBJECT;
        ret->item[5].u.ob = lpc_socks[which].owner_ob;
        add_ref(lpc_socks[which].owner_ob, "socket_status");
    } else {
        ret->item[5] = const0u;
    }
should be
Code: [Select]
if (!(lpc_socks[which].flags & STATE_FLUSHING) && lpc_socks[which].owner_ob && !(lpc_socks[which].owner_ob->flags & O_DESTRUCTED)) {
        ret->item[5].type = T_OBJECT;
        ret->item[5].u.ob = lpc_socks[which].owner_ob;
        add_ref(lpc_socks[which].owner_ob, "socket_status");
    } else {
        ret->item[5] = const0u;
    }
in the socket_status efun

Offline quixadhal

  • BFF
  • ***
  • Posts: 642
    • View Profile
    • WileyMUD
Re: Strange behavior in 3.2
« Reply #52 on: June 27, 2011, 04:35:01 pm »
Woot!

Good job guys!

Offline Sluggy

  • Friend
  • **
  • Posts: 91
    • View Profile
    • Stellarmass
Re: Strange behavior in 3.2
« Reply #53 on: July 21, 2011, 09:41:27 pm »
Hmmm, got anything else?

I'm running a modified version of DS3.0. Originally I was using FluffOS 2.18 and have since upgraded to FluffOS 2.22 with the above fix. The really odd thing about it is that I haven't changed anything in the mud (driver, Mudlib, or otherwise) since last fall but this crash didn't become a regular thing until March.

Code: [Select]
Stellarmass crashed Tue Jul 19 22:51:39 2011 with error Segmentation fault.

0:OBJ(/secure/sefun/sefun), file: /secure/sefun/sefun.c, fun: get_stack,
origin: simul
1:OBJ(/secure/daemon/master), file: /secure/daemon/master.c, fun: crash,
origin: driver
2:OBJ(/secure/sefun/sefun), file: /secure/sefun/sefun.c, fun: CATCH, origin:
simul
3:OBJ(/secure/sefun/sefun), file: /secure/sefun/sefun.c, fun: socket_names,
origin: simul
4:OBJ(/secure/daemon/instances), file: /secure/daemon/instances.c, fun:
CheckConnections, origin: local
5:OBJ(/secure/daemon/instances), file: /secure/daemon/instances.c, fun:
heart_beat, origin: driver
({ OBJ(/secure/sefun/sefun), OBJ(/secure/daemon/instances) })
---
Stellarmass crashed Wed Jul 20 11:46:48 2011 with error couldn't find object
uq in obj_table.

0:OBJ(/secure/sefun/sefun), file: /secure/sefun/sefun.c, fun: get_stack,
origin: simul
1:OBJ(/secure/daemon/master), file: /secure/daemon/master.c, fun: crash,
origin: driver
2:OBJ(/secure/sefun/sefun), file: /secure/sefun/sefun.c, fun: CATCH, origin:
simul
3:OBJ(/secure/sefun/sefun), file: /secure/sefun/sefun.c, fun: socket_names,
origin: simul
4:OBJ(/secure/daemon/instances), file: /secure/daemon/instances.c, fun:
CheckConnections, origin: local
5:OBJ(/secure/daemon/instances), file: /secure/daemon/instances.c, fun:
heart_beat, origin: driver
({ OBJ(/secure/sefun/sefun), OBJ(/secure/daemon/instances) })