Author Topic: Fluffos 2.23 Crasher : ASYNC  (Read 2190 times)

Offline silbago

  • Acquaintance
  • *
  • Posts: 13
    • View Profile
Fluffos 2.23 Crasher : ASYNC
« on: November 03, 2011, 07:40:02 pm »
Have had some problems for oh 6 months (or less) with ever increasing mudcrashes, 1-2 times a month to now
0-3 times a day. Started with Fluffos 2.22 on my behalf I think. Final Realms that is.

Has been very hard to get error dumps, can't manifest the bloody core dump for the fluff binary on the server.
I'd very much like the core, but ulimit -c unlimited doesnt help either (just to note it).
It just doesnt dump the blood *core*, nowhere on the harddrive. Sorry, I want the core myself.

Now what I think is wrong is async stuff, each time I can track the debug reports (runtime_debug.log) it shows
that the mud crashes during this sequence in auto_load.c -> catch( (ob=clone_object( <whatever filepath> )) );
Nothing wrong that could/should/would/can happen with that line. But the mud goes down alright.

It always happens at player login, sequentially the player hits enter sending his password to secure/login.c and
things roll starting up mortal.c (user.c | player.c | etc) and it starts to clone inventory objects from the user
savefile. Since players have oh 30 to 250 items in inventory, it is a "big process" (oh well).
Doesnt matter size of inventory of users, some times they carry 30 items, some times 300. No pattern.

Very hard to reproduce, I have tried but aint able to. Some players crash the mud for a week or two, then
some other user takes over. 3 weeks ago one specific player crashed the mud EVERY time he logged on !?!
These users has different computers/phones, through different networks and everything.
I noticed a mudcrasher fix note in changelog for 2.22 -> 2.23, but it didnt help us upgrading the Fluff !


Anyways here's the dump from runtime_debug.log that results in the driver going down


driver.2.23-fr: pthread_mutex_lock.c:62: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

******** FATAL ERROR: Aborted(IOT)
FluffOS driver attempting to exit gracefully.
(current object was /global/mortal#406483)
Recent instruction trace:
8678370:  63        2 local                     (0)
8678372:  23       10 branch_when_zero          (1)
8678375:  63        2 local                     (0)
8678377:  17        1 aggregate                 (1)
867837a:  64        3 local_lvalue              (1)
867837c:  96          (void)+=                  (2)
867837d:  25          branch                    (0)
8678389:  35        4 loop_incr                 (0)
867838b:   2          efun2                     (0)
867838f: 125          sizeof                    (2)
8678390:  29          bbranch_lt                (2)
86782b0:   2          efun2                     (0)
86782b4:  71          index                     (2)
86782b5:  98        5 (void)assign_local        (1)
86782b7:  63        5 local                     (0)
86782b9: 125          sizeof                    (1)
86782ba:  11        2 byte                      (1)
86782bc:  22          branch_eq                 (2)
86782c2:   2          efun2                     (0)
86782c6:  71          index                     (2)
86782c7: 155          stringp                   (1)
86782c8:  24          branch_when_non_zero      (1)
86782ce:  15          const0                    (0)
86782cf: 195          set_eval_limit            (1)
86782d0:   1          pop                       (1)
86782d1: 246        2 this_object               (0)
86782d3: 199          getuid                    (1)
86782d4:  24          branch_when_non_zero      (1)
86782e3:   2          efun2                     (0)
86782e7:  71          index                     (2)
86782e8:  15          const0                    (1)
86782e9: 273       29 find_object               (2)
86782eb:  24       18 branch_when_non_zero      (1)
86782ee:  39       14 catch                     (0)
86782f1:   2          efun2                     (0)
86782f5:  71          index                     (2)
86782f6:  14       52 short_string              (1)
86782f8: 246          this_object               (2)
171c046:  16          const1                    (0)
171c047:  98        4 (void)assign_local        (1)
171c049:  63        2 local                     (0)
171c04b:  13      402 string                    (1)
171c04e:  13      403 string                    (2)
171c051:  14      109 short_string              (3)
171c053:  13      404 string                    (4)
171c056:  17        4 aggregate                 (5)
171c059: 113          simul_efun                (2)
6d24f39:  63        0 local                     (0)
6d24f3b: 109          !                         (1)
6d24f3c:  37        5 ||                        (1)
6d24f3f:  63        1 local                     (0)
6d24f41: 109          !                         (1)
6d24f42:  23          branch_when_zero          (1)
6d24f48:   2          efun2                     (0)
6d24f4c: 252        8 member_array              (2)
6d24f4e:  46          return                    (1)
171c05d:  12        1 -byte                     (1)
171c05f:  22        4 branch_eq                 (2)
86782e7:  71          index                     (2)
86782e8:  15          const0                    (1)
86782e9: 273       29 find_object               (2)
86782eb:  24       18 branch_when_non_zero      (1)
86782ee:  39       14 catch                     (0)
86782f1:   2          efun2                     (0)
86782f5:  71          index                     (2)
86782f6:  14       52 short_string              (1)
86782f8: 246          this_object               (2)
171c046:  16          const1                    (0)
171c047:  98        4 (void)assign_local        (1)
171c049:  63        2 local                     (0)
171c04b:  13      402 string                    (1)
171c04e:  13      403 string                    (2)
171c051:  14      109 short_string              (3)
171c053:  13      404 string                    (4)
171c056:  17        4 aggregate                 (5)
171c059: 113          simul_efun                (2)
6d24f39:  63        0 local                     (0)
6d24f3b: 109          !                         (1)
6d24f3c:  37        5 ||                        (1)
6d24f3f:  63        1 local                     (0)
6d24f41: 109          !                         (1)
6d24f42:  23          branch_when_zero          (1)
6d24f48:   2          efun2                     (0)
6d24f4c: 252        8 member_array              (2)
6d24f4e:  46          return                    (1)
171c05d:  12        1 -byte                     (1)
171c05f:  22        4 branch_eq                 (2)

-- I hope above is usefull in any fashion, it is greek to me.

-- Below follows the trace stack log where it dies at line 216 in /global/auto_load.c :
    catch( (ob=clone_object( path )) );

}),0
'          CATCH' in '/  global/auto_load.c' ('/global/mortal#406483') /global/auto_load.c:216
--- end trace ---
Fatal error while shutting down.  Aborting.
Aborted




We dont use the DB package, but we do use the async package.
For save_object_async() and restore_object_async(). Very nice stuff, will use it far more.
Lag sucks  :), async doesnt.

Silbago @ Final Realms
« Last Edit: November 03, 2011, 07:52:38 pm by silbago »
Participating in development on Final Realms. telnet://fr.hyssing.net:4001

Offline wodan

  • BFF
  • ***
  • Posts: 434
  • Drink and code, you know you want to!
    • View Profile
Re: Fluffos 2.23 Crasher : ASYNC
« Reply #1 on: November 04, 2011, 01:19:57 pm »
Despite plans we never did start using those on discworld, so you're probably the first real user! I'll have a look at the code that's involved.

Offline silbago

  • Acquaintance
  • *
  • Posts: 13
    • View Profile
Re: Fluffos 2.23 Crasher : ASYNC
« Reply #2 on: November 04, 2011, 01:51:20 pm »
Async stuff works just fine, only a bit work to keep it shall we say statefull.
Volothamp implemented save_object_async() and restore_object_async(),
using callback to object so we can keep track of what is finished when.

If FR is the first to use it it aint surprising that it might be some hickups.
I have disabled the use of async stuff for the time being.

Thanx for looking into it anyways, async file-handling is just great !
Participating in development on Final Realms. telnet://fr.hyssing.net:4001