LPMuds.net

LPMuds.net Forums => Drivers => Topic started by: FallenTree on September 28, 2015, 01:15:16 pm

Title: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on September 28, 2015, 01:15:16 pm
Hi all,

It's been a while, and now I present fluffos 3.0-alpha-9.0 to you all.

Major changes:
1. Bundle jemalloc 4.0.3 , build process will build and link to it automatically. Jemalloc will solve memory fragmentation issue.
2. Better formatted  startup logging.
3. new RC(runtime config) system that provides a default value for every option, and output overridden ones on startup.
4. Fixed '\r' translation issue introduced since 7.4
5. Deal with CRLF in both RC configs and read_file, restore_* ,  no more translation needed.

Note:
CYGWIN build is broken for now, as jemalloc is having some trouble compiling under it.

As usual, the code is avaiable at https://github.com/fluffos/fluffos , I havn't tagged a release yet due to CYGWIN issue.

Cheers.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on September 29, 2015, 02:22:50 am
I also found and fixed a libtelnet bug during the process..
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on September 30, 2015, 11:34:47 pm
Great!!!
I'll test later with my mudlib.

Thank you very very much!
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: silenus on October 05, 2015, 09:45:09 am
I was trying to test the latest alpha driver under debian against the latest fluffos release but ran into a problem when it came to understanding some of the new options in the new local options. Is there still a way to make array a reserved word since ds and nightmare libs use this keyword quite a bit.

Thanks in advance
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 06, 2015, 12:40:18 am
Is there still a way to make array a reserved word since ds and nightmare libs use this keyword quite a bit.

You need to modify  https://github.com/fluffos/fluffos/blob/next-3.0/src/base/internal/options_internal.h  now to enable that.

I don't think array keyword actually give you anything, I recommend you use some global define / grep to get rid of it for the future.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: quixadhal on October 06, 2015, 11:23:29 pm
Actually, array makes more sense for LPC than the old C pointer notation.

LPC doesn't allow you to (directly) use pointers as a variable type.  As such, using a cryptic pointer notation to mean an array only makes sense to people who first learned C or C++ programming.  If it weren't for trying to support legacy code, I'd actually get rid of the * notation and force the use of the array keyword as the ONLY way to declare an array.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: silenus on October 07, 2015, 04:42:09 pm
I  guess the difficulty of removing either of them is that it now breaks compatibility with a number of existing mudlibs. Has the malloc memory allocator simplification happened? Can we still use multiple different mallocs or are we down to one?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 08, 2015, 08:08:47 am
we are now down to one,  jemalloc.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on October 08, 2015, 09:11:05 am
Hello!
Since Monday, driver 9.0 is up and works ok.

In that days, only fails one time.

I could see this trace in syslog:
Code: [Select]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd3745)[0x7fe44c731745]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7fe449dcc180]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd2d70)[0x7fe44c730d70]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xcfb28)[0x7fe44c72db28]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd192a)[0x7fe44c72f92a]
Oct  8 14:30:13 driver[16245]: ...bin/driver(telnet_recv+0x158)[0x7fe44c724968]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7fe44c74beb1]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x179ec)[0x7fe44a9799ec]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x2164f)[0x7fe44a98364f]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(event_base_loop+0x4df)[0x7fe44a9841ff]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z7backendP10event_base+0x26e)[0x7fe44c728a9e]
Oct  8 14:30:13 driver[16245]: ...bin/driver(main+0x457)[0x7fe44c680d37]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fe449db8b45]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0x240b1)[0x7fe44c6820b1]
Oct  8 14:30:13 driver[16245]: ******** FATAL ERROR: SIGSEGV: Segmentation fault
Oct  8 14:30:13 driver[16245]: FluffOS driver attempting to exit gracefully.
Oct  8 14:30:13 driver[16245]: crash() in master called successfully.  Aborting.

I understand this info is not sufficient, but i didn't see the error until ten minutes later :(

What info do yo need if that error happen again?

Except for that crash, all is ok!

Thank you very much.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: silenus on October 08, 2015, 10:10:02 am
How is it possible to see what information coverity is generating for fluffos on GitHub?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 09, 2015, 11:46:57 pm
Code: [Select]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd3745)[0x7fe44c731745]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7fe449dcc180]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd2d70)[0x7fe44c730d70]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xcfb28)[0x7fe44c72db28]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd192a)[0x7fe44c72f92a]
Oct  8 14:30:13 driver[16245]: ...bin/driver(telnet_recv+0x158)[0x7fe44c724968]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7fe44c74beb1]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x179ec)[0x7fe44a9799ec]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x2164f)[0x7fe44a98364f]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(event_base_loop+0x4df)[0x7fe44a9841ff]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z7backendP10event_base+0x26e)[0x7fe44c728a9e]
Oct  8 14:30:13 driver[16245]: ...bin/driver(main+0x457)[0x7fe44c680d37]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fe449db8b45]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0x240b1)[0x7fe44c6820b1]
Oct  8 14:30:13 driver[16245]: ******** FATAL ERROR: SIGSEGV: Segmentation fault
Oct  8 14:30:13 driver[16245]: FluffOS driver attempting to exit gracefully.
Oct  8 14:30:13 driver[16245]: crash() in master called successfully.  Aborting.

You need to post whole startup-log  , and possibly put your driver binary somewhere for me to download and analyse.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 09, 2015, 11:47:25 pm
How is it possible to see what information coverity is generating for fluffos on GitHub?

Tell me your email, and I will send you a link.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: silenus on October 10, 2015, 02:00:05 am
Sent you my email via pm.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on October 14, 2015, 10:25:02 am
Code: [Select]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd3745)[0x7fe44c731745]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7fe449dcc180]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd2d70)[0x7fe44c730d70]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xcfb28)[0x7fe44c72db28]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd192a)[0x7fe44c72f92a]
Oct  8 14:30:13 driver[16245]: ...bin/driver(telnet_recv+0x158)[0x7fe44c724968]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7fe44c74beb1]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x179ec)[0x7fe44a9799ec]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x2164f)[0x7fe44a98364f]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(event_base_loop+0x4df)[0x7fe44a9841ff]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z7backendP10event_base+0x26e)[0x7fe44c728a9e]
Oct  8 14:30:13 driver[16245]: ...bin/driver(main+0x457)[0x7fe44c680d37]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fe449db8b45]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0x240b1)[0x7fe44c6820b1]
Oct  8 14:30:13 driver[16245]: ******** FATAL ERROR: SIGSEGV: Segmentation fault
Oct  8 14:30:13 driver[16245]: FluffOS driver attempting to exit gracefully.
Oct  8 14:30:13 driver[16245]: crash() in master called successfully.  Aborting.

You need to post whole startup-log  , and possibly put your driver binary somewhere for me to download and analyse.

For now is ok, but i'm sure that will crash again.

I'm thinking about it.

The error is in:
Code: [Select]
telnet_recv(ip->telnet, reinterpret_cast<const char *>(&buf[0]), num_bytes);

Is possible that ip->telnet was incorrect or NULL for some telnet client? (i doubt it...)

I could not reproduce the error.

Startup log:
Code: [Select]
========================================================================
Boot Time: Wed Oct 14 17:23:06 2015
FluffOS Version: 3.0-alpha9.0(git-944ee22-1443467939)@ (Linux/x86-64)
Jemalloc Version: 4.0.3-0-ge9192eacf8935e29fc62fddc2701f7942b1cc02c
Core Dump: No, Max FD: 65535
Command: ./driver ./config.rl
========================================================================
Processing config file: ./config.rl
* Config 'time to clean up' New Value: 7200
* Config 'time to reset' New Value: 1800
* Config 'time to swap' New Value: 1800
* Config 'evaluator stack size' New Value: 1000
* Config 'maximum evaluation cost' New Value: 1000000
* Config 'maximum call depth' New Value: 35
* Config 'maximum array size' New Value: 25000
* Config 'maximum mapping size' New Value: 15000
* Config 'maximum bits in a bitfield' New Value: 1200
* Config 'maximum byte transfer' New Value: 10000
* Config 'living hash table size' New Value: 100
* Config 'gametick msec' New Value: 100
* Config 'heartbeat interval msec' New Value: 2000
* Config 'sane sorting' New Value: 0
* Config 'trace' New Value: 0
* Config 'receive snoop' New Value: 0
* Config 'reverse defer' New Value: 1
* Config 'enable_commands call init' New Value: 0
* Config 'sprintf add_justified ignore ANSI colors' New Value: 0
* Config 'call_out(0) nest level' New Value: 10000
Initializing internal stuff ....
Event backend in use: epoll
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 16, 2015, 10:37:38 am
Code: [Select]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd3745)[0x7fe44c731745]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7fe449dcc180]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd2d70)[0x7fe44c730d70]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xcfb28)[0x7fe44c72db28]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0xd192a)[0x7fe44c72f92a]
Oct  8 14:30:13 driver[16245]: ...bin/driver(telnet_recv+0x158)[0x7fe44c724968]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7fe44c74beb1]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x179ec)[0x7fe44a9799ec]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(+0x2164f)[0x7fe44a98364f]
Oct  8 14:30:13 driver[16245]: /usr/lib/libevent-2.1.so.5(event_base_loop+0x4df)[0x7fe44a9841ff]
Oct  8 14:30:13 driver[16245]: ...bin/driver(_Z7backendP10event_base+0x26e)[0x7fe44c728a9e]
Oct  8 14:30:13 driver[16245]: ...bin/driver(main+0x457)[0x7fe44c680d37]
Oct  8 14:30:13 driver[16245]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fe449db8b45]
Oct  8 14:30:13 driver[16245]: ...bin/driver(+0x240b1)[0x7fe44c6820b1]
Oct  8 14:30:13 driver[16245]: ******** FATAL ERROR: SIGSEGV: Segmentation fault
Oct  8 14:30:13 driver[16245]: FluffOS driver attempting to exit gracefully.
Oct  8 14:30:13 driver[16245]: crash() in master called successfully.  Aborting.

You need to post whole startup-log  , and possibly put your driver binary somewhere for me to download and analyse.

For now is ok, but i'm sure that will crash again.

I'm thinking about it.

The error is in:
Code: [Select]
telnet_recv(ip->telnet, reinterpret_cast<const char *>(&buf[0]), num_bytes);

Is possible that ip->telnet was incorrect or NULL for some telnet client? (i doubt it...)

I could not reproduce the error.

Startup log:
Code: [Select]
========================================================================
Boot Time: Wed Oct 14 17:23:06 2015
FluffOS Version: 3.0-alpha9.0(git-944ee22-1443467939)@ (Linux/x86-64)
Jemalloc Version: 4.0.3-0-ge9192eacf8935e29fc62fddc2701f7942b1cc02c
Core Dump: No, Max FD: 65535
Command: ./driver ./config.rl
========================================================================
Processing config file: ./config.rl
* Config 'time to clean up' New Value: 7200
* Config 'time to reset' New Value: 1800
* Config 'time to swap' New Value: 1800
* Config 'evaluator stack size' New Value: 1000
* Config 'maximum evaluation cost' New Value: 1000000
* Config 'maximum call depth' New Value: 35
* Config 'maximum array size' New Value: 25000
* Config 'maximum mapping size' New Value: 15000
* Config 'maximum bits in a bitfield' New Value: 1200
* Config 'maximum byte transfer' New Value: 10000
* Config 'living hash table size' New Value: 100
* Config 'gametick msec' New Value: 100
* Config 'heartbeat interval msec' New Value: 2000
* Config 'sane sorting' New Value: 0
* Config 'trace' New Value: 0
* Config 'receive snoop' New Value: 0
* Config 'reverse defer' New Value: 1
* Config 'enable_commands call init' New Value: 0
* Config 'sprintf add_justified ignore ANSI colors' New Value: 0
* Config 'call_out(0) nest level' New Value: 10000
Initializing internal stuff ....
Event backend in use: epoll

I saw exactly the same crash happened on another MUD. I sent in some fixes yesterday, can you update it and see if it happens again?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on October 16, 2015, 11:12:03 am
Perfect, i update it and i'll testing again with this version of driver.

Thank you very much!
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on October 16, 2015, 12:08:09 pm
Hello, i can't compile driver.

Error:
Code: [Select]
checking for libevent >= 2.0... no
configure: error: Fail to find libevent (2.0+) header/library, install libevent-dev or change --with_libevent

With Version 8.1 is ok and with previous versión 9.0 too.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 21, 2015, 10:18:31 am
I am working on fixing that.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 26, 2015, 09:51:36 am
Hello, i can't compile driver.

This has been fixed by the latest commit. Also, I think i‘ve also fixed the crash, please try again.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on October 26, 2015, 02:20:26 pm
Perfect, now is ok!

Tomorrow i'll come back to this version for tests.

Thank you very much!
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on October 28, 2015, 12:13:37 am
Perfect, now is ok!

Tomorrow i'll come back to this version for tests.

Thank you very much!

Great , how did it go? Assuming everything is working okay?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on October 28, 2015, 12:31:08 am
Hello, yes, for now all is ok!
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 06, 2015, 01:04:44 am
Hello! Today driver crash again.

Same error, in telnet_recv.

Uptime was: around 6 days.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 09, 2015, 09:15:01 am
do you have core available?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 09, 2015, 11:48:30 pm
Do you mean "rlimit"?

WARNING: rlimit for core dump is 0, you will not get core on crash.

Start driver:
Code: [Select]
========================================================================
Boot Time: Tue Nov 10 06:44:02 2015
FluffOS Version: 3.0-alpha9.0(git-944ee22-1443467939)@ (Linux/x86-64)
Jemalloc Version: 4.0.3-0-ge9192eacf8935e29fc62fddc2701f7942b1cc02c
Core Dump: No, Max FD: 65535
Command: ./driver ./config.rl
========================================================================

do you need that load driver with rlimit?
Which is the command for activate it?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 10, 2015, 09:24:38 am
You would need to do this :

ulimit -c unlimited && ./driver xxxx

Verify that driver startup without the warning.  And when it crash, send me the core.xxx file , together with the driver binary. I will be able to see what this is about.

Also , can you paste the crash log again?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 10, 2015, 11:55:40 pm
hello, trace is:
Code: [Select]
/home/mud/rlmud/driver/bin/driver(+0xd5415)[0x7f2db9e27415]
/lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7f2db74c3180]
/home/mud/rlmud/driver/bin/driver(+0xd4a40)[0x7f2db9e26a40]
/home/mud/rlmud/driver/bin/driver(+0xd1f38)[0x7f2db9e23f38]
/home/mud/rlmud/driver/bin/driver(+0xd365a)[0x7f2db9e2565a]
/home/mud/rlmud/driver/bin/driver(telnet_recv+0x158)[0x7f2db9e14eb8]
/home/mud/rlmud/driver/bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7f2db9e3bcd1]
/usr/lib/libevent-2.1.so.5(+0x1785c)[0x7f2db807085c]
/usr/lib/libevent-2.1.so.5(+0x209ae)[0x7f2db80799ae]
/usr/lib/libevent-2.1.so.5(event_base_loop+0x49f)[0x7f2db807a32f]
/home/mud/rlmud/driver/bin/driver(_Z7backendP10event_base+0x26e)[0x7f2db9e18f0e]
/home/mud/rlmud/driver/bin/driver(main+0x302)[0x7f2db9d71d42]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f2db74afb45]
/home/mud/rlmud/driver/bin/driver(+0x20475)[0x7f2db9d72475]
******** FATAL ERROR: SIGSEGV: Segmentation fault
FluffOS driver attempting to exit gracefully.
crash() in master called successfully.  Aborting.

This trace is for fluffos 8.1 too :(

i'll try core.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: quixadhal on November 11, 2015, 04:31:12 am
It's probably also helpful to compile with the debug flags on, and link against the debug versions of your system libraries.  ./build_FluffOS devel used to do this (or maybe debug).

If you know how to reproduce the crash, or you have beefy enough hardware to not mind the overhead, you can also run the driver in gdb, which would let you poke around in the case of a crash.  If you do that though, I'd recommend using screen, or doing it on a console.... a remote shell without screen may stop the driver if you get disconnected.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 11, 2015, 02:23:43 pm
For now i can't reproduce this bug and i can't compile in debug mode (lag for players :()
I'm trying gdb with a coredump for a crash forced by me (for testings purposes)...
When crash again for that error, i'll try it.

Thank you very much!

Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 12, 2015, 09:28:54 pm
can you also use addr2line to get correct line number from your above crash stacktrace ?
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 13, 2015, 11:12:24 am
That backtrace is from Fluffos 8.1, because i didn't save bt with last 9.0.
Note: Line Numbers from some files (main.cc) aren't exacts because some changes in my own driver version (I still don't send pull request), but i think that crash it is not due for this changes:

Used Command: addr2line -e ./driver "address"
Trace:
Code: [Select]
/home/mud/rlmud/driver/bin/driver(+0xd5415)[0x7f2db9e27415] => src/main.cc:203
/lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7f2db74c3180] => rc/vm/internal/base/object.cc:1230
/home/mud/rlmud/driver/bin/driver(+0xd4a40)[0x7f2db9e26a40] => src/net/telnet.cc:220
/home/mud/rlmud/driver/bin/driver(+0xd1f38)[0x7f2db9e23f38] => src/thirdparty/libtelnet/libtelnet.c:813
/home/mud/rlmud/driver/bin/driver(+0xd365a)[0x7f2db9e2565a] => src/thirdparty/libtelnet/libtelnet.c:1052
/home/mud/rlmud/driver/bin/driver(telnet_recv+0x158)[0x7f2db9e14eb8] => ??:0 ???
/home/mud/rlmud/driver/bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7f2db9e3bcd1] => ??:0 ???
/usr/lib/libevent-2.1.so.5(+0x1785c)[0x7f2db807085c]
/usr/lib/libevent-2.1.so.5(+0x209ae)[0x7f2db80799ae]
/usr/lib/libevent-2.1.so.5(event_base_loop+0x49f)[0x7f2db807a32f]
/home/mud/rlmud/driver/bin/driver(_Z7backendP10event_base+0x26e)[0x7f2db9e18f0e]
/home/mud/rlmud/driver/bin/driver(main+0x302)[0x7f2db9d71d42]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f2db74afb45]
/home/mud/rlmud/driver/bin/driver(+0x20475)[0x7f2db9d72475]
******** FATAL ERROR: SIGSEGV: Segmentation fault
FluffOS driver attempting to exit gracefully.
crash() in master called successfully.  Aborting.

With GDB (With "info line"):
Code: [Select]
/home/mud/rlmud/driver/bin/driver(+0xd5415)[0x7f2db9e27415]
Line 203 of "main.cc" starts at address 0xd5415 <attempt_shutdown(int)+149> and ends at 0xd5428 <attempt_shutdown(int)+168>.
Code: [Select]
/lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7f2db74c3180]
Line 1230 of "vm/internal/base/object.cc" starts at address 0x3517e <fgv_recurse(program_t*, int*, char*, unsigned short*, int) [clone .lto_priv.364]+1038>
   and ends at 0x35181 <fgv_recurse(program_t*, int*, char*, unsigned short*, int) [clone .lto_priv.364]+1041>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(+0xd4a40)[0x7f2db9e26a40]
Line 220 of "net/telnet.cc" starts at address 0xd4a40 <telnet_event_handler(telnet_t*, telnet_event_t*, void*) [clone .lto_priv.272]+1184>
   and ends at 0xd4a5d <telnet_event_handler(telnet_t*, telnet_event_t*, void*) [clone .lto_priv.272]+1213>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(+0xd1f38)[0x7f2db9e23f38]
Line 813 of "thirdparty/libtelnet/libtelnet.c" starts at address 0xd1f38 <_subnegotiate(telnet_t*)+72> and ends at 0xd1f62 <_subnegotiate(telnet_t*)+114>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(+0xd365a)[0x7f2db9e2565a]
Line 1052 of "thirdparty/libtelnet/libtelnet.c" starts at address 0xd3655 <_process(telnet_t*, char const*, unsigned long) [clone .lto_priv.264]+1301>
   and ends at 0xd3674 <_process(telnet_t*, char const*, unsigned long) [clone .lto_priv.264]+1332>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(telnet_recv+0x158)[0x7f2db9e14eb8]
Line 1171 of "thirdparty/libtelnet/libtelnet.c" starts at address 0xc2eb8 <telnet_recv+344> and ends at 0xc2ed0 <pop_n_elems(int) [clone .constprop.143]>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(_Z13get_user_dataP13interactive_t+0x571)[0x7f2db9e3bcd1]
Line 769 of "comm.cc" starts at address 0xe9cd1 <get_user_data(interactive_t*)+1393> and ends at 0xe9ce0 <get_user_data(interactive_t*)+1408>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(_Z7backendP10event_base+0x26e)[0x7f2db9e18f0e]
Line 259 of "backend.cc" starts at address 0xc6f0e <backend(event_base*)+622> and ends at 0xc6f18 <backend(event_base*)+632>.
Code: [Select]
/home/mud/rlmud/driver/bin/driver(main+0x302)[0x7f2db9d71d42]
Line 168 of "main.cc" starts at address 0x1fd42 <main(int, char**)+770> and ends at 0x1fd58 <main(int, char**)+792>.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 13, 2015, 11:32:01 pm
are you saying the crash is also happening in 3.0alpha8.1 ?

It looks like a libtelnet bug, which I have yet to figure out how it happens.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 14, 2015, 01:25:31 am
yes, the last backtrace is from fluffos 8.1.
After last crash from 9.0, i came back to fluffos 8.1 and finally crash too.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: quixadhal on November 14, 2015, 07:26:55 am
I'm not an expert on character set encodings, but my first instinctive thought was... are you using UTF-8 or a similar extended character set, and are there any valid byte sequences that use 255 in them?  TELNET was designed for ASCII transmission, and character 255 was denoted as the special IAC escape code to say "Hey, the next byte is either part of a command sequence, or another 255 escaped"

If libtelnet is scanning data at the byte level for IAC sequences, and one happens to be part of a multi-byte character set in whatever encoding you're using, it's very possible that libtelnet might try to analyze it.  If it happens to be a valid command (but with nonsense data), it might do something unexpected.

That's just a wild guess though.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 14, 2015, 11:49:09 am
You're right, my mud has active UTF-8.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 15, 2015, 12:40:30 am
I doubt that's the issue, telnet protocol will escape IAC automatically.

This confirm my suspicion, this is a libtelnet bug, which I have yet to have a grasp at.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 16, 2015, 02:14:44 am
Hello, this night driver crashed again.
I have the coredump, if you say me a mail i'll send you with the driver binary.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 18, 2015, 05:47:46 am
sunyucong@gmail.com

There must be some configuration trigger specific problem, because we havn't have much of a issue on many popular chinese muds.

Please send me binary, core dump, and full log.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 18, 2015, 01:33:33 pm
With full log, do you mean runtime_driver?
That log was deleted :(
the only useful info in it was stack trace, but that info is visible with gdb, isn't it?

The compress tar.gz core filesize are 56338715 MB.
Which are the max file_size for attachments in your mail? or do you have a ftp or similar?

I saw the core with gdb and in the moment of crash, i saw a telnet petition type TELNET_TELOPT_LINEMODE (34).
Perhaps some combination for some user and telnet client, and crash driver when enters? but i don't know, telnet protocol is new for me.

Thanks!

PD: Sorry my english
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 19, 2015, 09:50:22 pm
How come your core size is so big......How much memory do you have?

hmm, your crash looks different than what I have been hearing. I will take a look early next week.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 20, 2015, 11:20:57 am
In my server, driver is add to systemctl.
That core is generated with systemd, because with ulimit didn't generate core in that situation.

surely i did something wrong, it was my first time with coredumps :D
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 23, 2015, 08:31:29 am
@DarkWateR

First: I think this is caused by broken client, not a server side problem (although we certainly can do better than crashing..)

This is a very interesting problem: I have a workaround checked in , but I want to you do do something with the core, to verify something for me.

Here's how: use gdb driver core , use bt to find stack that belongs to "comm.cc", and use "f X" to select that frame:

> print ip->local_port
> x/4ub ((in_addr*)&((struct sockaddr_in*)&(ip->addr))->sin_addr)->s_addr
> x/36xb buf

First command will print out the port it connected to,  second command prints out IP address,   the rest will show me the whole packets.

Here is what I found: The port is 8000,  remote ip address is in "208.100.*",  and raw data look like this

(gdb) x/36xb buf
0x7fff4271eff0: 0xff    0xfc    0x22    0xff    0xfa    0x22    0xff    0xf0
0x7fff4271eff8: 0xff    0xff    0xfc    0x03    0xff    0xfc    0x18    0xff
0x7fff4271f000: 0xfc    0x1f    0xff    0xfc    0x27    0xff    0xfe    0x56
0x7fff4271f008: 0xff    0xfc    0x5b    0xff    0xfe    0x46    0xff    0xfe
0x7fff4271f010: 0x5d    0xff    0xfe    0xc9

.

Also, don't forget to try the latest fix :-p


Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 23, 2015, 01:09:13 pm
Hello, output is:
(gdb) print ip->local_port
$1 = 5001

(Incorrect coredump or binary?)
(gdb) x/4ub ((struct in_addr*)&((struct sockaddr_in*)&(ip->addr))->sin_addr)->s_addr
0xe71a64d0:     Cannot access memory at address 0xe71a64d0

(gdb)  x/36xb buf
0x7fffc2aa05a0: 0xff    0xfc    0x22    0xff    0xfa    0x22    0xff    0xf0
0x7fffc2aa05a8: 0xff    0xff    0xfc    0x03    0xff    0xfc    0x18    0xff
0x7fffc2aa05b0: 0xfc    0x1f    0xff    0xfc    0x27    0xff    0xfe    0x56
0x7fffc2aa05b8: 0xff    0xfc    0x5b    0xff    0xfe    0x46    0xff    0xfe
0x7fffc2aa05c0: 0x5d    0xff    0xfe    0xc9

Last output is the same.

i'll update with your path and cross my fingers!
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 24, 2015, 05:19:36 am
o yeah, the wonder of broken attacker!
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 24, 2015, 12:58:05 pm
Yeah! :D
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: FallenTree on November 28, 2015, 03:37:28 am
I assume there are no more crashes?

Cheers.
Title: Re: WIP: Fluffos 3.0 Alpha 9.0
Post by: DarKWateR on November 28, 2015, 05:49:56 am
yes, for now is running ok!