Author Topic: Contributions to the driver  (Read 1217 times)

Offline silenus

  • BFF
  • ***
  • Posts: 178
    • View Profile
Contributions to the driver
« on: October 18, 2015, 06:33:40 PM »
I have been thinking about experimenting with some driver hacking are contributions being accepted into the driver?

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: Contributions to the driver
« Reply #1 on: October 21, 2015, 10:18:13 AM »
well of course!, send a PR!

Offline silenus

  • BFF
  • ***
  • Posts: 178
    • View Profile
Re: Contributions to the driver
« Reply #2 on: October 26, 2015, 08:06:31 AM »
My interest primarily lies with the implementation of the vm. For example one thing I found that might be worth doing is changing the functional/function pointer code a little bit so that certain types of function pointers can be saved by save variable and restored. This would only be (of the 4 types) for lfuns.

It might also be nice if the string or func could somehow be merged into the functional syntax as well as the remaining old function pointer types- this would sort of conflict with the above though and would require some sort of general scheme for saving and parsing function pointers. I suspect this is doable with a bit more work with saving/restoring function pointer strings but would require some additional parsing routines to cope with restoring strings to functionals. This obviously is more work than the above but more general but also not that much more useful.

Another thing I would like to look at is garbage collection of the mark and sweep variety which is already used in certain memory models for ldmud. This would not solve the problem of fragmentation so I have some hesitations here too but I dont know other forms of garbage collection well enough to know how reserving large blocks of ram would affect the OS in general (this would be required for semi-space and generational collections). The algorithms in general for garbage collection in the stop the world case seem simple enough. (I actually wrote a tricolor mark/sweep in lpc for fun once where one could customized the root set and clean up objects that werent referenced by something)- most textbook descriptions are quite simple or even research papers.

Lastly I still dream of having a JIT for some version of LPC that optimizes away the cost of branch instructions in the vm for instruction selection at the very least and with llvm patchpoint perhaps includes some form of inline catching for call other instead of virtual functions. One of the biggest problems with lpc is that call-other is runtime typed as opposed to statically typed like virtual functions in C++ and java making it difficult to unbox types in sequences of instructions having calls to other objects that are not bound at compile time (inheritables are bound at compile time and dont change so  these are okay).  This however is quite a bit of work though it might be fun to do.

One question I do have is how much C++ is expected to be seen in the driver in the future? There is a lot of runtime type selection in the code at present which could be replaced with virtual functions and OOAD style dynamic dispatch. The main conflicting problem in this area is that in certain cases this makes it harder to infer the actual machine code layouts for objects/programs which would make it harder to implement JIT style tricks if it is a worthwhile goal (but I am not sure much like ruby and python if one really needs that sort of performance anyhow).



 

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: Contributions to the driver
« Reply #3 on: October 27, 2015, 01:06:50 AM »
My suggestion is to start small, do something benign and get to know the code base first. You can start by rewriting global living hash table to use std::unordered_map , for example.

FluffOS is a living driver, new changes need to be compatible with old code or at least provide a migration pathway. If you want to implement something as big as JIT or mark-and-sweep, you need to be very familiar with the codebase to start with. Something I am still trying to do.

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: Contributions to the driver
« Reply #4 on: October 27, 2015, 06:50:56 AM »
Also, I think working on the VM is not that interesting, at least to me, other than digging out old issues.

My primary goal is to bridge the driver with LUA,  later maybe also JS.  Which is clearly a more lively VM.

Offline silenus

  • BFF
  • ***
  • Posts: 178
    • View Profile
Re: Contributions to the driver
« Reply #5 on: October 29, 2015, 05:00:12 AM »
Well I have been looking over the source code of v8. It is quite a bit cleaner than the code in the fluffos vm. I think the main issue of patching another vm is that the fluffos has semantically some unique features that need to be carefully mapped onto other language vm's that lack those features. Some that come to mind include:

1) Dynamic class replacement. Code in fluffos has versions but the main name of a "class" or program is replaced by another object at runtime. This is a feature that most other vm's lack and needs to be mapped carefully onto another vm be it a js (v8)or lua vm.

2) Related to 1) is call other which isn't a simple function call but in most languages a reflective call lookup and call (in java).  It's more dynamic than virtual functions in most languages C/C++ java smalltalk. I have been struggling with efficient ways to implement this and get rid of boxing in lpc. I think it's a difficult problem that might not be worth the investment in time to get right because I think LPC doesnt need to be that efficient since most people are mostly running legacy applications on it.

3) OS type features- various applies are needed to be called in certain cases when certain efuns are executed. So, in general one needs to wrap function calls with extra code for this (could just be a wrapper function). Stack depth
and too long execution checks means that at the very least call_other and other function calls and back branches need extra book keeping to prevent things from running for too long. Arguably one could use some kind of time slicing like one gets in a modern operating system but the implementation work on top of what exists already is substantial.

4) inline caching. Call_other inline caching because of how it works wouldnt be mapping propery onto another vm in general. I.e. there would still be a substantial performance hit with any function call made in objects. This obviously is a general problem with OO languages and self modifying code via caching is generally used to mitigate the cost. Using anything other than a custom vm won't be able to handle this if it is ever implemented. LLvm supports patchpoint sort of which can be used to implement this. It's an experimental feature in llvm hence the sort of.

I think in general be it working on a JIT or remapping the existing semantics onto something else the development effort is nontrivial for replacing the vm wholesale. I am uncertain about what would be gained either since very few new projects are written in LPC nowadays and it's mostly legacy apps.  Writing the compiler mapping to account for the above would mean pretty much a full compiler pass upto the IL level like is current to do the language translation. The backend would obviously be the new vm.

As for a mark and sweep garbage collector- I think this is less difficult than it seems since LPC is already ref counted so their exists hooks for the creation and destruction of variables already. Obviously the destruction ones can be removed and the creation ones augmented or modified to handle the extra book keeping required in most cases. However I do wonder if mark/sweep is the best solution here since fragmentation is not addressed. I am curious in your opinion if there is any big deficiency in allocating two large blocks of memory (or a number of big blocks anyhow)- and using some sort of compacting garbage collection scheme like semi-space collection or generational garbage collection.

I was looking at the object hash tables but I haven't yet identified where the living hash lives. I found the otable.cc which houses the primary string name -> object translation. This seems pretty modular and I suspect that rewriting it to use stl instead of a custom hash table and linked list should be relatively easy.

One question I do have is- how C++ like can new code contributions be? I noticed that the modified call_out.cc still is written with primarily a C style interface and no classes or methods.

Offline FallenTree

  • BFF
  • ***
  • Posts: 483
    • View Profile
Re: Contributions to the driver
« Reply #6 on: October 29, 2015, 09:57:57 AM »
I think you misunderstood what bridging to other VM means, It means we can load script written in another language ,evaluate it in its own VM and provide syscall for it to access driver internals.

It doesn't mean we need to run LPC code in another VM, which I think it is certainly not worth it.

For example, you will be able to load a full LUA or JS script that calls EFUN to interact with users while accessing everything a normal LUA/JS vm can access.Think about the possibilities! Of course we need to implement a sort of two way communication channel, for apply to work. but in general I think it can work.

That's why I am not recommending working on LPC VM anymore, including garbage collectors. oname.cc does need a rewrite, that is fine.

I generally prefer to not introduce OOP into driver, since I havn't actually find a use for it,  Driver code is mostly in C-style,  I think it is better to leverage STL or other small C++ library and features to make it easier to maintainer and better written.  As for the style, just use your judgement, as long as it can be compiled with gcc 4.8+ , we will be fine.

Offline silenus

  • BFF
  • ***
  • Posts: 178
    • View Profile
Re: Contributions to the driver
« Reply #7 on: October 29, 2015, 10:42:09 AM »
I think I get what you mean by bridging now but I think you are still at the mercy of certain things by doing this that developing on an lpmud server prevents.

1) malicious code execution that modifies or deletes data.
2) denial of service type code execution that deliberately starves resources via infinite loops etc.

I don't see a good way around this other than rebuilding a layer on top of whatever as I mentioned. I think also by doing this one is no longer really operating in lpc. In general this isn't necessarily a bad thing but it makes it difficult to program the system when it's running along side other users because of the two issues above.