Author Topic: partition_array sefun  (Read 2343 times)

Offline silenus

  • BFF
  • ***
  • Posts: 196
    • View Profile
partition_array sefun
« on: April 04, 2007, 03:08:22 PM »
Someone mentioned a problem on intercre/ds channel(can't remember which) about grouping users with the same ip addresses into clusters and I thought of this as a possible solution and decided to post it. It's a bit more general and is based on the idea of partitions and equivalence classes though there is in this case only a partition function. I hope someone finds it useful. For example to solve the above problem would be quite easy once you have this function just do- partition_array( map( users(), (: query_ip_number($1) :) ) ) and implode the result as needed.

/*
   partitions an array into equivalence classes based on a partition function
    example usage- partition_array( ({1,2,3,4,5,6}), (: $1 % 2 :) ) would return
    ({ ({2, 4, 6}), ({1, 3, 5}) }).   
*/

varargs mixed array partition_array(mixed array ar, function f)
{
   mapping m = ([]);
   f = undefinedp(f) ? (: $1 :) : f;
   
   foreach(mixed elem in ar)
   {
      if( undefinedp( m[ evaluate(f, elem) ] ) ) m[ evaluate(f, elem) ] = ({ elem });
      else m[ evaluate(f, elem) ] += ({ elem });
   }
   return values( m );
}

« Last Edit: April 04, 2007, 03:36:33 PM by silenus »

Offline daelaskai

  • BFF
  • ***
  • Posts: 174
    • View Profile
Re: partition_array sefun
« Reply #1 on: December 27, 2007, 01:46:17 PM »
I was reviewing this post and it seems like there is an oversight.
For example, I'm looking to partition the results from get_dir() into
files and directories, respectively.  This partition_array() sefun works
great if the directory (we'll say "/x/") has both subdirectories and files in it.

Code: [Select]
output_files = partition_array( get_dir( "/x/", -1 ), (: ($1[1] == -2) :) );
x = path
y = *subdirectories
z = *files

The problem is this:

If the directory being referenced (/x/) does not have any subdirectories,
(only files), it returns ({ z }) and should return ({ ({}),  z }) or something similar.
The opposite is true if there are only subdirectories, it only returns a single array
as the output and not the array of two arrays ({ ({}), ({}) }) that is expected.

Does anyone have any suggestions on how to modify this so it correctly works
for my scenario?  If my post is unclear, I would be happy to explain it further
and would appreciate any help.

Daelas

Offline silenus

  • BFF
  • ***
  • Posts: 196
    • View Profile
Re: partition_array sefun
« Reply #2 on: December 27, 2007, 02:35:14 PM »
Hi Daelas,

If you are using MudOS there is an efun unique_array which creates an equivalence class type partition similar to partition_array (when I wrote this function I didnt realize that this efun existed). This however might not solve your problem. You might have to wrap the function to reinsert the missing empty sets. The problem is that given some partition function f and some set to partition y a priori its impossible to know how many empty partitions should be there.

There are ways to fix this but I cannot off the top of my head think of one which has an elegant implementation. One thought might be to modify the function so instead of returning an values(m) (which i think now isnt so nice since it doesn't preserve order) - to have it return m so you have the key,value pairs and thus know which partitions you have and which ones are null.

Perhaps someone else has some better ideas.

Regards,

Silenus.


Offline daelaskai

  • BFF
  • ***
  • Posts: 174
    • View Profile
Re: partition_array sefun
« Reply #3 on: December 27, 2007, 02:53:22 PM »
Hello Silenus!

It works perfectly when I reference the entire mapping (m) rather than the values(m).
Thanks for the advice!

Daelas