PDA

View Full Version : Because you all love my code questions...


DeepT
05-31-2006, 01:01 PM
I have another. This is just code theory and architecture, and will be as abstract as I can make it. I want to avoid any 'concrete' examples since they tend to get messy and off track. There may be no solution to this 'puzzle'. Finally I say there is no cosmic point to this other then to feed my own neurotic needs to find a better way to do things.

I have objects A to Z. I have Object "Data". I have System N1 which contains Data and Objects A to Z.

Objects A to Z are connected like this:

A has one or more Bs.
B has one ore more Cs.
etc... all the way down to Z.

My only interface to Z is through Y, and Y to X. On the top level I can only directly talk to A since A only knows about B.

System N1 has One and Only one A, and only A. It has no direct knowledge of the many Zs that are out there.

System N1 comes into possession of Data, but it has no use for it. Z has a need for DATA. How do I communicate this to Z without daisy-chaining 26 steps? IE: We do *not* want to pass Data to A who passes it to B who passes it to C, etc... down to Z.

Additional Requirements:
1. Any number of Systems, N1, N2, N3, etc... Each system is unique.

2. All Zs in System N1 need Access to Data which is unique to System N1. Similarly all Zs in system N2 need Access to Data which is unique to N2, etc..

3. "Data" is only available *after* construction is complete.

4. Classes A to Z do not inherit from each other.


Is this possible, and if so, how?

Charles
05-31-2006, 01:07 PM
External messaging system, each class registers itself for the data it wants, and then you just pass the data to the messaging system which sends a message to everything, anything who wants it uses it.

BaconTastesGood
05-31-2006, 01:08 PM
How do I communicate this to Z without daisy-chaining 26 steps?

In my world, we call that a global variable. Or a static method that returns a singleton. Or any zillion other ways to accomplish it.

If Z is the only Object that needs Data, and System doesn't know about Z, then the basic issue is: "Expose System.Data to an arbitrary object it is not familiar with". Propagating Data through a call chain would be stupid, so that leaves a global (or the functional equivalent -- a method that returns a singleton).

Fugitive
05-31-2006, 01:10 PM
If it's absolutely guaranteed that there is only and will only ever be one single instance of Data, you could throw it in a singleton (http://en.wikipedia.org/wiki/Singleton_pattern). Or make the singleton an interface which associates the Data objects with the appropriate System, if they're all in the same process.

(Edit: Bah, beaten.)

Nick Walter
05-31-2006, 01:12 PM
This sounds like a problem that can only be solved by pointer arithmetic! And maybe some judicious use of inline asm.

The goal here being to not only solve the problem but write undocumented garbage code thus ensuring lifelong employment, right? right?

Houngan
05-31-2006, 01:42 PM
Seems to be a code representation of directory services. You're going to need a global addressing scheme that is heirarchical and available to all levels. So, if M wants to send to D, he queries the directory and finds that D is located at a.b.c.d. He can then send it to C, if C in some way controls D, or directly to D, if it can handle the request. Full addressing should allow the request to bypass the upper levels with a global redirector that has knowledge of the entire tree.

H.

DeepT
05-31-2006, 01:58 PM
Please read the requirements.

There will be any number of instances of Data. While system N1 has only one copy of Data, System N2 has it's own unique copy.

A singleton can not work, unless I do not understand singletons.

Messaging/Directory services approach:
Sounds good but there is a detail that is missing.

Lets say that Z instances itself, how does it know what system it belongs too, and therefore when asking the service (or whatever) for the right Data, how would the service know which one to hand it? If you pass down some identifier that says you belong to system 1 or 2 or whatever, you face the exact same daisy chain problem.

Fugitive
05-31-2006, 02:03 PM
That's why I said 'or an interface that can associate...'. Each instance of Z has to have *something* unique to it that can be used to create a mapping between itself and the system it's part of, or what you ask is impossible. Exactly what that unique bit is depends on finer details; if each system is in its own separate thread, maybe you could use the thread ID or thread-local storage.

BaconTastesGood
05-31-2006, 02:20 PM
Please read the requirements.

Your requirements are vague and suck. It's not clear if Data is an Object or if it's a type. You call it an Object, but then later talk about "instances" of it, by which I guess you mean copies, not actually instances?

Do A-Z know who their parents are? Are they allowed to know about their contained system? Can they be contained by multiple systems? Do A-Y need Data? Do A-Z share the same type?

Are you just making a really complicated requirements for the following?

"Given a tree of depth 26, what is the best way to propagate data from the tree's owner to children at its lowest point, ideally without passing that data through all the children?"

If that's the case, then it's still a trivial problem. You can pass that data down directly by reference so all children can access it, or you can pass a reference to the parent System to all children so they can query it for any relevant state.

Houngan
05-31-2006, 03:00 PM
Please read the requirements.

There will be any number of instances of Data. While system N1 has only one copy of Data, System N2 has it's own unique copy.

A singleton can not work, unless I do not understand singletons.

Messaging/Directory services approach:
Sounds good but there is a detail that is missing.

Lets say that Z instances itself, how does it know what system it belongs too, and therefore when asking the service (or whatever) for the right Data, how would the service know which one to hand it? If you pass down some identifier that says you belong to system 1 or 2 or whatever, you face the exact same daisy chain problem.

Presumably if Z is being instanced, it's going to be Y that is doing it. Y passes Z it's FQN, which Z appends itself to. Like objects at multiple levels also inherit the FQN of their context, making them referentially unique while being substantively the same.

H.

DeepT
05-31-2006, 03:14 PM
Your requirements are vague and suck.
I am sorry you do not like them. If you do not like them, you do not need to participate.

It's not clear if Data is an Object or if it's a type. You call it an Object, but then later talk about "instances" of it, by which I guess you mean copies, not actually instances?
It is arbitrary, is it important? It might be an object, it might be a chunk of binary data or even a block of XML. If you have a solution that works ONLY if Data is an object, that is acceptable.
Do A-Z know who their parents are?
They have no parents.
Are they allowed to know about their contained system?
Tircky question to answer. Can A know it is contained by N? Yes. But can a particular instance of A know it is contained by a particular instance of N? No.
Can they be contained by multiple systems?
Each instance of A-Z is unique to one system.
Do A-Y need Data?
No. Only Z needs it. It is useless to everyone else. If A - Z needed the data, I would not mind the daisy-chain method. However since the do not need it, I do mind.
Do A-Z share the same type?
No. There is no relationship between different types of any kind aside from containing each other, such as D contains one or more Es.
Are you just making a really complicated requirements for the following?

"Given a tree of depth 26, what is the best way to propagate data from the tree's owner to children at its lowest point, ideally without passing that data through all the children?"


In an effort to not bring personal bias into it, it is arbitrarily long. If, for example It was just A - B (where B needed that data) you might say, "It is only one step so it is OK." Someone else might think A-C is OK. It becomes subjective. Picking a number like 26 steps should be beyond anyone's threshold of where the it is OK and where it is not OK.

We could then argue where the line is, or how many steps is OK and how many is not OK. I do not want to get into that discussion. Lets just say there are an arbitrary N steps between A and Z that makes daisy-chaining undesirable. How would you solve the problem with the requirements posted.


The reason I picked the requirements I did is because I want a general solution. They are harsh, Ill grant you that. If I was somewhat lenient, say, I said there was only one System N, not multiple, then you could easily say, just make a singleton or a static method or a global. That is fine and dandy, but then what do you do when you need more then one system? Your solution does not work.

If, however, you do find the general solution, you can apply it to any situation with less restrictions. So if we find one that works with multiple Systems N, it will work for just one System N. If we find one that works cases where A - Z are unrelated, then we find one for when they are related.

There is a method to my madness you know. I hope you can appreciate why I am putting such rigid restrictions in.


The ultimate answer may be that there is no general solution. However, is it not worth a look?

Fugitive
05-31-2006, 03:23 PM
Well the general solution is to just daisy-chain the information down the tree of objects, wrapping it in a context object for future expansion and so you only have to pass one thing. It's just that you already eliminated that option as a requirement.

antlers
05-31-2006, 04:10 PM
There absolutely has to be something that the System N1 and the contained Z's share, and that is unique to N1. No way around this. The question becomes, how is this shared thing passed? It's either passed from object to object explicitly, or is part of an environment unique to N1 that all the contained objects share. Such an environment is implementation specific--there's no general solution.

If different N's are in different threads, you can use thread local storage. If the different N's are in the same thread but coordinate with each other, you can use a global variable that marks the "current" N. If different N's are in different processes, you can use globals. If different N's are on different servers, you can use network addresses. If you use aspect oriented programming, you can give A-Z an aspect that represents the environment that you don't have to code for in each individual object.

bago
05-31-2006, 05:51 PM
There's always the XML dom way. Give every node an id.

BaconTastesGood
05-31-2006, 07:21 PM
It is arbitrary, is it important? It might be an object, it might be a chunk of binary data or even a block of XML. If you have a solution that works ONLY if Data is an object, that is acceptable.

Did you read what I said? You don't seem to be differentiating between a type and an object. That's a pretty major difference.

They have no parents.

You said "all B are contained in A" or something like that -- containment implies parentage.

In an effort to not bring personal bias into it, it is arbitrarily long.

That's fine. So, it's just a general propagation of data from root to leaf problem for trees, except with the arbitrary and rather odd stipulation that children don't know whom their parents are and, even better, that you can't propagate data from parents to children.

That constraint makes it an insolvable problem, right? If a child doesn't know who its parent(s) is/are, and the parents can't tell the children anything, there is no mapping in which to get specific data to the children.

In that case, you have to relax the constraints.

There is a method to my madness you know. I hope you can appreciate why I am putting such rigid restrictions in.

Not really, but whatever.

Yaltan
05-31-2006, 09:49 PM
Any reason why each object A-Z doesn't contain a reference to System? That way you can implement poor-man's publish/subscribe. You need to notify Z that there's data available, and then Z can go retrieve it from the reference.

This will obviously cause problems in a multi-threaded program unless you synchronize notification and access.

Basically I imagine A-Z as a singly-linked list of objects. You don't want to pass data down the chain (not sure why other than O(N) instructions and N is very large) and you don't want Z visible to the data owner. The only entity that knows Z is Y, so Y must be the one that informs Z there is data to be had. The same logic applies all the way back to A.

You mentioned not using a singleton because Systems are not unique. Is your definition of System a running executable or virtual machine (unique stack) or an instance of an object? If the former, you don't have a singleton problem.

XPav
05-31-2006, 10:09 PM
This is like some stupid version of the weakest link, where the dumbest person asks a dumb question, smart people answer him, and the dumb person gets mad and kicks everyone off the show for making him look stupid.

FIDGAF
06-01-2006, 05:39 AM
To me, it's more of a One-To-Many relationship and should be handled like any other data: A Database approach.
XML is an excellent method to use for something like this.

The A-Z where B is contained with A is parentage, no question, in a sense but also as One-To-Many so either approach would work IMO.
Using a relationship approach, all records below would be available from their respective node. A would contain all of B's records and calling A as an Object would enable you to use multiple instances of the data.

But I'm sure there's more to it than that, eh T?
;)

Rward
06-01-2006, 06:10 AM
A freind of mine says:

this solution passes POINTER to data down the chain.. not the actual data.

function give(Object *obj,Data *dat)
{
if(obj->name() == 'Z')
obj->ProcessData(dat);
else
for(int i=0;i<count(obj->children);i++)
give(obj->children[i],dat)
}
N1->give(&A,&data);

DeepT
06-01-2006, 07:40 AM
BTG: Ok I misinterpreted the question on child-parent relations. I was thinking base class-child class, not node-child node. So yes, B will know about A and C.

I also read what you said. I know there is a difference, but what I was telling you, is it is your choice. It can be a type, data, instance, whatever you want, as long as it solves the problem given the constraints.

Rward: That is still daisy-chaining. Yes, its much better then daisy-chaining a huge block of text, although passing it by reference would solve that problem anyhow.

As far as threads, thread local storage and all that... N1 and N2 exist in the same memory space, although yes, maybe that is a solution by creating a global/static class member within a memory block where each is unique, so that this global isn't really that global.

In effect it would seem that if I could create a name space to define things in that would work. The problem with that, however, is that it is not an arbitrary amount. I can create name space N1, N2, ... Nx but not Nx+1 . The multi-threaded approach would solve this though since I could create Nx+1 threads.

Yaltan: If A - Z possessed a reference to System, how would they get that information? Its the same problem. The only way I can think of is to daisy-chain the reference. Since A-Y do not need the reference, it would be kind of a waste.


So far there does seem to be one solution offered so far, and that is the multi-threaded one. Is there another way to create the equivalent to name space during run time? A way to create a 'global' that is only global to a subset, where such systems of globals and associated subsets, can exist an any arbitrary number?

Fugitive
06-01-2006, 09:00 AM
So far there does seem to be one solution offered so far, and that is the multi-threaded one. Is there another way to create the equivalent to name space during run time? A way to create a 'global' that is only global to a subset, where such systems of globals and associated subsets, can exist an any arbitrary number?
All of the methods I can think of would still require propagating some piece of information down the tree. The only reason the multithreaded approach works is because you're fooling the OS itself into keeping that information for you, but the avenues for doing so are very limited and OS-specific.

If you're opposed to the propagation just because it's a specific piece of information, then make the propagation more generic. As someone else mentioned, you could uniquely name nodes in a way that could be mapped to the system. Or add some kind of 'SubmitSystemRequest' member to the nodes that lets you propagate a function object back up the tree and execute it against the System. Either would be generic enough to be usable for other purposes in the future.

DeepT
06-01-2006, 09:14 AM
How would Z do this? It calls what method on what object? How does it know what instance to call or how to associate itself with anything in particular? IE: How does it know I am a Z of N1 and not N2? Even sending an identifier, like an INT that can be used for reference faces the same problem, how do you get that to Z without daisy-chaining?.

Yes, the multi-threaded example might be a trick, but is it really OS specific? Sure invoking a thread is, but 'thread theory' ? Objects, weather C++ or Java still obey the same rules. Do threads not obey the same rules?

A thread is a bit overkill, but so far seems to be the only solution. If I could create a frame where Z could look at something that appears to be global to IT, but is only global to the scope of N1, that would work too.

But let me ask a slightly different question, one that I may subjectively have gotten wrong.

Is it bad to daisy-chain data to such an extreme? That is give data to A which Z and only Z needs. It has to be daisy-chained through 25 layers to get to Z.

This whole thing is premised that such a model is a bad thing. If its not, then this whole thing is moot. If it is, then I wish to find the general solution to the problem.

Talisker
06-01-2006, 09:37 AM
Is it bad to daisy-chain data to such an extreme? That is give data to A which Z and only Z needs. It has to be daisy-chained through 25 layers to get to Z.
Maybe yes, maybe no. All depends on what you're actually trying to do, how often you're passing data around, etc etc etc etc.

Fugitive
06-01-2006, 09:46 AM
Well, to be truly generic you have to at least consider the possibility of whether or not you'll have to deal with more than one system within a single thread, in which case the multithreading trick falls apart. It depends on what assumptions you can make and what might change in the future.
How would Z do this? It calls what method on what object?Z would call the 'SubmitSystemRequest' member of Y, which would in turn call the 'SubmitSystemRequest' member of X, and so on, and presumably at least A knows which system it's part of and can execute the function object against that system. They could inherit that functionality from a separate class, even if the nodes don't inherit from each other, to avoid rewriting it in every single node class. It's just an idea though, I'm not trying to push gospel here, only you can judge how well it actually fits into the specifics of your system...
Is it bad to daisy-chain data to such an extreme? That is give data to A which Z and only Z needs. It has to be daisy-chained through 25 layers to get to Z.
'bad' can be rather relative when it comes to code. Something might not be 'good', but still better than a set of even worse alternatives.

Daisy-chaining in itself is not 'bad'. Would there still be a problem if the tree were only 3 levels deep? If not, then it's merely a matter of degree. It 'feels' wrong to stretch something out to unfamiliar lengths, but it doesn't necessarily mean the method itself is wrong.

However, I would say that the daisy-chaining as you've specifically described above is 'bad' because it's not robust. It works for that specific piece of data, but if you suddenly need more data or to perform some other kind of operation, you have to create yet another avenue for that. That's why I suggested the unique naming or function object methods, since they're generic enough to be used for other purposes as well, making it more robust.

But it is more work, and sometimes the quick-n-dirty method (i.e., using thread association) is good enough to satisfy your requirements. It all depends on what kind of assumptions you can make, what other constraints you're working under, and so on. Only you, as the actual code designer/writer can make that judgement though, and we can only offer ideas.

Just don't get angry when we can't read your mind and automatically forsee why you won't like a certain idea.

BaconTastesGood
06-01-2006, 10:38 AM
As pointed out by Fugitive, the thread solution isn't really a solution as much as it is gaming the rules. It's also not portable, since the original requirements didn't state if a system supported multithreading, and for all we know DeepT would have busted back with "U GUYS SUK, I DIDN'T SAY THREADNG WAS ALLOUDED!"

Like I said before, the fundamental problem that makes it unsolvable in a general sense without daisy chaining is that Z doesn't know ultimate who its Nx is, and an Nx isn't allowed to tell Z indirectly about this.

If you remove that constraint, then you generalize it so that node's either know their containing system, or you propagate a query up the tree, or you propagate data down the tree. Propagation up/down is O(N) run-time expensive. Storing a reference is O(N) storage expensive. Pick your poison.

Anything else AFAICT is going to be some one off hack, but I'm ready to be surprised.

If I was implementing this, I'd propagate the System to each Node since that's the cost of a single pointer storage and generalizes so that any node can query its containing system. Or if storage was at a premium, a virtual GetSystem() method like:

class Node:
def __init__(self,parent=None):
self.parent = parent
self.children = []
def get_system( self ):
return self.parent.get_system()
def add_child( self, child ): self.children.append( child )

class System(Node):
def __init__(self,data):
self.data = data
def get_system( self ): return self
def get_data( self ): return self.data

Then the tree builder would just fill in the blanks appropriately, and any node could query for its enclosing system via self.get_system().get_data().

Again, this isn't exactly rocket science.

DeepT
06-01-2006, 11:08 AM
Yes that would work and it's not rocket science.

I do not get mad at anything here, although sometimes I do get annoyed. For example when I say, there can be any number of Ns, then 3 posts later someone says, try X, but it only works with 1 N, do you allow more then 1 N?

N is allowed to talk to Z, or Z to N. I never said that was not allowed. It is just how it is done, thats all. The only particular way I didn't want to have N daisy-chain its information through 25 proxies to do that.

It seems to me that aside to creating threads, there is no general solution to this problem.

With that in mind, I purpose these new programming guidelines Ill add to my rule-book. Daisy-chaining is allowed during construction, but not at any other point in time.

Alan Au
06-01-2006, 11:11 AM
Is it bad to daisy-chain data to such an extreme? That is give data to A which Z and only Z needs. It has to be daisy-chained through 25 layers to get to Z.

This whole thing is premised that such a model is a bad thing. If its not, then this whole thing is moot. If it is, then I wish to find the general solution to the problem.
It depends on the data and how it's stored. The main thing you want to avoid is making 25 extra copies of the data or worse, running some piece of code 25 extra times to return the data.

- Alan

DeepT
06-01-2006, 11:41 AM
I was think of it also from a code read-ability point of view. Imagine you inherit this codebase from someone and you do not understand the whole thing in general. Trying to figure it out would be a huge pain in the ass. Not only do you need to see that Data entering A gets to Z, but also that nobody tampers with Data on the way to Z.

I try and write things in a hybrid manner of efficiency and readability, stressing one or the other as the situation demands. In most cases, readability is more important to me then efficiency since most things I write are for things that do not need the absolute most performance. As an example, the program I am writing, after 1 week consumed a total of 3 CPU seconds. I can afford to splurge a bit here and there.

bago
06-01-2006, 03:14 PM
I take it then that DeepT doesn't know how TLS works, nor the XML DOM?

FIDGAF
06-02-2006, 05:06 AM
I already suggested XML which would do pretty much exactly what he wants but I guess that's not what he wants to use.