home *** CD-ROM | disk | FTP | other *** search
- Gregory Stevens 7/22/93
- NNEVOLVE.C
- The Experiment:
-
- I set up a network archetecture of three layers, 16-10-4 nodes, initialized
- with random weights and threshholds. The total input pattern set represented
- 4 shapes in each of 4 possible positions (the shapes being horizontal lines,
- vertical lines, 3x3 squares and 3x3 diamonds). The initial output goal
- patterns for supervised training were set up to classify by what the shape
- was, independant of position. The initial net was trained by standard back-
- propagation algorithm on 10 randomly chosen patterns from the complete input
- pattern set, until each item was classified correctly within 0.20 of the
- goal state (see otput sheet for generation 1).
- Then, the final output of that net for each of all 16 patterns was saved as
- the new output goal pattern file, and the next net was initialized and trained
- on 10 random input patterns, with those outputs as the goal outputs. This
- was repeated for 20 generations, with the weights, hidden unit activations,
- and output activations saved for each generation.
-
- Significance:
-
- Many people assume that, especially during communication, each other's
- mental models of the world and of the referents of terms are the same, or
- at least extremely similar. Some even postulate that language is a means for
- transmission of information, so that the decoding of a term by the listener's
- understanding is so exact to the encoding by the speaker's understanding that
- the actual mental state can be said to have been transmitted in the speach
- process. Followers of the Wittgensteinian "private language" view maintain
- that it is impossible to tell how similar people's understandings of terms
- are, because it is only necessary that the behaviors of the communicators
- coincide. Thus, it is feasable, they maintain, that two people's internal
- states when using a term in mutual communication could be completely different
- and communication would still be successful as long as, coincidentally, maybe,
- the each person's behavior was considered appropriate by the other.
- How similar cognitive states must be for communication to be successful has
- been of much debate over the years. People of the conversative camp (as I
- will refer to the formerly mentioned view) maintain that the world of
- interaction is so dense with reference and diverse contexts that any
- discrepency between people's understandings of terms would show up quickly
- as miscommunication. The liberal camp (as I will refer to the later mentioned
- extreme philosophy) maintains that because constraints for communication are
- only behavioral, as long as there was a one to one mapping between behavior
- and mental states, they could be completely different internal states giving
- rise to consistently successful communication. There are, of course, many
- and diverse views between.
-
- The model implemented here draws the loose parallel between input units
- and sensory channels, hidden units and internal cognitive states, and output
- units and behavioral states. The goal is to simulate the notion of a person
- learning a catagorization (maybe a word), based on some non-exhaustive set
- of examples learned from (we don't get exposed to all examples of dogs before
- learning the term "dog"), and then teaching another person what the term means
- based on generalizations made by him, and that person doing the same to someone
- else, and so on. The resulting hidden unit activation states and nodes could
- then be analysed to determine if there are different internal states that
- still correspond to output states the same as the teacher's. Thus, in
- analogy, the two nets could (input-output) communicate about the patterns and
- identify them to each other, and it could be seen whether the internal states
- need to be the same for there to be communication, and how similar they must
- be.
-
- Results:
-
- Though the initial set was trained only on 10 of the 16 possible patterns
- at each iteration, it classified all of the input stimuli correctly after
- 500 iterations. After 20 generations, with each net being exposed to 10
- random patterns of the input set, the net was still producing almost completely
- accurate catagorization of all input patterns (after 500 iterations, only
- 2 of 16 were misclassified), though there was considerable degradation of
- surety (the average deviation wass 0.35 rather than under 0.20). This
- presumably could be solved by increasing iterations, but because each net
- had only learned to 500 iterations when it trained the next net, these are
- the relevant values.
-
- Upon gross analysis of the hidden unit activations for each generation for
- each of the 16 input patterns after 500 iterations for each net, it appeared
- that certain structures of internal activation representation remained the
- same (within 0.02 across all generations), others deviated considerably with
- disproportionately little degredation in output activation (changes of
- activation as much as 0.60 or so). Although with a 10 unit representation of
- a 16 unit input pattern, it is impossible to analyse the funciton of the units
- in terms of feature detection and the like, this seems to correspond to the
- notion that there are certain "important" aspects of an internal representation
- that must be preserved, while others can vary a great deal across internal
- representations with little effect on communication.
- Upon gross analysis of the weight structures it is apparent that the weight
- construction is extremely different in most of the connections to nodes, and
- that the strongly different connections are distributed evenly across the
- network, unlike the similar vs. different patterns of activity, which were
- localized.
-
- Conclusion:
-
- If our actions are guided by internal "models of the world," they must
- satisfy the constraints of relevent sensory input for survival. If we view
- survival, or at least creating internal models of the world, as a
- constraint satisfaction problem, there is a strong parallel to neural net
- learning. If our mental models of the world must simultaniously solve for
- all the contraints of our sensory inputs, then it is strongly analogous to
- solving for the appropriate weights to get appropriate output activities
- given input activities. With this analogy in hand, I maintain that the results
- of this mini-model, although showing nothing conclusive about human
- psychology, indicate that the amount of lenience in internal representation
- while still allowing for effective communication and interaction in the
- outside world is greater than at least the conservative camp tends to
- propose.
-