used to think that robotics was the coolest thing in engineering
Research Idea
A conditioned learning algorithm that is able to classify its inputs as either punishment or reward based on the effect of the input on the objective function
Background:
In the domain of autonomous robotic navigation, the primary objective is to make the robots learn to avoid obstacles. In experiments, bumping into obstacles has already been defined as punishment and the robots are programmed to turn around /change direction once they hit the obstacles. The robots only learn to avoid bumping into the obstacles in the future from the visual inputs they get. When the punishments and rewards have to be defined as a pre-requisite for learning, the functionality of the robots also become restricted as it can only learn from inputs which are defined as either a punishment or a reward.
However, the robot becomes more autonomous if its learning algorithm can determine its own punishments and rewards and determine its own course of action corresponding to various situations. A punishment can be an input which hinders the objective function of the robot (which is to move) – like an obstacle and a reward is an input which would enhance the objective function- like an empty stretch of space. Such ability could help the system learn and evolve into a more sophisticated system.
Scope:
Investigation has to be carried out into understanding how primitive organisms classify their environmental inputs as either punishment (pain) or reward (pleasure). There are also works in computational neuroscience that are studying the neural sequences for pleasure and pain. Identifying and emulating those processes into and artificial neural network by using simple Hebbian learning and conditioned learning algorithms can help create an algorithm that can be implemented with the robotic system.
Extending the algorithm to include not just single but also multiple objectives. Research can also be carried out into creating derived objectives which is going to help the higher objective and creating a hierarchy of objectives to be implemented with practical or simulated robotic systems. For example in living organisms, the higher objective being survival, the derived objectives can be moving away from harmful conditions, to find food, and assimilating etc.
————————————————————————————————————————————————————————————————————–
Hebbian rule and Conditioned Learning:
watson ‘ the human mind is a set of conditioned responses’
I believe that thought is just continual changing combination of excitations(firing) in the neurons in the different screen areas of the brain,and logical thought was brought about by conditional learning and firing, where one thing or event is associated with another thing and the neural connections had the strength combinations to bring about the thoughts – well logically.
most people wud know what a reflex action is – the dog and bell experiment,how the bell becomes associated with food.
My idea was to create a neural netwrok using the simple hebbian learning, but that would achieve this association of events and conditioning via a time based collateral association between pathways.
hebbian learning work this way..the strength of association between 2 neural pathways increases while there is activation on both sides. When a pathway achieves excitation, the excitation stays within that node/pathway for sometime and gradually decays. Although bell might excite a sound area of the brain and the food excites smell or visual part of brain, In the region of the conscience, both activity gets associated by virtue of being succesful events in time. That is within the conscience part before the activation of bell completely decays off, the activation of other event comes in and thus forming a connection between the neural pathways of bell hearing and food seeing based on the Hebbian principle. Now this association is also reflected and ‘learnt’ in the memory part of the brain, thus forming the association between the bell and the food
So this was the conditional learning algorithm and the functional objective I had in mind for the neural network.
In my research proposal, I also talked about the learning objectives of the Neural network, where one doesnt have to keep telling the machine what to think, but the machine doing its own thinking based on an objective. I mean the objective for Living things is to preserve itself nd survive…all those function( first biological, which later became neural) such as seeking food,shelter and mate ..and further derived objectives such as moving,seeing , grooming etc are advances based on the same core objectives.
The first section was my proposal for which I got slected , but only half Research grant from UTS , UNSW and some other uni’s in aus..this was some 5 years ago, so I cudnt pursue It. May be when I become rich, I will get back to my real passion such as this one and physics, and away from this dumb area of Business Intelligence. Unfortunately so far, this blog is the only thing that has come out of my dear idea
Research Idea
A conditioned learning algorithm that is able to classify its inputs as either punishment or reward based on the effect of the input on the objective function
Background:
In the domain of autonomous robotic navigation, the primary objective is to make the robots learn to avoid obstacles. In experiments, bumping into obstacles has already been defined as punishment and the robots are programmed to turn around /change direction once they hit the obstacles. The robots only learn to avoid bumping into the obstacles in the future from the visual inputs they get. When the punishments and rewards have to be defined as a pre-requisite for learning, the functionality of the robots also become restricted as it can only learn from inputs which are defined as either a punishment or a reward.
However, the robot becomes more autonomous if its learning algorithm can determine its own punishments and rewards and determine its own course of action corresponding to various situations. A punishment can be an input which hinders the objective function of the robot (which is to move) – like an obstacle and a reward is an input which would enhance the objective function- like an empty stretch of space. Such ability could help the system learn and evolve into a more sophisticated system.
Scope:
Investigation has to be carried out into understanding how primitive organisms classify their environmental inputs as either punishment (pain) or reward (pleasure). There are also works in computational neuroscience that are studying the neural sequences for pleasure and pain. Identifying and emulating those processes into and artificial neural network by using simple Hebbian learning and conditioned learning algorithms can help create an algorithm that can be implemented with the robotic system.
Extending the algorithm to include not just single but also multiple objectives. Research can also be carried out into creating derived objectives which is going to help the higher objective and creating a hierarchy of objectives to be implemented with practical or simulated robotic systems. For example in living organisms, the higher objective being survival, the derived objectives can be moving away from harmful conditions, to find food, and assimilating etc.
————————————————————————————————————————————————————————————————————–
Hebbian rule and Conditioned Learning:
watson ‘ the human mind is a set of conditioned responses’
I believe that thought is just continual changing combination of excitations(firing) in the neurons in the different screen areas of the brain,and logical thought was brought about by conditional learning and firing, where one thing or event is associated with another thing and the neural connections had the strength combinations to bring about the thoughts – well logically.
most people wud know what a reflex action is – the dog and bell experiment,how the bell becomes associated with food.
My idea was to create a neural netwrok using the simple hebbian learning, but that would achieve this association of events and conditioning via a time based collateral association between pathways.
hebbian learning work this way..the strength of association between 2 neural pathways increases while there is activation on both sides. When a pathway achieves excitation, the excitation stays within that node/pathway for sometime and gradually decays. Although bell might excite a sound area of the brain and the food excites smell or visual part of brain, In the region of the conscience, both activity gets associated by virtue of being succesful events in time. That is within the conscience part before the activation of bell completely decays off, the activation of other event comes in and thus forming a connection between the neural pathways of bell hearing and food seeing based on the Hebbian principle. Now this association is also reflected and ‘learnt’ in the memory part of the brain, thus forming the association between the bell and the food
So this was the conditional learning algorithm and the functional objective I had in mind for the neural network.
In my research proposal, I also talked about the learning objectives of the Neural network, where one doesnt have to keep telling the machine what to think, but the machine doing its own thinking based on an objective. I mean the objective for Living things is to preserve itself nd survive…all those function( first biological, which later became neural) such as seeking food,shelter and mate ..and further derived objectives such as moving,seeing , grooming etc are advances based on the same core objectives.
The first section was my proposal for which I got slected , but only half Research grant from UTS , UNSW and some other uni’s in aus..this was some 5 years ago, so I cudnt pursue It. May be when I become rich, I will get back to my real passion such as this one and physics, and away from this dumb area of Business Intelligence. Unfortunately so far, this blog is the only thing that has come out of my dear idea
No comments:
Post a Comment