There are various ways to interact MaxMSP with the outside world. What I experienced was to use a video source to get usable signals within MaxMSP. The software used is reacTIVision (http://reactivision.sourceforge.net/).
ReacTIVision can recognize and convert certain video signals midi signal. Using symbols owners, the software converts their position (x / y / z) and rotation midi signal: this, however, requires a complex patch that recognizes these specific signals, interpret them and convert them into commands that can be used to MaxMSP.
In this way I was able to map the positions and movements of the symbols within the predetermined area. Regarding the animations. I used some features of Jitter as well as simple subpatch objects and scripts.