Title: 3D Enabled Object-CentricMachine Learning
Deep knowledge ofthe world is necessary if we are to have autonomous and intelligent agents andartifacts that can assist us or even carry out tasks entirely independently.One way to factorize the complexity of the world is to associate informationand knowledge with stable entities, animate or inanimate, such as a person or avehicle, etc. In this talk I'll survey a number of recent efforts in my groupwhose aim is to create and annotate reference representations for objects basedon 3D models with the aim of delivering such information to new observations,as needed. In this object-centric view, the goal is to use these referencerepresentations for aggregating information and knowledge about objectgeometry, appearance, articulation, materials, physical properties,affordances, and functionality. We acquire such information in a multitude ofways, both from crowd-sourcing and from establishing direct links betweenmodels and signals, such as images, videos, and 3D scans -- and through theseto language and text.
The purity of the 3D representation allows us to establish robust maps andcorrespondences for transferring information among the 3D models themselves --making our current 3D repository, ShapeNet, a true network. We introducecertain rigorous mathematical and computational tools for making suchrelationships or correspondences between 3D models first-class citizens -- sothat the relationships themselves become explicit, algebraic, storable andsearchable objects. Information transport and aggregation in such networksnaturally lead to abstractions of objects and other visual entities, allowingdata compression while capturing variability as well as shared structure.Furthermore, the network can act as a regularizer, allowing us to to benefitfrom the "wisdom of the collection" in performing operations onindividual data sets or in mapinference between them, ultimately enablinga certain joint understanding of the data that provides the powers ofabstraction, analogy, compression, error correction, and summarization.
This effectively enables us to add missing information to signals throughcomputational imagination, giving us for example the ability to infer what anoccluded part of an object in an image may look like, or what other objectarrangements may be possible, based on the world-knowledge encoded inShapeNet. I will also briefly discuss current efforts to design deepneural network architectures appropriate for operating directly on irregular 3Ddata, as well as ways to learn object function from observing multiple actionsequences involving objects.
Leonidas Guibasobtained his Ph.D. from Stanford University under the supervision of DonaldKnuth. His main subsequent employers were Xerox PARC, DEC/SRC, MIT, andStanford. He is currently the Paul Pigott Professor of Computer Science (and bycourtesy, Electrical Engineering) at Stanford University. He heads theGeometric Computation group and is part of the AI Laboratory, the GraphicsLaboratory, the Bio-X Program, and the Institute for Computational andMathematical Engineering. Professor Guibas’ interests span geometric dataanalysis, computational geometry, geometric modeling, computer graphics,computer vision, robotics, ad hoc communication and sensor networks, anddiscrete algorithms. Some well-known past accomplishments include the analysisof double hashing, red-black trees, the quad-edge data structure,Voronoi-Delaunay algorithms, the Earth Mover’s distance, Kinetic DataStructures (KDS), Metropolis light transport, heat-kernel signatures, andfunctional maps. Professor Guibas is a member of the National Academy ofEngineering, an ACM Fellow, an IEEE Fellow and winner of the ACM Allen Newellaward and the ICCV Helmholtz prize.