An inverse procedural generator
This is the home page of my research on inverse procedural generation applied to building facades. A quick presentation of the concept follows. I will update this page to keep track of my progress.
In computer graphics it is often useful to generate the content automatically to ease the burden of drawing everything by hand. Common examples include cities, forests, landscapes, oceans, or roads, which often show up in movies, video games, or urban planning simulations to name a few. These features clearly follow some patterns. Nonetheless they cannot be drawn with a naive tiling which would result in something too geometrical to come from the real world. Nature and mankind induce some randomness and chaos on the environment.
This is where procedural graphics prove useful: with a set of rules combined to some transformation functions, one can generate a whole city with its underlying randomness by specifying a few parameters.
The use of procedural generator can be overwhelming: they usually require of dozens of inter-related parameters. A tiny alteration in a parameter value may dramatically affect the result. Therefore, the biggest challenge in procedural graphics is the understanding of the parameter set for every generator.
To overcome the limitation of procedural generation, we could think of a different approach: from a real-world input, we can deduce the right parameter set that generate the desired result. This is called inverse procedural generation.
Nonetheless facade formalization might involve hundreds of parameters. Some of them may affect minor details in the output, others may transform the result drastically. In other words, they have different psychophysical properties, i.e. they affect the way we perceive the building unevenly.
Which one makes it look believable? Does it look old or sad? Does it belong to a specific neighbourhood of some city? A psychophysical study of the architectural properties is relevant to improve the generation process.
The first logical move is to master the forward procedural generation. Thus I have implemented a prototype in Lua/Cairo based on the papers Instant Architecture (Wonka et al.) and Procedural Modeling of Buildings (Müller et al.).
A facade can be seen as a tile of rectangular regions. This model can be conveniently stored into a data structure.
Before working directly with pictures, we can feed our inverse data generator with a manually created data structure.
As for the forward generator, I have implemented a Lua prototype with some basic heuristics. The (only?) reference I have been using is Inverse Procedural Modeling of Facade Layouts (Fuzhang Wu et al.).
The data structure and its associated functions are actually quite interesting to study. I will devote an article to it later.
The input data is real-world pictures. Some work is needed before we can get to the convenient data structure our inverse generator can use.
Perspective correction: it is easier to work on orthographic data. Real life pictures of facades are never perfectly aligned, but this can be easily rectified. Even with some specialized hardware it is not always possible to take orthographic pictures of facades, e.g. in narrow streets with tall buildings.
Occlusion handling: facade picture are usually not barebone, a lot of elements cannot be taken away when taking the picture. This includes streets signs and lights, bystanders, trees, and so on. They should not be included in the final data structure, thus they should be removed. It is possible to handle most cases automatically.
To implement. It is still possible to work without it on ideal facades.
Lighting normalization: depending on the time of the day, the weather and the cast shadows, not all parts of the facade might illuminated equally. This can disturb the segmentation process, e.g. a shadow on a region can be seen as different regions.
Once our images have been pre-processed, the segmentation can be run serenely. The quality of the result can hardly be garanteed. Indeed, there is no formal deterministic definition of a region, this is mostly a human criteria. A semi-automatic segmenter sounds a reasonable choice here: first pass is fully automatic; for the second pass, control is left to the user who can then merge or split regions, resize them, group them, and so on.
The current state of the art is fairly limited, yet certainly extensible to 3D. This is definitely some field to explore.