I thought since its been a while I would provide an update on my ARB_arrays_of_arrays work and while I'm at it would give a bit of an overview of the Mesa GLSL compiler with some other interesting bits and pieces thrown in.
Since my last update the first part of my ARB_arrays_of_arrays support has been reviewed and committed to Mesa. I've also submitted v3 of the piglit tests for review. Work on piglit execution tests and linking support in Mesa is ongoing and I've found myself being slowed down a bit by the need to do some more reading/research on compilers in order to get fully across what is required to finish things off. In order to let you understand what it is I've been doing I thought I'd provide a basic introduction to Mesa's GLSL compiler while also giving some tips for those interested in being able to start their own further research on the topic.
For those that are not aware OpenGL has its own programming language called GLSL used to give developers more direct control of the graphics pipeline. This means that Mesa needs to implement a compiler in order to support GLSL, the code for this compiler is contained in the src/glsl directory of the Mesa source code.
For those interested in understanding more about compilers I stumbled across a free online course put together by Alex Aiken of Stanford University, the lectures can be viewed without signing up here. I've only watched a couple of the lecture so far but they are quite good and cover lots of the basics. The course seems to put a lot of focus on Lexical Analysis (scanner) and Syntax Analysis (parser) but not so much on Intermediate Representation of code which is a little disappointing but understandable given the target audience.
So how does Mesa's GLSL compiler work? Well a crude high level follow diagram would look sometime like this.
Mesa uses Lex and Yacc (actually Flex and Bison in GNU/Linux) to automatically generate a scanner and parser from the files glsl_lexer.ll and glsl_parser.yy respectively (to be exact there is also some extra manually created parser code in glsl_parser_extras.cpp).
From here the parser outputs an abstract syntax tree (AST) which is then converted into and intermediate representation which can be understood by the Intel drivers backend and Gallium. The main code for this step can be found in the file ast_to_hir.cpp but there are also other ast_* files involved in this step.
Note: Before the IR is read in by the backend there are various optimisations and lowering passes performed on the IR.
Currently the IR is also in a tree based form and the code is implemented using a visitor pattern (if your a programmer and never heard of patterns you might want to take a look at the classic book Design Patterns: Elements of Reusable Object-Oriented Software a.k.a Gang of Four book. Knowing about design patterns will not only help you in your daily job but it can be handy to refresh up on them before a job interview)
Anyway recently at FOSDEM 2014 Ian Romanick gave an interesting presentation about the history of Mesa's GLSL compiler design and implementation and highlighted current issues with the treelike structure of the IR, you can view the presentation here.
Ian also references in that presentation two books used by the Intel developers when implementing the current compiler:
So currently this is the point I'm at, the code that been committed implements the yacc changes needed for ARB_arrays_of_arrays and adds support to the AST for arrays of arrays. IR is created from the AST without any changes needed and from my testing appears to execute correctly when the array of arrays is only visible to a single shader, however when the array of arrays are past between shaders or are a uniform then changes are required to the IR linking code and likely the driver backend. I'm currently finding myself needing to look into compiler topics such as register allocation to fully understand this problem.
Currently I have a part working patch for linking inputs and outputs but it seems to be causing gnome shell to crash while I'm testing so obviously still needs some work to avoid regressions.
To summarise I will continue working on the remaining arrays of arrays support I've just been going slow due to needing to do a bit of self education in order to feel confident that I'm going about things correctly. I hope someone has found this post useful/entertaining/educational please feel free to leave any comments/questions below.
Just before I go for a more complete/accurate/lower level description on the GLSL compiler see the readme text in the Mesa src/glsl source directory.