Communication Issues in “The Challenge of the Multicores”
The Talk
A few weeks ago I listened in on a lecture given by Fran Allen, a Turing award winner and a language/compiler/optimization researcher at IBM.
The talk was on “The Challenge of the Multicores”. It meandered a bit into the history of 60’s era IBM machines, a few reflections, and a basic explanation of the problem being encountered with an emphasis on scientific computing. I think- the lecture was rather unfocused, and a few weeks ago, so memory=dead. Part of the lecture was a list of solutions to the multicore challenge:
- Cache needs to die a bloody and well deserved death
- We should move away from low level general purpose languages (C/C++/Java/Python/etc)
(Humorous tilt added by me, but otherwise this is roughly what she said)
I’m not sure if there were more, but these two stuck in my head because of what insued. With the elimination of cache, you could hear almost every engineer in the audience groan in horror, and roughly similar may have occurred with the death of low level languages. The questions at the end of the seminar were roughly 90% “How do we handle the discrepency between CPU speed and memory speed without cache”, “How could any low level general purpose language be completely displaced by a high level language”, etc. And through this, I’m doing a combination of laughing, crying, and shaking my head. This seems a classic issue of miscommunication.
The Death of Cache
With the first point, Fran isn’t saying “Get rid of the cache and have the CPU talk directly to memory” (or she might be, I’m not sure how well she understands hardware concerns). What she is saying is that how cache is implemented today is a large source of performance problems. Cache is operated largely without instruction from the program- it makes educated guesses about to load, and what to keep. When it guesses wrong, a cache miss occurs and a massive microsecond or so overhead occurs while memory is fetched.
Here is the problem: cache guessing is based largely on assembly code and current use. A program can’t tell the CPU ‘I’m going to use this data soon’; a compiler can’t provide meta data to the CPU of the form ‘by the time this method finishes, we’re going to need data from one of these three locations’. Lack of certainty about what the cache will do impedes automated optimization. That is what she was getting at. If instead we were able to treat the cache as 4MB worth of really slow registers, with it’s own lesser instruction set and controller, we could see massive performance gains as we can get much closer to a perfect cache with an intelligent cache controller then we could with a controller that only guesses based on current and previous state.
When phrased that way, cache death doesn’t seem like such an incredibly terrible idea. The cache is still there in some form, it’s just not the cache of today. There are still all manner of issues that would need to be worked out, but it’s alot more clear that “yes, you may have a point there”.
The Death of C
Again, she isn’t saying “C should be eradicated from everywhere and everything”, she’s saying that using a low level language removes the semantics of what you’re trying to do. The best example I can think of would be the difference between using SQL, and hand coding a query over a data structure. (Ignore small issues like ethics, responsibility, pain, etc).
If you encode a query yourself, you are responsible for parallelizing it, and doing that correctly. That means you need to keep in mind data dependencies, lock acquisition orders, etc. You may need to take into account niggly little details about the machine architecture you’re working on if you want massive performance. If you have a sufficiently complex query, it might be worthwhile to precompute certain parts of the query and reuse them. And so on.
With SQL, instead of specifying the steps to get to the state you want, you specify “I want everything that looks like x, cross referenced with everything that feels like y, ordered by z”. Your database server is then responsible for building a query execution plan. It asks “what can I precompute to make this faster”, “Do I need to do a linear scan through this entire data set, or do I have meta data somewhere such that I can do a binary search”, etc.
SQL constrains what you can ask for, but in return it exploits all manner of optimizations and parallelizations, without your lifting of a finger. Fran’s focus is on high performance scientific computing. If the people doing a weather simulation say “I need the solution to Ax=b” rather then coding a matrix inverse or an LU decomposition by hand, the compiler can provide a very efficient implementation, which may be able to exploit optimizations that the programmers could not see. She isn’t saying “Throw out C#, write your web applications in MATLAB”; but she is saying that current practice makes it hard for optimization researchers to help your code run faster.
Conclusion
Communicate better. It’s better for you, your message, and us.
Google Apps
I decided to switch from GMail as a primary address to an address @codertao.com. However, I ran into a bit of a snag. I like GMail. It’s friendly and has over 1,000 daily slashdot headlines waiting for me to search them the next time I want to remember what I read about small scale nuclear reactors. That, and the mail service that comes with my web host isn’t exactly perfect (and a 7GB inbox is larger then the disk space I have in the web hosting account). So I tried the clearly logical thing of playing with the free version of google apps.
And I can now say I’ve found something Google doesn’t do well- GMail -> Google Apps Migration. Once everything is setup and the old accounts are forwarding to the new ones, everything seems to work fine; with the possible exception that Apps seems to be behind GMail in so far as UI features.
To actually get all of my infinitely valuable slashdot daily headlines to transfer, I signed up for the free demo of Google Apps Premiere Edition, which includes an IMAP migration tool. Some amount of setup was need with that, the details of which I have forgotten and don’t wish to live through again. The end result is everything appears to have transferred- tags seem to still be in place, my 600mb inbox has stayed roughly the same; but I am still shocked. That Google doesn’t consider the use case of a user switching from gmail to google apps important enough to warrant time for streamline migration is just… weird.
Granted, I can hardly complain. Standard Edition of Google Apps is free. I finally have an email address simple enough to be written down in chicken scratch and still be read. Along with easy inbox searching and the email log of my last ~5 years online. Overall I think it was worth it.
Slight Update, and Math is Cool
First, the blog shouldn’t be considered dead. It’s just with the summer Math class, and my lack of a game night on Saturdays to ignore while working on implementing Euler (or ACM problems if I get to it) the past two weeks has shot my productivity. That, and working on LC3D has come to a slow halt; though I’m continuing to blame that on a lack of time.
So, let’s chat about that Math class, and the second part of that thar title. I’m taking my first 400 level Math class, ‘Mathematics of Probability’. So far it has been like a combination of statistics and discrete math; and I like it. It’s not the proof intensive class of nightmares that I was worried it might be, which is comforting. And as I already said, I like it. Here’s one specific problem I like:
In how many ways can 4 rooks be placed on a chessboard such that no two rooks can capture each other. For the Chess-o-phobes, rooks capture horizontally and vertically, and board is an 8×8 grid. There are two ways to think about the solution to the problem.
The first way is my personal favorite. You can place the first rook in any of 64 squares, followed by 49 squares, then 36, then 25 squares. That is (8*7*6*5)^2 = (P8,4)^2; or number of permutations of 8 objects, of which 4 are relevant, e.g. 8!/4!. Now, we’ve selected these locations in such a way that order matters; e.g. if we place the first rook in the top right and second in bottom left; that is distinct from placing the first rook in the bottom left, and the second in the top right. To fix that problem we realize we are working with permutations of 4 objects, and divide by 4!. In the end, we get 8!^2/4!^3 ; or (P8,4)^2/4!.
The second way involves conceptualizing it as choosing the rows and columns in which the rooks will be placed: choosing 4 columns, and 4 rows can be done in (C8,4)^2 = (8!/(4!*4!))^2. When we’ve done that we look only at the intersections of these rows and columns, basically a 4×4 grid, in which we will place a pawn on each row, and each column. On the first row, there are 4 locations to place a pawn; on the second there will be only 3; third will have only 2; and the final row can have a pawn in only one location, after the other three rows have been chosen. This means there are 4! ways to place the rooks on a grid that can be chosen in C8,4^2 ways. Multiplication gives you that this can be done in 8!^2/4!^3 = 4!*(C8,4)^2 ways. That might look familiar if you read the last paragraph.
Formula actually comes out to be identical, but there are in fact two ways to arrive at it.
Interesting fact about me: I have a strong, persistent fear of math. I actually like it quite a bit: taking an integral over the volume of a sphere, solving brine tank differential equations using matricies, figuring out the number of unique ways that you can order books on a book shelf, all manner of interesting little math problems. I like knowing how to solve these things which to my knowledge have yet to come up in my life. Not quite true; I’ve actually used bits of Calculus 3, twice, long after I had forgotten the relevant portion that I needed.
But, while I enjoy math, I also have that fear thing. I think it’s based on a fear that, at some point, I’m going to encounter something which I flat out can’t understand; and that anything past that point would be impossible for me to understand. Given that fear explanation, being able to find answers in multiple ways actually gives a fair ammount of comfort to me. Even if this is a rather basic example.
So, that’s my update, and minor thoughts for the day.
Howdy
We’ll see how this turns out. I am Matt Moss, a Software Developer at the Energy Systems Lab in TEES… at Texas A&M; University. I work on a (small) variety of projects, mostly focusing on the backend of our web home energy calculators. While not working I’m also a full time undergrad student in the CompSci department. I’m an active participant in the ACM ICPC programming contests… and that’s about it for extra-curricular activity.
My evil plan for this blog is to get a slight online presence, have incentive to work on side projects, and to get better at ACM-esque problem solving. The evil details of the evil plan are to go through problems (past ACM, project Euler, etc) over this summer (2009 for ye archeaologists), and post analysis of the problem, and a solution. With any luck, I will also post details/expansions/others for ideas for side projects.
Also, Coder Tao is not a fascinating name based on the balance that should come from being able to solve complex problems semi-elegantly while also knowing the important bits of Software Engineering. Nor is it an endorsement of Asian religions. Nor is it a statement involving Torque and coding folk. Tao is the name of a gnome sorcerer. An impressive machine of destruction and chaos. And it’s now a name.
As I said, we’ll see how this turns out.
-Matt
